Fasi, M and Mikaitis, M orcid.org/0000-0001-8706-1436 (2023) CPFloat: A C library for simulating low-precision arithmetic. ACM Transactions on Mathematical Software, 49 (2). 18. ISSN 0098-3500
Abstract
One can simulate low-precision floating-point arithmetic via software by executing each arithmetic operation in hardware and then rounding the result to the desired number of significant bits. For IEEE-compliant formats, rounding requires only standard mathematical library functions, but handling subnormals, underflow, and overflow demands special attention, and numerical errors can cause mathematically correct formulae to behave incorrectly in finite arithmetic. Moreover, the ensuing implementations are not necessarily efficient, as the library functions these techniques build upon are typically designed to handle a broad range of cases and may not be optimized for the specific needs of rounding algorithms. CPFloat is a C library for simulating low-precision arithmetics. It offers efficient routines for rounding, performing mathematical computations, and querying properties of the simulated low-precision format. The software exploits the bit-level floating-point representation of the format in which the numbers are stored, and replaces costly library calls with low-level bit manipulations and integer arithmetic. In numerical experiments, the new techniques bring a considerable speedup (typically one order of magnitude or more) over existing alternatives in C, C++, and MATLAB. To our knowledge, CPFloat is currently the most efficient and complete library for experimenting with custom low-precision floating-point arithmetic.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2023 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution International 4.0 License. |
Keywords: | low-precision arithmetic, loating-point arithmetic, mixed precision, IEEE 754 standard, binary16, bloat16, round-to-nearest, directed rounding, round-to-odd, stochastic rounding |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 18 Apr 2023 15:27 |
Last Modified: | 15 Nov 2024 11:36 |
Published Version: | https://dl.acm.org/doi/10.1145/3585515#sec-10 |
Status: | Published |
Publisher: | ACM |
Identification Number: | 10.1145/3585515 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:196270 |
Download
Filename: CPFloat A C Library for Simulating Low-precision Arithmetic.pdf
Licence: CC-BY 4.0