Error Analysis of Matrix Multiplication with Narrow Range Floating-Point Arithmetic

Abstract

High-performance computing hardware now supports many different floating-point formats, from 64 bits to only 4 bits. While the effects of reducing precision in numerical linear algebra computations have been extensively studied, some of these low precision formats also possess a very narrow range of representable values, meaning underflow and overflow are very likely. The goal of this article is to analyze the consequences of this narrow range on the accuracy of matrix multiplication. We describe a simple scaling that can prevent overflow while minimizing underflow. We carry out an error analysis to bound the underflow errors and show that they should remain dominated by the rounding errors in most practical scenarios. We also show that this conclusion remains true when multiword arithmetic is used. We perform extensive numerical experiments that confirm that the narrow range of low precision arithmetics should not significantly affect the accuracy of matrix multiplication, provided a suitable scaling is used.

Metadata

Item Type:	Article
Authors/Creators:	Mary, T. Mikaitis, M. https://orcid.org/0000-0001-8706-1436
Copyright, Publisher and Additional Information:	This is an author produced version of an article published in SIAM Journal on Scientific Computing, made available under the terms of the Creative Commons Attribution License (CC-BY), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited.
Keywords:	rounding error analysis, floating-point arithmetic, low precision, matrix multiplication, underflow, GPU tensor cores
Dates:	Accepted: 14 March 2025 Published (online): 1 July 2025 Published: August 2025
Institution:	The University of Leeds
Academic Units:	The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds)
Depositing User:	Symplectic Publications
Date Deposited:	17 Mar 2025 15:45
Last Modified:	06 Aug 2025 10:47
Status:	Published
Publisher:	Society for Industrial and Applied Mathematics
Identification Number:	10.1137/24M1685109
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:224490

Download

Accepted Version

Filename: mami25-aam.pdf

Licence: CC-BY 4.0

CLICK TO DOWNLOAD

CORE (COnnecting REpositories)

Error Analysis of Matrix Multiplication with Narrow Range Floating-Point Arithmetic

Abstract

Metadata

Download

Accepted Version

Export

Statistics