FPTalks on May 7: Mantas Mikaitis (University of Leeds)

20 views

Skip to first unread message

Pavel Panchekha

unread,

Apr 30, 2026, 11:47:38 AMApr 30

to fpb...@fpbench.org

Hello all!

The next FPTalks Seminar is Thursday, May 7 from 9–10am Pacific Time on Zoom:

https://washington.zoom.us/j/99708186928?pwd=HbhpebAtCWvoP4VQahYb8G1QpQnTgm.1

I'm super excited for Mantas to talk about the accurate models of NVIDIA Tensor Cores his team has developed. Here's the abstract:

Matrix multiplication is a fundamental operation in both training of neural networks and inference. To accelerate matrix multiplication, Graphical Processing Units (GPUs) provide it implemented in hardware. Due to the increased throughput over the software-based matrix multiplication, the multipliers are increasingly used outside of AI, to accelerate various applications in scientific computing. However, matrix multipliers targeted at AI are at present not compliant with IEEE 754 floating-point arithmetic behaviour, with different vendors offering different numerical features. This leads to non-reproducible results across different generations of GPU architectures, at the matrix multiply-accumulate instruction level.

To study numerical characteristics of matrix multipliers—such as rounding behaviour, accumulator width, normalization points, extra carry bits, and others—test vectors are typically constructed. Yet, these vectors may or may not distinguish between different hardware models, and due to limited hardware availability, their reliability across many different platforms remains largely untested. We present software models for emulating the inner product behavior of low- and mixed-precision matrix multipliers in the V100, A100, H100 and B200 data center GPUs in most supported input formats of interest to mixed-precision algorithm developers: 8-, 16-, and 19-bit floating point. These matrix multiplier models are first approximated by determining the numerical features via test vectors designed to trigger outputs sensitive to bit level differences in the implementation, followed by semi-exhaustive comparison (randomised input vectors of 10⁷ values) between the models and the actual GPU matrix multipliers—this process is repeated until the model is bit accurate. These models enable verification of test vectors before applying them to real hardware and also support computational scientists and mixed-precision algorithm developers with easy-to-use accurate models available in MATLAB—we demonstrate their use in multi-word emulation algorithms for matrix multiplication.

The work is based on a preprint available on Arxiv. Looking forward to seeing everyone!

As a reminder, if you would like to give a talk or know of someone that would be great for an FPBench Community meeting, please have them fill out the speaker suggestion form!

—Pavel Panchekha

Pavel Panchekha

unread,

May 5, 2026, 12:00:25 PMMay 5

to fpb...@fpbench.org

Reminder: FPTalks Seminar on Accurate models of NVIDIA Tensor Cores by Mantas Mikaitis this Thursday, 9–10am Pacific Time on Zoom.