Next Thursday (Aug 7) FPTalks Community Meeting: Overton on Numerical Computing + Baranowski & Gopalakrishnan on FP Arithmetic and Exceptions

21 views

Skip to first unread message

Zachary Tatlock

unread,

Jul 31, 2025, 11:47:09 AMJul 31

to fpb...@fpbench.org, advi...@fpbench.org, Michael Overton, Mark Baranowski, Ganesh Gopalakrishnan

Howdy folks!

Next Thursday, August 7 at 9:00 - 10:00 AM Pacific, we’ll have the next monthly FPTalks Community Meeting on this Zoom:
https://washington.zoom.us/j/92831331326

We’re thrilled to have two segments this month!

Michael Overton (NYU) will kick things off with a 15-minute spotlight on the new edition of his book Numerical Computing with IEEE Floating Point Arithmetic, just published by SIAM in May 2025.

Although the basic principles of IEEE floating point arithmetic have remained largely unchanged since the first edition of my book Numerical Computing with IEEE Floating Point Arithmetic was published by SIAM in 2001, the technology that supports it has changed enormously. Every chapter of the book has been rewritten extensively, and two new chapters have been added: one on computations with higher precision than that mandated by the standard, needed for a variety of scientific applications, and one on computations with lower precision than was ever contemplated by those who wrote the standard, driven by the massive computational demands of machine learning. Topics include the rationale for floating point representation, correctly rounded arithmetic and exception handling, support for the standard by floating point microprocessors and programming languages, and an introduction to the key concepts of cancellation, conditioning and stability. The book gives many technical details that are not readily available elsewhere. The second edition was published by SIAM in May 2025.

Marek Baranowski and Ganesh Gopalakrishnan (University of Utah) will then present on Detecting and Diagnosing FP Exceptions in GPUs and CPUs.

Hundreds of GPU programmers are being stymied by the NaNs and INFs that arise during computation, often polluting loss functions (ML) and residuals (HPC). The debugging problem is exacerbated due to GPU kernels being closed-source and launched from scripts written in Python, Julia etc. While one may build binary analysis tools to analyze exceptions, separate tools are needed for different GPUs. Finally one likes to detect exceptions at a higher level (e.g., LLVM): the lack of publicly available GPU support from LLVM makes such tools more easily CPU-targetable.

In this talk, we will briefly survey tools that can help detect and diagnose floating-point exceptions. The bulk of this talk will be devoted to covering the tools written at Utah: namely GPU-FPX (for GPU SIMT cores) and its 'nixnan' variant (for GPU Tensor Cores). We run a few demos that illustrate the ease of use of GPU-FPX on a variety of codes: simple data compressors, simple GPTs, and Python/Julia codes. While GPU-FPX currently helps ``X-Ray'' down the stack of kernel calls, knowing what these kernels do and which of the detected exceptions are relevant -- and which exception coercion rules (to normal values) are sound -- remains unsolved. The only clear guidance we know of -- consistent exception handling due to Demmel -- does not seem to hold and is inefficient if literally followed. Given that exceptions occur with such high frequencies already and will multiply in their manifestations on different hardware and software, clear guidelines for exception coercion and blame assignment are needed.

The talk will highlight how we mined exceptions from Tensor Cores in nixnan (Reference: github.com/ganeshutah/PLDI25-Array-Workshop ), and also summarize FloatGuard (AMD Exception Checking tool from UC Davis: HPDC'25) and FPChecker (LLVM Exception Checking from Livermore: ISSWC'22). We will devote ~15 mins to garner audience feedback to help us prepare for our SC'25 tutorial on this topic this November in St. Louis, MO.

Additional input from: Xinyi Li (Utah), Dolores Miao (UC Davis), Harvey Dam (Utah), Cindy Rubio-Gonzalez (UC Davis), and Ignacio Laguna (LLNL).

Looking forward to seeing everyone :)

As a reminder, if you’d like to give a talk or know someone who would be great for an FPBench Community Meeting, please fill out our speaker suggestion form!

Best regards,

FPTalks Discussion: https://fpbench.org/subscribe

Nominate a speaker: https://fpbench.org/nominate

Zachary Tatlock

https://ztatlock.net

Professor

Paul G. Allen School of Computer Science & Engineering

University of Washington

Zachary Tatlock

unread,

Aug 5, 2025, 12:23:42 PMAug 5

to fpb...@fpbench.org, advi...@fpbench.org, Michael Overton, Mark Baranowski, Ganesh Gopalakrishnan

Howdy folks,

Just a quick reminder: this Thursday, August 7 at 9:00 AM PT, we’ve got an especially exciting lineup for our monthly FPTalks meeting:

Michael Overton (NYU) will kick things off with a 15-min spotlight on the new edition of his book Numerical Computing with IEEE Floating Point Arithmetic. The update includes deep dives on high- and low-precision computing, correct rounding, exception handling, and more. He’s even shared a sneak peek of sample pages (attached).

Then Ganesh Gopalakrishnan & Mark Baranowski (Utah) will demo GPU-FPX, nixnan, and other tools for diagnosing FP exceptions on CPUs and GPUs. Lots of debugging insight for ML and HPC applications!

Here is the Zoom link: https://washington.zoom.us/j/92831331326

Hope to see you there!

FPTalks Discussion: https://fpbench.org/subscribe
Nominate a speaker: https://fpbench.org/nominate