Operator Learning: From Theory to Practice | 9am PT Mar 26

12 views
Skip to first unread message

Grigory Bronevetsky

unread,
Mar 24, 2024, 12:30:49 AMMar 24
to ta...@modelingtalks.org

image.pngModeling Talks

Operator Learning: From Theory to Practice

Nikola Kovachki, NVIDIA

image.png

Tuesday, Mar 26 | 9am PT

Meet | Youtube Stream


Hi all,


The presentation will be via Meet and all questions will be addressed there. If you cannot attend live, the event will be recorded and can be found afterward at
https://sites.google.com/modelingtalks.org/entry/operator-learning-from-theory-to-practice


Abstract: We present a general framework for approximating non-linear maps between infinite dimensional Banach spaces from observations. Our approach follows the "discretize last" philosophy by designing approximation architectures directly on the function spaces of interest without tying parameters to any finite dimensional discretization. Such architectures exhibit an approximation error that is independent of the training data discretization and can utilize data sources with diverse discretization common to many engineering problems. We review the infinite-dimensional approximation theory for such architectures, showing the universal approximation property and the manifestation of the curse of dimensionality translating algebraic rates in finite dimensions to exponential rates in infinite dimensions. We discuss efficient approximation of certain operators arising from parametric partial differential equations (PDEs) and show that efficient parametric approximation implies efficient data approximation. We demonstrate the utility of our framework numerically on a variety of large-scale problems arising in fluid dynamics, porous media flow, weather modeling, and crystal plasticity. Our results show that data-driven methods can provide orders of magnitude in computational speed-up at a fixed accuracy compared to classical numerical methods and hold immense promise in modeling complex physical phenomena across multiple scales. 


Bio: Nik Kovachki is a research scientist at NVIDIA in the Learning and Perception group. His work focuses on understanding the connections between machine learning and scientific computing. Nik received a B.Sc. in mathematics from Caltech in 2016, and a Ph.D. in applied and computational mathematics from Caltech in 2022 under the supervision of Prof. Andrew Stuart. In 2025, Nik will join the mathematics faculty at the Courant Institute of Mathematical Sciences at New York University.


More information on previous and future talks: https://sites.google.com/modelingtalks.org/entry/home

Grigory Bronevetsky

unread,
Mar 31, 2024, 3:12:46 AMMar 31
to Talks, Grigory Bronevetsky
Video Recording: https://youtu.be/zjxzzNl3ZRA

Summary

  • Focus: Operator learning and how it is useful for scientific computing

  • Motivation:

    • Goal is modeling fluids, materials, weather

    • Operator learning applications: 

      • Speed up expensive simulations

      • Model unknown dynamics from data

  • Setting:

    • Map between separable Banach spaces

    • Function -> Function

    • A function represents a mapping from some coordinate space to the state of the world at points in that space

    • Goal is to approximate the function->function mapping to minimize approximation error

  • Example:

    • Semi-discrete heat equation (discretized in time, not space)

      • Forward Euler formulation loses stability when represented in as function transform

      • Backward Euler is stable, choice of approximation parameters is independent of the parameters of the discretization

    • Design of operator learning architecture has same type of design choice

      • Goal: discretization invariance:

        • decouple cost from discretization

        • use information at different discretizations

        • transfer learn across discretizations

  • Architectures and Approximation Theory

    • Reduced order modeling: 

      • Idea:

        • Encode original function into a low-dim approximation, 

        • Transform this low-dim approximation to another function, 

        • Then decode back into original representation

      • Goal: make the low-dim approximation/transform accurate: approximately commutes with original transform

      • Universal approximation is possible in this framework

    • Non-linear instantiation:

      • Encode original function using PCA -> main eigenfunctions of input function

      • Decode via inverse PCA

      • Low-dim transform: neural net

      • Can prove that we can achieve any level of accuracy given enough eigenfunctions to decompose to via PCA

      • Challenge: for this linear transform the number of eigenfunctions needed grow exponentially; need a non-linear approximation

    • Non-linear instantiation:

      • Sequence of neural layers

      • Kernel transforms data into reduced form

      • Apply linear weights

      • Push through non-linear operator (e.g. sigmoid, tanh is normal neural nets)

      • Approximation is more non-linear and efficient

      • Challenge: choice of kernel

        • Many kernels directly imply a data representation, 

        • E.g. CNNs impose a specific grid

        • More flexible: 

          • Transforms: fourier, circle harmonics, wavelets, Laplace-Beltrami

          • Adaptive meshing / multipole

          • Allow selective discretization that uses different levels of approximation in different spatial regions

      • Approximation

        • Can show that for each architecture there is some bad map that requires exponentially many parameters

        • So worst case is bad but what can we approximate successfully?

        • For each approximation method try to find the space of functions the method can approximate efficiently (polynomially many parameters)

        • Hard to characterize this space but can show it is non-empty

          • E.g. Navier-Stokes model of incompressible fluids

          • Can prove that approximating this requires polynomially many parameters

    • Data complexity

      • Instantiate the framework to encoder that uses a differentiable function to sample the original transform and encode the data, then decode it

        • E.g. finite sampler

      • Can prove that in the worst case the number of samples required for an approximation grows exponentially

      • But can show that if the transformation approximation requires polynomially many parameters, the data approximator will need polynomially many samples

  • Applications

    • 3D RANS Simulations

      • Training: 500 converged simulations, 5,000,000 Reynolds number

      • Map: inlet velocity to wall shear stress

      • Used Geometry-Informed Neural Operator to approximate simulation efficiently

    • Weather modeling

      • Used ERA5 Reanalysis from ECMWF

      • 1979-2018 - 1 hour intervals

      • 721x1440 equiangular grid

      • Parameterized using spherical harmonics

      • Matches accuracy of physics model but with lower cost


Reply all
Reply to author
Forward
0 new messages