# Seminar for machine learning and UQ in scientific computing

This seminar is organized by the Scientific Computing group of CWI Amsterdam. The focus is on the application of Machine Learning (ML) and Uncertainty Quantification in scientific computing. Topics of interest include, among others:

- combination of data-driven models and (multi scale) simulations
- new ML architectures suited for scientific computing or UQ,
- incorporation of (physical) constraints into data-driven models,
- efficient (online) learning strategies,
- using ML for dimension reduction / creating surrogates,
- inverse problems using ML surrogates,

and any other topic in which some form of ML and/or UQ is used to enhance (existing) scientific computing methodologies. All applications are welcome, be it financial, physical, biological or otherwise.

For more information, or if you'd like to attend one of the talks, please contact Wouter Edeling of the SC group.

**Schedule upcoming talks:**

20 May 2022 11h00 CET*: Cecelia Pagliantini (Eindhoven University of Technology): Structure-preserving dynamical model order reduction of Hamiltonian systems*

In this talk we will consider reduced basis methods (RBM) for the model order reduction of parametric Hamiltonian dynamical systems describing nondissipative phenomena. The development of RBM for Hamiltonian systems is challenged by two main factors: (i) failing to preserve the geometric structure encoding the physical properties of the dynamics, such as invariants of motion or symmetries, might lead to instabilities and unphysical behaviors of the resulting approximate solutions; (ii) the local low-rank nature of transport-dominated and nondissipative phenomena demands large reduced spaces to achieve sufficiently accurate approximations. We will discuss how to address these aspects via a structure-preserving nonlinear reduced basis approach based on dynamical low-rank approximation. The gist of the proposed method is to evolve low-dimensional surrogate models on a phase space that adapts in time while being endowed with the geometric structure of the full model. If time permits, we will also discuss a rank-adaptive extension of the proposed method where the dimension of the reduced space can change during the time evolution.

28 Apr. 2022 14h00 CET: *Nazanin Abedini (Vrije Universiteit Amsterdam): Convergence properties of a data-assimilation method based on aGauss-Newton iteration*

Data assimilation is broadly used in many practical situations, such as weather forecasting, oceanography and subsurface modelling. There are some challenges in studying these physical systems. For example, their state cannot be directly and accurately observed or the underlying time-dependent system is chaotic which means that small changes in initial conditions can lead to large changes in prediction accuracy. The aim of data assimilation is to correct error in the state estimation by incorporating information from measurements into the mathematical model. The widely-used data-assimilation methods are variational methods. They aim at finding an optimal initial condition of the dynamical model such that the distance to observations is minimized (under a constraint of the estimate being a solution of the dynamical system). The problem is formulated as a minimization of a nonlinear least-square problem with respect to initial condition, and it is usually solved using a Gauss-Newton

method. We propose a variational data-assimilation method that minimizes a nonlinear least-square problem as well but with respect to a trajectory over a time window at once. The goal is to obtain a more accurate estimate. We prove method convergence in case of noise-free observations and provide error bound in case of noisy observations. We confirm our theoretical results with numerical experiments using Lorenz models.

15 June 2022: *Cristóbal Bertoglio (Bernoulli Institute - U Groningen): Sequential data assimilation in blood flow models*

*Previous talks*

24 Feb. 2022 15h00 CET: *Laura Scarabosio (Radboud University): Deep neural network surrogates for transmission problems with geometricuncertainties*

We consider the point evaluation of the solution to interface problems with geometric uncertainties, where the uncertainty in the obstacle is

described by a high-dimensional parameter, as a prototypical example of non-smooth dependence of a quantity of interest on the parameter. We focus in particular on an elliptic interface problem and a Helmholtz transmission problem. The non-smooth parameter dependence poses a challenge when one is interested in building surrogates. In this talk we propose to use deep neural networks for this purpose. We provide a theoretical justification for why we expect neural networks to provide good surrogates. Furthermore, we present numerical experiments showing their good performance in practice. We observe in particular that neural networks do not suffer from the curse of dimensionality, and we study the dependence of the error on the number of point evaluations (which coincides with the number of discontinuities in the parameter space), as well as on several modeling parameters, such as the contrast between the two materials and, for the Helmholtz transmission problem, the wavenumber.

*22 Jul. 2021 15h00 CET: Ilias Bilionis (School of Mechanical Engineering, Purdue University): Situational awareness in extraterrestrial habitats: Open challenges, potential applications of physics-informed neural networks, and limitations*

I will start with an overview of the research activities carried out by the Predictive Science Laboratory (PSL) at Purdue. In particular, I will use our work at the Resilient Extra-Terrestrial Habitats Institute (NASA) to motivate the need for physics-informed neural networks (PINNs) for high-dimensional uncertainty quantification (UQ), automated discovery of physical laws, and complex planning. The current state of these three problems ranges from manageable to challenging to open, respectively. The rest of the talk will focus on PINNs for high-dimensional UQ and, in particular, on stochastic PDEs. I will argue that for such problems, the squared integrated residual is not always the right choice. Using a stochastic elliptic PDE, I will derive a suitable variational loss function by extending the Dirichlet principle. This loss function exhibits (in the appropriate Hilbert space) a unique minimum that provably solves the desired stochastic PDE. Then, I will show how one can parameterize the solution using DNNs and construct a stochastic gradient descent algorithm that converges. Subsequently, I will present numerical evidence illustrating this approach's benefits to the squared integrated residual, and I will highlight its capabilities and limitations, including some of the remaining open problems.

* 23 Jun. 2021 10h00 CET: Christian Franzke (IBS Center for Climate Physics, Pusan National University in South Korea): Causality Detection and Multi-Scale Decomposition of the Climate System using Machine Learning*Detecting causal relationships and physically meaningful patterns from the complex climate system is an important but challenging problem. In

my presentation I will show recent progress for both problems using Machine Learning approaches. First, I will show that Reservoir Computing

is able to systematically identify causal relationships between variables. I will show evidence that Reservoir Computing is able to systematically identify the causal direction, coupling delay, and causal chain relations from time series. Reservoir Computing Causality has three advantages: (i) robustness to noisy time series; (ii) computational efficiency; and (iii) seamless causal inference from high-dimensional data. Second, I will demonstrate that Multi-Resolution Dynamic Mode Decomposition can systematically identify physically meaningful patterns in high-dimensional climate data. In particular, Multi-resolution Dynamic Mode Decomposition is able to extract the changing annual cycle.

*17 June 2021 15h00 CET: Bruno Sudret (ETH Zürich, chair Risk Safety and Uncertainty Quantification): Surrogate modelling approaches for stochastic**simulators*

Computational models, a.k.a. simulators, are used in all fields of engineering and applied sciences to help design and assess complex systems in silico. Advanced analyses such as optimization or uncertainty quantification, which require repeated runs by varying input parameters, cannot be carried out with brute force methods such as Monte Carlo simulation due to computational costs. Thus the recent development of surrogate models such as polynomial chaos expansions and Gaussian processes, among others. For so-called stochastic simulators used e.g. in epidemiology, mathematical finance or wind turbine design, an intrinsic source of stochasticity exists on top of well-identified system parameters. As a consequence, for a given vector of inputs, repeated runs of the simulator (called replications) will provide different results, as opposed to the case of deterministic simulators. Consequently, for each single input, the response is a random variable to be characterized.

In this talk we present an overview of the literature devoted to building surrogate models of such simulators, which we call stochastic emulators. Then we focus on a recent approach based on generalized lambda distributions and polynomial chaos expansions. The approach can be used

with or without replications, which brings efficiency and versatility. As an outlook, practical applications to sensitivity analysis will also be presented.Acknowledgments: This work is carried out together with Xujia Zhu, a PhD. student supported by the Swiss National Science Foundation under Grant Number #175524 “SurrogAte Modelling for stOchastic Simulators (SAMOS)”.

*10 Jun. 2021 16h00 CET: Hannah Christensen (Oxford): Machine Learning for Stochastic Parametrisation*

Atmospheric models used for weather and climate prediction are traditionally formulated in a deterministic manner. In other words, given a particular state of the resolved scale variables, the most likely forcing from the sub-grid scale motion is estimated and used to predict the evolution of the large-scale flow. However, the lack of scale-separation in the atmosphere means that this approach is a large source of error in forecasts. Over the last decade an alternative paradigm has developed: the use of stochastic techniques to characterise uncertainty in small-scale processes. These techniques are now widely used across weather, seasonal forecasting, and climate timescales.

While there has been significant progress in emulating parametrisation schemes using machine learning, the focus has been entirely on deterministic parametrisations. In this presentation I will discuss data driven approaches for stochastic parametrisation. I will describe experiments which develop a stochastic parametrisation using the generative adversarial network (GAN) machine learning framework for a simple atmospheric model. I will conclude by discussing the potential for this approach in complex weather and climate prediction models.

*21 May 2021 16h00: John Harlim (Penn state): **Machine learning of missing dynamical systems*

In the talk, I will discuss a general closure framework to compensate for the model error arising from missing dynamical systems. The

proposed framework reformulates the model error problem into a supervised learning task to estimate a very high-dimensional closure model, deduced from the Mori-Zwanzig representation of a projected dynamical system with projection operator chosen based on Takens embedding theory. Besides theoretical convergence, this connection provides a systematic framework for closure modeling using available machine learning algorithms. I will demonstrate numerical results using a kernel-based linear estimator as well as neural network-based nonlinear estimators. If time permits, I will also discuss error bounds and mathematical conditions that allow for the estimated model to reproduce the underlying stationary statistics, such as one-point statistical moments and auto-correlation functions, in the context of learning Ito diffusions.

*29 Apr. 2021 16h15: Nathaniel Trask (Sandia): Structure preserving deep learning architectures for convergent and stable data-driven modeling*

The unique approximation properties of deep architectures have attracted attention in recent years as a foundation for data-driven modeling in scientific machine learning (SciML) applications. The "black-box" nature of DNNs however require large amounts of data that generalize poorly in traditional engineering settings where available data is relatively small, and it is generally difficult to provide a priori guarantees about the accuracy and stability of extracted models. We adopt the perspective that tools from mimetic discretization of PDEs may be adapted to SciML settings, developing architectures and fast optimizers tailored to the specific needs of SciML. In particular, we focus on: realizing convergence competitive with FEM, preserving topological structure fundamental to conservation and multiphysics, and providing stability guarantees. In this talk we introduce some motivating applications at Sandia spanning shock magnetohydrodynamics and semiconductor physics before providing an overview of the mathematics underpinning these efforts.