# Seminar for machine learning and UQ in scientific computing

The focus of this seminar is application of Machine Learning (ML) and Uncertainty Quantification in scientific computing. Topics of interest include, among others:

- combination of data-driven models and (multi scale) simulations
- new ML architectures suited for scientific computing or UQ,
- incorporation of (physical) constraints into data-driven models,
- efficient (online) learning strategies,
- using ML for dimension reduction / creating surrogates,
- inverse problems using ML surrogates,

and any other topic in which some form of ML and/or UQ is used to enhance (existing) scientific computing methodologies. All applications are welcome, be it financial, physical, biological or otherwise.

For more information, please contact Wouter Edeling of the SC group.

**Upcoming talks:**

*Currently the seminar is being set up, it will commence shortly.*

**Previous UQ seminar talks:**

This seminar is a continuation of the UQ seminar. Presentations available internally on https://oc.cwi.nl.

2020

*12 Mar. 2020: Martin Janssens (TU Delft): **Machine learning unresolved turbulence in a variational multiscale model*Models for the influence of unresolved processes on resolved scales remain a prominent error source in numerical simulations of engineering and atmospheric flows. In recent years, improvements in the capacity of machine learning algorithms and the increasing availability of high-fidelity datasets have identified data-driven unresolved scales models in general and Artificial Neural Networks (ANNs) in particular as high-potential options to break the deadlock. Yet, early contributions in this field rely on inconsistent multiscale model formulations and are plagued by numerical instability. To sketch a clearer picture on the sources of the accuracy and instability of ANN unresolved scales models, we have developed a framework in which no assumptions on the model form are made. We use ANNs to infer exact projections of the unresolved scales processes on the resolved degrees of freedom. Such "interaction terms" naturally arise from Variational Multiscale Model (VMM) formulations. Our VMM-ANN framework limits error to the data-driven interaction term approximations, offering explicit insight into their functioning. We assess our model in the context of a one-dimensional, forced Burgers’ problem, for a range of simple and realistic forcings. Simple, feedforward ANNs with local input trained on error-free data a priori to inserting them in forward simulations (offline) strongly improve the prediction of the interaction terms of our problem compared to traditional, algebraic VMM closures in offline settings at various levels of discretisation; they also generalise well to uncorrelated instances of our forcing. However, this performance does not translate to simulations of forward problems. The model suffers from instability due to i) ill-posed nonlinear solution procedures and ii) self-inflicted error accumulation. These correspond to two dimensions of forward simulations that are not accounted for by offline training on error-free data. We show that introducing noise to the training data can partly remedy these problems in our simple setting. Yet, we conclude that appreciable challenges remain in order to capitalise on the promise offered by ANNs to improve the unresolved scales modelling of turbulence.

*3 Feb. 2020: Akil Narayan (Uni Utah, US): Low-rank algorithms for multi-level and multi-fidelity simulations in uncertainty quantification*

High-fidelity computational software is often a trusted simulated-based tool for physics-based prediction of real-world phenomena. The tradeoff of such high-fidelity software in delivering an accurate prediction is substantial computational expense. In such situations performing uncertainty quantification on these simulations, often requiring several queries of the software, is infeasible. Alternatively, low-fidelity simulations are often orders of magnitude less expensive, but yield inaccurate physical predictions. In more complicated cases, one is faced with multiple models with multiple fidelity levels, and must make a decision about where to allocate computational resources. In this talk we explore low-rank strategies for addressing such multi-fidelity problems. While low-fidelity models are of dubious predictive value, they can be of substantial value in terms of quantifying uncertainty through low-rank methods. We explore the practical and mathematical aspects of these approaches, and demonstrate their effectiveness on problems in molecular dynamics, linear elasticity, and topology optimization.

2019

*5 Dec. 2019: Yous van Halder (CWI): Multi-level surrogate model construction with convolutional neural networks*In this talk, we will present a novel approach to use convolutional neural networks to improve multi-level Monte Carlo estimators. Next to the classic idea of variance decay upon grid refinement, we employ the idea that the PDE error behavior is similar between consecutive levels when the grid is fine enough. Based on these ideas, we design the following neural network: a convolutional neural network that extracts local error features as latent quantities, a fully connect network to map local errors to global errors, subsequently extended with transfer learning to efficiently learn the error behavior on finer grids. We show promising results on nonlinear PDEs, including the incompressible Navier-Stokes equations.

*21 Nov. 2019: Philippe Blondeel (KU Leuven): p-refined Multilevel Quasi-Monte Carlo for Galerkin Finite Element Methods with applications in Civil Engineering*Practical civil engineering problems are often characterized by uncertainty in their material parameters. Discretization of the underlying equations is typically done by means of the Galerkin Finite Element method. The uncertain material parameter can then be expressed as a random field that can be represented by, for example, a Karhunen–Loève expansion. Computation of the stochastic response is very costly, even if state-of-the-art Multilevel Monte Carlo (MLMC) is used. A significant cost reduction can be achieved by using p-refined Multilevel Quasi-Monte Carlo (p-MLQMC). The method is based on the idea of variance reduction by employing a hierarchical discretization of the problem based on a p-refinement scheme. This novel method is then combined with a rank-1 lattice rule yielding faster convergence compared to the method based on random Monte Carlo points. This method is first benchmarked on an academic beam bending problem. Finally, we use our algorithm for the assessment of the stability of slopes, a problem that arises in geotechnical engineering.

*7 Nov. 2019: Georgios Pilikos (CWI): Bayesian Machine Learning for Seismic Compressive Sensing*We will introduce Bayesian machine learning for efficient seismic surveys. This will include data-driven models that learn sparse representations (features) of the data and built around the framework of Compressive Sensing. Furthermore, we will show that using these models, it is possible to predict missing receivers' values and simultaneously quantify the uncertainty of these predictions.

* 24 Oct. 2019: Anna Nikishova (UvA): Sensitivity analysis based dimension reduction of multiscale models*Sensitivity analysis (SA) recognizes the effects of uncertainty in the model inputs to the output parameters. Whenever the variance is a representative measure of model uncertainty, Sobol variance-based method is the preferred approach to identify the main sources of uncertainty. Additionally, SA can be applied to identify ineffectual inputs parameters in order to decrease the model dimensionality by equating such parameters to their mean values.

In this talk, we demonstrate that in some cases SA of a single scale model provides information on the sensitivity of the final multiscale model output. This then can be employed to reduce the dimensionality of the multiscale model input. However, the sensitivity of a single scale model response does not always bound the sensitivity of the multiscale model output. Hence, an analysis of the function defining the relation between single scale components is required to understand whether single scale SA can be used to reduce the dimensionality of the overall multiscale model input space.

*26 Sept. 2019: Nikolaj Mucke (CWI): Non-Intrusive Reduced Order Modeling of Nonlinear PDE-Constrained Optimization using Artificial Neural Networks*Nonlinear PDE-constrained optimization problems are computationally very time consuming to solve numerically. Hence, there is much to be gained from replacing the full order model with a reduced order surrogate model. Using conventional methods, such as proper orthogonal decomposition (POD), often yields a such reduced order model but doesn't necessarily cut down computation time for nonlinear problems due to the intrusive nature of the method. As an alternative, artificial neural networks, combined with POD, are here presented as a viable non-intrusive surrogate model that cuts down computation time significantly.

The talk will be divided into three parts: 1) a brief introduction to PDE-constrained optimization and a discussion about why such problems are computationally heavy to solve, 2) an introduction to POD in the context of PDE-constrained optimization, 3) a presentation of how neural networks can be utilized as a surrogate model.

* 5 Sept. 2019: Kelbij Star (SCK•CEN / Ghent University), POD-Galerkin Reduced Order Model of the Boussinesq approximation for buoyancy-driven flows*A parametric Reduced Order Model (ROM) for buoyancy driven-flows is presented for which the Full Order Model (FOM) is based on the finite volume approximation. To model the buoyancy, a Boussinesq approximation is applied. Therefore, there exists a two-way coupling between the incompressible Boussinesq equations and the energy equation. The ROM is obtained by performing a Galerkin projection of the governing equations onto a reduced basis space that has been constructed using a Proper Orthogonal Decomposition approach. The ROM is tested on a 2D differentially heated cavity, of which the wall temperatures are parametrized using a control function method. Furthermore, the issues and challenges of Reduced Order Modeling, like stability, non-linearity and boundary control, are discussed. Finally, attention will be paid to the training of the ROM and especially for the application of Uncertainty Quantification.

*22 Aug. 2019: Hemaditya Malla (CWI - TU/e), Response-based quadrature rules for uncertainty quantification*Forward propagation problems in uncertainty quantification using polynomial-based surrogates involve the numerical approximation of integrals using quadrature rules. Such numerical approximations require function evaluations that are often costly. Most quadrature rules in literature are constructed so as to be able to exactly integrate polynomials upto a given degree. The accuracy of such quadrature rules depends on the smoothness of the function being integrated. In order to integrate functions that lack smoothness this approach requires a large number of function evaluations to reach a certain accuracy. In this talk, an algorithm is introduced that generates quadrature rules of higher accuracy to integrate such types of functions that requires less function evaluations.

*25 Apr. 2019: Laurent van den Bos (CWI), Inferring predictions under uncertainty using quadrature rules*A novel numerical integration framework for the purpose of uncertainty propagation is proposed. The framework is based on the calculation of

quadrature rules, which are weighted averages of (costly) function evaluations, and is therefore very suitable for the calculation of moments and

pseudo-spectral expansions.

The key challenge considered in this work is that the distribution of the input parameters (that should be propagated) is not known explicitly and is

correlated. This situation happens often if the distribution is inferred from measurements or if the distribution is obtained through a Bayesian framework, i.e. it is a posterior. Both scenario's are considered in this work. All quadrature rules have positive weights and are interpolatory. Then the efficiency of the proposed integration framework can be demonstrated mathematically using relatively straightforward concepts. The theory is validated using test functions and the problem of predicting the flow over an airfoil is considered to demonstrate the applicability of the framework to practical cases.

*18 Apr. 2019: Tristan van Leeuwen (Utrecht), A Kernel Method for Seismic Full Waveform Inversion*Full waveform inversion aims to estimate subsurface medium parameters from seismic measurements. It is typically cast as a PDE-constrained optimization problem which can be solved using the well-known reduced-space method. It has been shown to be advantageous to relax the constraints and solve a joint parameter-state estimation problem. The challenge here is to device an algorithm that avoids having to store and update the full state. In this talk I discuss a formulation of this problem in a Reproducing Kernel Hilbert Space, where the Representer Theorem can be used to yield a finite-dimensional optimization problem whose dimension is dictated by the size of the data, which is typically much smaller than the size of the discretized state.

*11 Apr. 2019: Benjamin Sanderse (CWI), Stable model order reduction for incompressible flows through energy conservation *The simulation of complex fluid flows is an ongoing challenge in the scientific community. The computational cost of Direct Numerical Simulation (DNS) or Large Eddy Simulation (LES) of turbulent flows quickly becomes imperative when one is interested in control, design, optimization and uncertainty quantification. For these purposes, simplified models are typically used, such as reduced order models, surrogate models, low-fidelity models, etc. In this work we study reduced order models (ROMs) that are obtained by projecting the Navier-Stokes equations onto a lower-dimensional space. Classically, this is performed by using a POD-Galerkin method, where the basis for the projection is built from a proper orthogonal decomposition of the snapshot matrix of a set of high-fidelity simulations. Ongoing issues of this approach are, amongst others, the stability of the ROM, handling turbulent flows, and conservation properties. We will address the stability of the ROM for the particular case of the incompressible Navier-Stokes equations. We propose to use an energy-conserving discretization of the Navier-Stokes equations as full-order model (FOM), which we project on a lower-dimensional space in such a way that the resulting ROM inherits the energy conservation property of the FOM, and consequently its nonlinear stability properties.

*28 Mar. 2019: Erik Quaeghebeur (TU Delft), Robust wind farm layout optimization using pseudo-gradients*The layout of a wind farm has an important impact on its power production. Layouts that are robust against, for example, wind climate variability, can provide production guarantees that enable easier access to project funding. Because layout optimization is a computationally demanding task and robust optimization even more so, I developed a substantially more computationally efficient optimization approach using so-called pseudo-gradients. This approach still results in layouts comparable in quality to those typically generated. Its efficiency is an enabler of my robust optimization investigations. I will present the layout optimization problem, my pseudo-gradient based approach, and results of my ongoing investigations into robust layout optimization.

*7 Mar. 2019: Andreas van Barel (KU Leuven) MG/OPT and MLMC for robust optimization of PDEs*We present an algorithm based on the MG/OPT framework (a multigrid optimization framework) to solve optimization problems constrained by PDEs with uncertain coefficients. The gradients and Hessians contain expected value operators, which are evaluated using a multilevel Monte Carlo (MLMC) method. Each of the MG/OPT levels then contains multiple underlying MLMC levels. The MG/OPT hierarchy allows the algorithm to exploit the structure inherent in the PDE, speeding up the convergence to the optimum (regardless of the problem being deterministic or stochastic). In contrast, the MLMC hierarchy exists to exploit structure present in the stochastic dimensions of the problem. One can observe a large reduction in the number of samples required on expensive levels, and therefore in computational time.

* 21 Feb. 2019: Anne Eggels (CWI), Sensitivity analysis by Gaussian processes*Sobol indices are in UQ the most well-known method to compute sensitivity indices. In recent times however, methods based on divergences emerge. These divergence-based indices can be computed by Monte Carlo sampling of the input space to obtain output samples, which are then combined with kernel density estimation. In practice, the number of output samples available is limited, which leads to multiple problems. Therefore, we propose to use a Gaussian process in order to obtain predictions for more locations in the input space. An extra benefit from this is the availability of confidence intervals for the sensitivity indices computed in this way.

*7 Feb. 2019: Jurriaan Buist (CWI), Neural networks for closure models in multiphase flow*Multiphase-flows are described by the multiphase Navier-Stokes equations. Numerically solving these equations is computationally expensive, and performing many simulations for the purpose of design, optimization and UQ is often prohibitively expensive. A simplified model, the so-called two-fluid model, can be derived from a spatial averaging process. The averaging process introduces a closure problem, which is represented by unknown friction terms in the two-fluid model. Correctly modeling these friction terms is a long-standing problem in two-fluid model development. In this work we take a new approach and learn the closure terms in the two-fluid model from a set of unsteady high-fidelity simulations through the open source code Gerris and form the training data for a neural network (NN). The NN provides a functional relation between the two-fluid model inputs and the closure terms, which are added as source terms in the two-fluid model. In our presentation we will show the benefits of this approach by comparing the original two-fluid model, the trained two-fluid model, and the high-fidelity simulations.

* 24 Jan. 2019: Bart de Leeuw (CWI), Ensemble shadowing for imperfect models*We discuss the use of shadowing data assimilation for models with structural error. We propose using a regularized shadowing method for this purpose. We then turn to the question how to do error analysis and generate an ensemble based on shadowing.

*10 Jan. 2019: Nassim Razaaly (INRIA), An efficient reliability analysis tool, for the computation of low tail probabilities and extreme quantiles characterized by multiple failure regions*Calculation of tail probabilities and small quantiles is of fundamental importance in several domains, such as for example risk assessment or optimization. One major challenge consists in their computation when characterized by multiple-failure regions and rare event, say an occurrence probability smaller than 1e-7. Here, we focus on cases where the function of interest is the output of an computationally expensive code such as CFD or structural analysis. We propose a novel algorithm permitting to build an accurate Kriging metamodel, and exploit it using Importance Sampling techniques in order to estimate the required statistics (either quantile or tail probability). In fact, it relies on a novel metamodel building strategy, which aims to refine the limit-state region in all the branches ”equally”, even in the case of multiple failure regions. Due to Kriging limitations, the method is suitable for low stochastic dimension (say less than 10). This refinement step is formulated in such a way that the computation of both small probabilities of failure and extreme quantiles is unified (Schobi 2016). Parallel strategies are proposed. Several numerical examples taken from Bect 2017 and Schobi 2016 are carried out (2D, 6D, 8D), where small failure probabilities (10^-6 - 10^-9) are efficiently and accurately evaluated with a low number of the original performance function (less than 100). Corresponding inverse problem consisting in evaluating the associated quantile associated to the

*true*failure probability are evaluated with the proposed framework, with the same order of performance function calls. The main novelty consists in adapting the method proposed by Schobi (2016) for the efficient evaluation of extreme quantiles.

2018

*20 Dec. 2018: Yous van Halder (CWI), Neural networks for multifidelity surrogate modelling*In this talk we discuss how neural networks can be used together with multifidelity techniques to accelerate the parametric solution of time-dependent partial differential equations. Based on a small number of high-resolution (high-fidelity) simulations, a neural network is trained ('offline') that maps low-fidelity simulation outputs to achieve an accuracy similar to the high-fidelity simulations in 'online' mode. We show promising results for non-linear equations including the flow over a backward-facing step.

*11-12 Dec. 2018: Meeting of the CWI-INRIA Associate team COMMUNES (Computational Methods for Uncertainties in Fluids and Energy Systems)*

As part of this meeting, there will be 8 presentations on topics in UQ. All presentations are in CWI room L016.

*Tuesday Dec 11th*:

13:30-14:00 Francois Sanson (INRIA Bordeaux): '*Emulation of a discontinuous function, application to space object reentry risk estimation'*

14:00-14:30 Anne Eggels (CWI): '*Dependency and sensitivity indices*'

break

15:00-15:30 Svetlana Dubinkina (CWI): '*Uncertainty quantification for random energy systems*'

15:30-16:00 Pietro Congedo (INRIA Paris): '*A novel approach for constrained multi-objective optimization under uncertainty'*

*Wednesday Dec 12th:*

9:30-10:00 Nassim Razaaly (INRIA Bordeaux): '*Robust Optimization of a Supersonic ORC Turbine Cascade under a probabilistic constraint: a Quantile-based Approach*'

10:00-10:30 Olivier le Maître (LIMSI/CNRS, Paris): *'Some complexity reduction methods for the Bayesian inference of model parameters'*

break

11:00-11:30 Benjamin Sanderse / Yous van Halder (CWI): '*Physics-informed surrogate models for fluid flows*'

11:30-12:00 Daan Crommelin / Wouter Edeling (CWI): '*Data-driven stochastic closures for multiscale dynamical systems*'

* 6 Dec. 2018: Anastasia Borovykh (CWI), Neural networks as Gaussian processes*In this talk we will discuss when neural networks tend to behave like Gaussian processes. In particular, as the number of nodes tends to infinity, by an application of the central limit theorem the output of a neural network layer, under particular assumptions, is given by a Gaussian process. Both deep neural networks as well as convolutional neural networks can behave like Gaussian processes and we will study when and how this behaviour can be obtained or avoided. Furthermore we discuss how to obtain uncertainty estimates from the neural network output through the link with the Gaussian process posterior.

*11 Oct. 2018: Continuous-level Monte Carlo methods (CLMC)*

Article discussion: article can be found here.

*13 Sep. 2018: Wouter Edeling (CWI), Dynamic mode decomposition (DMD)*

Introduction to dynamic mode decomposition.

*31 May 2018: Laurent van den Bos (CWI), Generating nested and positive quadrature rules based on arbitrary sample sets*

*17 May 2018: Daan Crommelin (CWI), Data-driven stochastic closures for multiscale dynamical systems*

*3 May 2018: Prashant Kumar (CWI), MLMC for model-form uncertainties in RANS simulations*

*5 Apr. 2018: Enrico Camporeale (CWI), Accuracy-reliability cost function for empirical variance estimation*

*22 Mar. 2018: Learning about physical parameters: The importance of model discrepancy*Article discussion: article can be found here.

*8 Mar. 2018: Yous van Halder (CWI), Residual based adaptive sampling for surrogate modelling*

*22 Feb. 2018: Yu Zhang (TU Delft), Efficient Global Optimization accelerated with low-fidelity analysis*

*8 Feb. 2018: Bart de Leeuw (CWI), Shadowing data assimilation for imperfect models*

*25 Jan. 2018: Sangeetika Ruchi (CWI), Bayesian inversion for high dimensional systems using data assimilation*

*11 Jan. 2018: Rakesh Sarma (TU Delft), Bayesian estimation with reduced order models for aeroelastic predictions*

2017

14 Dec. 2017: Wouter Edeling; PDE-constrained UQ for turbulence models

23 Nov. 2017: Anna Nikishova; Semi-intrusive uncertainty quantification for multiscale models

10 Oct. 2017: Julia Klinkert, Benjamin Sanderse; Polynomial chaos expansions and UQLab

21 Sep. 2017: Yous van Halder; Adaptive uncertainty quantification for high-dimensional and discontinuous models

29 Jun. 2017: Jouke de Baar; Multifidelity uncertainty quantification

15 Jun. 2017: Richard Dwight; Gaussian process regression

18 May 2017: Article discussion; Data-driven discovery of governing physical laws (Rudy, Brunton, Proctor, Kutz)

26 Apr. 2017: Krzysztof Bisewski; Introduction to rare event simulation

06 Apr. 2017: Anne Eggels; Sensitivity analysis for correlated variables

23 Feb. 2017: Laurent van den Bos; Bayesian calibration of model uncertainty by using adaptive surrogate methods

02 Feb. 2017: Yous van Halder; Local stochastic collocation and multi-element methods

12 Jan. 2017: Yous van Halder; Intrusive vs. non-intrusive UQ

2016

08 Dec. 2016: Sangeetika Ruchi; Kalman filtering

10 Nov. 2016: Enrico Camporeale; Adaptive selection of sampling points

27 Oct. 2016: Anne Eggels; UQ for correlated datasets

13 Oct. 2016: Bart de Leeuw; Data assimilation and shadowing

22 Sep. 2016: Joke Blom; Sensitivity analysis

01 Sep. 2016: Prashant Kumar; Multilevel Monte Carlo methods

30 Jun. 2016: Benjamin Sanderse, Laurent van den Bos; UQ review paper of Xiu

09 Jun. 2016: Benjamin Sanderse; Introduction