Nederlands

Workshop on Modern Applications of Control Theory and Reinforcement Learning

Following our Spring School and workshop on Themes across Control and Reinforcement Learning, of the research semester programme on Control Theory and Reinforcement Learning, we have a workshop on Modern Applications of Control Theory and Reinforcement Learning.

When
20 May 2025 from 9:15 a.m. to 21 May 2025 6 p.m. CEST (GMT+0200)
Where
Science Park 125, Turingzaal
Add

Register here for the workshop

DEADLINE: 1 May 2025

On 20 and 21 May 2025, we are organizing the workshop on Modern Applications of Control Theory and Reinforcement Learning. You are welcome to bring along a poster. Please submit your abstract while registering.

Applications of control theory and reinforcement learning are increasingly diverse. This workshop aims to foster the transfer of methods of control and RL across upcoming domains, especially complex adaptive systems such as climate-socio-economics, neuroscience, and similar.

Speakers information

Diederik M. Roijers is a senior researcher at the AI lab at the Vrije Universiteit Brussel (VUB) and business developer at the Innovation Department of the Municipality of Amsterdam. His main aim is to create trustworthy AI that works for and with people through oversight and rapid user-adaptation. He is a (founding) member of the PEER project, where he investigates human-AI collaboration systems. In Amsterdam, this is done in the context of an important use-case: accessible route planning. While at the VUB, he focusses on rapidly learning and adapting to user preferences.

His main research interests are urban AI, reinforcement learning, planning, multi-agent systems, and multi-objective decision making. Click for more information of his tutorials or book for an introduction to multi-objective models and methods for multi-agent systems, RL and planning, or his publication page for information about his latest research. Other interests his are game theory, machine learning, robotics, e-tutoring systems, and education.

He obtained his PhD at the University of Amsterdam under the supervision of Shimon Whiteson and Frans Oliehoek. Click here for more information or his PhD thesis for further details. After his PhD, he worked on social robotics in the TERESA project, at the Department of Computer Science of the University of Oxford; as an FWO Postdoctoral Fellow on Multi-objective Reinforcement learning with Guarantees at the Vrije Universiteit Brussel; as assistant professor at the Vrije Universiteit Amsterdam; and as senior lecturer and researcher at the Institute of ICT and the Microsystems Technology Research Group where he worked on efficient AI for microprocessing systems and sensor data.

Talk details

A Plea for User-Centred RL

Much RL research naturally focusses on the methods and algorithms themselves, as this is ultimately the core of our field. As such, the whole picture of a system including its design, development and deployment cycle with advantages and disadvantages is - however understandably - left out of scope of the typical research paper. In this talk, I will go over what this might obfuscate, and what consequences this has for the evaluation and applicability of RL in practice, using practical examples. Subsequently, I will make a plea for starting at the end, i.e., how a system is used in practice, the requirements that entails, and the way we might more fairly report results.

Elena Rovenskaya is the IIASA Advancing Systems Analysis (ASA) Program Director. She is also a research scholar at the Optimal Control Department of the Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russia (on-leave). Her scientific interests lie in the fields of optimization, decision science, and mathematical modeling of complex socio-environmental systems.

Dr. Rovenskaya graduated in 2003 from the Faculty of Physics, Lomonosov Moscow State University, Russia. She received her PhD in 2006 at the Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russia. In her PhD dissertation, Dr. Rovenskaya developed a new numerical method for solving a broad class of non-convex optimization problems.

Dr. Rovenskaya was appointed Advancing Systems Analysis (ASA) Program Director from January 2021. Currently, the ASA Program includes 140+ scientists and aims to identify, develop, and deploy new systems-analytical methods, tools, and data that address the most pressing global sustainability challenges with greater agility, and help find solutions to those challenges that are both realistic and appropriate.

Talk details

Optimal Control and the Stories We Tell About Climate Change Economics

The talk will present three stylized models based on optimal control theory, each addressing a distinct aspect of the economics of climate change. The first model explores optimal transition pathways from 'brown' to 'green' capital. The second investigates potential incentive structures to enhance the contribution of wealthier segments of the society to global climate efforts. Finally, the third model demonstrates how the optimal control approach using reachable sets can help tackle the complex challenge of measuring sustainability.

Herke van Hoof is associate professor at the University of Amsterdam. Before that, Herke van Hoof was a postdoc at McGill University in Montreal, Canada, where he worked with Professors Joelle Pineau, Dave Meger, and Gregory Dudek. He obtained his PhD at TU Darmstadt, Germany, under the supervision of Professor Jan Peters, where he graduated in November 2016. Herke got his bachelor and master degrees in Artificial Intelligence at the University of Groningen in the Netherlands. His group works on various aspects of modular reinforcement learning. To address the low data efficiency of reinforcement learning from scratch, we investigate topics like using (symbolic) prior knowledge, modularity, and transferring knowledge between tasks.

Talk details

Reinforcement learning for real-world network infrastructure

Traditional AI control methods predominantly depend on deep reinforcement learning, which necessitates extensive and energy-intensive compute clusters. This presentation introduces a novel approach capable of learning to control systems directly on edge devices. By leveraging stochastic differential equations and utilizing system noise to facilitate learning, our method circumvents several challenges associated with on-chip AI model training. I will delve into the theoretical foundations of this approach and showcase our efforts in adapting these solutions for edge device implementation.

Marcel van Gerven

Marcel van Gerven, Rad… is Professor of Artificial Cognitive Systems and Principal Investigator in the Department of Machine Learning and Neural Computing of the Donders Institute for Brain, Cognition and Behaviour. Prof. van Gerven is an expert in machine learning and neuromorphic computing. His work ranges from understanding the computational mechanisms of learning, inference and control in natural and artificial systems to the development of new AI technology with applications in e.g. neuroscience, neurotechnology, healthcare and smart industry. Prof. van Gerven is recipient of several grants at the intersection of AI and neuroscience, such as Dutch Vidi, Crossover, Perspective and Gravitation grants as well as EU HBP and FET grants. He also received the Radboud Science Award for his scientific work. Prof. van Gerven is cofounder of Radboud AI and directs one of the European ELLIS units as part of the European Excellence Network in Machine Learning. He also contributes to the Healthy Data program, which aims to make AI accessible in healthcare, and is director of an Innovation Centre in AI for semiconductor manufacturing. Through his work, he aims to bridge the gap between natural and artificial intelligence and contribute to the development of sustainable AI solutions that make a positive impact in science, industry and society.https://www.cwi.nl/admin/pages/254147/edit/#block-fdbbf7eb-6596-494a-857d-98a1ed2f0416-section

Talk details

Harnessing Noise for Neuromorphic Control

Traditional AI control methods predominantly depend on deep reinforcement learning, which necessitates extensive and energy-intensive compute clusters. This presentation introduces a novel approach capable of learning to control systems directly on edge devices. By leveraging stochastic differential equations and utilizing system noise to facilitate learning, our method circumvents several challenges associated with on-chip AI model training. I will delve into the theoretical foundations of this approach and showcase our efforts in adapting these solutions for edge device implementation.

Marta Kwiatkowska is Professor of Computing Systems and Fellow of Trinity College, University of Oxford, and Associate Head of MPLS. Prior to this she was Professor in the School of Computer Science at the University of Birmingham, Lecturer at the University of Leicester and Assistant Professor at the Jagiellonian University in Cracow, Poland. She holds a BSc/MSc in Computer Science from the Jagiellonian University, MA from Oxford and a PhD from the University of Leicester. In 2014 she was awarded an honorary doctorate from KTH Royal Institute of Technology in Stockholm.

Marta Kwiatkowska spearheaded the development of probabilistic and quantitative methods in verification on the international scene and is currently working on safety and robustness for machine learning and AI. She led the development of the PRISM model checker, the leading software tool in the area and widely used for research and teaching and winner of the HVC 2016 Award. Applications of probabilistic model checking have spanned communication and security protocols, nanotechnology designs, power management, game theory, planning and systems biology, with genuine flaws found and corrected in real-world protocols. Kwiatkowska gave the Milner Lecture in 2012 in recognition of "excellent and original theoretical work which has a perceived significance for practical computing". She is the first female winner of the 2018 Royal Society Milner Award and Lecture, see her lecture here, and won the BCS Lovelace Medal in 2019. Marta Kwiatkowska was invited to give keynotes at the LICS 2003, ESEC/FSE 2007 and 2019, ETAPS/FASE 2011, ATVA 2013, ICALP 2016, CAV 2017, CONCUR 2019 and UbiComp 2019 conferences.

She is a Fellow of the Royal Society, Fellow of ACM, member of Academia Europea, Fellow of EATCS, Fellow of the BCS and Fellow of Polish Society of Arts & Sciences Abroad. She serves on editorial boards of several journals, including Information and Computation, Formal Methods in System Design, Logical Methods in Computer Science, Science of Computer Programming and Royal Society Open Science journal. Kwiatkowska's research has been supported by grant funding from EPSRC, ERC, EU, DARPA and Microsoft Research Cambridge, including two prestigious ERC Advanced Grants, VERIWARE ("From software verification to everyware verification") and FUN2MODEL ("From FUNction-based TO MOdel-based automated probabilistic reasoning for DEep Learning"), and the EPSRC Programme Grant on Mobile Autonomy.

Talk details

Provable guarantees for data-driven policy synthesis: a formal verification perspective

Machine learning solutions are revolutionizing AI, but their instability against adversarial examples – small perturbations to inputs that can drastically change the output – raises concerns about the readiness of this technology for widespread deployment. Formal verification, and particularly probabilistic verification, have become indispensable components of rigorous engineering methodologies to ensure system safety and dependability. Using illustrative examples, this lecture will discuss the role that formal verification technology can play in motion planning by providing provable guarantees on safety and optimality of neural network policies.

Prof.dr. Sander M. Bohté heads the CWI Machine Learning group, and is also a part-time full professor of Cognitive Computational Neuroscience at the University of Amsterdam, The Netherlands. He received his PhD in 2003 at CWI on the topic of “Spiking Neural Networks”. He was then awarded an NWO TALENT grant, which he spent with Michael Mozer at the University of Colorado in Boulder. In 2004, he rejoined CWI as junior permanent staff to work on distributed spiking neural network models and multi-agent systems. In 2016, he co-founded the CWI Machine Learning group, where his research bridges the field of neuroscience with applications thereof as advanced neural networks. His work has been pioneering in the development of advanced and efficient spiking neural networks, including seminal work on supervised learning with spike-time coded networks. Recent work has also developed biologically plausible deep learning and deep reinforcement learning models for cognition, and spiking neural network versions thereof.

Talk details

Scaling Biologically Plausible Deep Reinforcement Learning

Humans have a remarkable capacity for learning, yet neuronal learning is constrained to locality in time and space and limited feedback.

While neural learning rules have been designed that adhere to these principles and constraints, they exhibit difficulty in scaling to deep networks and complicated datasets. Here, we leverage insights from behavioural science by developing a curriculum that structures how samples are presented to a network to optimise learning. The key features of the curriculum involve progressively introducing new classes to the dataset based on performance metrics, and using a recency bias to protect recently acquired classes. We demonstrate that our curriculum approach makes feedback-based “BrainProp"-style learning robust and more rapid, while substantially improving classification accuracy. We also show the curriculum similarly improves performance for networks trained using error-backpropagation. Our results show the potential of curriculum learning in local learning settings with limited feedback and further bridges the gap between biologically plausible learning rules and error-backpropagation.

Sander Keemink is fascinated by how neurons and networks (in brains and machines) encode information and perform computations. While it is now possible to train and build highly effective networks, this does not mean we understand them. This hampers our interpretation of what neural networks are really doing, which is not something desirable for a technology seeing such increasingly widespread use. His research focus is therefore always to find the core underlying principles of network function (currently mainly in spiking networks, but also more generally).

Talk details

To spike or not to spike: using brain-like signals for control

Neurons communicate with downstream systems with sparse and short-lived electrical pulses, or spikes. Using these incredibly brief spikes, they must affect and control those downstream systems. With the ascent of neuromorphics devices spiking signals are increasingly studied as a control signal, but advances have mostly focussed on using AI techniques to train spiking networks. It remains unclear how spikes could be used in a more principled way. In this talk I take a different approach: how should spiking signals be used for control, given both knowledge from computational neuroscience and control theory? We will first look at how to translate the classic optimal control theory of Linear–quadratic–Gaussian control directly to spiking signals --- and we will show that spiking networks can implement these algorithms efficiently and realistically. However, this approach relies on filtering the spiking signal to approximate an analogue control signal --- which ultimately means the neurons have to output a continuous control signal (either through synapses or on hardware by filtering the spikes). We therefore next consider how downstream linear dynamical systems could be controlled solely by brief spiking events, as if a neuron can only give brief 'kicks'. Inspired by linear quadratic control, we require spikes to only happen if it brings a controlled system closer to a target. From this principle, we derive the required connectivity for spiking networks, and show that they can successfully control linear systems. The work gives insight both into how real neurons could control downstream systems like other neurons and muscles, and has applications in neuromorphic hardware design for control tasks where the control output has to be brief and sparse. 

Timm Faulwasser is Full Professor at TU Hamburg and leads the Institute of Control Systems. He has studied Engineering Cybernetics at the University of Stuttgart, with majors in systems and control and philosophy. From 2008 until 2012 he was a member of the International Max Planck Research School for Analysis, Design and Optimization in Chemical and Biochemical Process Engineering Magdeburg. In 2012 he obtained his PhD from the Department of Electrical Engineering and Information Engineering, Otto-von-Guericke-University Magdeburg, Germany. From 2013 to 2016 he was with the Laboratoire d’Automatique, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. He was leading the Optimization and Control Group at the Institute for Automation and Applied Informatics at Karlsruhe Institute of Technology (KIT) in 2015-2019. From 2019 to 2024 he held the professorship for energy efficiency in the Department of Electrical Engineering and Information Technology, TU Dortmund University, Germany. Since April 2024 he leads the Institute of Control Systems at Hamburg University of Technology, 21073 Hamburg, Germany. Currently, he serves as associate editor for the IEEE Transactions on Automatic Control, the IEEE Control System Letters, as well as Mathematics of Control Systems and Signals. Dr. Faulwasser received the 2021-2023 Automatica Paper Prize.  His research interests include optimal and predictive control of nonlinear systems and networks.

Wolfram Barfuss is the Argelander (Tenure-Track) Assistant Professor of the Transdisciplinary Research Area (TRA) Sustainable Futures and based at the Center for Development Research (ZEF) at the University of Bonn, Germany. He obtained his doctoral degree in theoretical physics of complex systems from the Potsdam Institute for Climate Impact Research and the Humboldt University Berlin (2019). Before joining the University of Bonn in 2023, he was a postdoctoral scientist at the Tübingen AI Center at the University of Tübingen (2021-2023), the School of Mathematics at the University of Leeds (2020-2021), and the Max Planck Insitute for Mathematics in the Sciences in Leipzig (2019-2020). His research centers around the question, "Are we smart enough for the good life?" To answer this question, the BarfussLab develops formal models of collective reinforcement learning dynamics to better understand how, in complex environments, individual decisions become collective action for a sustainable future.

Talk details

Collective Reinforcement Learning Dynamics for Sustainability Economics

Cooperation at scale is critical for achieving a sustainable future for humanity. However, achieving collective, cooperative behavior—in which intelligent actors in complex environments jointly improve their well-being—remains poorly understood. Complex systems science (CSS) provides a rich understanding of collective phenomena, the evolution of cooperation, and the institutions that can sustain both. Yet, much of the theory in this area fails to fully consider individual-level complexity and environmental context—mainly for the sake of tractability and because it has not been clear how to do so rigorously. These elements are well captured in multiagent reinforcement learning (MARL), which has recently focused on cooperative (artificial) intelligence. However, typical MARL simulations can be computationally expensive and challenging to interpret. In this presentation, I propose that bridging CSS and MARL offers new directions. By investigating the non-linear dynamics of collective reinforcement learning, we can better understand how, in complex environments, individual decisions become collective action for a sustainable future.

Tentative programme

09:15 - 09:45 Registration and tea/coffee

09:45 - 10:00 Welcome

10:00 - 11:00 Marta Kwiatkowska, Provable guarantees for data-driven policy synthesis: a formal verification perspective

11:00 - 11:30 Break

11:30 - 12:10 Diederik Roijers, A Plea for User-Centred RL

12:10 - 12:50 Herke van Hoof, Reinforcement learning for real-world network infrastructure

12:50 - 14:00 Lunch and Poster session

14:00 - 15:00 Elena Rovenskaya, Optimal Control and the Stories We Tell About Climate Change Economics

15:00 - 15:30 Break

15:30 - 17:00 Panel Discussion with Workshop Speakers

17:00 - 18:30 Discussion over drinks and bites

09:30 - 10:00 Registration and tea/coffee

10:00 - 11:00 Timm Faulwasser

11:00 - 11:30 Break

11:30 - 12:10 Marcel van Gerven, Harnessing Noise for Neuromorphic Control

12:10 - 13:30 Lunch and Poster session

13:30 - 14:30 Wolfram Barfuss, Collective Reinforcement Learning Dynamics for Sustainability Economics

14:30 - 15:00 Break

15:00 - 15:40 Sander Keemink, To spike or not to spike: using brain-like signals for control

15:40 - 16:20 Sander Bohté, Scaling Biologically Plausible Deep Reinforcement Learning

16:20 - 17:30 Discussion over drinks

Logistics

The conference will be held in the Turing room at the Congress Centre of Amsterdam Science Park, next to Centrum Wiskunde & Informatica (CWI).
Address: Science Park 125, 1098 XG Amsterdam.

Please be aware that hotel prices in Amsterdam can be quite steep. We strongly recommend all participants to secure their hotel reservations as early as possible!

Hotel Recommendations
* Generator Hostel
* MEININGER Hotel (Amsterdam Amstel)
* Hotel Casa
* The Manor Amsterdam
* The Lancaster Hotel Amsterdam

From these hotels, the venue can be reached in 15-30 minutes with public transport. In all public transportation, you can check in and out with a Mastercard or Visa contactless credit card and also with Apple Pay and Google Wallet.

Sharing a hotel room is a great way to reduce costs! If you are attending and are interested in room-sharing arrangements, please send an email to events@cwi.nl .

Banner Workhop on Modern Applications of Control Theory and Reinforcement Learning