Nederlands

Evolutionary Intelligence Seminar - Tobias Moxter

Semantic Representations in Genetic Programming for Symbolic Regression: Explicit and Model-based Perspectives on Locality

When
8 Aug 2023 from 4 p.m. to 8 Aug 2023 5 p.m. CEST (GMT+0200)
Where
CWI, L016
Web
Add

Tobias Moxter

Semantic Representations in Genetic Programming for Symbolic Regression: Explicit and Model-based Perspectives on Locality

Machine learning is rapidly changing the world in exciting and formidable ways. Its unprecedented ability to capture complex, non-linear patterns has made it a powerful tool, enabling profound scientific and technological breakthroughs. Yet, against the backdrop of numerous beneficial instances, we are observing many erroneous and, unfortunately, harmful applications. Perhaps the most substantial driver for the repeated occurrence of "AI failures" is our limited ability to understand the rules discovered by machine learning algorithms, particularly by neural networks and deep learning. This thesis contributes to genetic programming (GP), a meta-heuristic addressing the need for explanations by evolving human-readable instructions to form accurate yet comprehensible programs. Since GP generates programs using the framework of evolutionary algorithms, it relies on intelligent, cost-effective variation operators. A research direction yielding promising results focuses increasingly on leveraging semantics, i.e., how programs behave, rather than regarding solely syntax, i.e., how programs are written. Locality, loosely understood as the correspondence of neighborhood structure between syntax and semantics, is widely expected to bear significant benefits for GP. This thesis contributes to the research into GP for symbolic regression (GPSR) by introducing an explicit semantic program representation along with an encode-perturb-decode scheme capable of producing semantically similar offspring. Novel approaches to search space analysis are applied to GPSR to investigate the role of locality and its relation to landscape modality in semantic representations. Results suggest that higher locality by itself may be an insufficient indicator of performance in GPSR. Finally, deep learning is leveraged to study the properties and potential of learned semantic representations with modern model architectures.
Experiments on known, synthetic benchmark problems demonstrate notable improvements of the explicit encoding over classic variation operators, while positive characteristics of the learned representation justify further research into model-based semantic algorithms for GPSR.