Life Sciences and Health Seminar Marco Virgolin, Esteban Gabory

Counterfactual explanations: A bird-eye view; On Strings Having the Same Length-k Substrings

When
22 Feb 2022 from 4 p.m. to 22 Feb 2022 5 p.m. CET (GMT+0100)
Where
online
Web
Add

Join Zoom Meeting
https://cwi-nl.zoom.us/j/81535249101?pwd=ZDh3Y1kyZmxBUml1Tm00Rkx0NXBwdz09

Meeting ID: 815 3524 9101
Passcode: 967201

Title: Counterfactual explanations: A bird-eye view
Speaker: Marco Virgolin
Abstract: Counterfactual explanations (CEs) are a powerful means for understanding how decisions made by machine learning models can be changed. In particular, a CE prescribes how a user (a point in feature space x with coordinates age_x=32, gender_x=Male, salary_x=...) needs to intervene (x->x') so that the classification made by a model (f(x)="no loan") changes, typically in favor of the user (f(x')="yes loan"). Obtaining "useful" CEs is a search problem that depends on many aspects. To begin with, we must define a distance d such that x'=argmin d(x,x') will reasonably translate to minimal intervention effort in real life. Next, multiple constraints and desiderata must be considered for practical adoption, such as plausibility of intervention (age_x' - age_x > 0), causal relationships (increment in salary->increment in savings), and robustness to perturbations of different nature (e.g., to measurements in x or updates to f). Finally, availability of access to and information on the model is important: f may or may not be re-trained (e.g., to smoothen its decision boundary), expose gradients or provide only labels, be queried an unlimited or limited number of times, and so on. This multitude of important aspects contributes to establishing important links to many research areas, such as optimization (multi-objective, gradient-based or gradient-free, expensive), machine learning, and also areas concerning ethics and fairness.

Title: On Strings Having the Same Length-k Substrings
Speaker: Esteban Gabory
Abstract: Let Substr_k(X) denote the set of length-k substrings of a given string X for a given integer k>0. We study the following basic string problem, called Shortest equivalent string: Given a set S_k of n length-k strings and an integer z>0, find a shortest string T such that Substr_k(T)=S_k. The Shortest equivalent string problem arises naturally as an encoding problem in many real-world applications; e.g., in data privacy, in data compression, and in bioinformatics. We answer this problem by showing that it is equivalent to finding a shortest walk that crosses every edge of a graph, and we give an algorithm for the latter problem.