LSH Seminar Mick van Dijk

Classifying C. albicans species by using mixed integer optimization based optimal classification trees

Classifying C. albicans species by using mixed integer optimization based optimal classification trees

World-wide medical use of the antifungal azole has led to an enormous increase in azole resistant C. albicans species, which is most commonlyassociated with fungal infections. A possible reason for resistance are point mutations on the ERG11 gene. To provide patient specific medication it would be beneficial to be able to, in silico, classify the fungal isolates as being resistant or susceptible and identify the locations of the responsible point mutations. The aim of this thesis is to apply and compare several classification algorithms, in particular decision tree algorithms. Bertsimas and Dun recently introduced a novel formulation based on Mixed Integer Optimization to generate optimal classification trees. We have implement this method and applied it to the C. albicans data set to construct univariate and multivariate classification trees. Moreover, by adding extra constraints and variables to the original formulation we are able to model extensions such as minimizing false negative misclassifications and non-binary classification trees.