CWI develops better techniques for ambiguity detection

Publication date: 08-12-2011


Programming languages have grammars, just like ordinary languages, in which ambiguities can occur that make a computer program - written in that language - do something that you do not want, causing mistakes. This type of error is very difficult to find and very expensive to fix. Bas Basten, PhD student from the Centrum Wiskunde & Informatica (CWI) in Amsterdam, designed new tools and techniques to find such double meanings. On 15 December he defends his PhD thesis ‘Ambiguity Detection for Programming Language Grammars’ at the University of Amsterdam. With the new CWI techniques grammars for programming languages can be checked faster for ambiguities – sometimes even thousands of times faster. Results from this study are useful for software engineers, for example in the design of domain specific languages (DSLs).

 

PEMDAS, or Mr. Van Dalen
In the past, children in the Netherlands learned the (now obsolete) mnemonic ‘Mr. Van Dalen waits for an answer’ to remember the calculation rule, which is in the USA commonly known as PEMDAS: Parentheses come before Exponents, which precede Multiplication and Division, who in turn have priority over Addition and Subtraction. The sum 1+1*3 equals, according to the calculation rule, 4 but for someone who does not know these rules the answer could also be (1+1)*3=6. If it is not clear for a computer which operation it should handle first, confusion may occur between programmer and computer, causing errors in the software. Therefore it is important that all ambiguities are resolved before a language is put to use. Since an infinite amount of combinations of operators such as + and * exist, it frequently happens that authors of new programming languages overlook an ambiguous combination in the design.

In his research Bas Basten first examined existing methods to detect ambiguities in grammars of programming languages. Then he developed his own, improved techniques. He designed a method and corresponding tool ‘AmbiDexter’, which first filters out parts of grammars that cannot induce ambiguities. This allows the remainder to be searched faster for double meanings. When an ambiguity is found, it is often very difficult to see what exactly is wrong. Therefore Basten and his colleagues also developed the expert system ‘Dr. Ambiguity’ to analyse the cause of the double meaning. Both tools are integrated in Rascal – the meta-programming language of CWI. They are freely available for users (open source). In a case study, several ambiguities were found in grammars of programming languages. AmbiDexter is one of the first useful detection systems for ambiguities in software, and at the moment it is the fastest. Benchmark studies have shown that the filtering method can be up to thousands of times faster than existing methods. This research is a good example of ‘Software as a Service’ – an important research area for CWI.  

 
More information: http://homepages.cwi.nl/~basten/ambiguity/ and http://www.rascal-mpl.org/ 

Supervisor: Prof. Dr. Paul Klint (CWI and UvA), co-supervisor: Dr Jurgen J. Vinju (CWI).

PhD defence: 15 December 2011, Agnietenkapel, Oudezijds Voorburgwal 231, Amsterdam, 14:00h.

Picture: Ambigram, by Bas Basten (CWI).