Software Analysis and Transformation

Software Analysis and Transformation

SWAT studies software systems: their design, their construction, and their inevitable evolution. Our mission is to learn to understand software systems and to improve their quality. We focus on complexity as the primary quality attribute of software systems.

SWAT studies software systems: their design, construction and evolution. Our mission is to learn to understand software systems and to improve their quality. In particular, we study the causes of software complexity – a major cause of technology failure in society – and investigate how complex systems can be made simpler and more reliable. We analyze and visualize software systems, transforming them into better versions of themselves. We also generate new software with the goal of simplifying it through automation and abstraction. We keep our feet on the ground by working with corporate IT departments on streamlining their software systems and making them more reliable. Putting our ideas into practice is the best way to ensure they work.

Description

Software Analysis and Transformation group description

Events

Software Analysis and Transformation group events
Events

Software Analysis and Transformation group events

Logic, Rationality and Common Sense - Workshop on the occasion of Jan van Eijck's retirement
Jun 02, 2017 Turing zaal, CWI gebouw, Science Park 123, Amsterdam,

We cordially invite you to attend this workshop on the occasion of Jan van Eijck's retirement as researcher at CWI and as professor at ILLC. The topics are reflections on the possibilities and the limitations of applications of logic to the analysis of human behaviour . More information can be found on the website.

Members

News

Software Analysis and Transformation group news
News

Software Analysis and Transformation group news

Partners

Software Analysis and Transformation group partners

There are currently no items in this folder.

Vacancies

Software Analysis and Transformation group vacancies.

Key publications

Key publications for SWAT
Software Renovation - key publications

A. van Deursen and L. Moonen, Exploring Legacy Systems Using Types, 7th WorkingConference on Reverse Engineering IEEE Computer Society, 2000.A. van Deursen and T. Kuipers, Building Documentation Generators, In Proceedings International Conference on Software Maintenance (ICSM'99), IEEE Computer Society, 1999, 40-49.A. van Deursen and T. Kuipers, Identifying Objects using Cluster and Concept Analysis, In Proceedings 21st International Conference on Software Engineering (ICSE'99), ACM, 1999, pp. 246-255

Domain-Specific Languages - key publications

Arie van Deursen, Paul Klint, Little Languages: Little Maintenance?. In Journal of Software Maintenance, 10, 75--92, 1998. A. van Deursen, P. Klint and J. Visser, Domain-Specific Languages: An Annotated Bibliography. In  ACM SIGPLAN Notices, 35: 26-36

Generic Language Technology

The Generic Language Technology project hosts the development of programming language technology that supports the research of SEN1 (SWAT):

Collaboration Partners SEN1 (SWAT)

AG5 ASML Bell Labs, Alcatel-Lucent Delft University of Technology E

Key publications of SEN1: SWAT

Please find out key publications via the project pages of SEN1 (SWAT) that are linked from this page.

Please find out key publications via the project pages of SEN1 (SWAT) that are linked from this page.

ATEAMS: Analysis and Transformation based on rEliAble tool coMpositionS

Special collaboration ATEAMS is an INRIA project team, which means it is funded directly by INRIA Lille Nord Europe. The project covers most of the research interests of SEN1 (SWAT).

Special collaboration

ATEAMS is an INRIA project team, which means it is funded directly by INRIA Lille Nord Europe. The project covers most of the research interests of SEN1 (SWAT).

Description

Software is still very complex, and it seems to become more complex every year. Over the last decades, computer science has delivered various insights how to better organize software. Via structured programming, modules, objects, components and agents, software systems are more and more evolving into “systems of systems” that provide services to each other. Each system is large, uses incompatible — new, outdated or non-standard — technology and above all, exhibits failures.

It is becoming more and more urgent to analyze the properties of these complicated, heterogeneous and very large software systems and to refactor and transform them to make them simpler and to keep them up-to- date. With the phletora of different languages and technology platforms it is becoming very difficult and very expensive to construct tools to achieve this.

The main challenge of ATEAMS is to address the need for the combination of different kinds of novel analysis and transformation tools and the existence of the diversity of programming environments. We do this by investing in a virtual laboratory called “Rascal”. It is a domain specific programming language for source code analysis, transformation and generation. Rascal is programming language parametric, such that it can be used to analyze, transform or generated source code in any language. By combining concepts from both program analysis and transformation into this language we can efficiently experiment with all kinds of tools and algorithms.

We now focus on three sub-problems. First, we study fact extraction: to extract information from existing software systems. This extracted information is vital to construct sound abstract models that can be used in further analysis (such as model checking or static analysis). Automated fact extraction is still expensive and error-prone.

Second, we study refactoring: to semi-automatically improve the quality of a software system without changing its behavior. Refactoring tools are a combination of analysis and transformations. Implementations of refactoring tools are complex and often broken. We study better ways of designing refactorings and we study ways to enable new (more advanced and useful) refactorings.

Finally, we study code generation from domain specific languages (DSLs). Here we also find a combination of analysis and transformation. Designing, implementing and, very importantly, maintaining DSLs is costly. We focus on application areas such as Computational Auditing and Digital Forensics to experiment with this subject.

Members

  • Prof. Dr. Paul Klint (CWI)
  • Dr. Mark Hills (INRIA)
  • Prof. Dr. Jan van Eijck (CWI)
  • Dr. Jurgen Vinju (CWI)
  • Dr. Tijs van der Storm (CWI)
  • Prof. Dr. Jan van Eijck (CWI)
  • Dr. Vadim Zaytsev (CWI)
  • Dr. Sunil Simon (CWI)
  • Drs. ing. Jeroen van den Bos (PhD student, CWI)
  • Drs. Bas Basten (PhD student, CWI)
  • Drs. Paul Griffioen (PhD student, CWI)
  • Drs. Floor Sietsma (PhD student, CWI)
  • Drs. Arnold Lankamp (scientific programmer, CWI)
  • Drs. Bert Lisser (scientific programmer, CWI)
  • Maarten Dijkema (support, CWI)

Former members

  • Yaroslav Usenko (INRIA)

Partners

Model-Driven Engineering in Digital Forensics

To keep up with the size of storage devices, speeds of network connections and amount of digital devices in use, digital forensic investigations rely heavily on high performance custom software applications to perform large parts of analyses. However, the continuous introduction of new consumer applications and devices along with

To keep up with the size of storage devices, speeds of network connections and amount of digital devices in use, digital forensic investigations rely heavily on high performance custom software applications to perform large parts of analyses. However, the continuous introduction of new consumer applications and devices along with
regularly encountered variants of data storage formats requires forensic software to be exceptionally flexible and adaptable.

To realize these requirements, using domain-specific languages to raise the level of abstraction and separate different concerns in the domain is a viable approach. Additionally, model-driven engineering may enable additional capabilities such as reuse of models, deep application integration and extensive optimizations.

Investigating these requirements and capabilities requires analysis, design and implementation of systems employing these techniques, evaluation comparing existing systems to these alternative solutions and the development of innovative tools and techniques.

Members

  • Prof. Dr. Paul Klint (project leader)
  • Dr. Tijs van der Storm
  • Drs. ing. Jeroen van den Bos (PhD student)

Key publications

  • Jeroen van den Bos and Tijs van der Storm. A Case Study in Evidence-Based DSL Evolution, in: Proceedings of 9th European Conference on Modelling Foundations and Applications (ECMFA’13), volume 7949 of Lecture Notes in Computer Science, pages 207–219. Springer, 2013 (Experimental data: https://github.com/jvdb/derric-eval).
  • Jeroen van den Bos and Tijs van der Storm. TRINITY: An IDE for The Matrix, in: Proceedings of the 29th IEEE International Conference on Software Maintenance (ICSM’13). IEEE, 2013 (tool paper).
  • Jeroen van den Bos and Tijs van der Storm, Domain-Specific Optimization in Digital Forensics, in: Proceedings of the 5th International  Conference on Model Transformation (ICMT'12), 2012
  • L. Aronson, J. van den Bos. Towards an Engineering Approach to File Carver Construction2011 IEEE 35th Annual Computer Software and Applications Conference Workshops, Munich, Germany, 368–373, 2011.
  • Jeroen van den Bos and Tijs van der Storm, Bringing Domain-Specific Languages to Digital Forensics, in: Proceedings of the 33rd International Conference on Software Engineering (ICSE'11), Software Engineering in Practice, ACM, 2011.

Software

Partners

NWO Stimuleringsregeling Kennisbenutting: Analyse van C/C++ source code

Om Rascal geschikt te maken voor de analyse en transformatie van programmeertalen is een substantiële hoeveelheid werk vereist die niet gemakkelijk in een onderzoeksomgeving uitgevoerd kan worden: de resultaten zijn bijzonder nuttig maar nauwelijks publiceerbaar.

Om Rascal geschikt te maken voor de analyse en transformatie van programmeertalen is een substantiële hoeveelheid werk vereist die niet gemakkelijk in een onderzoeksomgeving uitgevoerd kan worden: de resultaten zijn bijzonder nuttig maar nauwelijks publiceerbaar. In dit project ligt de focus op het ontwikkelen van een C/C++ front-end voor Rascal. Het project kan onderverdeeld worden in de volgende onderdelen:

  • Aanpassen van de al beschikbare C en C++ grammatica's zodat deze de volledige C/C++ standaard aankunnen.

  • Ontwerpen van gemeenschappelijk datamodel voor het representeren van uit C/C++ geëxtraheerde feiten.

  • Ontwerpen en implementeren van feitenextractors die het datamodel vullen met feiten die uit C/C++ sourcecode geëxtraheerd is.

  • Ontwerpen en implementeren van enkele standaardanalyses op C/C++ code (o.a. naam- en typeresolutie).

Participanten

  • Prof. dr. Paul Klint
  • Dr. Jurgen Vinju
  • Drs. Arnold Lankamp

GrammarLab: Foundations For a Grammar Laboratory

The shape of chemical compounds is governed by principles of molecular structure which can be understood using valence bond theory. The acquired understanding can be used to manipulate chemical compounds toobtain useful products such as plastics.

The shape of chemical compounds is governed by principles of molecular structure which can be understood using valence bond theory. The acquired understanding can be used to manipulate chemical compounds toobtain useful products such as plastics. Similarly, the internal shape of software is governed by the principles of source code structure which can be made insightful by the theory of grammars. This has enabled the creation of useful products such as compilers.


Grammars for programming languages are complex: two grammar rules can generate an exponential number of source code structures.  Programming languages have hundreds of grammar rules. In the context of compiler construction—one language, one grammar, one compiler—this complexity is barely manageable. In the context of IDE construction,refactoring, and reverse engineering it is multiplied by the number of different languages, dialects, versions, embeddings and grammar usecases.


The theory of grammars does not provide insight in this complexity. It fails to provide answers to common engineering questions such as:

  • How to efficiently construct grammars?
  • How to assess the quality of grammars?
  • When an erroneous structure is detected, how to relate this  effect to its grammatical cause?

These common engineering questions imply that grammars are software,thus requiring well-founded engineering principles and practices. The domain and theory of grammarware engineering is underdeveloped. This research contributes to this field by developing the scientific framework and tools for understanding, creating, versioning, analyzing, testing, debugging, visualizing, and maintaining grammars.

Key publications

  • P. Klint, R. Lämmel, C. Verhoef, Toward an Engineering Discipline for GrammarwareTransactions on Software Engineering and Methodology, Vol. 14, No. 3, July 2005, 331-380. bib, pdf, doi.
  • V. Zaytsev, Recovery, Convergence and Documentation of Languages. PhD thesis, Vrije Universiteit, October 2010. bib, pdf.

Members

Generic Language Technology - key publications

M.G.J. van den Brand, P. Klint, and J.J. Vinju. "Term rewriting with Traversal functions". ACM Transactions on Software Engineering and Methodology (TOSEM), 12(2):152-190, 2003.Mark van den Brand, Jan Heering, Paul Klint, and Pieter Olivier. "Compiling Rewrite Systems: The ASF+SDF Compiler". ACM Transactions on Programming Languages and Systems 24:334--368. 2002.Mark van den Brand, Jeroen Scheerder, Jurgen Vinju, and Eelco Visser.
  • M.G.J. van den Brand, P. Klint, and J.J. Vinju. "Term rewriting with Traversal functions". ACM Transactions on Software Engineering and Methodology (TOSEM), 12(2):152-190, 2003.
  • Mark van den Brand, Jan Heering, Paul Klint, and Pieter Olivier. "Compiling Rewrite Systems: The ASF+SDF Compiler". ACM Transactions on Programming Languages and Systems 24:334--368. 2002.
  • Mark van den Brand, Jeroen Scheerder, Jurgen Vinju, and Eelco Visser. Disambiguation Filters for Scannerless Generalized LR Parsers. In R. Nigel Horspool, editor, Compiler Construction,volume 2304 of LNCS, pages 143-158. Springer-Verlag, 2002.
  • M.G.J. van den Brand, H.A. de Jong, P. Klint and P.A. Olivier, Efficient Annotated Terms, Software-Practice and Experience 2000; 30:259-291.
  • M.G.J. van den Brand, P. Klint and P.A. Olivier, Compilation and Memory Management for ASF+SDF, In Compiler Construction'99 (CC'99), Lecture Notes in Computer Science Vol. 1575, Springer-Verlag, 1999, 198 213.
  • J.A. Bergstra and P. Klint, The discrete time ToolBus, Science of Computer Programming 1998; 31:205-229

Next Generation Auditing: Data-assurance as a Service

Fraud and fraudulent business practices are of all times. Legislators, regulators and financial authorities try to create legal frameworks and procedures for early detection and prevention, but they have failed consistently during the last 10-15 years.

Fraud and fraudulent business practices are of all times. Legislators, regulators and financial authorities try to create legal frameworks and procedures for early detection and prevention, but they have failed consistently during the last 10-15 years. This proposal will create a unique cooperation between experts in Dutch accountancy theory and software engineering researchers specialized in software analysis and domain-specific languages.

The grand vision of this proposal is to create continuous auditing services that can perform real-time monitoring of companies in order to fundamentally increase the reliability and transparency of financial systems and supporting IT systems.

We start with the design of a meta-model for auditing and create the concepts and techniques to describe, analyze, and maintain specific company models and to automatically generate auditing services from them. These services include intake (establishing a company model based on the mining of available data) and monitoring (comparing actual data with the company's model). Key elements of our approach are the use of software analysis to extract facts from financial reports and source code of IT systems, to link the software with actions in the company's model, and to address global issues of maintenance and evolution of these models. The increased trust that is created by continuous auditing services is in itself an enabler for software services in general.

Members

  • Prof. Dr. Paul Klint (Project leader)
  • Drs. Paul Griffioen (PhD student)
  • Dr. Tijs van der Storm

Partners

More SWAT

The group contributes to the following topics:

Software Analysis

With software analysis, we propose and evaluate methods for observing software both quantitatively and qualitatively.  We automatically extract models of software systems, which can then be simulated, measured, checked and visualized. On the one hand, such analyses may provide insight into specific software systems, which is valuable in itself.  On the other hand, by collecting information about sets of software systems, we may also use software analysis to come to general insights. We work on a variety of increasingly rich models of software systems that would allow increasingly meaningful analyses to be performed.

Relevant keywords in this area are:

  • Static Analysis
  • Software Analytics
  • Software Metrics
  • Software Visualization
  • Software Evolution
  • Architecture Conformance
  • Software Quality
  • Software Complexity
  • Code smells
  • Reverse Engineering
  • Re-engineering
  • Type checkers and type inference
  • Relational Calculus
  • Pattern Matching
  • Model Checking
  • Rascal

Software Transformation

With software transformation, we propose and evaluate automated methods of construction and maintenance of software systems. Using large-scale automated software renovation we improve software quality by transforming existing systems to better systems. Software transformation and analysis go hand in hand. Automated software analysis is used for checking preconditions of automated software transformations such as refactoring.

Relevant keywords in this area are:

  • Refactoring
  • Re-engineering
  • Source-to-source transformation
  • Compilers
  • Code-to-model

Software Generation

With software generation, the goal is to simplify software by automation and abstraction.  Using the construction of domain-specific languages we improve the quality of newly designed systems by carefully constructing  automated but reconfigurable transformations from high-level domain concepts to high-quality source code. The questions are which domain specific languages should be defined, and how to implement them effectively.

Relevant keywords in this area are:

  • Model Driven Engineering
  • Model Based Testing
  • Domain Specific Languages
  • Software Simulation
  • Software Verification
  • Software Synthesis

Meta programming and Software Language Engineering

 

Below the previous topic lies an infra-structure topic of meta-programming. Meta programs are programs which take programs (source code) as input or output. To efficiently 

Relevant keywords in this area are:

  • General context-free parsing algorithms and data-dependent parsing algorithms
  • Modular Language Specification and Implementation Mechanisms
  • (Data-dependent) Context-free Grammars
  • Object Algebras
  • Algebraic Specification
  • Term Rewriting
  • Generic Tree Traversal
  • Pattern Matching (Modulo Theories)
  • Relational Calculus
  • Persistent data-structures
  • Immutable data-structures

Teaching

SWAT is heavily involved in the Master Software Engineering. This successful research master covers a big part of the SWEBOK. It is a collaboration between Universiteit van Amsterdam, Vrije Universiteit, CWI and Hogeschool van Amsterdam. 

In particular we teach courses and assist at courses at different universities at the master level, such as:

  • Software Construction (UvA)
  • Software Evolution (UvA, TUE)
  • Software Testing (UvA)
  • Software Quality and Testing (RUG)

Software

SWAT has a strong tradition in building and contributing to open source software.