Rascal Tutorial

Versions of this document:

The slides for this tutorial can be found online at:

The archive with a project containing solutions to the exercises can be found here miss-grant-full.zip

Comments and suggestions: Tijs van der Storm

Introduction

Rascal is programming language for source code analysis and transformation. The primary application areas are (legacy) system renovation, reverse engineering and re-engineering, and the implementation of domain specific languages (DSLs). DSLs are languages tailored to a specific application domain. Examples include SQL, Excel, Make, Latex, VHDL, etc. Today we are going to use Rascal for some aspects of implementing a small domain specific language for state machines. The example is derived from Martin Fowler's book on Domain Specific Languages. The relevant chapter is published on-line:

http://www.informit.com/articles/article.aspx?p=1592379

State machines are useful for describing state dependent behaviour, for instance to control machines or work-flow engines. For more information on the theory of this kind of state machines, you may want to consult Wikipedia http://en.wikipedia.org/wiki/Finite_state_transducer.

In this tutorial we will explore the Rascal language and environment by encountering the following aspects of DSL implementation:

Installation

If you haven't installed the Rascal Eclipse Plugin already, go to the following URL and follow the instructions.

Some notes:

Download the zip archive for the tutorial.

To start, open the Rascal perspective and "Import..." the Eclipse project contained in the zip file (select the archive file).

To open a Rascal console, open a Rascal file from the project and right-click on the Rascal editor. In the context-menu you'll find Launch Console.

Rascal has an interactive documentation system which can started from within Eclipse via the Rascal menu. It can be found online at http://tutor.rascal-mpl.org.

Structure of the project

The source directory of project contains a number of files. Some of them have real code in them, other have stubs that you will implement during this tutorial. The files are briefly described below:

The special folders eclipse and std contain Rascal's standard library. Have a peek to see what's there. Finally, the input directory contains some example state machine definitions. If you've opened a console and imported the Plugin module (import Plugin;) you may invoke the main() function. If you then double-click controller files an editor will open with syntax highlighting.

Syntax Definition for State Machines

Rascal has built-in support for context-free grammars which can be used to define the syntax of programming languages. A syntax rule consists of the following parts:

syntax NonTerminal = label_1: Element_1i ... Element_1m1
                   | ...
                   | label_n: Element ... Element_nmn ;

This defines a rule named "NonTerminal" with n alternatives. Each alternative has a label and a sequence of Elements. The elements of an alternative define the syntax to be recognized by this rule. An element can be one of the following symbols:

To define lexical rules (e.g., for identifiers) character classes are used (similar to regular expressions):

The Syntax module of the project already defines the syntax of Fowler's state machines. Note that our syntax definition extends the Layout module from the Rascal standard library. Have a look and try to understand the how Rascal's syntax formalism works.

Tip: open the std entry in the package explorer of the MissGrant project to see what's in the standard library. Locate and open the Layout module and digest what's in there.

Quiz: what is the layout convention defined in the Comment module?

Quiz: what is the meaning of the !<< and !>> constructs? Why are they needed?

If you've run main() from the Plugin module, you can already open editors with state machines. If you want to parse state machines from the console, import module Parse, and use the provided parse functions.

If you import Implode, you can use the load function to directly obtain ASTs for a source location. You can inspect parse trees and ASTs by importing util::ValueUI, and entering tree(the tree) in the console.

Quiz: what is the meaning of # in the definition of parse? Hint: type in "#int" in the Rascal console.

Abstract Syntax

Rascal uses algebraic data types (ADTs) for describing abstract syntax, as is common in functional programming languages. The Rascal standard library defines a function "implode" that turns a concrete syntax tree into an abstract syntax tree.

To make this work, an abstract syntax ADT should correspond to the syntax definition in the following way:

You can verify in the AST module that the abstract syntax definition indeed follows this scheme.

Quiz: what are source code transformations where ASTs are cumbersome to use?

Exercise 1: source to source transformation

Consider the following paragraph in Fowler's text:

In particular, you should note that reset events aren't strictly necessary to express Miss Grant's controller. As an alternative, I could just add a transition to every state, triggered by doorOpened, leading to the idle state. The notion of a reset event is useful because it simplifies the diagram.

What this means is, that it is possible to construct an equivalent state machine that does not depend on reset events. In other words, the resetEvents feature of the state machine language is syntactic sugar. In this assignment you are to write a transformation that desugars reset events according to the quote above.

Tip: use the Rascal visit construct to transform the state machine AST.

Bonus: implement the reverse, i.e. 'resugaring' of resetEvents.

Quiz: does resugaring after desugaring always give the original state machine? If not, why not, and would that be a problem?

Exercise 2: Fact extraction

For some applications, especially source code analyses, the tree structure of the AST is not ideal. In this assignment you will write a function that extracts a relational abstraction of state machines. Relations are natural representations for graphs. And a state machine can be considered as a special kind of graph.

In order to connect the resulting analysis to the state machine visualizer, we use the following types as interfaces (see module Extract):

 alias TransRel = rel[str state, str event, str toState]
 alias ActionRel = rel[str state, str command]

The first relation captures the transition structure of a state machine: it contains tuples <s, t, s'>, where s is the source state, t the triggering event, and s' the target state. The second relation captures which commands should be executed upon entering a certain state.

NB: You may assume that reset events have been desugared as in the previous step.

Quiz: what is the difference between a binary relation and a map?

Exercise (optional): implement the extraction of two similar relations, only now the produce a transition relation between state, event token and state, and an action relation between state and command token.

Visualizing a state machine

The relations of the previous assignment are an abstract representation of a state machine. This provides excellent input to graphically visualizing a state machine. The provided Rascal project contains a module to visualize state machines (Visualize). If you've implemented the relations of the previous exercise correctly, you can visualize a state machine clicking MissGrant -> Visualize in the context menu of a state machine editor. You can also import the module Visualize in the console and then call the function renderController on a controller AST.

In the visualization there are buttons for each event. You can click on them to interactively simulate the execution of the state machine.

Exercise 3: Well-formedness checking of state machines

Many programming languages have type checkers. In the case of state machines, there isn't really a notion of types. Nevertheless, it is still possible to make mistakes. In this assignment the goal is to make a checker function that detects such mistakes. This function will return a collection of error or warning messages. The data type to be used for this can be found in the standard library module Message.

The list of things you could check for includes (but might not be limited by) the following:

Rascal checkers typically return a set of error messages which can be used by the IDE to do error marking in editors. The Message data type can be found in the Message module of the standard library. To obtain the source location of AST nodes use aNode@location (see the AST module to see which nodes have these annotations).

Exercise: write a checker for state machines in the Check module. The toplevel function is provided and will be used by the IDE.

Tip: write one helper function for each violation (e.g., duplicate events) and call each helper function in the top-level check function, unioning the respective sets of error messages.

Tip: for reachability analysis use the built-in Rascal operator for transitive closure (post-fix +).

Note: you may put the facts you've extracted from the previous assignment to good use in some of the analyses that the checker performs.

Exercise 4: Unparsing using string templates.

In this assignment you are to write a function that converts ASTs back to source code using Rascal's built-in string templates. String templates are string literals with advanced mechanisms for interpolation:

The Unparse module shows function definition stubs in the pattern-based invocation style: each AST node has a corresponding "equation". Fill in each definition with the appropriate string templates and recurse on unparse where needed.

Note: the result does not have to be pretty. But the provided tests at the top of Unparse should succeed. You can run them by entering ":test" on the command line after importing Unparse.

Tip: you may use the ' (single quote) character to add artifical margins in strings. String templates will be expanded relative to the margin. This improves readability of the templates.

Quiz: why wouldn't you want to use this unparse function as a real pretty printer?

Exercise 5: Code-generation to Java

State machines have to be executed in code somehow. One approach is to generate Java code. There are multiple ways for generating code from a state machine. You may consider one of the following alternatives:

Note that upon entering states the actions (if any) should be executed. To be able to run the code, assume the input stream is a java.util.Scanner object, and use nextLine to obtain the next token. Actions print tokens onto an output stream, which can be a java.io.PrintWriter.

Quiz: what is wrong with the second approach? What is the reason that it's a problem in Java, but not, say, in Scheme?

Quiz: what are the draw-backs of using strings for code generation? What are alternatives?

Bonus: parallel merge of two state machines

The desugaring reset events (cf. above) is an instance of a simple source-to-source transformation. In this assignment we will engage in a source-to-source transformation that is slightly more complex. The goal is to take two state machines and produce a new one that runs the two original state machines in parallel.

In the realm of Gothic security systems the company noticed that sometimes a client wanted two or more hidden compartments in the same room. the two controllers may share events, such as opening the drawer. In this case it is off course more cost effective to install a single controller instead of two. This is achieved by merging the two state machines through the algorithm you are going to program now and installing the resulting state machine onto a single controller.

The states in the machine resulting from merging machines S1 and S2 are identified by tuples of the states of both machines. Execution thus starts in a the initial state <s0, u0> where s0 and u0 are the initial states of S1 and S2 respectively. Running S1 and S2 in parallel then entails the following:

Note: you have to decide how commands, events and reset events are combined and how, upon entering a combined state <s', u'>, the actions of both s' and u' are combined.

Since the result of the parallel merge transformation is again just an ordinary state machine, you can reuse the code generator of the previous assignment as is to run two state machines in parallel.

Exercise (optional): write an interpreter that evaluates two state machines in parallel, without performing the parallel merge.

Exercise (optional): implement other operations on state machines (e.g., concat, compose, union etc.---see the Wikipedia page cited above).

Tip: use the unparser of exercise 4 to inspect the result of the parallel merge as source code in an editor as follows: first import util::ValueUI, then call text(unparse(some controller)).

Closing

We have seen a number of features of Rascal for source code analysis and transformation. These features were illustrated with a simple DSL for state machines. This is not all of Rascal, however. Topics that we didn't cover today include:

For more information, check out the online documentation http://tutor.rascal-mpl.org. If you have questions, feel free to ask them on the Rascal StackOverflow site http://ask.rascal-mpl.org.