TP2 - Typing
The goal of this session is to add types, and a static type analysis, to our language.
Start by downloading the package: codeV2.tgz.
The language is the same as last week, but extended with variable declarations. Such declarations are statements of the form:
var var1 [= expr1], var2 [=expr2],...;
A single declaration can declare an arbitrary number of variables var1, var2, etc. Note that there is no static type information in declarations: the language is dynamically typed. Each variable name is optionally followed by an initialization of the form = expr, which behaves similarly to an assignment. If there is no initialization expression, the variable is assumed to be non-initialized. Note that using a non-initialized value in an expression results in a value error, but not a type error, and detecting the use of non initialized variables is thus out of the scope of the type analysis we develop here.
A variable declared in a block starts existing at the point of declaration and stops existing at the end of the block; this is the variable scope. It is important to distinguish variables of the same name, in case several of them exist at the same time (in nested scopes).
The program is dynamically typed, à la Python: upon assignation, a variable takes the type of the value it is assigned. In our simple language, there are only boolean and integer values. There is a type error in case an integer value is used with a boolean operator, or the converse.
The goal of this exercise is to write a static analyzer that automatically discovers whether the program is type safe (no dynamic type error at execution). The analysis is similar to computing the denotational semantics of a non-deterministic language, but sets of values (integers, booleans, errors) are replaced with abstract values, i.e., types.
- Define the type aval of elements of the abstract value lattice. It should represent the type of integers and of booleans, and be closed by intersection and union.
- Define the type env of environments mapping each variable to the type of values it contains.
- Define the lattice join operator aval_join: aval -> aval -> aval (union of abstract values) and its extension to environments: env_join: env -> env -> env. You can exploit the functions map2o and forall2o from the Mapext functor (this is an extension to OCaml's standard Map module, and it is provided in the libs subdirectory).
- Program the functions eval_expr: env -> expr ext -> aval and eval_stat: env -> stat ext -> env. These should be very similar to the non-deterministic denotational semantics of last week, but use abstract values and abstract environments instead of sets of values and sets of environments. They are also extended to handle local variable scope.
- Evaluate your analyzer on the files in the examples directory. Can you spot the cases where the analyzer cannot prove the safety of the program, despite the program having no actual type error at execution?
Possible extensions and variants include:
- Polymorphic types. A polymorphic type system allows several types for programs. As our language has only a finite set of types, we can represent polymorphism by listing the set of possible types. Program a polymorphic type analyzer that propagate sets of type assignments instead of a single type assignment.
- Flow-insensitive types. A static type analysis allows the same variable to change its type throughout the execution of the program. Standard type systems are, however, flow-insensitive: a single type assignment should be valid at all program points. Program a flow-insensitive type analyzer that ensures this property and rejects programs requiring variables to change type.
Author: Antoine Miné