Examples in Program Analysis

1 Structuring
1.1 Specification
1.2 Output of files
2 Dangling else
2.1 Specification
2.2 Output files
3 Switch Case Lables
3.1 Specification
3.2 Output files
4 Variable Usage
4.1 Specification
4.2 Output files
5 For Loop Control
5.1 Specification
5.2 Output files

1 Structuring

1.1 Specification

The structure of the example language is specified by the following concrete grammar productions.

A program is a block that consists of possibly empty sequences of declarations and statements.

Program[1]==
Prog    ::= Block .
Block   ::= '{' Decl* Stmt* '}' .
This macro is invoked in definition 12.
A declaration defines one or more variables of type int. Each declaration is terminated by a ';'. The symbol VarDecl distinguishes this defining occurrence of Ident from other occurrences. Several identifiers are separated by ','.
Declarations[2]==
Decl    ::= 'int' VarDecls ';' .
VarDecls::= VarDecl // ',' .
VarDecl ::= Ident .
This macro is invoked in definition 12.
There are several forms of statements: assignments, conditionals, switch statements, break statements, for loops, blocks, and expression statements. Statements, except blocks and switch statements are terminated by ';'.
Statements[3]==
Stmt    ::= Block .
Stmt    ::= Expr ';' .
Stmt    ::= 'break' ';' .
This macro is invoked in definition 12.
An assignment has the usual notation. The Assign symbol is introduced, because for loops contain assignments without terminating ';'. The symbol VarSet distinguishes the variable on the left-hand side from variable occurrences where their value is accessed rather than set.
Assignments[4]==
Stmt    ::= Assign ';' .
Assign  ::= VarSet '=' Expr .
VarSet  ::= Ident .
This macro is invoked in definition 12.
For statements consist of the initial assignment, the iteration condition, the assignment executed after each iteration, and the iterated statement.
For loop[5]==
Stmt    ::= ForStmt .
ForStmt ::= 'for' '(' ForCtrl ';' Expr ';' Assign ')' Stmt .
ForCtrl ::= Assign .
This macro is invoked in definition 12.
The else branch of an if statement is optional. The dangling else ambiguity is resolved such that an else branch is bound to the innermost if statement. The $'else' modification instructs the parser generator not to reduce that production if an 'else' token follows.
If statement[6]==
Stmt    ::= 'if' '(' Expr ')' Stmt $'else' .
Stmt    ::= 'if' '(' Expr ')' Stmt 'else' Stmt .
This macro is invoked in definition 12.
Switch statements have the same structure as in C.
Switch statement[7]==
Stmt    ::= SwStmt .
SwStmt  ::= 'switch' '(' Expr ')' '{' Case* '}' .
Case    ::= CaseLab Stmt* .
CaseLab ::= 'case' CaseLit ':' .
CaseLab ::= 'default' ':' .
CaseLit ::= Number .
This macro is invoked in definition 12.
The following expression syntax specifies precedence and associativity of operators: MulOpr have highest precedence, CmpOpr have lowest precedence. AddOpr and MulOpr are left-associative, and CmpOpr are not associative.
Expressions[8]==
Expr    ::= Sum CmpOpr Sum / Sum .
Sum     ::= Sum AddOpr Fact / Fact .
Fact    ::= Fact MulOpr Operand / Operand .
CmpOpr  ::= '<' / '>' / '==' / '!=' / '<=' / '>=' .
AddOpr  ::= '+' / '-' .
MulOpr  ::= '*' / '/' .
Operand ::= '(' Expr ')' .
This macro is invoked in definition 12.
The precedence and associativity rules are only relevant for the construction of the tree. The then constructed tree need not distinguish between Expr, Sum, Fact, and Operand. They are all represented by an Expr node. Similarly all operators are represented by an Opr node in the tree:
Expressions in abstract tree[9]==
Expr    ::= Sum Fact Operand .
Opr     ::= CmpOpr AddOpr MulOpr .
This macro is invoked in definition 13.
Operands are integral numbers, identifiers that denote access of a variable value, or calls of predefined functions. The latter may have an arbitrary number of arguments.
Operands[10]==
Operand ::= Number .
Operand ::= VarUse .
VarUse  ::= Ident .
Operand ::= Ident '(' Params ')' .
Params  ::= .
Params  ::= Expr // ',' .
This macro is invoked in definition 12.
The notation for numbers, identifiers, and comments are specified as in C using canned token specifications.
Tokens[11]==
Number: C_INTEGER
Ident:  C_IDENTIFIER
        C_COMMENT
This macro is invoked in definition 14.

1.2 Output of files

tree.con[12]==
Program[1]
Declarations[2]
Statements[3]
Assignments[4]
For loop[5]
If statement[6]
Switch statement[7]
Expressions[8]
Operands[10]
This macro is attached to an output file.
tree.sym[13]==
Expressions in abstract tree[9]
This macro is attached to an output file.
tree.gla[14]==
Tokens[11]
This macro is attached to an output file.

2 Dangling else

2.1 Specification

A warning is issued if the Stmt subtree of a one-sided if statement is a two-sided if statement. In order to produce the warning in the lower context we associate an attribute WarnSingleIf to Stmt. It indicates whether this Stmt occurs immediately below an rIf context. There it is set to true (1). In all other contexts where Stmt occurs on the righthand side it is set to false.

Using a SYMBOL computation for Stmt INH.WarnSingleIf = 0; avoids to take care of any other Stmt context. It is overridden by the RULE computation in the one-sided if context.

dangling else[15]==
ATTR WarnSingleIf: int;

SYMBOL Stmt COMPUTE INH.WarnSingleIf = 0; END;

RULE: Stmt ::= 'if' '(' Expr ')' Stmt COMPUTE
  Stmt[2].WarnSingleIf = 1;
END;

RULE: Stmt ::= 'if' '(' Expr ')' Stmt 'else' Stmt COMPUTE
  IF (Stmt[1].WarnSingleIf,
  message (WARNING, "beware of dangling else!", 0, COORDREF));
END;
This macro is invoked in definition 16.

2.2 Output files

dangling.lido[16]==
dangling else[15]
This macro is attached to an output file.

3 Switch Case Lables

3.1 Specification

For each SwStmt the number of its CaseLit occurrences and their smalles and largest value are computed by three CONSTITUENTS constructs, each combining the CaseLit values by an appropriate triple of functions: ADD, ARGTOONE, ZERO or MIN, IDENTICAL, f_INT_MAX or MAX, IDENTICAL, f_INT_MIN.

The functions f2, f1, f0 are chosen such that f1 yields the appropriate value of a single item, and for any x the equation f2 (x, f0 ()) = x holds.

The first CONSTITUENTS doesn't use the values of CaseLit.Sym at all for counting, since ARGTOONE is a constant (unary!) function, that yields 1 for any argument. f_INT_MAX and f_INT_MIN are constant function that yield the largest repectively the smallest int value.

In case of nested switch statements the CONSTITUENTS do not access CaseLit occurrences of inner switch statements. This fact could be made explicit by using CONSTITUENTS CaseLit.Sym SHIELD (SwStmt) WITH ...

The SYMBOL computation for CaseLit acesses the value of the terminal Number.

Compute Switch Case Lables[17]==
ATTR Sym:       int;

SYMBOL SwStmt COMPUTE
  printf ("switch in line %d has %d case labels in the range %d to %d\n",
        LINE,
        CONSTITUENTS CaseLit.Sym WITH (int, ADD, ARGTOONE, ZERO),
        CONSTITUENTS CaseLit.Sym WITH (int, MIN, IDENTICAL, f_INT_MAX),
        CONSTITUENTS CaseLit.Sym WITH (int, MAX, IDENTICAL, f_INT_MIN));
END;

SYMBOL CaseLit COMPUTE SYNT.Sym = TERM; END;
This macro is invoked in definition 19.
ADD, ZERO, ARGTOONE, and IDENTICAL are predefined macros in LIDO. The other functions are supplied by the following cpp macros:
Define Constituents Procedure Macros[18]==
#include <limits.h>
#define MIN(x,y)        ((x)<=(y) ? (x) : (y))
#define MAX(x,y)        ((x)>=(y) ? (x) : (y))
#define f_INT_MIN()     INT_MIN
#define f_INT_MAX()     INT_MAX
This macro is invoked in definition 20.

3.2 Output files

switch.lido[19]==
Compute Switch Case Lables[17]
This macro is attached to an output file.
switch.head[20]==
Define Constituents Procedure Macros[18]
This macro is attached to an output file.

4 Variable Usage

4.1 Specification

For this analysis we want to associate properties to variables. Hence we first need a name analysis that provides unique keys for variable ocurrences.

We specify C-like scope rules by using the library module CScope:

Instanciating CScope Module[21]==
$/Name/CScope.gnrc:inst
This macro is invoked in definition 34.
This module provides computational roles for a root symbol RootScope that contains all scopes, for a symbol RangeScope that forms a possibly nested scope, and for defining and applied identifier occurrences, IdDefScope, IdUseEnv.

These roles are associated to our grammar symbols.

Inheritance of scope ranges to grammar symbols[22]==
SYMBOL Prog    INHERITS RootScope END;
SYMBOL Block   INHERITS RangeScope, RangeUnique END;
SYMBOL VarDecl INHERITS IdDefScope END;
SYMBOL VarUse  INHERITS IdUseEnv END;
SYMBOL VarSet  INHERITS IdUseEnv END;
This macro is invoked in definition 35.
Additionally we have to make the value of the terminal Ident available one level up at the VarDecl, VarUse, and VarSet nodes. That is achieved by the computation associated to the abstract symbol IdOcc:
Making Available Identifier Code[23]==
SYMBOL IdOcc COMPUTE THIS.Sym = TERM; END;
SYMBOL VarDecl INHERITS IdOcc END;
SYMBOL VarUse  INHERITS IdOcc END;
SYMBOL VarSet  INHERITS IdOcc END;
This macro is invoked in definition 35.
Our scope rules can be violated in any of two cases: A variable is used but not defined, or a variable is defined more than once in a range. In each case we want to produce an error message.

The first case is checked by inspecting the Key attribute of an applied identifier occurrence:

Check if variables are defined[24]==
SYMBOL IsDefined COMPUTE
  IF (EQ (THIS.Key, NoKey),
  message (ERROR, "variable not defined", 0, COORDREF));
END;

SYMBOL VarUse  INHERITS IsDefined END;
SYMBOL VarSet  INHERITS IsDefined END;
This macro is invoked in definition 35.
The check for multiply defined variables is obtained from the library:
Instanciate Module to check for multiple definitions[25]==
$/Prop/Unique.gnrc:inst
This macro is invoked in definition 34.
The computational role Unique provides a compuation for a boolean attribute that indicates uniqueness. The RangeUnique role is associated to the root of the grammar.
Check for multiple definitions of variables[26]==
SYMBOL Prog     INHERITS RangeUnique END;
SYMBOL VarDecl  INHERITS Unique COMPUTE
  IF (NOT (THIS.Unique),
  message (ERROR, "variable is multiply defined", 0, COORDREF));
END;
This macro is invoked in definition 35.
We now get to the analysis of variable usage. We split the task into two subtasks. First it is checked whether a variable is set (used) somewhere in the program. A message is issued at the variable declaration if it is never set (used).

For that purpose we associate properties IsSet and IsUsed to variables:

Set and Used Properties of Variables[27]==
IsSet:          int;
IsUsed:         int;
This macro is invoked in definition 36.
The attribute Prog.GotUseSet represents the computational state when the IsSet and IsUsed properties are set at all variable occurrences. It is a precondition for the check in the declaration context.
Propagate Definitions and Uses of Variables[28]==
SYMBOL Prog COMPUTE 
  THIS.GotUseSet = CONSTITUENTS (VarSet.GotUseSet, VarUse.GotUseSet);
END;

SYMBOL VarSet COMPUTE
  THIS.GotUseSet = ResetIsSet (THIS.Key, 1);
END;

SYMBOL VarUse COMPUTE
  THIS.GotUseSet = ResetIsUsed (THIS.Key, 1);
END;

SYMBOL VarDecl COMPUTE
  IF (NOT (GetIsSet (THIS.Key, 0)),
  message (WARNING, "variable is never set", 0, COORDREF))
  <- INCLUDING Prog.GotUseSet;

  IF (NOT (GetIsUsed (THIS.Key, 0)),
  message (WARNING, "variable is never used", 0, COORDREF))
  <- INCLUDING Prog.GotUseSet;
END;
This macro is invoked in definition 35.
For the second analysis we associate the property LastAccess to variables representing the current state of a variable.
LastAccess property[29]==
LastAccess:     int;
This macro is invoked in definition 36.
It may assume one of the values
Possible Values of LastAccess property[30]==
#define unused  0
#define set     1
#define used    2
This macro is invoked in definition 37.
We simulate assignments and accesses of variables along a left-to-right CHAIN through the program. (This simulation is of course not accurate in the presence of loops or if statements. Hence not all situation where a variable may be used before it is set will be flagged with a warning.)

That CHAIN is started in the Prog context. It does not carry a value.

The computations that set LastAccess are allocated on the UseSetChain CHAIN by each having UseSetChain as both pre- and postcondition.

LastAccess starts with the value unused at the variable declaration and may then alternate between set and used along the CHAIN.

Define SetUsed Chain[31]==
CHAIN UseSetChain:      VOID;

SYMBOL Prog COMPUTE
  CHAINSTART HEAD.UseSetChain = "yes";
END;

SYMBOL VarDecl COMPUTE
  THIS.UseSetChain = 
        ResetLastAccess (THIS.Key, unused)
        <- THIS.UseSetChain;
END;
This macro is invoked in definition 35.
The CHAIN computation for VarUse comprises two actions which have to be executed in that order: first check the LastAccess state, then modify it to "used". The ORDER constructs combines these actions.
Mark Variable Use in SetUsed Chain[32]==
SYMBOL VarUse COMPUTE
  THIS.UseSetChain = ORDER (
        IF (EQ (GetLastAccess (THIS.Key, unused), unused),
                message (WARNING, "variable is used before set", 0, COORDREF)),
        ResetLastAccess (THIS.Key, used))
        <- THIS.UseSetChain;
END;
This macro is invoked in definition 35.
The LastAccess property for an assignment has to occur after the the expression is examined by the CHAIN computations. Hence it is allocated with suitable pre- and postconditions in the RULE context for assignments. (If it were specified as a CHAIN computation in the SYMBOL VarSet context, it would occur BEFORE the CHAIN computations of the righthand side expression.)
Mark Varialbe Definition in SetUsed Chain[33]==
RULE: Assign ::= VarSet '=' Expr COMPUTE
  Assign.UseSetChain =
        ResetLastAccess (VarSet.Key, set)
        <- Expr.UseSetChain;
END;
This macro is invoked in definition 35.

4.2 Output files

vars.specs[34]==
Instanciating CScope Module[21]
Instanciate Module to check for multiple definitions[25]
This macro is attached to an output file.
vars.lido[35]==
Inheritance of scope ranges to grammar symbols[22]
Making Available Identifier Code[23]
Check if variables are defined[24]
Check for multiple definitions of variables[26]
Propagate Definitions and Uses of Variables[28]
Define SetUsed Chain[31]
Mark Variable Use in SetUsed Chain[32]
Mark Varialbe Definition in SetUsed Chain[33]
This macro is attached to an output file.
vars.pdl[36]==
Set and Used Properties of Variables[27]
LastAccess property[29]
This macro is attached to an output file.
vars.head[37]==
Possible Values of LastAccess property[30]
This macro is attached to an output file.

5 For Loop Control

5.1 Specification

In the first step we check whether for control variables are used uniquely. We associate a property IsForCtrl to variable keys (the keys are already specified in the solution of the Variable usage analysis):
for control property[38]==
IsForCtrl:      int;
This macro is invoked in definition 44.
It can have one of three values:
possible values of for control property[39]==
#define NoForCtrl       0
#define ForCtrl         1
#define MultForCtrl     2
This macro is invoked in definition 45.
For variables derived from ForCtrl that property is set to ForCtrl if it is not yet set, or to MultForCtrl if it is already set elsewhere.

A warning is issued if the ForCtrl variable is set to MultForCtrl. The dependency on INCLUDING Prog.GotForCtrl ensures that ALL offending ForCtrl occurrences are indicated (rather than only the last one).

for control used in other statement warning[40]==
SYMBOL Prog COMPUTE
  SYNT.GotForCtrl = CONSTITUENTS ForCtrl.GotForCtrl;
END;

SYMBOL ForCtrl: Key: DefTableKey;
SYMBOL ForCtrl COMPUTE
  SYNT.Key = CONSTITUENT VarSet.Key;
  SYNT.GotForCtrl = SetIsForCtrl (THIS.Key, ForCtrl, MultForCtrl);

  IF (EQ (MultForCtrl, GetIsForCtrl (THIS.Key, ForCtrl)),
  message (WARNING, "ctrl variable used in other for statement", 0, COORDREF))
  DEPENDS_ON INCLUDING Prog.GotForCtrl;
END;
This macro is invoked in definition 46.
The task to check for control variables used outside for statement can be expressed as onother instance of a name analysis task: A for control variable is defined by a ForCtrl with the ForStmt being its range. Usage of a for control variable outside its range is considered to be an undefined occurrence, indicated by a warning.

Hence we use another instance of the CScope module:

CScope instanciation used to check for control variables[41]==
$/Name/CScope.gnrc+instance=For +referto=For:inst
This macro is invoked in definition 47.
The parameter +instance=For modifies the identifiers used in the module to ForKey, ForRootScope, ForRangeScope, ForIdDef, ForIdUse, and such avoids name clashes with identifers of the modules used before. The parameter +referto=For causes the module to use key attributes named ForKey instead of Key. Hence, at a ForCtrl we can have both a Key attribute representing the variable according to the original scope rules and a ForKey attribute representing the object according to our For rules.

The condition for the warning then checks whether the variable is a for control variable (property IsForCtrl) and it is undefined with respect to uses in our "for ranges" (THIS.ForKey is NoKey). The warning has to depend on a computational state where th property IsForCtrl is set for all variables (INCLUDING Prog.GotForCtrl).

for control used outside of scope warning[42]==
SYMBOL Prog     INHERITS ForRootScope END;
SYMBOL ForStmt  INHERITS ForRangeScope END;
SYMBOL ForCtrl  INHERITS ForIdDefScope COMPUTE
  THIS.Sym = CONSTITUENT VarSet.Sym;
END;

SYMBOL VarSet INHERITS ForIdUseEnv, OutsideFor END;
SYMBOL VarUse INHERITS ForIdUseEnv, OutsideFor END;

SYMBOL OutsideFor COMPUTE
  IF (AND (NE (NoForCtrl, GetIsForCtrl (THIS.Key, NoForCtrl)),
        EQ (THIS.ForKey, NoKey)),
  message (WARNING, "for ctrl variable used outside for scope", 0, COORDREF))
  DEPENDS_ON INCLUDING Prog.GotForCtrl;
END;
This macro is invoked in definition 47.
Finally we check for control variables being assigned outside the for control part. We associate an attribute InForCtrl to Assign. It indicates whether the Assign occurs in a for control part. (We use the same technique of SYMBOL computation and overriding as in the dangling else task.) Then the warning condition is similar to that of the above warning.
assignment to for control variable warning[43]==
ATTR InForCtrl: int;

SYMBOL Assign COMPUTE THIS.InForCtrl = 0; END;

RULE: ForStmt ::= 'for' '(' ForCtrl ';' Expr ';' Assign ')' Stmt COMPUTE
  Assign.InForCtrl = 1;
END;

RULE: ForCtrl ::= Assign COMPUTE
  Assign.InForCtrl = 1;
END;

RULE: Assign ::= VarSet '=' Expr COMPUTE
  IF (AND (NE (NoForCtrl, GetIsForCtrl (VarSet.Key, NoForCtrl)),
        NOT (Assign.InForCtrl)),
  message (WARNING, "assignment to for control variable", 0, COORDREF))
  DEPENDS_ON INCLUDING Prog.GotForCtrl;
END;
This macro is invoked in definition 47.

5.2 Output files

loop.pdl[44]==
for control property[38]
This macro is attached to an output file.
loop.head[45]==
possible values of for control property[39]
This macro is attached to an output file.
loop.lido[46]==
for control used in other statement warning[40]
This macro is attached to an output file.
loop.specs[47]==
CScope instanciation used to check for control variables[41]
for control used outside of scope warning[42]
assignment to for control variable warning[43]
This macro is attached to an output file.