Examples in Program Analysis (Solution II)

1 Variable Usage
1.1 Solution
1.2 Further Questions
1.3 Output files

1 Variable Usage

The style guide requires that a program must not have variables which are never used. Furthermore programs must not rely upon default initialisation. Hence each variable must be assigned to before it is used.

You decide to suggest to your manager that the programmers should use lint before they release their programs. Inspite of that you try to let your analysis tool check these requirements, and compare the results with those of lint.

For the program

vars.in[1]==
{ int a, b, c, d, e;
  int f;
  int a;

  a = read (x);
  b = a + c;
  c = a * c;
  e = d + a;
  print (e);
}
This macro is attached to an output file.
your analysis tool shall produce messages like
vars.out[2]==
"vars.in", line 5:13 ERROR: variable is not defined
"vars.in", line 1:7 ERROR: variable is multiply defined
"vars.in", line 2:7 WARNING: variable is never used
"vars.in", line 3:7 ERROR: variable is multiply defined
"vars.in", line 5:13 WARNING: variable is used before set
"vars.in", line 6:11 WARNING: variable is used before set
"vars.in", line 7:11 WARNING: variable is used before set
"vars.in", line 8:7 WARNING: variable is used before set
This macro is attached to an output file.
The following questions may help to focus your attention on the significant aspects of this problem:

1. Decompose the whole task into subproblems: a name analysis task and several single tasks which need to compute a property of variables each. Solve the tasks separately.

2. How do you solve the name analysis task?

3. For each property subtask answer the following questions: What are the property values? In which contexts is the property to be set? In which contexts is the property to be read? What is the sequencing relation between setting and reading the property.

4. Discuss the write before read subproblem for the statement c = a * c;.

1.1 Solution

Name Analysis:

The task requires to associate properties to program objects. We first determine which identifier stands for which program object, represented by a key. The name analysis module CScope is used for that purpose:

Name analysis module instance[3]==
$/Name/CScope.gnrc:inst
This macro is invoked in definition 13.
This module provides computational roles for a root symbol RootScope that contains all scopes, for a symbol RangeScope that forms a possibly nested scope, and for defining and applied identifier occurrences, IdDefScope, IdUseEnv. These roles are associated to our grammar symbols.

Additionally we have to make the value of the terminal Ident available one level up at the VarDecl and VarUse nodes. That is achieved by the computation associated to the CLASS symbol IdOcc.

Name analysis module roles[4]==
SYMBOL Prog    INHERITS RootScope END;
SYMBOL Block   INHERITS RangeScope END;
SYMBOL VarDecl INHERITS IdDefScope END;
SYMBOL VarUse  INHERITS IdUseEnv END;

CLASS SYMBOL IdOcc: Sym: int;
CLASS SYMBOL IdOcc COMPUTE THIS.Sym = TERM; END;
SYMBOL VarDecl INHERITS IdOcc END;
SYMBOL VarUse  INHERITS IdOcc END;
This macro is invoked in definition 14.
Check Variable Definitions:

The scope rules can be violated in any of two cases: A variable is used but not defined, or a variable is defined more than once in a range. In each case we want to produce an error message.

The first case is checked by inspecting the Key attribute of an applied identifier occurrence:

Undefined variables[5]==
SYMBOL VarUse COMPUTE
  IF (EQ (THIS.Key, NoKey),
  message (ERROR, "variable is not defined", 0, COORDREF));
END;
This macro is invoked in definition 14.
In order to check for multiple definitions we use a property:
Multiple definitions property[6]==
MultDef: int;
This macro is invoked in definition 15.
For each key it may have one of 3 values which are encoded for better readability in the following .head file fragment:
Definition state macros[7]==
#define NotDefined      0
#define OnceDefined     1
#define MultipleDefined 2
This macro is invoked in definition 16.
The SetMultDef in VarDecl context sets the state either to 1 if no definition has been encountered so far, or to 2 otherwise. The CONSTITUENTS and INCLUDING pair using the attribute GotMultDef ensures that all set operations are done before any check for a message is executed:
Multiple definitions[8]==
SYMBOL Prog COMPUTE
  SYNT.GotMultDef = CONSTITUENTS VarDecl.GotMultDef;
END;

SYMBOL VarDecl COMPUTE
  SYNT.GotMultDef = 
    SetMultDef (THIS.Key, OnceDefined, MultipleDefined);

  IF (EQ (GetMultDef (THIS.Key, NotDefined), MultipleDefined),
  message (ERROR, "variable is multiply defined", 0, COORDREF))
  <- INCLUDING Prog.GotMultDef;
END;
This macro is invoked in definition 14.
Used Variables:

For checking whether a defined variable is used somewhere in the program we use the same technique as for checking for multiple definitions. A property is introduced:

Used property[9]==
IsUsed: int;
This macro is invoked in definition 15.
All variable uses have to be encountered before the check for an unused variable is done in the context of a variable definition.
Uses of variables[10]==
SYMBOL Prog COMPUTE 
  THIS.GotIsUsed = CONSTITUENTS VarUse.GotIsUsed;
END;

SYMBOL VarUse COMPUTE
  THIS.GotIsUsed = ResetIsUsed (THIS.Key, 1);
END;

SYMBOL VarDecl COMPUTE
  IF (NOT (GetIsUsed (THIS.Key, 0)),
  message (WARNING, "variable is never used", 0, COORDREF))
  <- INCLUDING Prog.GotIsUsed;
END;
This macro is invoked in definition 14.
Read before Write:

We now determine whether a variable is read before written. For that purpose we use a boolean property that describes whether a variable is already written in left-to-right order throughout the program.

Is written property[11]==
IsWritten: int;
This macro is invoked in definition 15.
We simulate assignments and accesses of variables along a left-to-right chain through the program. (This simulation is of course not accurate in the presence of loops or if statements. Hence not all situation where a variable may be used before it is set will be flagged with a warning.)

That chain is started in the Prog context. It does not carry a value. The states have to be computed for each variable individually.

A message is issued when a variable with not IsWritten is encountered in a VarUse which is not the left-hand side of an assignment. The VarUse on the left-hand side of an assignment turns the state into IsWritten.

We use a boolean attribute Lhs to distinguish the left-hand side of an assignment from other VarUses.

Furthermore, we have to locally modify the chain order in assignment contexts such that the left-hand side is encountered after the right-hand side.

Variable access computation[12]==
CHAIN VarAccess: VOID;

SYMBOL Prog COMPUTE
  CHAINSTART HEAD.VarAccess = "yes";
END;

SYMBOL VarUse: Lhs: int;
SYMBOL VarUse COMPUTE
  INH.Lhs = 0; /* default */
  THIS.VarAccess =
    IF (INH.Lhs,
        ResetIsWritten (THIS.Key, 1),
    IF (NOT (GetIsWritten (THIS.Key, 0)),
        message (WARNING, "variable is used before set", 0, COORDREF)))
    <- THIS.VarAccess;
END;

RULE: Assign ::= VarUse '=' Expr COMPUTE
  VarUse.Lhs = 1;
  Expr.VarAccess = Assign.VarAccess;
  VarUse.VarAccess = Expr.VarAccess;
  Assign.VarAccess = VarUse.VarAccess;
END;
This macro is invoked in definition 14.

1.2 Further Questions

1. Do you know a library module role that supports the check for undefind variables?

2. Do you know a library module that supports the check for multiple definitions?

3. How did you solve the problem that a variable use in the right-hand side of an assignment has to be considered before the left-hand side? Do you see alternative solutions?

1.3 Output files

vars.specs[13]==
Name analysis module instance[3]
This macro is attached to an output file.
vars.lido[14]==
Name analysis module roles[4]
Undefined variables[5]
Multiple definitions[8]
Uses of variables[10]
Variable access computation[12]
This macro is attached to an output file.
vars.pdl[15]==
Multiple definitions property[6]
Used property[9]
Is written property[11]
This macro is attached to an output file.
vars.head[16]==
Definition state macros[7]
This macro is attached to an output file.