C-3.1 3. Code Generation Input: Program in intermediate language Tasks: Storage mapping properties of program objects (size, address) in the definition module Code selection generate instruction sequence, optimizing selection Register allocation use of registers for intermediate results and for variables Output: abstract machine program, stored in a data structure Design of code generation: Implementation of code generation: * analyze properties of the target * Storage mapping: processor a traversal through the program and the * plan storage mapping definition module computes sizes and addresses of storage objects * design at least one instruction sequence for each operation of the * Code selection: use a generator for intermediate language pattern matching in trees * Register allocation: methods for expression trees, basic blocks, and for CFGs (c) 2002 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 301 Objectives: Overview on design and implementation In the lecture: * Identify the 3 main tasks. * Emphasize the role of design. Suggested reading: Kastens / Übersetzerbau, Section 7 -------------------------------------------------------------------------------- C-3.2 3.1 Storage Mapping Objective: for each storable program object compute storage class, relative address, size Implementation: use properties in the definition module, traverse defined program objects Design the use of storage areas: code storage progam code global data to be linked for all compilation units run-time stack activation records for function calls heap storage for dynamically allocated objects, garbage collection registers for addressing of storage areas (e. g. stack pointer) function results, arguments local variables, intermediate results (register allocation) Design the mapping of data types (next slides) Design activation records and translation of function calls (next section) (c) 2002 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 302 Objectives: Design the mapping of the program state on to the machine state In the lecture: Explain storage classes and their use Suggested reading: Kastens / Übersetzerbau, Section 7.2 -------------------------------------------------------------------------------- C-3.3 Storage Mapping for Data Types Basic types arithmetic, boolean, character types match language requirements and machine properties: data format, available instructions, size and alignment in memory Structured types for each type representation in memory and code sequences for operations, e. g. assignment, selection, ... record relative address and alignment of components; reorder components for optimization union storage overlay, tag field for discriminated union set bit vectors, set operations for arrays and functions see next slides (c) 2002 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 303 Objectives: Overview on type mapping In the lecture: The topics on the slide are explained. Examples are given. * Give examples for mapping of arithmetic types. * Explain alignment of record fields. * Explain overlay of union types. * Discuss a recursive algorithm for type mapping that traverses type descriptions. Suggested reading: GdP slides on data types -------------------------------------------------------------------------------- C-3.4 Array Implementation: Pointer Trees An n-dimensional array a: array[l1..u1, l2..u2, ..., ln..un] of real; is implemented by a tree of linear arrays; n-1 levels of pointer arrays and data arrays on the n-th level 10 l1 l3 l2 u3 5 u2 ... u1 ... l3 4 u3 Each single array can be allocated separately, dynamically; scattered in memory In Java arrays are implemented this way. (c) 2002 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 304 Objectives: Understand implementation variant In the lecture: Aspects of this implementation variant are explained: * allocation by need, * non-orthogonal arrays, * additional storage for pointers, * costly indirect access Assignments: Allocate an array in Java that has the shape of a pyramid. How many pointer and data cells are needed? -------------------------------------------------------------------------------- C-3.5 Array Implementation: Contiguous Storage An n-dimensional array a: array[l1..u1, l2..u2, ..., ln..un] of real; is mapped to one contiguous storage area linearized in row-major order: 10 5 start store[start] ... store[start + elno*elsz - 1] 4 linear storage map of array a onto byte-array store from index start: number of elements elno = st1 * st2 * ... * stn i-th index stride sti = ui - li + 1 element size in bytes elsz Index map of a[i1, i2, ..., in]: store[start+ (..((i1-l1)*st2 + (i2-l2))*st3 +..)*stn + (in-ln))*elsz] store[const + (..(i1*st2 + i2)*st3 +..)*stn + in)*elsz] (c) 2002 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 305 Objectives: Understand implementation variant In the lecture: Aspects of this implementation variant are explained: * Give an example for a 3-dimensional array. * Explain the index function. * Explain the index function with constant terms extracted. * Compare the two array implementation variants: * Allocation in one chunk, * orthogonal arrays only, * storage only for data elements, * efficient direct addressing. * FORTRAN: column major order! Suggested reading: GdP slides on data types Questions: * What information is needed in an array descriptor for a dynamically allocated multi-dimensional array? -------------------------------------------------------------------------------- C-3.6 Functions as Data Objects Functions may occur as data objects: Functions that are defined on the * variables outermost program level (non-nested) * parameters can be implemented by just the address of the code. * function results * lambda expressions (in functional languages) Functions that are defined in nested structures have to be implemented by a pair: (closure, code) The closure contains all bindings of names to variables or values that are valid when the function definition is executed. In run-time stack implementations the closure is a sequence of activation records on the static predecessor chain. (c) 2002 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 306 Objectives: Understand the concept of closure In the lecture: The topics on the slide are explained: * examples for functions as data objects, * recall functional programming (GdP), * closures as a sequence of activation records, * relate closures to run-time stacks Suggested reading: GdP slides on run-time stack Questions: * Why must a functional parameter in Pascal be represented by a pair (closure, code)? -------------------------------------------------------------------------------- C-3.7 3.2 Run-Time Stack Activation Records Run-time stack contains one activation record for each active function call. Activation record: activation record: provides storage for the data of a function call. parameters dynamic link: link from callee to caller, static link to the preceding record on the stack return address static link: dynamic link link from callee c to the record s where c is defined local variables s is a call of a function which contains the definition of the function, the call of which created c. register save area Variables of surrounding functions are accessed via the static predecessor chain. Only relevant for languages which allow nested functions, classes, objects. closure of a function call: the activation records on the static predecessor chain (c) 2006 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 307 Objectives: Understand activation records In the lecture: Explain * static and dynamic links, * Explain nesting and closures, * return address. See C-3.10 for relation to call code. -------------------------------------------------------------------------------- C-3.8 Example for a Run-Time Stack Run-time stack: A call creates an activation record and pushes it onto the stack. It is popped on termination of the call. h float a; h a: nested q static int i; q: links functions r q1 i: r int b; 1: b=i+1; q2 i: r2: if(..) q(); q3 i: r(); r3: q(); r b: push, pop b=i+1; The static link points to the activation record where the called function is defined, e. g. r3 in q3 Optimization: activation records of non-recursive functions may be allocated statically. Languages without recursive functions (FORTRAN) do not need a run-time stack. Parallel processes, threads, and coroutines need a separate run-time stack each. (c) 2011 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 308 Objectives: Understand run-time stacks In the lecture: * Explain static links. * Explain nesting and closures. Questions: * Why do threads need a separate run-time stack? -------------------------------------------------------------------------------- C-3.9 Not-Most-Recent Property The static link of an activation record c for a function r points to an activation record d for a function q where r is defined in. If there are activation records for q on the stack, that are more recently created than d, the static link to d is not-most-recent. That effect can be achieved by using functional parameters or variables. Example: h float a; q(funct f) h a: nested int i; q: static links functions q r 1f: q int b; i: r b=i+1; 1: q2f: r1 if(..) q(r); i: r2: *f(); q3f: r2 i: not-most- q(q); r3: recent r b: 2 b=i+1; (c) 2004 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 309 Objectives: Really understand static links In the lecture: * Explain not-most-recent property. * r[1] and r[2] must be represented by different values, because they have different closures. -------------------------------------------------------------------------------- C-3.10 Closures on Run-Time Stacks Function calls can be implemented by a run-time stack if the closure of a function is still on the run-time stack when the function is called. h Example for violation: float a; q h a: h a: int i; q: q: ? r q1 i: r b: int b; r1: b=i+1; b=i+1; return r; during the the closure call of q for the call of r *(q()) (); is missing Language conditions to guarantee run-time stack discipline: Pascal: functions not allowed as function results, or variables C: no nested functions Modula-2: nested functions not allowed as values of variables Functional languages maintain activation records on the heap instead of the run-time stack (c) 2002 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 310 Objectives: Language condition for run-time stacks In the lecture: * Explain language restrictions to ensure that necessary closures are on the run-time stack. Questions: * Explain why C, Pascal, and Modula-2 obey the requirement on stack discipline? -------------------------------------------------------------------------------- C-3.11 Activation Records and Call Code activation record: call code function code result push parameter values parameters push static link subroutine jump static link - + return address push dynamic link dynamic link 0 stack register := top of stack local variables base address increment top of stack for local variables register save area save registers + - ... function body ... restore registers deallocate local variables pop stack register return jump pop static link pop parameter area use and pop result (c) 2011 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 311 Objectives: Relation between activation record and call code In the lecture: Explain * contents of records, * how to save registers, * relative addresses of data in the activation record * register windowing related to run-time stacks Suggested reading: Kastens / Übersetzerbau, Section 7.2.2, 7.3.1 Questions: * How would you design the layout of activation records for a processor that provides register windowing? -------------------------------------------------------------------------------- C-3.12 3.3 Code Sequences for Control Statements A code sequence defines how a control statement is transformed into jumps and labels. Notation of the Code constructs: Code (S) generate code for statements S Code (C, true, M) generate code for condition C such that it branches to M if C is true, otherwise control continues without branching Code (A, Ri) generate code for expression A such that the result is in register Ri Code sequence for if-else statement: if (cond) ST; else SE;: Code (cond, false, M1) Code (ST) goto M2 M1: Code (SE) M2: (c) 2002 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 312 Objectives: Concept of code sequences for control structures In the lecture: * Explain the notation. * Explain the code sequence for if-else statements. -------------------------------------------------------------------------------- C-3.13 Short Circuit Translation of Boolean Expressions Boolean expressions are translated into sequences of conditional branches. Operands are evaluated from left to right until the result is determined. true if a or b and c then ST else SE false 2 code sequences for each operator; applied to condition tree on a top-down traversal: Code (A and B, true, M): Code (A, false, N) Code (not A, X, M): Code (A, not X, M) Code (B, true, M) Code (A < B, true, M): Code (A, Ri); N: Code (B, Rj) Code (A and B, false, M): Code (A, false, M) cmp Ri, Rj Code (B, false, M) braLt M Code (A or B, true, M): Code (A, true, M) Code (A < B, false, M): Code (A, Ri); Code (B, true M) Code (B, Rj) cmp Ri, Rj Code (A or B, false, M): Code (A, true, N) braGe M Code (B, false, M) N: Code for a leaf: conditional jump (c) 2007 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 313 Objectives: Special technique for translation of conditions In the lecture: * Explain the transformation of conditions. * Use the example of C-3.14 * Use 2 inherited attributes for the target label and the case when to branch. * Discuss whether the technique may be applied for C, Pascal, and Ada. Suggested reading: Kastens / Übersetzerbau, Section 7.3.3 Questions: * Why does the transformation of conditions reduce code size? * How is the technique described by an attribute grammar? * Why is no instruction generated for the operator not ? * Discuss whether the technique may or must be applied for C, Pascal, and Ada. -------------------------------------------------------------------------------- C-3.14 Example for Short Circuit Translation true condition target inherited attributes if a or b and c then ST else SE false 3 code if-stmt then-part else-part f M1 or 5 ST 6 goto M2; M1: SE; M2: 4 N: t N a f M1 and 1 load a, R1 braNe N f M1 b f M1 c 2 load b, R1 3 load c, R1 braEq M1 braEq M1 (c) 2002 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 314 Objectives: Illustrate short circuit translation In the lecture: Discuss together with C-3.13 Suggested reading: Kastens / Übersetzerbau, Section 7.3.3 -------------------------------------------------------------------------------- C-3.15 Code Sequences for Loops While-loop variant 1: Pascal for-loop unsafe variant: while (Condition) Body for i:= Init to Final do Body M1: Code (Condition, false, M2) i = Init Code (Body) L: if (i>Final) goto M goto M1 Code (Body) M2: i++ goto L M: While-loop variant 2: while (Condition) Body Pascal for-loop safe variant: goto M2 for i:= Init to Final do Body M1: Code (Body) if (Init==minint) goto L M2: Code (Condition, true, M1) i = Init - 1 goto N L: Code (Body) N: if (i>= Final) goto M i++ goto L M: (c) 2007 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 315 Objectives: Understand loop code In the lecture: * Explain the code sequences for while-loops. * Discuss the two variants. * Explain the code sequences for for-loops. * Variant 1 may cause an exception if Final evaluates to maxint. * Variant 2 avoids that problem. * Variant 2 needs further checks to avoid an exception if Init evaluates to minint. * Both variants should not evaluate the Final expression on every iteration. Questions: * What are the advantages or problems of each alternative? -------------------------------------------------------------------------------- C-3.16 3.4 Code Selection * Given: target tree in intermediate language. * Optimizing selection: Select patterns that translate single nodes or small subtrees into machine instructions; cover the whole tree with as few instructions as possible. * Method: Tree pattern matching, several techniques void Example: assignment assign void ... = a[i].s; store R5 assumed: assign ... cont R6: points to current activation record store load relative address of a is 12 induct. var. i is substituted by ix, rel. adr 8 (R2,18) R4,12 record elem. s has rel. adr. 6 ... cont addradd R2,18 R2,12 add R3 addradd addradd const 6 move R2,12 6 R6,12 add s addradd const R1 6 addr cont add R6,12 R1 s R6,12 load addr cont a addrR6,8 R6,12 load load (R6,8), R1 R6,8 a R6,8 addr add R6,R1,R2 ix R6,8 move 6,R3 ix add R2,R3,R4 load (R6,8), R1 cost: 3 instructions load (R4,12),R5 cost: 6 instructions add R6,R1,R2 store R5, ... store (R2,18),... (c) 2011 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 316 Objectives: Understand the task In the lecture: The topics on the slide are explained. Examples are given. * The task is explained. * Example: Code of different cost for the same tree. -------------------------------------------------------------------------------- C-3.17 Selection Technique: Value Descriptors Intermediate language tree node operators; Value descriptors state how/where the e.g.: value of a tree node is represented, e. g. addr address of variable Ri value in register Ri const constant value c constant value c cont load contents of address Ri,c address Ri + c addradd address + value (adr) contents at the address adr alternative translation patterns to be selected context dependend: Ri, c1 + c2 Rk addradd addradd Ri, c1 c2 Ri Rj addradd Ri, c1 c2 -> Ri, c1 + c2 ./. addradd Ri Rj -> Rk add Ri, Rj, Rk (c) 2007 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 317 Objectives: Notion of value descriptors In the lecture: * Explain value descriptors * Explain alternative translation patterns * Concept of deferred operations * Different costs of translations * Compare with the concept of overloaded operators: here, selection by kind of value descriptor. Suggested reading: Kastens / Übersetzerbau, Section 7.3.4 Questions: * How is the technique related to overloaded operators in source languages? -------------------------------------------------------------------------------- C-3.18 Example for a Set of Translation Patterns # operator operands result code 1 addr Ri, c -> Ri,c ./. 2 const c -> c ./. 3 const c -> Ri move c, Ri 4 cont Ri, c -> (Ri, c) ./. 5 cont Ri -> (Ri) ./. 6 cont Ri, c -> Rj load (Ri, c), Rj 7 cont Ri -> Rj load (Ri), Rj 8 addradd Ri c -> Ri, c ./. 9 addradd Ri, c1 c2 -> Ri, c1 + c2 ./. 10 addradd Ri R j -> Rk add Ri, Rj, Rk 11 addradd Ri, c R j -> Rk, c add Ri, Rj, Rk 12 assign Ri R j -> void store Rj, Ri 13 assign Ri (Rj, c) -> void store (Rj,c), Ri 14 assign Ri,c R j -> void store Rj, Ri,c (c) 2011 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 318 Objectives: Example In the lecture: * Explain the meaning of the patterns. * Use the example for the tree of C-3.19 Suggested reading: Kastens / Übersetzerbau, Section 7.3.4 -------------------------------------------------------------------------------- C-3.19 Tree Covered with Translation Patterns 12 tree for assignment void ... = a[i].s; assign void 13 assign store ... R5 store cont 6 ... load (R2,18) 11 R4,12 cont 4 addradd 9 R2,18 add 11 addradd R2,12 R3 11 addradd const R2,12 6 6 move addradd const 6 R6,12 add R1 3 add 2 1 addr cont 6 R6,12 R1 R6,12 R6,8 load 1 addr cont 6 R6,12 load 1 addr R6,8 R6,8 1 addr R6,8 load (R6,8), R1 add R6,R1,R2 load (R6,8), R1 move 6,R3 6 add R6,R1,R2 add R2,R3,R4 store (R2,18),... load (R4,12),R5 store R5, ... application of pattern #6 cost: 3 instructions cost: 6 instructions (c) 2011 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 319 Objectives: Example for pattern applications In the lecture: * Show applications of patterns. * Show alternatives and differences. * Explain costs accumulated for subtrees. * Compose code in execution order. -------------------------------------------------------------------------------- C-3.20 Pattern Selection Pass 1 bottom-up: assign Annotate the nodes with sets of pairs {(void,lhscost+3)} { (v, c) | v is a kind of value descriptor that an ... 13 applicable pattern yields, cont c are the accumulated subtree costs} {((Ri+c),2), (Ri,3)} 4 6 If (v, c1), (v, c2) keep only the cheaper pair. addradd {(Ri+c,2), (Ri+c,4)} Pass 2 top-down: 9 11 addradd const Select for each node the cheapest pattern, {(Ri+c,2)} {(c,0), (Ri,1)} 11 2 3 that fits to the selection made above. addr cont Pass 3 bottom-up: {(Ri+c,0)} {((Ri+c),0), (Ri,1)} 1 4 addr 6 Emit code. {(Ri+c,0)} 1 load (R6,8), R1 add R6,R1,R2 Improved technique: store (R2,18),... relative costs per sets => cost: 3 instructions finite number of potential sets integer encoding of the sets at generation time (c) 2011 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 320 Objectives: 2-pass selection algorithm In the lecture: * Explain the role of the pairs and sets. * Show the selection using the following pdf file: an example for pattern selection * Overloading resolution in Ada is performed in a similar way (without costs). -------------------------------------------------------------------------------- C-3.21 Pattern Matching in Trees: Bottom-up Rewrite Bottom-up Rewrite Systems (BURS) : a general approach of the pattern matching method: Specification in form of tree patterns, similar to C-3.18 - C-3.20 Set of patterns is analyzed at generation time. Generator produces a tree automaton with a finite set of states. On the bottom-up traversal it annotates each tree node with a set of states: those selection decisions which may lead to an optimal solution. Decisions are made on the base of the costs of subtrees rather than costs of nodes. Generator: BURG (c) 2004 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 321 Objectives: Get an idea of the BURS method In the lecture: * Explain the basic ideas of BURS. * Compare it to the previous technique. * Decides on the base of subtree costs. * Very many similar patterns are needed. Suggested reading: Kastens / Übersetzerbau, Section 7.4.3 Questions: * In what sense must the specification be complete? -------------------------------------------------------------------------------- C-3.22 Tree Pattern Matching by Parsing The tree is represented in prefix form. Translation patterns are specified by tuples (CFG production, code, cost), Value descriptors are the nonterminals of the grammar, e. g. 8 RegConst ::= addradd Reg Const nop 0 11 RegConst ::= addradd RegConst Reg add Ri, Rj, Rk 1 Deeper patterns allow for more effective optimization: Void ::= assign RegConst addradd Reg Const store (Ri, c1),(Rj, c2) 1 Parsing for an ambiguous CFG: application of a production is decided on the base of the production costs rather than the accumulated subtree costs! Technique "Graham, Glanville" Generators: GG, GGSS (c) 2011 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compilation Methods SS 2011 / Slide 322 Objectives: Understand the parsing approach In the lecture: Explain * how a parser performs a tree matching, * that the parser decides on the base of production costs, * that the grammar must be complete, * that very many similar patterns are needed. Suggested reading: Kastens / Übersetzerbau, Section 7.4.3 Questions: * In what sense must the grammar be complete? What happens if it is not? * Why is it desirable that the grammar is ambiguous? * Why is BURS optimization more effective? --------------------------------------------------------------------------------