CI-1 Compiler I (dt. Übersetzer I) Prof. Dr. Uwe Kastens Winter 2001/2002 (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 01 In the lecture: Welcome to the lecture! -------------------------------------------------------------------------------- CI-2 Objectives The participants are taught to * understand fundamental techniques of language implementation, * use generating tools and standard solutions, * understand compiler construction as a systematic combination of algorithms, theories and software engineering methods for the solution of a precisely specified task, * apply compiler techniques for languages other than programming languages. Forms of teaching: Lectures Tutorials Exercises Homeworks Running project (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 02 Objectives: Understand the objectives of the course. In the lecture: The objectives are explained. Questions: * What are your objectives? * Do they match with these? * When did you last listen to a talk given in English? -------------------------------------------------------------------------------- CI-3 Lectures in English Some agreements about giving lectures in English: * I'll speak English unless someone asks me to explain something in German. * Stop me or slow me down whenever you get lost. * I don`t speak as well as a native speaker; but I'll do my best ... * You may ask questions and give answers in English or in German. * I'll prepare the slides in English. A German version is available. * You`ll have to learn to speak about the material in at least one of the two languages. * You may vote which language to be used in the tutorials. * You may chose German or English for the oral exam. (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 03 Objectives: Clarification about the use of the English language in this course In the lecture: The topics on the slide are discussed. -------------------------------------------------------------------------------- CI-4 Syllabus Week Chapter Topic 1 Introduction Compiler tasks 2 Compiler structure 3 Lexical analysis Scanning, token representation 4 Syntactic analysis Recursive decent parsing 5 LR Parsing 6 Parser generators 7 Grammar design 8 Semantic analysis Attribute grammars 9 Attribute grammar specifications 10 Name analysis 11 Type analysis 12 Transformation Intermediate language, target trees 13 Target texts 14 Synthesis Overview 15 Summary (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 04 Objectives: Overview over the topics of the course In the lecture: Comments on the topics. -------------------------------------------------------------------------------- CI-5 Prerequisites from Lecture Topic here needed for Foundations of Programming Languages: 4 levels of language properties Compiler tasks, compiler structure Context-free grammars Syntactic analysis Scope rules Name analysis Data types Type analysis Lifetime, runtime stack Storage model, code generation Modeling: Finite automata Lexical analysis Context-free grammars Syntactic analysis -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 05 Objectives: Identify concrete topics of other courses In the lecture: Point to material to be used for repetition Suggested reading: Course material for Foundations of Programming Languages Course material for Modeling Questions: * Do you have the prerequisites? * Are you going to learn or to repeat that material? -------------------------------------------------------------------------------- CI-6 References Material for this course Compiler I: http://www.uni-paderborn.de/cs/ag-kastens/compi in German Übersetzer I (1999/2000): http://www.uni-paderborn.de/cs/ag-kastens/uebi in English Compiler II: http://www.uni-paderborn.de/cs/ag-kastens/uebii Modellierung: http://www.uni-paderborn.de/cs/ag-kastens/model Grundlagen der Programmiersprachen: http://www.uni-paderborn.de/cs/ag-kastens/gdp U. Kastens: Übersetzerbau, Handbuch der Informatik 3.3, Oldenbourg, 1990 (not available on the market anymore, available in the library of the University) W. M. Waite, L. R. Carter: An Introduction to Compiler Construction, Harper Collins, New York, 1993 W. M. Waite, G. Goos: Compiler Construction, Springer-Verlag, 1983 R. Wilhelm, D. Maurer: Übersetzerbau - Theorie, Konstruktion, Generierung, Springer-Verlag, 1992 A. Aho, R. Sethi, J. D. Ullman: Compilers - Principles, Techniques and Tools, Addison-Wesley, 1986 A. W. Appel: Modern Compiler Implementation in C, Cambridge University Press, 1997 (available for Java and for ML, too) (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 06 Objectives: Useful references for the course In the lecture: Comments of the course material and books * The material for this course is being translated from the material of "Übersetzer I (WS 1999/2000)" while the course is given * The course "Compiler II" will follow next semester. Questions: * Find the material in the Web, get used to its structure, place suitable bookmarks. -------------------------------------------------------------------------------- CI-7 Course material in the Web (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 07 Objectives: The root page of the course material. In the lecture: The navigation structure is explained. Assignments: Explore the course material. -------------------------------------------------------------------------------- CI-7a Commented slide in the course material (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 07a Objectives: A slide of the course material. In the lecture: The comments are explained. Assignments: Explore the course material. -------------------------------------------------------------------------------- CI-8 What does a compiler compile? A compiler transforms correct sentences of its source language into sentences of its target language such that their meaning is unchanged. Examples: Source language: Target language: Programming language Machine language C++ Sparc code Programming language Abstract machine Java Java Bytecode Programming language Programming language (source-to-source) C++ C Application language Application language LaTeX HTML Data base language (SQL) Data base system calls (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 08 Objectives: Variety of compiler applications In the lecture: Explain examples for pairs of source and target languages. Suggested reading: Kastens / Übersetzerbau, Section 1. Assignments: * Find more examples for application languages. * Exercise 3 Recognize patterns in the target programs compiled from simple source programs. Questions: What are reasons to compile into other than machine languages? -------------------------------------------------------------------------------- CI-9 What is compiled here? class Average class Average { private: { private int sum, count; int sum, count; public: public Average (void) Average () { sum = 0; count = 0; } { sum = 0; count = 0; } void Enter (int val) void Enter (int val) { sum = sum + val; count++; } { sum = sum + val; count++; } float GetAverage (void) float GetAverage () { return sum / count; } { return sum / count; } }; };--------------- -------------- 1: Enter: (int) --> void _Enter__7Averagei: Access: [] pushl \%ebp Attribute quotesinglbase Code` (Length 49) movl \%esp,\%ebp Code: 21 Bytes Stackdepth: 3 Locals: 2 movl 8(\%ebp),\%edx 0: aload_0 movl 12(\%ebp),\%eax 1: aload_0 addl \%eax,(\%edx) 2: getfield cp4 incl 4(\%edx) 5: iload_1 L6: 6: iadd movl \%ebp,\%esp 7: putfield cp4 popl \%ebp 10: aload_0 ret 11: dup 12: getfield cp3 15: iconst_1 16: iadd (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 09 Objectives: Recognize examples for compilations In the lecture: Anwer the questions below. Questions: * Which source and target language are shown here? * How did you recognize them? -------------------------------------------------------------------------------- CI-10 What is compiled here? program Average; \documentstyle[12pt]{article} var sum, count: integer; \begin{document} aver: integer; \section{Introduction} procedure Enter (val: integer); This is a very short document. begin sum := sum + val; It just shows count := count + 1; \begin{itemize} \item an item, and end; \item another item. begin \end{itemize} sum := 0; count := 0; \end{document} Enter (5); Enter (7); aver := sum div count; ------------- end. ----------- \%\%Page: 1 1 void ENTER_5 (char *slnk , int VAL_4) 1 0 bop 164 315 a Fc(1)81 { b(In)n(tro)r(duction) 164 425 y Fb(This)16 {/* data definitions: */ b(is)g(a)h(v)o(ery)e(short) /* executable code: */ i(do)q(cumen)o(t.)j(It)c(just)g { (sho)o(ws)237 527 y Fa(\017)24 b SUM_1 = (SUM_1)+(VAL_4); Fb(an)17 b(item,) COUNT_2 = (COUNT_2)+(1); c(and)237 628 y Fa(\017)24 b ; Fb(another)17 b(item.) } 961 2607 y(1)p }}/* ENTER_5 */ eop (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 10 Objectives: Recognize examples for compilations In the lecture: Anwer the questions below. Questions: * Which source and target language are shown here? * How did you recognize them? -------------------------------------------------------------------------------- CI-11 Languages for specification and modeling SDL (CCITT) UML Specification and Description Language: Unified Modeling Language: (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 11 Objectives: Be aware of specification languages In the lecture: Comments on SDL and UML Suggested reading: Text Questions: What kind of tools are needed for such specification languages? -------------------------------------------------------------------------------- CI-12 Domain Specific Languages (DSL) A language designed for a specific application domain. Application Generator: Implementation of a DSL by a program generator Examples: * Simulation of mechatronic feedback systems * Robot control * Collecting data from instruments * Testing car instruments * Report generator for bibliographies: string name = InString "Which author?"; int since = InInt "Since which year?"; int cnt = 0; "\nPapers of ", name, " since ", since, ":\n"; [ SELECT name <= Author && since <= Year; U. Kastens: Construction of cnt = cnt + 1; Application Generators Year, "\t", Title, "\n"; Using Eli, ] Workshop on Compiler Techniques for Application "\n", name, " published ", cnt, "papers.\n"; Domain Languages ..., Linköping, April 1996 (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 12 Objectives: Understand DSL by examples In the lecture: Explain the examples Suggested reading: * C.W. Krueger: Software Reuse, ACM Computing Surveys 24, June 1992 * Conference on DSL (USENIX), Santa Babara, Oct. 1997 * ACM SIGPLAN Workshop on DSL (POPL), Paris, Jan 1997 Questions: Give examples for tools that can be used for such languages. -------------------------------------------------------------------------------- CI-13 Programming languages as source or target languages Programming languages as source languages: * Program analysis call graphs, control-flow graph, data dependencies, e. g. for the year 2000 problem * Recognition of structures and patterns e. g. for Reengineering Program languages as target languages: * Specifications (SDL, OMT, UML) * graphic modeling of structures * DSL, Application generator => Compiler task: Source-to-source compilation (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 13 Objectives: Understand programming languages in different roles In the lecture: * Comments on the examples * Role of program analysis in software engineering * Role of Source-to-source compilation in software engineering Questions: Give examples for the use of program analysis in software engineering. -------------------------------------------------------------------------------- CI-14 Semester project as running example A Structure Generator We are going to develop a tool that implements record structures. In particular, the structure generator takes a set of record descriptions. Each specifies a set of named and typed fields. For each record a Java class declaration is to be generated. It contains a constructor method and access methods for the specified record fields. The tool will be used in an environment where field description are created by other tools, which for example analyze texts for the occurrence of certain phrases. Hence, the descriptions of fields may occur in arbitrary order, and the same field may be described more than once. The structure generator accumulates the field descriptions such that for each record a single class declaration is generated which has all fields of that record. Design a domain specific language. Implement an application generator for it. Apply all techniques of the course that are useful for the task. (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 14 Objectives: Get an idea of the task In the lecture: * Comment the task description. * Explain the role of the running example. Assignments: In the tutorial * Discuss the task description. * Explain the purpose of such a generator. * Give examples for its input and output. * What are the consequences of the second paragraph of the task description? * Discuss variants of the input. -------------------------------------------------------------------------------- CI-15 Meaning preserving transformation A compiler transforms correct sentences of its source language into sentences of its target language such that their meaning is unchanged. Meaning Language definition described for Source language abstract machine Compilation same results Target language Execution Machine description A meaning is defined only for correct programs. Compiler task: Error handling The compiler analyses static properties of the program at compile time, e. g. definitions of Variables, types of expressions. Decides: Is the program compilable? Dynamic properties of the program are checked at runtime, e. g. indexing of arrays. Decides: Is the program executable? But in Java: Compilation of bytecode at runtime, just in time compilation (JIT) (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 15 Objectives: Understand fundamental notions of compilation In the lecture: The topics on the slide are explained. Examples are given. * Explain the role of the arcs in the commuting diagram. * Distinguish compile time and run-time concepts. * Discuss examples. -------------------------------------------------------------------------------- CI-16 Example: Tokens and structure Character sequence int count = 0; double sum = 0.0; while (count 20 getstatic #4 11 iload_1 23 if_icmplt 7 (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 17 Objectives: Get an idea of the name analysis and transformation task In the lecture: Some requirements for these tasks are discussed along the example: * program objects and their properties, * program constructs and their types * target program Questions: * Why is the name (e.g. count) a property of a program object (e.g. k1)? * Can you impose some structure on the target code? -------------------------------------------------------------------------------- CI-18 Language definition - Compiler task * Notation of tokens lexical analysis keywords, identifiers, literals formal definition: regular expressions * Syntactic structure syntactic analysis formal definition: context-free grammar * Static semantics semantic analysis, transformation binding names to program objects, typing rules usually defined by informal texts * Dynamic semantics transformation, code generation semantics, effect of the execution of constructs usually defined by informal texts in terms of an abstract machine * Definition of the target language (machine) transformation, code generation assembly (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 18 Objectives: Relate language properties to levels of definitions In the lecture: * These are prerequisites of the course "Grundlagen der Programmiersprachen" (see course material GdP-13, GdP13a). * Discuss the examples of the preceding slides under these categories. Suggested reading: Kastens / Übersetzerbau, Section 1.2 Assignments: * Exercise 1 Let the compiler produce error messages for each level. * Exercise 2 Relate concrete language properties to these levels. Questions: Some language properties can be defined on different levels. Discuss the following for hypothetical languages: * "Parameters may not be of array type." Syntax or static semantics? * "The index range of an array may not be empty." Static or dynamic semantics? -------------------------------------------------------------------------------- CI-19 Compiler tasks Scanning Lexical analysis Conversion Structuring Parsing Syntactic analysis Tree construction Name analysis Semantic analysis Type analysis Translation Data mapping Transformation Action mapping Execution-order Code generation Register allocation Instruction selection Encoding Instruction encoding Assembly Internal Addressing External Addressing (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 19 Objectives: Task decomposition leads to compiler structure In the lecture: * Explain tasks of the rightmost column. * Relate the tasks to chapters of the course. Suggested reading: Kastens / Übersetzerbau, Section 2.1 Assignments: Learn the German translations of the technical terms. -------------------------------------------------------------------------------- CI-20 Compiler structure and interfaces Source program Lexical analysis Token sequence Syntactic analysis Abstract program tree Semantic analysis Transformation Analysis (frontend) Intermediate language Synthesis (backend) Optimization Code generation Abstract Peephole optimization machine program Assembly Target program (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 20 Objectives: Derive compiler modules from tasks In the lecture: In this course we focus on the analysis phase (frontend). Suggested reading: Kastens / Übersetzerbau, Section 2.1 Assignments: Compare this slide with U-08 and learn the translations of the technical terms used here. Questions: Use this information to explain the example on slide CI-16 -------------------------------------------------------------------------------- CI-21 Software qualities of the compiler * Correctness Translate correct programs correctly. Reject wrong programs and give error messages * Efficiency Storage and time used by the compiler * Code efficiency Storage and time used by the generated code Compiler task: Optimization * User support Compiler task: Error handling (recognition, message, recovery) * Robustness Give a reasonable reaction on every input (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 21 Objectives: Consider compiler as a software product In the lecture: Give examples for the qualities. Questions: Explain: For a compiler the requirements are specified much more precisely than for other software products. -------------------------------------------------------------------------------- CI-22 Strategies for compiler construction * Obey exactly to the language definition * Use generating tools * Use standard components * Apply standard methods * Validate the compiler against a test suite * Verify components of the compiler (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 22 Objectives: Apply software methods for compiler construction In the lecture: It is explained that effective construction methods exists especially for compilers. Questions: What do the specifications of the compiler tasks contribute to more systematic compiler construction? -------------------------------------------------------------------------------- CI-23 Generators Pattern: Environment Specification Generator Implemented algorithm Interfaces Typical compiler tasks solved by generators: Regular expressions Scanner generator Finite automaton Context-free grammar Parser generator Stack automaton Attribute grammar Attribute evaluator Tree walking algorithm generator Code patterns Code selection Pattern matching generator integrated system Eli: Specifications Cooperating generators Compiler (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 23 Objectives: Usage of generators in compiler construction In the lecture: The topics on the slide are explained. Examples are given. Suggested reading: Kastens / Übersetzerbau, Section 2.5 Assignments: * Exercise 5: Find as many generators as possible in the Eli system. -------------------------------------------------------------------------------- CI-24 Environment of compilers Compilation units Source programs Executable program Preprocessor Source program Libraries Interactive commands Core dump Debugger Compiler Input Output Code files Linker Executable program Interpreter Source program Analysis part Input abstract machine Output (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 24 Objectives: Understand the cooperation between compilers and other language tools In the lecture: * Explain the roles of language tools * Explain the flow of information Suggested reading: Kastens / Übersetzerbau, Section 2.4 -------------------------------------------------------------------------------- CI-25 Compilation and interpretation of Java programs Source modules Java Compiler load needed class files Class files dynamically - in Java Bytecode local or via Internet (intermediate language) Interpreter Bytecode prozessor Java Virtual Machine Class in software JVM loader Just-In-Time Compiler (JIT) Machine code Input Output (c) 2001 bei Prof. Dr. Uwe Kastens -------------------------------------------------------------------------------- Lecture Compiler I WS 2001/2002 / Slide 25 Objectives: Special situation for Java In the lecture: Explain the role of the absctract machine JVM: * Interpretation of bytecode. * Compile and optimize while executing the program. * Load class files while executing the program. Questions: * explain why the JVM can not rely on the type checks made by the compiler. --------------------------------------------------------------------------------