It creates schedules for instructions to execute them.
Ordering of instructions : At last, the code generator decides the order in which the instruction will be executed. Also, it decides the registers to be used to keep these values. Code generator decides what values to keep in the registers. The target machine’s architecture may not allow all of the values to be kept in the CPU memory or registers. Register allocation :Ě program has a number of values to be maintained during the execution. One representation can have many ways (instructions) to convert it, so it becomes the responsibility of the code generator to choose the appropriate instructions wisely. Selection of instruction : The code generator takes Intermediate Representation as input and converts (maps) it into target machine’s instruction set. It can be in Abstract Syntax Tree (AST) structure, Reverse Polish Notation, or 3-address code. IR Type : Intermediate representation has various forms.
The target machine can have either CISC or RISC processor architecture. That language may facilitate some machine-specific instructions to help the compiler generate the code in a more convenient way. Target language : The code generator has to be aware of the nature of the target language for which the code is to be transformed. The code generator should take the following things into consideration to generate the code: Code GeneratorĪ code generator is expected to have an understanding of the target machine’s runtime environment and its instruction set. If the target code can accommodate those instructions directly, that will not only improve the quality of code, but also yield more efficient results. The target machine can deploy more sophisticated instructions, which can have the capability to perform specific operations much efficiently. Though the output of a * a and a 2 is same, a 2 is much more efficient to implement.
Their ‘strength’ can be reduced by replacing them with other operations that consume less time and space, but produce the same result.įor example, x * 2 can be replaced by x << 1, which involves only one left shift. There are operations that consume more time and space. For example, the expression a = a + 0 can be replaced by a itself and the expression a = a + 1 can simply be replaced by INC a. There are occasions where algebraic expressions can be made simple. So instead of jumping to L1 and then to L2, the control can directly reach L2, as shown below: In this code,label L1 can be removed as it passes the control to L2. There are instances in a code where the program control jumps back and forth without performing any significant task. In this code segment, the printf statement will never be executed as the program control returns back before it can execute, hence printf can be removed. Programmers may have accidently written a piece of code that can never be reached. Unreachable code is a part of the program code that is never accessed because of programming constructs. We can delete the first instruction and re-write the sentence as: Multiple loading and storing of instructions may carry the same meaning even if some of them are removed. A bunch of statements is analyzed and are checked for the following possible optimization: Redundant instruction eliminationĪt source code level, the following can be done by the user:Īt compilation level, the compiler searches for instructions redundant in nature. These methods can be applied on intermediate codes as well as on target codes. By locally, we mean a small portion of the code block at hand. This optimization technique works locally on the source code to transform it into an optimized code. Interior nodes also represent the results of expressions or the identifiers/name where the values are to be stored or assigned. Leaf nodes represent identifiers, names or constants. DAG provides easy transformation on basic blocks. Directed Acyclic Graphĭirected Acyclic Graph (DAG) is a tool that depicts the structure of basic blocks, helps to see the flow of values flowing among the basic blocks, and offers optimization too.
We will now see how the intermediate code is transformed into target object code (assembly code, in this case). It should be efficient in terms of CPU usage and memory management.It should carry the exact meaning of the source code.We have seen that the source code written in a higher-level language is transformed into a lower-level language that results in a lower-level object code, which should have the following minimum properties: The code generated by the compiler is an object code of some lower-level programming language, for example, assembly language. Through post code generation, optimization process can be applied on the code, but that can be seen as a part of code generation phase itself. Code generation can be considered as the final phase of compilation.