Copy propagation

From Citizendium
Revision as of 11:30, 16 February 2007 by imported>Chris day (Copy propagation ON WHEELSS!!! moved to Copy propagation: revert)
Jump to navigation Jump to search

Copy Propagation is an optimization used in compilers. It is an enabling optimization, which is to say that it does not directly reduce code size or increase code speed; rather, it reorganizes code so that other optimizations may be more effective. Copy propagation operates on a low-level intermediate representation such as quads or register transfer level (RTL), and can operate on either the basic block or control flow graph level.

Basic RTL Demonstration

The basic idea of the copy propagation optimization is to seek out two instructions where the at least one source of the second instruction is the destination operand of the first copy instruction. In RTL,

  1. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B\gets A}
  2. zero or more instructions
  3. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle C\gets f(x_1, x_2, \ldots, B, \ldots, x_n)}

In other words, the code copies the value of A into both B, and then has some instruction which depends on the value of B. If the compiler finds such a pair of instructions, and if the compiler can prove that the intermediate instructions (2) do not modify the values of B or A, then the compiler can transform this to the equivalent code:

  1. zero or more instructions

Since the value of A has not changed, and since B also held that value, the third instruction must compute the same value.

When Copy Propagation may be Applied

In the earlier RTL listings, it was assumed that the instructions take place within a basic block---that is, that there were no branching instructions or function calls. However, the presence of branches or calls does not necessarily prevent the compiler from performing this instruction. It does require that the compiler perform control flow analysis to verify a few characteristics.

First, it must be verified that the basic block which contains the assignment of B dominates the block which contains the use of B, or in other words that the definition is always executed at least once before the use. This test prevents cases such as this conditional statement, expressed in pseudocode:

  1. if( some condition )
  2. then
  3. else
  4. end if

In this example, the assignment does not dominate the use , because there is a path of execution which contains the second assignment, but not the first (namely, the else clause).

Next, it must be shown that no other assignment to A or B dominates the second assingment and post-dominates the first assignment. In other words, the assignment must always be the most recent assignment to A or B before the assignment . Take for example, this loop:

  1. while( some condition )
  2. end loop

In this example, the assignment clearly dominates the assignment . However, we cannot change the second assignment to because A changes within the loop; after the first iteration of the loop, C would contain the value of X and not the value of the initial A.

How Copy Propagation may enable other Optimizations

Creation of Dead Code

Copy propagation may aid the optimization process by creating more dead code, which can later be removed by dead code elimination. For instance,

After copy propagation has been applied twice, we have

  1. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle C\gets A}
  2. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle D\gets A}

Suppose that instruction (3) was the only use of memory location C. If so, then no instructions depend on the value of C, and the instruction may be removed.

Constant Folding

Copy propagation may also aid constant in constant folding. If, for instance, the operand A was a constant, and the use was an addition:

  1. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B\gets 4}
  2. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle C\gets 2 + B}

A single copy propagation will yield,

  1. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B\gets 4}
  2. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle C\gets 2 + 4}

The constant can later be folded to,

  1. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B\gets 4}
  2. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle C\gets 6}


Template:Stub