CS 211- Fall  2008: Homework 3- Due October 14th, 6pm

 

Ques 1. For the following code fragment, assume that all data references are shown, that all values are defined before use, and that only b and c are used again after this segment. You may ignore any possible exceptions. The individual statements are numbered to provide an easy reference.

 
       1.  if (a > c) {
       2.               d = d + 5;
       3.               a= b + d +e;}
          }  else  {
       4.               e = e + 2;
       5.               f = f + 2;
       6.               c = c + f;
           }
       7.   b = a + f;

   
    (a) List the control dependencies in the code. For each control dependence, tell whether the dependent statement can be scheduled before the if statement based on the data references.


    (b) Assume a dynamically scheduled multiple-issue processor without speculation and with a window (i.e., scope of ILP) that is holding the entire code fragment. Find the data dependencies and use this information to make a list of the successive groups of statements that are issued together.


    (c) It is given that only the values b and c are live after the code segment. If it is known that a value is not live at some point in the code, then the statement that defines that value can be deleted without changing the program meaning. Find any values that are not live within the code fragment, and list the statement(s) that a compiler with this information could delete.


    (d) How does the result for part (c) above affect the ILP achieved by the processor in part (b) ? What does this illustrate about measuring computer performance factors such as ILP, making hardware design choices, and compiler technology ?

Ques 2. Consider the following loop. First, list all the dependencies. Can this loop, as written, be unrolled ? Explain. Can this loop be rewritten  so loop unrolling can be applied ? Explain.

 

            ….

                        For (i=1; i<100; i=i+1){

                        A[i] = A[i] + B[i];  /* S1 */

                        B[i]= 2*B[i]; /* S2 */

                        C[i] = a[i-1] + b[i-1];  /*S3 */
                        D[i] = D[i] * A[i-1];  /* S4 */

                        }

 

Practice Problems: (will not be graded so do not turn these in; but you must try to solve these since they will help you understand the material).

 

PP1: Consider the following segment of code, and assume you have to schedule it on a superscalar processor using Tomasulo's method.

load R0, (R4)

add  R1, R0, #10

load R2, (R5)

mul  R0, R2, #10

mul  R4, R2, R1

                    store R0, (R5)

Assume you have a superscalar processor with multiple functional units and you have to schedule the above code using the Tomasulo algorithm. Assume the processor has two reservation stations for the load/store, two reservation stations for Add, and two reservation stations for multiplication. However, the processor has only one load unit with a latency of 2 cycles, one add unit with latency of 1 cycle, and one multiplication unit with latency of 2 cycles. Further, assume that data is written to the Common data bus (CDB) after the execution is complete in the functional units; i.e., the results are available for the next instruction one cycle after the functional unit completes. For example, if a load is issued and started in cycle 1 then it completes in cycle 3, releasing the functional unit, and the instruction using its result can start in cycle 4.

 

 

 

 PP2: Consider the following piece of code, and derive the branch prediction accuracy for the two branch predictors defined in what follows.

             i = 0;
             while (i<100) do {  /*  Branch b1 */

                    a[i] = a[i]*b[i];
                     i = i+1;
             }
             while (i<150) do { /* branch b2 */
                   c[i] = c[i]*10;
                    i = i+1;
             }

First assume you have a 1 bit branch predictor set to T. Derive branch prediction accuracy for the above code using the 1 bit predictor.
Next, assume you have a 2 bit branch predictor as given in Figure 2.4. Compute the branch prediction accuracy for the above code assuming both branches history bits are set to NT.         

 

PP3: Case study 1 from textbook.