Universität Paderborn - Home Universität Paderborn
Die Universität der Informationsgesellschaft

Parallel Programming WS 2014/2015 - Solution 5

Kastens, Pfahler
Institut für Informatik, Fakultät für Elektrotechnik, Informatik und Mathematik, Universität Paderborn
Jan 05, 2015

Solution for Exercise 4

Original loop:

   for I = 1 to 3 do
     for J = 1 to 3 do
       A[I, J] = A[I - 1, J + 1] + 1
     endfor
   endfor

a)
The iteration space including a dependence vector.
   x x x
   x x x      
   x x x

D = 1
    -1
b)
It is illegal to permute the I and J loops because the resulting dependence vector is negative:
   T = 0 1
       1 0

   T * D = 0 1 * 1  = -1
           1 0   -1    1
c)
Applying a skewing transformation with factor 1:
   T = 1 0
       1 1

   T * D = 1 0 *  1  = 1
           1 1   -1    0

Legal.
d)
The resulting iteration space including a dependence vector:
       x       coord (3,6) 
     x x
   x x x
   x x
   x           coord (1,2)

D = 1
    0
e)
Permuting the I and J loops of the resulting code:
   T = 0 1
       1 0

   T * D = 0 1 * 1  = 0
           1 0   0    1

Legal.
f)
The resulting iteration space including a dependence vector:
       x x x      coord (4,3) - coord(6,3)
     x x x
   x x x          coord (2,1) - coord(4,1)

D = 0
    1
g)
The loop bounds of the transformed loop derived from the drawing:
   i' runs from 2 to 6
   j' runs from max(1, i' - 3) to min(3, i' - 1)
h)
Homework: Deriving the loop bounds mathematically:
Skewing matrix S:    1 0
                     1 1

Permutation matrix:  0 1
                     1 0

Overall transformation:
T = P * S:           1 1
                     1 0
The inverse transformation matrix T-1:
   0 1
   1 -1
The bounds equation derived from the original program is
   B * i  <=  -1
       j       3
              -1
               3
==>
   -1  0  * i  <=  -1
    1  0    j       3
    0 -1           -1
    0  1            3
The bounds equation derived for the transformed program is
   B * T-1 * i'  <=  -1
              j'       3
                      -1
                       3
==>
    0 -1  * i'  <=  -1
    0  1    j'       3
   -1  1            -1
    1 -1             3
which yields
   j' >= 1
   j' <= 3
   j' - i' <= -1
   i' - j' <= 3
==>
   2 <= i' <= 6
   j' >= max (1, i' - 3)
   j' y= min (3, i' - 1)
which corresponds to the bounds we derived graphically.
Additionally we compute the transformed loop body:

The old iteration variables in terms of the new ones:

   T-1 * i'  =  i
          j'     j
==>
   0  1  * i' = i
   1 -1    j'   j
==>
   i = j'
   j = i' - j'
Together the transformed program looks as follows:
   for i' = 2 to 6 do
     for j' = max (1, i' - 3) to min (3, i' - 1) do
       A[j', i' - j'] = A[j' - 1, i' - j' + 1] + 1
     endfor
   endfor

It computes the array elements in the order
A[11], A[12], A[21], A[13], A[22], A[31], A[23], A[32], A[33]

Solution for Exercise 5

   for I = 0 to N do
     for J = 0 to M do
       A[I + 1, J] = A[I, J - 1] + A[I + 1, J - 2]
     endfor                    
   endfor
These are the necessary steps: If N = M = 2 the transformed loop executes the computations:
s00: A[1, 0] = A[0, -1] + A[1, -2]
s10: A[1, 1] = A[0, 0] + A[1, -1]
s11: A[2, 0] = A[1, -1] + A[2, -2]
s20: A[1, 2] = A[0, 1] + A[1, 0]
s21: A[2, 1] = A[1, 0] + A[2, -1]
s22: A[3, 0] = A[2, -1] + A[3, -2]
s31: A[2, 2] = A[1, 1] + A[2, 0]
s32: A[3, 1] = A[2, 0] + A[3, -1]
s42: A[3, 2] = A[2, 1] + A[3, 0]

These are the same computations that are executed by the original loop. The computations of the inner loop are parallelizable.

Generiert mit Camelot | Probleme mit Camelot? | Geändert am: 19.02.2016