Chapter 27 Answers |
Document last modified: |
Question 27.0
- Do both sum array A = <1,2,3,4,5,6,7,8>? Yes
- Assume a constant cost of 1, what is the recurrence for the right sum? 2T(n/2) + 1
- Draw the recursion tree for the right sum.

4. What is the upper bound of each? O(n)
From recursion tree, n size 1 problems at bottom:
From Master Theorem, Case 1: f(n) = O(1) = O(nlog22-1) = O(n1-1) = O(n0) , T(n)= Θ(nlog22) = Θ(n1)
5. What is the recursion tree height? lg n
sum(A,n)
- result = 0
- for i = 1 to n
- result = result + A[ i ]
- return result
sum(A, l, r) )
- if l ≥ r return A[ l ]
- m=⌊(l+r)/2⌋
- return sum(A, l, m) + sum(A, m+1, r)
6. Suppose each call to sum was executed on a new processor.
How many total processors would we need to sum the 8 values? 15
What is the upper bound in execution time, in terms of n? O(lg n) because longest path is lg n to last processors at bottom of recursion tree.
7. What is 1024 / lg 210? 102

Question 27.1 - In above, which strands are parallel or serial?
Question 27.2 Does the left subroutine exhibit more parallelism (do more work in less time) than the right or just use more processors?
Draw the execution of strands for the left similar to that below at right.
P-Fib( n )
|
P-Fib( n )
|
Parent thread executing two spawns results in three threads executing. The parent waits for children to complete at sync. Both achieve same parallelism but left has higher cost for spawning one un-necessary child.

Question 27.3
Question 27.4
The following finds the maximum of an array using divide-and-conquer.
For: P-max( <6, 3, 5, 7, 8, 1, 4, 2>, 0, 7)
- Draw the serial recursion tree.
- Draw the parallel threads. Same as recursion tree.
- What is the length of the Critical Path? 3
- What is lg 23? 3
- Determine the parallelism.
Work
Span
Parallelism
|
|
Question 27.5
Give a parallel version that multiplies each element of an array by 2 using divide-and-conquer.
P-double( <6, 3, 5, 7, 8, 1, 4, 2>, 0, 7)
Number of calls? 2T1(n/2)
= 2lg n+1-1
= (n+1)lg 2-1
= n+1-1
= Θ(n)Recursion tree height? lg n
T1(n) = Θ( n )
T∞(n) = Θ( lg n )
T1(n) / T∞(n) = Θ( n / lg n)
P-double(A, l, r)
if l ≥ r then A[ l ] = A[ l ] * 2
m=⌊(l+r)/2⌋
spawn P-sum(A, l, m)
spawn P-sum(A, m+1, r)
sync
return

Question 27.6
1. Give a parallel version using parallel for that searches an array for a value and returns the index if found or -1.
LinearSearch( <16, 13, 15, 17, 15, 12, 15, 14>, 15) returns index 3.
2. Give a parallel version using divide-and-conquer that searches an array for a value and returns the index if found or -1.
LinearSearch(A,x,n)
- location = -1
- for i = 1 to n
- if A[ i ] == x
- location = i
- return location
Hint: For analysis, recall that the parallel for operating on an array can be synthesized by a parallel divide-and-conquer of the array. 3. For the parallel algorithm, what is:
T1(n)
T∞(n), the span of parallel for (recall that parallel for is equivalent to divide-and-conquer)
T1(n) / T∞(n)
4. Is there a best or worst case for your algorithm? No
5. Does your algorithm always return the same value, given the same data? Yes, all threads that execute to a size 1 problem wait at sync, returning the leftmost index when found.
LinearSearch(A, x, n)
- location = -1
- parallel for i = 1 to n
- if A[i] == x
- location = i
- return location
P-LinearSearch(A, x, l, r)
if l ≥ r then
if A[ l ] = x return l
else return -1
m=⌊(l+r)/2⌋
a = spawn P-LinearSearch(A, x, l, m)
b = spawn P-LinearSearch(A, x, m+1, r)
sync
if a != -1 return a
else return bFor the parallel algorithm, what is:
T1(n), the work
O(n)
T∞(n), the span of parallel for
O(lg n)
T1(n) / T∞(n), parallelism
O(n)/O(lg n) = O(n/lg n)
Question 27.6.1
What is the Critical Path of the above execution DAG? There are several of the same length.
What is its length? 10
What is the work? 29
Do those agree with our analysis, T1(n) / T∞(n) = Θ( n )? 29/9 < 4 not equal 8.
Question 27.6.2
What is y at the end of each? 2 on left and 4 on right.
Question 27.6.3 - For below DAG:
What is the work, T1(n)? 29 strands
What is the Critical Path of the following execution DAG?
1 black 1,8
1 gray 1,8
1 black 5,8
1 gray 5,8
1 black 7,8
1 gray 7,8
1 black 8,8
1 white 7,8
1 white 5,8
1 white 1,8What is its length, T∞(n)? 10
Do those agree with our analysis, T1(n) / T∞(n) = Θ( n )? No, 29/10 = 2 9/10 not 64/8
Mat-Vec(A: 8 x 8 matrix, x) DAG is:
Question 27.7
A parallel version using parallel for that searches an array for a value and returns the index if found or -1.
Does P-search( <16, 13, 15, 17, 15, 12, 15, 14>, 15) return index 3, 5 or 7? Any of 3, 5, 7.
Will the answer always be the same given the same array? Not in this case because we do not know exactly how the parallel for is implemented. However, our earlier divide-and-conquer solution was biased to return the left-most index when found.
Will the answer always be the same, given the same array? If not, how is another answer possible? Lines 3 and 4 can be executed concurrently by n processors, each accessing the same location variable. The last processor to execute: location = i will provide the final returned result.
P-search(A, x, n)
- location = -1
- parallel for i = 1 to n
- if A[i] == x
- location = i
- return location