Advaned Computer Architecture
Problem 1: Loop Optimization
For the following code fragments; first list all the data dependencies and then rearrange the code to reduce the dependencies and identify regions of parallelism.
1.
for i=1:n
for j=1:m
A(i,j)=A(j-1,i+1)+S
end
end
2.
for i=1:n
for j=1:m
A(i+1,j+1)=A(i,j)+A(i+1,j)
end
end
3.
for i=1:n
for j=1:m
A(i,j)=A(i-1,j)+A(i+1,j)+A(i,j+1)+A(i,j-1)
end
end
Problem 2: Branch Prediction
Diagram is attached below
The Figure represents a 2-bit predictor, that predicts whether a branch will be taken or not depending on the state. A (m; n) predictor is one where we consider the outcome of the last m branches to on an n-bit predictor to predict the next branch. Assume we are using a (1,2) predictor. We have two 2-bit predictors. Predictor 1 is used when the previous branch was executed and Predictor 2 is used when the previous branch is not executed. Based on this; what will be the accuracy of prediction for the following codes. Assume in the first step we start from state “predict taken”. Show all steps.
1.
a=T;
b=F;
for i=1:5
if(a==T) {a=T;}
{b=!b;}
if(a==b) {a=F; b=T;}
end
2.
a=T;
b=F;
for i=1:5
if(a==T) {a=F;}
if(b==T){b=F;}
if(a==b) {a=T; b=F;}
end
Problem 3: Parallel Algorithm Design
Design and implement a parallel algorithm for finding the (i) maximum, (ii) standard deviation and (iii) mode of a given set of positive numbers. Submit the code and a table/graph listing the running time changes for 2,4,8,16,32 processors for set size 5K 10K and 20K.
Show the steps for applying the parallel algorithms on this sequence: 8,24,4,32,128,64,12, 56, 48, 4
"You need a similar assignment done from scratch? Our qualified writers will help you with a guaranteed AI-free & plagiarism-free A+ quality paper, Confidentiality, Timely delivery & Livechat/phone Support.
Discount Code: CIPD30
Click ORDER NOW..


