Jacket: Assignment

This exercise is designed to compare the performance of conducting matrix multiplication on both CPU and GPU in both single-precidion and double precision. Please modify the following code

type='single';
N = 128; % matrix size
M = 400; % number of matrices
trials = 30; % several timing trials, pick best
disp 'Computing the CPU for-loop benchmarks...'
A = ones(N,N,M,type); % many matrices...
B = ones(N,N,type); % ... each multiplied against one
cpu_for_seconds = inf;
for t = 1:trials
tic
for i = 1:M
C = A(:,:,i) * B;
end
cpu_for_seconds = min(toc, cpu_for_seconds);
end
cpu_for_gflops = 2 * M * (N^3) ./ (1e9 * cpu_for_seconds)