PGI: Assignment

Modify the code CPU_Sgemm.f90 to call
sgemm subroutine in cuda blas library for
calculating
C = a1 * A *B + a2 * C
Where a1 and a2 are real constants, A, B and C
are real 2D arrays of NxN.