October 1st
Reading Assignment : LU, QR, and Cholesky Factorizations Using Vector
Capabilities of GPUS
Matrix Multiply exploration - compile and evaluate the (lab1-
matrixmul, lab2-, and lab3)
- evaluate different sizes for BLOCK_SIZE
- gather execution time for BLOCK_SIZE
- compute GFLOPS
- use hardware performance counters to evaluate the change in number
of coalesced memory operations (loads/stores)
Scan exploration - compile and evaluate the SDK/projects/scan
(Note: the CPU version is built into the code module as:
computeGold, in scan_gold.cpp)
-evaluate various data sizes (see /scan/doc directory)