The focus of Day 3 is tiled matrix multiplication
mat_mul_tiled.cu
(Improved matrix multiplication)To compile and profile the CUDA codes, use:
nvcc -o compiled_code_name source_code.cu
nsys profile --stats=true compiled_code_name
Youtube video by OMean1Sigma https://www.youtube.com/watch?v=QmKNE3viwIE&t=172s
Youtube video by OMean1Sigma https://www.youtube.com/watch?v=Q3GgbfGTnVc&t=313s