HPC 101: Introduction to High Performance Computing
0.1.0
Prerequisite
Learning Outcomes
Course Details
Session 0
Session 1
Session 2
Session 3
Session 4
Session 5
GPU Paralellism in HPC
Memory Management
Streams
Vectorization in GPU
Reduction in GPU
Distributed Parallelism
Exercise 8
Exercise 9
Reference
Contributers
HPC 101: Introduction to High Performance Computing
Course Details
Session 5
View page source
Session 5
GPU Paralellism in HPC
Kernel Function
Device Functions
Thread Indexing
Memory Management
Streams
Vectorization in GPU
How the Return Works
Reduction in GPU
Distributed Parallelism
Overview Diagram
Step-by-Step Code Explanation
Initialize MPI
Define Problem Size
Create Arrays on Root Process
Allocate Buffers for Chunks
Distribute the Work (Scatter)
Perform Computation
Gather Results to Root
Print Final Result
Summary Table
Exercise 8
Launch Jupyter Notebook
Exercise 9