HPC 101: Introduction to High Performance Computing
0.1.0
  • Prerequisite
  • Learning Outcomes
  • Course Details
    • Session 0
    • Session 1
    • Session 2
    • Session 3
    • Session 4
    • Session 5
      • GPU Paralellism in HPC
      • Memory Management
      • Streams
      • Vectorization in GPU
      • Reduction in GPU
      • Distributed Parallelism
      • Exercise 8
      • Exercise 9
  • Reference
  • Contributers
HPC 101: Introduction to High Performance Computing
  • Course Details
  • Session 5
  • View page source

Session 5

  • GPU Paralellism in HPC
    • Kernel Function
    • Device Functions
    • Thread Indexing
  • Memory Management
  • Streams
  • Vectorization in GPU
    • How the Return Works
  • Reduction in GPU
  • Distributed Parallelism
    • Overview Diagram
    • Step-by-Step Code Explanation
      • Initialize MPI
      • Define Problem Size
      • Create Arrays on Root Process
      • Allocate Buffers for Chunks
      • Distribute the Work (Scatter)
      • Perform Computation
      • Gather Results to Root
      • Print Final Result
    • Summary Table
  • Exercise 8
    • Launch Jupyter Notebook
  • Exercise 9
Previous Next

© Copyright 2025, National Computational Infrastructure.

Built with Sphinx using a theme provided by Read the Docs.