Repo containing blog demo materials for profiling Triton GPU kernels on
- CUDA
- ROCm
=======
Triton profiling introduction demo materials and runtime container images.
- NVIDIA GPU for the NVIDIA blog
- AMD GPU for the AMD blog
- Podman or Docker
- make
NOTE: ROCm's Compute profiler only runs on AMD CDNA GPUs, i.e. MI300X (02-12-2026).
Container images that provide a runtime environment for the target GPU
- cuda - Requires an NVIDIA GPU
- nsight - Does not require an NVIDIA GPU
- rocm - Requires an AMD GPU (i.e. MI300X)
NOTE: The nsight target only provides an environment to run the Nsight tools.
make cuda-imagemake nsight-imagemake rocm-imageAll targets will start a new container and remove it when you exit.
Runs the target image and leaves the user inside it at a bash prompt.
make [cuda | nsight]-consoleThe nsight image can only be used to view the Jupyter notebook, an NVIDIA GPU and the cuda image are required to run it.
make [cuda | nsight]-jupyterRuns the Nsight Systems UI.
make [cuda | nsight]-systemsRuns the Nsight Compute UI.
make [cuda | nsight]-computeRuns the target image and leaves the user inside it at a bash prompt.
make rocm-consolemake rocm-jupyter