High Performance Computing
CUDA, GPU computing, parallel workflows, and simulation infrastructure for large-scale scientific computation.
CUDA Programming
NVIDIA's parallel computing platform for GPU-accelerated scientific computing. Write kernels that execute across thousands of GPU threads simultaneously. Covers CUDA C/C++, memory hierarchy (global, shared, registers), warp execution, and stream-based concurrency.
GPU Computing
Leveraging GPU architectures (NVIDIA Ampere, Ada Lovelace, Hopper) for massively parallel workloads. Topics include GPU memory management, kernel optimisation, occupancy analysis, and profiling with NVIDIA Nsight.
Parallel Workflows
Multi-node distributed computing using MPI (Message Passing Interface), OpenMP for shared-memory parallelism, and hybrid MPI+OpenMP approaches. Task scheduling, load balancing, and workflow orchestration for cluster environments.
Performance & Benchmarks
Standardised benchmarks for evaluating computational performance: LINPACK, HPCG, micromagnetic standard problems, and custom VELSTROM Earth-science workloads. Scaling analysis (strong and weak scaling) and Amdahl's law.
Simulation Resources
Tutorials
Getting started with CUDA, writing your first GPU kernel, MPI basics, and hybrid parallelism patterns.
Benchmarks
Performance baselines for VELSTROM workloads across GPU generations. Reproducible benchmark scripts and result databases.
Reference Models
Pre-configured simulation templates for common HPC patterns: domain decomposition, particle-in-cell, spectral methods, and Monte Carlo.