HPC

High Performance Computing

CUDA, GPU computing, parallel workflows, and simulation infrastructure for large-scale scientific computation.

CUDA Programming

NVIDIA's parallel computing platform for GPU-accelerated scientific computing. Write kernels that execute across thousands of GPU threads simultaneously. Covers CUDA C/C++, memory hierarchy (global, shared, registers), warp execution, and stream-based concurrency.

CUDA C/C++cuBLAScuFFTcuRANDNVCC

GPU Computing

Leveraging GPU architectures (NVIDIA Ampere, Ada Lovelace, Hopper) for massively parallel workloads. Topics include GPU memory management, kernel optimisation, occupancy analysis, and profiling with NVIDIA Nsight.

NVIDIA GPUsMemory CoalescingOccupancyNsight Profiling

Parallel Workflows

Multi-node distributed computing using MPI (Message Passing Interface), OpenMP for shared-memory parallelism, and hybrid MPI+OpenMP approaches. Task scheduling, load balancing, and workflow orchestration for cluster environments.

MPIOpenMPHybrid ParallelismSLURMPBS

Performance & Benchmarks

Standardised benchmarks for evaluating computational performance: LINPACK, HPCG, micromagnetic standard problems, and custom VELSTROM Earth-science workloads. Scaling analysis (strong and weak scaling) and Amdahl's law.

LINPACKStrong ScalingWeak ScalingRoofline Model

Simulation Resources

Tutorials

Getting started with CUDA, writing your first GPU kernel, MPI basics, and hybrid parallelism patterns.

Under Development

Benchmarks

Performance baselines for VELSTROM workloads across GPU generations. Reproducible benchmark scripts and result databases.

Under Development

Reference Models

Pre-configured simulation templates for common HPC patterns: domain decomposition, particle-in-cell, spectral methods, and Monte Carlo.

Working towards it