GPU Programming with CUDA

Graphics Processing Units (GPUs) were originally developed for computer gaming and other graphical tasks, but for many years have been exploited for general purpose computing in a number of areas. They offer advantages over traditional CPUs because they have greater computational capability, and use high-bandwidth memory systems (memory bandwidth is the main bottleneck for many scientific applications).


Kevin Stratford

Kevin has a background in computational physics and joined EPCC in 2001. He teaches on courses including 'Scientific Programming with Python' and 'GPU Programming with CUDA'.


Draft page based on previous run of this course. Some details may be subject to change.


This introductory course will describe GPUs, and the advantages they offer.

It will teach participants how to start to program GPUs, which cannot be used in isolation, but are usually used in conjunction with CPUs.

Important issues affecting performance will be covered.

The course focuses on NVIDIA GPUs, and the CUDA programming language (an extension to C/C++ or Fortran). Please note the course is aimed at application programmers; it does not consider machine learning or any of the packages available in the machine learning arena.

Hands-on practical sessions are included.

You will require your laptop, and your institutional credentials to connect to eduroam. The training parctical exercises will be run on a web-based system so all you will need is a relatively recent web browser (Firefox, Chrome and Safari are known to work).

This course is free to all academics.


Provisional timetable based on previous run - may be subject to change.

Day 1

  • 10:00 Introduction
  • 10:20 GPU Concepts/Architectures
  • 11:00 Break
  • 11:20 CUDA Programming
  • 12:00 A first CUDA exercise
  • 13:00 Lunch
  • 14:00 CUDA Optimisations
  • 14:20 Optimisation Exercise
  • 15:00 Break
  • 15:20 Constant and Shared Memory
  • 16:00 Guest Lecture Alan Gray (NVIDiA) Overview of NVIDIA Volta
  • 17:00 Close

Day 2

  • 10:00 Constant and Shared Memory
  • 10:10 Exercise
  • 10:30 OpenCL and Directives
  • 11:00 Break
  • 11:30 OpenCL and / or Directives Exercises
  • 13:00 Lunch
  • 14:00 Performance portability and Kokkos
  • 14:30 Exercise: Getting started with Kokkos patterns
  • 15:00 Break
  • 15:10 Kokkos memory management
  • 15:30 Memory management exercises
  • 16:00 Close


The course will be held at University of Birmingham.


Please use the registration page to register for ARCHER courses.


If you have any questions please contact the ARCHER Helpdesk.