This section contains details on all the training courses offered by the ARCHER service.
To find out the dates and locations of upcoming courses, please see the Training pages.
If you would like us to run one of the courses described below at a particular location or would like training not specifically covered by one of the courses below then please contact us via the ARCHER Helpdesk.
We have classified the ARCHER courses into 3 levels:
- Introductory: Requiring no substantial programming skills or knowledge of HPC; these courses only assume basic computer literacy.
- Intermediate: These require some existing knowledge, for example the ability to program in C or Fortran, or experience of running parallel applications on HPC systems.
- Advanced: These require an existing knowledge of parallel programming.
Outline Course Descriptions
Introductory (level 1) courses
In many domains of research the rapid generation of large amounts of data is fundamentally changing how research is done. The deluge of data presents great opportunities, but also many challenges in managing, analyzing and sharing data. Data Carpentry aims to teach the skills that will enable researchers to be more effective and productive. The workshop is designed for learners with little to no prior knowledge of programming, shell scripting, or command line tools.
Better and more effective approaches to managing digital research data are becoming increasingly important in computational science and beyond. The scientific data sets that underpin research papers can occupy many gigabytes of storage, and are increasingly complex and challenging to work with. This course introduces students to the ideas, methods and techniques of modern digital research data management. It covers: managing and moving research data; data formats; metadata; persistence, preservation and provenance of research data; licensing, copyright and access rights.
This course provides an introduction to HPC, using ARCHER for the practical sessions. The first day covers the basic concepts underlying the drivers for HPC development, HPC hardware, software, programming models and applications. The second day provides an opportunity for more practical experience, information on performance and the future of HPC. This foundation gives the ability to appreciate the relevance of HPC for attendees and equips them with the tools to start making effective use of HPC facilities themselves. This course now includes material from the previous "Introduction to ARCHER" course that we have run eight times since the start of the service. From 2015, the "Hands-on Intro to HPC" course will be structured so that those who are already familiar with HPC and only require ARCHER-specific training can attend just the second day. On completion of the course, we expect that attendees will be in a position to undertake the ARCHER Driving Test, and potentially qualify for an account and CPU time on ARCHER.
This course provides an introduction to Fortran 90/95, which contains many powerful features that make it a suitable language for programming scientific, engineering and numerical applications. Topics include: fundamentals, program control, subprograms, variables, input and output, arrays.
Python is becoming popular in all areas of scientific computing due to its flexibility and availability of many useful libraries. It is increasingly being used for HPC, with typical use cases being: pre and post-processing of simulation data; coordinating or glueing together existing pieces of code written in Fortran or C; entire applications parallelised using mpi4py. In this course we give an overview of the aspects of Python most relevant to HPC, and cover the pros and cons of typical use cases. The goal is to give guidance on using the many attractive features of Python without sacrificing performance.
Software Carpentry's goal is to help scientists and engineers become more productive by teaching them basic computing skills like program design, version control, testing, and task automation. In this two-day workshop, short tutorials will alternate with hands-on practical exercises. Participants will be encouraged both to help one another, and to apply what they have learned to their own research problems during and between sessions.
Intermediate (level 2) courses
Data Analytics, Data Science and Big Data are a just a few of the many terms used in business and academic research, all referring to the manipulation, processing and analysis of data. Fundamentally, these are all concerned with the extraction of knowledge from data that can be used for competitive advantage or to provide scientific insight. In recent years, this area has undergone a revolution in which HPC has been a key driver. This course provides an overview of data science and the analytical techniques that form its basis as well as exploring how HPC provides the power that has driven their adoption. The course will cover: key data analytical techniques such as, classification, optimisation, and unsupervised learning; key parallel patterns, such as Map Reduce, for implementing analytical techniques; relevant HPC and data infrastructures; case studies from academia and business.
As parallel codes grow ever larger and run on an increasing number of cores, naive methods for debugging and profiling cease to be practical. For large applications it is essential to use tools to aid in debugging for correctness, and in profiling for identification of performance hotspots. This hands-on course covers the use of the portable DDT debugger, instrumentation with Score-P and performance analysis with Scalasca and Vampir.
This course introduces participants to three popular European materials modelling packages - CASTEP, GPAW and CP2K - all of which have been widely used on HPC platforms in the UK and Europe. The course covers the basic theory implemented in each of the codes, as well as instruction on how to use the functionality in each package. Participants undertake tutored practical exercises with each of the codes, in order to gain hands-on experience. Participants can expect to gain enough experience to decide which code is best suited to their particular applications, and the ability to run calculations of moderate complexity using ARCHER.
This course covers all the basic knowledge required to write parallel programs using the Message Passing programming model (the method primarily used to run applications on the world's largest supercomputers), and is directly applicable to almost every parallel computer architecture. This course uses the de facto standard for message passing, the Message Passing Interface (MPI). It covers point-to-point communication, non-blocking operations, derived datatypes, virtual topologies, collective communication and general design issues.
This course is a precursor to the "Programming the Xeon Phi" course being run in collaboration with DiRAC. It will contain material selected from the "Hands-on Introduction to HPC" and "Shared-Memory Programming with OpenMP" courses designed to give attendees a general introduction to the fundamental concepts of programming multicore processors before going on to the specifics of programming the Xeon Phi.
Software development is of key importance for the effective use of HPC facilities. This course looks at the tools and techniques important for enhancing the efficiency of the software engineering process from initial design to final testing. The focus is on the application of practical techniques allowing students effectively to develop, test and maintain high quality, portable code. The course covers valuable software skills which are vital in the fields of computational science and engineering
Almost all modern computers have a shared-memory architecture with multiple cores connected to the same physical memory. This course covers OpenMP, the industry standard for shared-memory programming, which enables programs to be parallelised easily using compiler directives. The course covers an introduction to the fundamental concepts of the shared variables model, followed by the syntax and semantics of OpenMP and how it can be used to parallelise real programs.
This is an extended version of Shared-Memory Programming with OpenMP, delivered as part of the CDT programme at Southampton. In addition to covering OpenMP, other threaded programming models will be covered such as Java and Pthreads.
Advanced (level 3) courses
Applications must be adapted to utilise GPUs: most lines of source code are executed on the CPU and key computational kernels are distributed to the GPU cores. For NVIDIA GPUs, the most popular method is CUDA which is powerful but requires significant development. OpenCL is an alternative which has portability advantages. Recently, a new higher-level standard has emerged, OpenACC, which uses compiler "directives". In this course we introduce and provide hands-on experience of CUDA, OpenCL (with more emphasis on the former) and OpenACC. In many cases it can be difficult to obtain good performance, so we will cover a range of common GPU optimisation techniques.
This course is aimed at programmers seeking to deepen their understanding of MPI and explore some of its more recent and advanced features. We cover topics including communicator management, non-blocking and neighbourhood collectives, MPI-IO, single-sided MPI and the new MPI memory model. We also look at performance aspects such as which MPI routines to use for scalability, overlapping communication and calculation and MPI internal implementation issues.
This course is aimed at programmers seeking to deepen their understanding of OpenMP and explore some of its more recent and advanced features. We cover topics including nested parallelism, OpenMP tasks, the OpenMP memory model, performance tuning, hybrid OpenMP + MPI, OpenMP implementations, and upcoming features in OpenMP 4.0
One of the greatest challenges to running parallel applications on large numbers of processors is how to handle file IO. The IO part of the MPI standard gives programmers access to efficient parallel IO in a portable fashion, but there are a large number of different routines available and some can be difficult to use in practice. Despite its apparent complexity, MPI-IO adopts a very straightforward high-level model. If used correctly, almost all the complexities of aggregating data from multiple processes can be dealt with automatically by the library. The first day of the course will cover the MPI-IO standard, developing IO routines for a regular domain decomposition example. It will also briefly cover higher-level standards such as HDF5 and NetCDF. The second day will concentrate on ARCHER, covering how to configure the Lustre file system for best performance and how to tune the Cray MPI-IO library. Case studies from real codes will also be presented.
This course aims to address application performance, which is one of the key requirements for HPC applications. Performance is a difficult a requirement to satisfy because issues effecting performance often vary between different hardware and software environments. This requires performance issues to be frequently re-visited as the hardware and software environment changes. Furthermore, performance programming requires detailed knowledge of the underlying environment and the design decisions necessary to achieve good performance are often in conflict with other desirable properties of the program. After taking this course students should have a good practical understanding of the general issues and methodologies associated with designing building and refactoring codes to meet performance requirements. In addition they will have an overview of a number of subjects that are important in the understanding of performance on current systems.
Partitioned Global Address Space (PGAS) languages such as Unified Parallel C (UPC) and Fortran Coarrays have been the subject of much attention in recent years, in particular due to the exascale challenge. There is a widespread belief that existing message-passing approaches such as MPI will not scale to this level due to issues such as memory consumption and synchronisation overheads. PGAS approaches offer a potential solution as they provide direct access to remote memory. This reduces the need for temporary memory buffers, and may allow for reduced synchronisation and hence improved message latencies. This course covers how the PGAS model is implemented in C (via UPC) and Fortran (via coarrays), and also how to use the OpenSHMEM library for single-sided communication.
The Xeon Phi is Intel's recently released accelerator, which is attracting a lot of interest as it has a familiar x86 CPU architecture as opposed to the radically different architectures of GPUs. This course will cover the Xeon Phi architecture and how to use the Intel compiler and associated tools to exploit its full computational potential using a variety of offload models. It will also cover issues relating to using the Xeon Phi as part of a larger HPC system.