CRAY Inc.

Cray Centre of Excellence for ARCHER

The Cray Centre of Excellence for ARCHER forms part of the ARCHER service and has a mission to engage with the ARCHER community to maximise the use of Cray technologies. In recent years our focus has been to concentrate on projects that might have maximum benefit for large groups of users.

The Centre comprises two dedicated staff with many years combined experience in HPC applications, software and hardware. These staff are augmented by other Cray staff who can be engaged on particular projects as appropriate. Within Cray the Centre of Excellence staff are part of the wider Cray EMEA Research Lab (CERL). The CERL research interests include high-performance data analytics, I/O and memory hierarchy, data-centric analysis and optimisation, Large-scale Energy distribution system optimisation, application infrastructure software engineering and experimental system design. CERL is also involved in EU H2020 projects (for example Maestro and EPIGRAM-HS) and with European training network programmes. Please get in contact via the CERL site if you wish to discuss a project that perhaps falls outside the focus of the ARCHER CoE.

Access to significant experience is available via the CoE ranging from expertise in particular application areas to access to insights on new technology directions.

With tutorials at SC13 and SC14 and a finalist for best paper at CUG14 the team is heavily involved in international conferences and workshops.

The CoE team has been instrumental in recent developments to important HPC applications and technology areas such as the use of Python in HPC, power monitoring on Cray systems, UM (Climate Modelling), CASTEP (materials modelling) and OpenFOAM (CFD).

The Centre is available to augment the support offered by the helpdesk, this could cover helping users with porting applications or optimising performance and scaling on the Cray ARCHER system. The Centre is always interested in hearing from current or potential users of ARCHER who feel they could benefit from Centre of Excellence assistance.

To contact us, please email the ARCHER helpdesk support@archer.ac.uk and mark FAO Cray Centre of Excellence.

Projects

ARCHER is used for a wide variety of scientific applications and the Cray Centre of Excellence (CoE) engages with users from many different fields. As well as supporting users day-to-day on the Cray XC30 the Centre of Excellence is available to assist a variety of user projects.

In each of these projects we aim to bring best practice in software development together with a style and structure which fits comfortably into the existing software. The projects can consist of developments and optimizations to user code or detailed interaction on use of Cray tools (Cray compiler, performance tools, MPI, etc) geared towards the particular project. In the past some projects have involved interactions or collaboration with Cray R&D.

Some examples of past Cray projects can be found in the following case studies that cover activities by Cray CoE and Research Staff based in Edinburgh:

We recently started a major initiative focusing on I/O observation and optimization. This fell into three main areas:

  • Investigation of techniques to improve application I/O
  • Development of an IO analysis framework
  • Non-invasive application profiling

This activity was targeted as the best use of effort to bring maximum benefit to most users. As a possible way to improve application I/O we have studied the use of ADIOS as a way to introduce parallel APIs and advanced optimizations along with XIOS an I/O server used in the NWP/Climate community. We made available an optimization tool that can improve small-file I/O of applications, in particular with OpenFOAM. To make real advances in I/O optimizations requires a more holistic approach and in some cases one based on workflows and some of our ideas are progressing within the Cray EMEA Research Lab in EU-funded research projects.

The second major activity has been the development of LASSi, a new tool which analyses application I/O by combining I/O statistics from the Lustre filesystem with application information from the ALPS scheduler logs. Using this tool we are able to understand the I/O behaviour of applications, resolve issues in the filesystem faster and have a better understanding of the load on the filesystems at any time. We have also developed profiling tools that can provide an I/O profile of an application with no application changes, using this in addition to the I/O profiles we gain from the Luste filesystem gives us the tools to investigation application I/O that we need.

Meetings

  • The CoE attends various events in order to engage with the wider ARCHER user community. Examples in recent years include consortia-related meetings, the ARCHER Champions meetings, the Computing Insight UK meetings, ECMWF workshops and UK RSE workshop.
  • The CoE is represented on the eCSE panel meetings as technical reviewer and advisor.
  • The CoE attends the quarterly ARCHER scientific advisory committee (SAC) meetings.

Workshops

  • The CoE provided training material for the ARCHER Virtual Tutorial on preparing people for submitting an eCSE proposal
  • The CoE presented the following ARCHER Virtual Tutorials/Webinars:
    • Not So Old Fortran
    • ARCHER Programming Environment Updates (Feb 2016, May 2018)
    The tutorials/webinars are available here
  • The CoE participated in the PRACE IO optimization workshop at Daresbury Laboratory, the Efficient Parallel I/O on ARCHER course at Durham., the Porting and I/O Optimization workshop around CUG (2016), the Advanced OpenMP Course (Bristol, May, 2016).
  • The ARCHER CoE gave a Reveal Tool demonstration at the Multicore Challenge 2014 Conference
  • The ARCHER CoE organized a workshop "Is the programming environment ready for hybrid supercomputers?" as part of the PARCO 2015 conference held in Edinburgh in September 2015.
  • CoE staff supported various ARCHER Hands-On Porting and Optimization Workshops the most recent generic workshops were held in Imperial College London in May 2016 and in Birmingham (Apr, 2017).