Implementation and optimisation of advanced solvent modelling functionality in CASTEP and ONETEP


Key Personnel

PI/Co-Is: Chris-Kriton Skylaris - University of Southampton, Matt Probert - University of York, Jacek Dziedzic - University of Southampton, Lucian Anton - Cray

Technical: James Womack - University of Southampton

Relevant Documents

eCSE Technical Report: Implementation and optimisation of advanced solvent modelling functionality in CASTEP and ONETEP

Project summary

Implicit solvent models provide a simple, yet accurate means to incorporate solvent effects into electronic structure calculations. Such models avoid the computational expense of explicitly modelling solvent molecules by representing the solvent implicitly, for example as a polarizable dielectric medium. In this project we have implemented one such model, the minimal parameter solvation model (MPSM), in two electronic structure packages: CASTEP and ONETEP. In the MPSM, the electrostatic potential which describes the interaction of the implicit solvent and solute is determined by direct solution of the nonhomogeneous Poisson equation (NPE), with a dielectric permittivity derived directly from the quantum mechanical electron density.

The MPSM was previously implemented in ONETEP, a linear scaling density functional theory (DFT) package. In this implementation, an efficient second-order multigrid solver, DL_MG, was employed to solve the NPE with open (Dirichlet) boundary conditions (BCs). As part of this project, the model and software were extended to support periodic BCs, allowing the simulation of of systems with natural periodicity, such as surfaces and polymers, using the MPSM. Test calculations were performed which demonstrate the validity of our implementation.

To improve the accuracy of the second-order solutions produced by DL_MG for use in the MPSM, the high-order defect correction method is employed. During this project we ported the original implementation of this method in ONETEP to DL_MG, allowing the solver to directly produce higher-order-corrected solutions to the NPE. We also optimized the implementation of the defect correction, which resulted in improved performance of the defect correction in DL_MG — a 1.2-1.5x speed-up was demonstrated over the original implementation in large-scale DFT calculations.

The full MPSM was implemented in CASTEP, a state-of-the-art plane-wave-pseudopotential DFT code, using DL_MG to solve the NPE. This involved some modifications to the behaviour of the model compared to the implementation in ONETEP, to account for the differences in the theoretical methods employed by CASTEP. We have tested the CASTEP implementation on several small molecules, obtaining free energies of solvation which differ from ONETEP by approx 0.1 kcal/mol or less.

The solvent modelling capabilities of ONETEP have already enabled significant scientific work, including the development of models for biomolecular association and interaction and a method for the prediction of the effect of solvent on optical transitions in molecules. It is anticipated that the addition of these capabilities to CASTEP will enable an even wider community of researchers to apply the model to a variety of chemical systems and to develop new methodologies which make use of the model.

The extension of DL_MG to support the high-order defect correction significantly increases the capabilities of the solver library. It has been demonstrated that the defect correction can dramatically improve the accuracy of solutions to the NPE over a second-order solution. Including this directly within the solver library reduces the burden on software developers who wish to use DL_MG in their own software packages, but require greater than second-order accuracy — DL_MG can now provide high-order corrected solutions directly, without the need for an external correction scheme. This is already enabling collaborative work involving the application of DL_MG in electronic structure codes other than CASTEP and ONETEP.

The integration of DL_MG into CASTEP provides additional capabilities beyond the modelling of solvent. Using the multigrid solver, CASTEP can, for the first time, perform calculations with open BC electrostatics, enabling the treatment of isolated systems without the need to correct for the use of periodic BCs. The ability to solve the NPE for an arbitrary dielectric function also opens the door to embedding in non-solvent environments — another first for CASTEP. We anticipate that these additional capabilities will be of great utility for CASTEP users in a variety of contexts.

Achievement of objectives

  1. Enable the CASTEP user community to do simulations in the presence of pure and saline solvent
    • The full ONETEP solvent model will be implemented in CASTEP, adapted to its specific data layout and parallel compute model.

      The complete implicit solvation model was successfully implemented in CASTEP, enabling fully open boundary condition implicit solvent calculations with a fixed dielectric cavity. However, the multigrid solver and computation of boundary conditions only run on a single MPI rank currently due to difficulties encountered with CASTEP's non-contiguous real-space data distribution.

    • Accuracy validation

      Solvation energies computed using the solvent model in ONETEP and CASTEP demonstrate excellent numerical agreement. For the small molecules tested (neutral and charged) the free energies of solvation computed by ONETEP and CASTEP differed by approx. 0.1 kcal/mol or less.

    • Commit the solvent model capability to the CASTEP main branch.

      The solvent model capability is committed to the CASTEP development repository and will be merged into the main branch once the details of the user interface are finalised.

    • Demonstrate calculations with more than 1000 cores on ARCHER using CASTEP with parallel performance in solvent equivalent to that of standard CASTEP calculations in vacuum.

      This was not possible due to the issues encountered with CASTEP's real-space density distribution which made a full MPI-parallel implementation infeasible during this project. Work is currently underway as part of a PhD project to address the issue of non-contiguous real-space data, which should allow a full MPI-parallel implementation.

  2. Enable simulations with generalised boundary conditions in the solvent model in ONETEP and CASTEP
    • The solvent model implementation will be extended to periodic and mixed (periodic in some directions and open in others) boundary conditions.

      Support for fully periodic boundary condition (BC) solvent calculations was implemented in ONETEP. Support for calling the multigrid solver with open, periodic and mixed BCs was added to ONETEP. However, mixed BC support was not added to the solvent model. This would require work which is beyond the scope of this project.

    • Commit the extensions to the ONETEP and CASTEP main branches.

      The extension of the solvent model and multigrid solver interface will be merged into the main branch in the near future. The CASTEP solvent model was not extended to support periodic and mixed BCs as this was not feasible during the timescale of this project.

    • Demonstrate comparable calculation speed and parallel scaling with the open boundary condition model on test calculations of surfaces (slabs) in contact with solvent which are periodic in 2 dimensions and open in the third dimension.

      Mixed BC support was not implemented, so this test was not possible.

    • Accuracy validation: Demonstrate solvation energy calculation of molecule in periodic solvent simulation cells.

      The consistency of treatment of periodic BCs was demonstrated in calculations with small molecules located in different positions in the simulation cell. As expected the free energy of solvation calculated with molecules at the centre of the cell differed negligibly from the energy with the molecule positioned straddling cell boundaries. Free energies of solvation for graphene sheets with different unit cell sizes were computed in periodic BCs and the per-atom solvation energy was shown to be effectively independent of cell size.

  3. Improvements to DL_MG code
    • Transfer to the DL_MG code algorithms currently implemented in ONETEP that can be abstracted from the solvent model in order to avoid duplication of code between ONETEP and CASTEP.

      The defect correction code from ONETEP was successfully ported to DL_MG.

    • Implement these algorithms in DL_MG more efficiently than in the current ONETEP solvent model.

      The defect correction code in DL_MG was modified to support generalized periodic and mixed BCs and a general 3-D parallel decomposition of data. The halo communication and high-order finite difference components of the defect correction code in DL_MG were optimized.

    • Commit all improvements to the DL_MG main branch.

      The defect correction code was committed to the DL_MG development repository and will be incorporated into the next major release.

    • Demonstrate solvation calculations with ONETEP with better parallel performance than the original ONETEP solvent model at all core counts.

      A 1.2-1.5x speedup was obtained for the combined defect correction and multigrid solver component of large scale ONETEP calculations, relative to the previous implementation. This was demonstrated over a range of core counts in vacuum and solvent.

  4. Dissemination of solvent model to CASTEP and ONETEP communities
    • Include presentations and training exercises for the solvent model in the CASTEP and ONETEP summer schools.

      The 2017 ONETEP masterclass is taking place in August 2017 and will provide training on using the implicit solvent model in ONETEP, if suitable for participants. However, the implementation of the solvent model in CASTEP requires some further optimization and usability improvements before being made available to the wider user community so will not be demonstrated in the workshop this year.

    • Update the user and developerís documentation for ONETEP, CASTEP and DL_MG.

      Developer documentation for ONETEP, CASTEP and DL_MG was written throughout the project as part of software development. This is in the process of being refined and tidied before being committed to the main branches / forming part of a major release.

    • Present project outcomes at conferences.

      A poster was presented by J. C. Womack at CCP9 Young Researchers Meeting and Community Meeting event (April 2017). He also gave a talk at the Coding Solvation Workshop, Livorno, Italy (August 2017).

    • Publish paper describing the solvent model with its unified implementation in the CASTEP and ONETEP codes.

      A publication describing DL_MG (including developments during this eCSE project) has been published in the Journal of Chemical Theory and Computation. A publication describing the implementation and application of the solvent model in CASTEP and ONETEP is planned to follow the DL_MG publication.

Summary of the software

The developments completed during this project were committed to the respective version control repositories of ONETEP, CASTEP and DL_MG. ONETEP and CASTEPís repositories are not public, so the developments will become available through future commercial releases (to industrial customers) or through access to the source code via academic license or collaborator agreement. The DL_MG source code is available at its CCPForge project page.

Precompiled ONETEP and CASTEP binaries are currently available on ARCHER, and the developments completed during this work will become available when these binaries are updated.

For further information see the websites for each of the codes: