Baskerville Case Studies

Baskerville is always looking to showcase and to advertise the great work that is being produced on Baskerville. If you are a Baskerville user and would like your work displayed here please contact baskerville-tier2-support@contact.bham.ac.uk

Author: Kit Windows-Yule, Associate Professor in Chemical Engineering, University of Birmingham

What is this project about?
As a Turing Fellow, I have had the good fortune to lead or otherwise be involved in a number of Data Study Groups (DSGs). DSGs are a unique sprint research activity run by the Alan Turing Institute, in which a group of ~10 PhD-level data science and AI experts from diverse fields come together to address a significant challenge proposed by an industrial or governmental “Challenge Owner”. In one of the more notable projects, the Challenge Owner was Rolls Royce, and the aim of the challenge was to use AI methods to try and develop new ideas for ways to optimise the production of turbine blades, minimising the waste and cost associated with defective parts.

In order to operate optimally, a whole turbine blade must be formed from a single crystal – that is, all its constituent atoms must be organised neatly into a single crystalline lattice. Without such single crystal components, which are capable of operating in the very hottest parts of an aeroplane engine, it is impossible to manufacture the modern jet engine. Despite decades of advances, challenges remain to identify and overcome reasons for failure to achieve a single crystal in each and every casting. Because of the actions of multiple mechanisms that result in formation and growth of secondary grains, the experiments required to separate them are too costly and too slow to perform in practice. As such, a data-driven approach to solving this problem could prove transformative.

During the DSG, a variety of artificial intelligence and data science methods were tested, including convolutional neural networks (CNNs), generative adversarial networks (GANs), and topological data analysis. While the complete solution to such a complex problem could not realistically be obtained within the DSG timeframe, several proposed solutions showed significant promise and, as such, funding was acquired to continue exploring these promising methods in a PhD project stemming from this initial exploratory work.

Why was Baskerville chosen?
For two main reasons – the scale of the challenge, and the sensitivity of the data.
Concerning the former, the data provided by Rolls Royce included over 3GB of process data and around 185GB of 3D images. In order to process these data and use them in the development and training of novel AI models within the restrictive one-week timeframe of the DSG, significant compute power was required. Baskerville’s 208 NVIDIA A100 GPUs provided the ideal solution to this issue.

Regarding the latter, despite being desensitised, the data used for the study represented real Rolls Royce process data, and as such were decidedly commercially sensitive. The secure environment provided by Baskerville allowed the data to be handled and processed safely, without fear of data leaks. Without the security promises offered by Baskerville, this project could not have gone ahead.

How has Baskerville been useful?
In addition to the need for extreme compute power and data security mentioned above, the system’s flexibility and ease of use were also pivotal to the project’s success. Due to the nature of the DSG, researchers from different institutes around the world had to learn how to use an HPC that they had never worked on before within a very limited timeframe. The extensive documentation and training resources available to the participants, as well as the excellent support provided by the Advanced Research Computing (ARC) team, ensured that they could get up to speed in no time at all, and thus make full use of the time and resources available to them.

Author: Matthew Foulkes, Professor of Physics, Imperial College London

What is your research about?
I am a theoretical materials physicist, interested in all types of everyday matter, from molecules to metals, insulators, semiconductors, magnets, and even superconductors. My research ranges from practical and applied at one end -- I study the formation and properties of protective oxide scales on high-performance alloys containing aluminium -- to theoretical and abstract at the other end -- I recently worked on superconductivity in an idealised model system known as the unitary Fermi gas. The common theme is that most of my research projects start from the quantum mechanical many-particle Schrödinger equation, which serves, in effect, as a “grand unified theory" for materials physics. There are more fundamental layers (quantum field theories and string theories) below it, but for the purposes of a materials physicist the many-particle Schrödinger equation is all we need.

Gaining the ability to solve the Schrödinger equation accurately and efficiently would be transformative, allowing materials scientists, physicists, chemists and biochemists to participate in the simulation-based revolution that has already overtaken much of engineering. This would greatly reduce the cost of discovering and developing new materials, chemicals, catalysts, and medicines. Perhaps unsurprisingly, the many-particle Schrödinger equation is difficult to solve accurately for any but the smallest systems. In practice, we almost always rely on mean-field-like approximate methods such as density functional theory. Most of the limitations in our understanding of molecules and materials, such as our failure to explain high-temperature superconductivity in the cuprates, arise when these approaches fail.

Four years ago, working with a small team from Google DeepMind, we introduced FermiNet [1], a new neural-network-based approach to solving the many-particle Schrödinger equation. Similar approaches had been tried for model systems, but ours was the first successful application to real molecules and solids. The FermiNet neural network takes any set of particle positions as input and returns the corresponding value of the many-particle wavefunction. In practice, the network replaces the analytic trial wavefunctions used in many other computational approaches to quantum theory. The weights and biases that define the network are optimised using the variational principle, which states that the best attainable approximation to the ground-state wave function can be found by minimising the energy expectation value. No external data are required. The computational cost is high, but FermiNet and its successor the Psiformer [2] are much more accurate than density functional theory, scale better with system size than comparably accurate quantum chemistry approaches, and run very naturally on massively parallel GPU-based supercomputers such as Baskerville.

Recent work, much of which used Baskerville, has shown that neural wave functions are able to describe strongly correlated systems beyond the purview of DFT, and that they can “discover” quantum phase transitions, such as Wigner crystallisation and the onset of superconductivity in the electron gas, without human guidance [3,4]. They seem to be particularly good at describing unusual systems, where conventional methods struggle. Our recent work on the binding of positrons to molecules, for example, produced what we believe to be the most accurate computations of binding energies and annihilation rates available [5].

Li2 Benzene density — Positrion Desity of Li₂ Benzene

The ground-state electron and positron densities of positronic (a) dilithium and (b) benzene molecules. The left-hand/right-hand columns show the electron/positron density. In the dilithium molecule, the positron density resembles a doughnut wrapping the covalent bond. In the benzene molecule, the positron density is very diffuse, resembling a bun surrounding the benzene hamburger. The scale bar at the bottom left of each subfigure corresponds to the radius of a hydrogen atom.

Why did you choose Baskervile?

Our codes, developed in close collaboration with researchers at Google DeepMind, were designed to run on GPU supercomputers from the start, leveraging their vast low-level parallelism via the JAX library (roughly equivalent to NumPy with the addition of automatic differentiation and just-in-time compilation), the XLA accelerated linear algebra machine learning compiler, and NVIDIA's CUDA programming model. Our simulations are memory hungry, with large runs requiring 40GB A100 GPUs or better. The GPU resources available to university-based researchers in the UK are limited, and were even more limited when we applied for our first Baskerville grant via the EPSRC's Access to High Performance Computing mechanism. Baskerville was one of only two A100 machines available at the time, and the other was already full up.

Although we had no choice but to apply for time on Baskerville, we were happy to discover that it is powerful, stable and well run. The support we received was always helpful and efficient and the installation of a few large-memory 80GB A100 GPUs was a boon. Much of our research over the past few years would have been impossible without Baskerville. If only it were 10 times larger!

D. Pfau, J.S. Spencer, A.G.D.G. Matthews, and W.M.C. Foulkes, Phys. Rev. Research 2, 033429 (2020).
I. von Glehn, J.S. Spencer, and D. Pfau, https://arxiv.org/abs/2211.13672v2 (2022).
G. Cassella, H. Sutterud, S. Azadi, N.D. Drummond, D. Pfau, J.S. Spencer, and W.M.C. Foulkes, Phys. Rev. Lett. 130, 036401 (2023). (Editors’ Suggestion).
W.T. Lou, H. Sutterud, G. Cassella, W.M.C. Foulkes, J. Knolle, D. Pfau, and J.S. Spencer, Phys. Rev. X 14, 021030 (2024).
G. Cassella, W.M.C. Foulkes, D. Pfau and J.S. Spencer, Nat. Commun. 15, 5214 (2024).

Baskerville Case Studies

Rolls Royce Data Study Group

Learning quantum mechanical wave functions with Baskerville

Baskerville pushes the boundaries

High-resolution medical scans & Baskerville

Predicting weather impacts using GPU accelerated High Performance Computing