Machine-learned interatomic potential

Beginning in the 1990s, researchers have employed machine learning programs to construct interatomic potentials, mapping atomic structures to their potential energies. These potentials are generally referred to as 'machine-learned interatomic potentials' (MLIPs) or simply 'machine learning potentials' (MLPs). Such machine learning potentials promised to fill the gap between density functional theory, a highly-accurate but computationally-intensive simulation program, and empirically derived or intuitively-approximated potentials, which were far computationally lighter but substantially less accurate. Improvements in artificial intelligence technology have served to heighten the accuracy of MLPs while lowering their computational cost, increasing machine learning's role in fitting potentials.

Machine learning potentials began by using neural networks to tackle low dimensional systems. While promising, these models could not systematically account for interatomic energy interactions; they could be applied to small molecules in a vacuum and molecules interacting with frozen surfaces, but not much else, and even in these applications often relied on force fields or potentials derived empirically or with simulations. These models thus remained confined to academia.

Modern neural networks construct highly-accurate, computationally-light potentials because theoretical understanding of materials science was increasingly built into their architectures and preprocessing. Almost all are local, accounting for all interactions between an atom and its neighbor up to some cutoff radius. There exist some nonlocal models, but these have been experimental for almost a decade. For most systems, reasonable cutoff radii enable highly accurate results.

Almost all neural networks intake atomic coordinates and output potential energies. For some, these atomic coordinates are converted into atom-centered symmetry functions. From this data, a separate atomic neural network is trained for each element; each atomic neural network is evaluated whenever that element occurs in the given structure, and then the results are pooled together at the end. This process - in particular, the atom-centered symmetry functions, which convey translational, rotational, and permutational invariances - has greatly improved machine learning potentials by significantly constraining the neural networks' search space. Other models use a similar process but emphasize bonds over atoms, using pair symmetry functions and training one neural network per atom pair.

Still other models, rather than using predetermined symmetry-dictating functions, prefer to learn their own descriptors instead. These models, called message-passing neural networks (MPNNs), are graph neural networks. Treating molecules as three-dimensional graphs (where atoms are nodes and bonds are edges), the model intakes feature vectors describing the atoms, and iteratively updates these feature vectors as information about neighboring atoms is processed through message functions and convolutions. These feature vectors are then used to predict the final potentials. This method gives more flexibility to the artificial intelligences, often resulting in stronger and more generalizable models. In 2017, the first-ever MPNN model, a deep tensor neural network, was used to calculate the properties of small organic molecules. Such technology was commercialized, leading to the development of Matlantis in 2022, which extracts properties through both the forward and backward passes. Matlantis, which can simulate 72 elements, handle up to 20,000 atoms at a time, and execute calculations up to 20,000,000 times faster than density functional theory with almost indistinguishable accuracy, showcases the power of machine learning potentials in the age of artificial intelligence.

Gaussian Approximation Potential (GAP)
One popular class of machine-learned interatomic potential is the Gaussian Approximation Potential (GAP),  which combines compact descriptors of local atomic environments with Gaussian process regression to machine learn the potential energy surface of a given system. To date, the GAP framework has been used to successfully develop a number of MLIPs for various systems, including for elemental systems such as Carbon, Silicon, Phosphorus, and Tungsten, as well as for multicomponent systems such as Ge2Sb2Te5 and austenitic stainless steel, Fe7Cr2Ni.