Strategies for generating databases to effectively train machine learned interatomic potentials

Machine learned interatomic potentials (MLIP) are used to predict the total energy, atom-centered forces and stress tensor of crystalline materials at a fraction of the cost of ab-initio methods. These MLIPs are most commonly trained on a database of density functional theory (DFT) calculations.

A key challenge to training accurate MLIPs is to ensure that the database consists of diverse structures which include different local atomic environments. Two common strategies for ensuring wide coverage of these local environments are (1) Bayesian error estimation with kernel methods and (2) active learning through committee approaches with neural networks.

In this work, we will apply both these techniques to prototypical systems in materials science and electrochemistry, such as bulk metal oxides and aqueous electrolyte interfaces. We will benchmark and explain why one approach performs better than the other.

This project is well suited for students interested in developing and applying machine learning models. Prior experience in DFT calculations and machine learning techniques is not required. Basic knowledge of programming will be useful but can be picked up during the project.

Name of Faculty

Sudarshan Vijay