Calibrated uncertainty for molecular property prediction using ensembles of message passing neural networks

Research output: Contribution to journalJournal articlepeer-review

Documents

  • Fulltext

    Final published version, 517 KB, PDF document

  • Jonas Busk
  • Peter Bjørn Jørgensen
  • Arghya Bhowmik
  • Mikkel N. Schmidt
  • Winther, Ole
  • Tejs Vegge

Data-driven methods based on machine learning have the potential to accelerate computational analysis of atomic structures. In this context, reliable uncertainty estimates are important for assessing confidence in predictions and enabling decision making. However, machine learning models can produce badly calibrated uncertainty estimates and it is therefore crucial to detect and handle uncertainty carefully. In this work we extend a message passing neural network designed specifically for predicting properties of molecules and materials with a calibrated probabilistic predictive distribution. The method presented in this paper differs from previous work by considering both aleatoric and epistemic uncertainty in a unified framework, and by recalibrating the predictive distribution on unseen data. Through computer experiments, we show that our approach results in accurate models for predicting molecular formation energies with well calibrated uncertainty in and out of the training data distribution on two public molecular benchmark datasets, QM9 and PC9. The proposed method provides a general framework for training and evaluating neural network ensemble models that are able to produce accurate predictions of properties of molecules with well calibrated uncertainty estimates.

Original languageEnglish
Article number015012
JournalMachine Learning: Science and Technology
Volume3
Issue number1
Number of pages12
DOIs
Publication statusPublished - 2022

    Research areas

  • molecular property prediction, machine learning potential, uncertainty quantification, uncertainty calibration, message passing neural network, graph neural network, ensemble model, DESIGN

Number of downloads are based on statistics from Google Scholar and www.ku.dk


No data available

ID: 288267480