HOAX: a Hyperparameter optimization algorithm applied in Computational Chemistry

HOAX (hyperparameter optimization algorithm explorer) is an open-source machine learning algorithm that generates the optimal hyperparameters for the neural networks (NNs) used in computational chemistry.

The model has been proposed by a research team from The University of Groningen in the Netherlands and it is designed to automate the process of finding hyperparameters for NNs.

Hyperparameter optimization

NNs are frequently used in computational chemistry, especially for describing molecular characteristics, like their potential energy surface (PES).

NNs can be trained to capture the complex relationship between the input features of the molecules and their PESs. However, to ensure accurate and reliable PES generation, the hyperparameters must be very carefully selected.

As the process of finding hyperparameters is typically done manually by the user, HOAX is a user-friendly framework that makes the process more efficient.

The model

HOAX is an extension of the PySurf package, but it can also be used as a standalone package. It is written in Python and its NN are implemented in PyTorch.

The hyperparameters used in the model are the number of layers, the number of nodes within each layer, the learning rate, and the batch size in the neural network.

The training data was made of a set of trajectories of SO2, Pyrazine (C4H4N2), Furan (C4H4O), and Pyrrole (C4H5N).

All NN calculations were carried out using:

  • the ADAM optimizer
  • tanh activation function
  • MSE as error function

Each training session was limited to the length of less than 10,000 epochs.

Once the neural network is trained and validated, it can be used to predict PESs for new molecular structures.

Schematic overview of the HOAX package

The package is divided into three parts; the interpreter, the neural network generator, and the hyperparameter explorer.

Results and future research

The results show that the new framework requires less human effort and reduce much of the computational cost associated with constructing the reference PESs.

The framework’s robustness is confirmed by the fact that good models of PESs are still produced even with a 50% reduction in the data set.

HOAX can be easily adapted to other ML packages and other types of data, such as charges, dipole moments, and orbital energies. The only restriction is that data must be provided in the HDF5 or NetCDF format.

HOAX could also be extended to carry out cross-validation on the best networks found, even during the NN training process.

Learn more:

Other popular posts