31st July 2023

Here we have obtained a free energy change of approximately 2.5 kcal/mol for the vacuum transformation. In order to compute a relative hydration free energy it is necessary to perform the same transformation for a merged molecule solvated in a box of water molecules, and to subtract the vacuum free energy change to the solvated free energy change. If you want to check how to write code to complete a thermodynamic cycle for a relative hydration free energy calculation consult this BSS tutorial.

Something more complicated: protein-ligand relative binding free energy calculations

A common use case for AFE calculations is to compute the relative binding free energies (RBFE) of a congeneric series (usually 10-50 compounds) of ligands for a protein. This can be done with BioSimSpace by implementing code similar to that used for the ethane to methanol case. Since this involves repeated operations on a large number of ligands, it is convenient to automate the process and create higher-level front-ends for the setup and analysis operations. Here, we highlight one possible implementation that is based on the use of two separate interactive BioSimSpace Jupyter notebook scripts (blue boxes below). The first is used to set up all the necessary calculations, and the second to analyze all the completed simulations. The intermediate tasks involving running all the required MD simulations and analyzing every trajectory to generate a set of free energy changes are implemented via a set of automated scripts that deploy compute tasks across a SLURM scheduler (orange boxes).

a schematic description of an implementation of a RBFE pipeline with BioSimSpace

The setup script can be run interactively to select desired simulation settings from a dropdown menu.

a dropdown menu to select simulation settings

After the selection of simulation settings, a network of relative binding free energy calculations that spans the entire uploaded dataset is generated using BSS.Align.generateNetwork , which provides an interface to the LOMAP software. Here, this step is illustrated with a dataset of 16 TYK2 ligands from the FEP+ JACS 2015 benchmark set.

 

a LOMAP generated network via BioSimSpace

 

The setup completes with production of input files for processing the entire dataset on a cluster configured with a SLURM scheduler.

The interactive analysis script is used to generate scatter plots of computed and measured relative binding free energies for each edge of the above network. It is called once the simulations have completed. The figure below was obtained by processing triplicate runs performed with the SOMD engine on the above network

 

a scatter plot of measured and calculated RBFEs for the Tyk2 dataset

 

This data can be processed by different network analysers to produce binding free energies for the dataset with respect to an arbitrary reference point. For instance here we have used the cinnabar tool maintained by OpenFE to perform this analysis.

back-transformed network RBFEs into absolute BFEs

The above plot shows that despite a few large outliers in the scatter plot of relative binding free energies ΔΔG, the back-transformed model that predicts ΔG values is relatively robust. For applications where the experimental data is unknown, it is sufficient to generate a rank-ordered list of compounds by calculated binding free energy.

If you want to try this out for yourself, check out our detailed tutorials that provide an introduction to alchemical free energy calculations and protein-ligand relative binding free energy calculations by logging in to our JupyterHub server at try.openbiosim.org (GitHub account required) and following the instructions in the file TUTORIALS.txt.

Written by Julien Michel.