BioSimSpace diaries: working with crystal water molecules

Crystal water

When setting up a protein simulation it may be important to preserve specific water molecules, often referred to as crystal waters. For example, many structures from the Protein Data Bank contain the coordinates of water molecules that are resolvable via X-ray crystallography. In this example, we will use BioSimSpace to parameterise and solvate a tyrosine kinase 2 protein structure containing two crystal waters within its binding site.

Firstly, let’s load the PDB structure:

This is a single molecule that contains the oxgyen atoms of two crystal waters at the end of the structure:

We can use BioSimSpace to search for and extract the protein and water components of the system. In this case, the waters are part of a residue named WAT, so we can use this as our search term:

Now we have the components, we can parameterise them individually. First the protein, and then the water:

In this case we need to specify the desired water model, and also that we don’t want to ensure that the parameterised molecule is compatible with the topology of the molecule that we passed in. This is because we are only passing in oxygen atoms, so tLEaP will add the missing hydrogens.

By default, BioSimSpace ensures that the topology of the parameterised molecule(s) matches that of the input molecule and will raise an exception if this is not the case. This is because the input system is often required as a reference by the user, e.g. they might want to preserve a specific naming convention, chain identifiers, etc.

Let’s check one of the parameterised crystal waters:

Note that the coordinates of the oxygen atom are preserved.

Once both components are parameterised, we can combine them together to create a new system. The next step in our setup procedure is to solvate the system in a water box. Here we will use a truncated octahedral box with a base length of 7 nanometers. Since the protein is charged, during the solvation process gmx genion will have been used to neutralise the solvated system by adding counter ions. We can check this as follows:

In this case, two sodium ions have been added to neutralise the system. In doing so, gmx genion will have removed two random water molecules. In order to ensure that crystal waters are not removed, we temporarily tag them with a unique residue and molecule name during solvation.

To check that they are preserved we can re-solvate the system, asking to preserve the water naming for the existing waters in the system. (By default, they will be updated to the default GROMACS naming convention.)

We can compare the residues in both solvated systems to see the difference.

When solvating, molecules in the original system will be centered within the solvation box, hence the coordinates of the crystal waters after solvation won’t necessarily match those from before. By setting match_water=False, it is possible to preserve the naming convention used for any existing waters in the system. However, note that some simulation engines rely on a specific naming convention for water molecules and a single solvent group.

If you want to try this out for yourself log in to our JupyterHub server at try.openbiosim.org (GitHub account required) to reproduce and adapt this example.

Written by Julien Michel and Lester Hedges.