RDKit is a fantastic open source cheminformatics package. It provides a wealth of tools, including powerful functionality for working with smiles and drawing two-dimensional representations of molecules.
In the 2023.2 release of sire we added new sire.convert functions. These enable sire to interconvert with RDKit, thereby adding both smiles and 2D visualisation support.
For example, here we could use RDKit directly to create a molecule from a smiles string, and to generate optimised 3D coordinates.
from rdkit import Chem
from rdkit.Chem import AllChem
rdkit_mol = Chem.MolFromSmiles(
"CC1([C@@H](N2[C@H](S1)[C@@H](C2=O)NC(=O)CC3=CC=CC=C3)C(=O)O)C")
rdkit_mol = AllChem.AddHs(rdkit_mol)
AllChem.EmbedMolecule(rdkit_mol)
AllChem.UFFOptimizeMolecule(rdkit_mol)
rdkit_mol
With sire.convert we can convert this RDKit molecule into a sire molecule! This allows us to use sire’s integration with NGLView to get a nice 3D view of the molecule.
import sire as sr
sr_mol = sr.convert.to(rdkit_mol, "sire")
sr_mol.view()
To make things easier, we’ve wrapped up the RDKit code, and put it behind a new function, sire.smiles.
sr_mol = sr.smiles(
"CC1([C@@H](N2[C@H](S1)[C@@H](C2=O)NC(=O)CC3=CC=CC=C3)C(=O)O)C")
sr_mol.view()
Conversions can go both ways. This means that we can use sire.convert to go back from the sire molecule to the RDKit molecule.
rdkit_mol = sr.convert.to(sr_mol, "rdkit")
rdkit_mol
RDKit has great functionality for generating the smiles strings from molecules, e.g. using rdkit.Chem.MolToSmiles.
Chem.MolToSmiles(rdkit_mol)
'[H]OC(=O)[C@]1([H])N2C(=O)[C@@]([H])(N([H])C(=O)C([H])([H])c3c([H])c([H])c([H])c([H])c3[H])[C@@]2([H])SC1(C([H])([H])[H])C([H])([H])[H]'
We have wrapped up this code and exposed it as a new .smiles function on all of our molecular containers.
sr_mol.smiles(include_hydrogens=True)
'[H]OC(=O)[C@]1([H])N2C(=O)[C@@]([H])(N([H])C(=O)C([H])([H])c3c([H])c([H])c([H])c([H])c3[H])[C@@]2([H])SC1(C([H])([H])[H])C([H])([H])[H]'
Here, we included hydrogens, which makes the smiles string a bit long! By default, we only include hydrogens that are needed to resolve structural ambiguities.
sr_mol.smiles()
'CC1(C)S[C@@H]2[C@H](NC(=O)Cc3ccccc3)C(=O)N2[C@H]1C(=O)O'
In addition to creating smiles strings, RDKit can also be used to create 2D images of molecules. We’ve wrapped up this code into a new .view2d function, which is also available on all molecular containers.
sr_mol.view2d()

The powerful thing is that these functions work even if the molecule was not originally created by RDKit or from a smiles string. For example, let’s load cholesterol from an SDF file. Cholesterol is included as a tutorial molecule, so we can download it from sire’s tutorial site.
mols = sr.load(sr.expand(sr.tutorial_url, "cholesterol.sdf"))
mols.view()
mols.smiles()
['CC(C)CCCC(C)C1CCC2C3CC=C4CC(O)CCC4(C)C3CCC12C']
mols.view2d()

And, of course, you can convert this molecule to RDKit, so you can take advantage of all of the other amazing functionality that RDKit provides!
rdkit_mol = sr.convert.to(mols, "rdkit")
rdkit_mol
If you want to try this yourself, please feel free connect to try.openbiosim.org and starting a notebook. You can download the notebook used to generate this post onto the server by running this command in one of the notebook code cells.
! wget https://github.com/OpenBioSim/posts/raw/main/sire/002_rdkit/rdkit.ipynb
The notebook will appear in the file browser in the left pane. Click on it to open, and then execute away. Feel free to use this as a starting point for your own notebooks for exploring the integration of sire and RDKit.
The new sire.convert functionality can be used to convert to the native types of other popular molecular modelling packages. Look forward to our next blog post, where we will show you how conversion to OpenMM let you run minimisation and molecular dynamics!