Fragment and Scaffold
In [1]:
Copied!
%load_ext autoreload
%autoreload 2
%load_ext autoreload
%autoreload 2
In [2]:
Copied!
import datamol as dm
dm.disable_rdkit_log()
import datamol as dm
dm.disable_rdkit_log()
Fragmentation¶
The fragmentation methods implemented in datamol
will return the fragment set coverage of a molecule, as opposed to a break down of the molecules into non-overlapping blocks.
In the following, let's fragment a molecule using multiple methods.
In [3]:
Copied!
smiles = "CCCOCc1cc(c2ncccc2)ccc1"
mol = dm.to_mol(smiles)
mol
smiles = "CCCOCc1cc(c2ncccc2)ccc1"
mol = dm.to_mol(smiles)
mol
Out[3]:
BRICS¶
In [4]:
Copied!
frags = dm.fragment.brics(mol)
dm.viz.to_image(frags, n_cols=6)
frags = dm.fragment.brics(mol)
dm.viz.to_image(frags, n_cols=6)
Out[4]:
FraggleSim¶
In [5]:
Copied!
frags = dm.fragment.frag(mol)
dm.viz.to_image(frags, n_cols=6)
frags = dm.fragment.frag(mol)
dm.viz.to_image(frags, n_cols=6)
Out[5]:
Recap¶
In [6]:
Copied!
frags = dm.fragment.recap(mol)
dm.viz.to_image(frags, n_cols=6)
frags = dm.fragment.recap(mol)
dm.viz.to_image(frags, n_cols=6)
Out[6]:
Any break¶
This method uses BRICS first and fallback to generating all possible fragmentation if it doesn't work.
In [7]:
Copied!
frags = dm.fragment.anybreak(mol)
dm.viz.to_image(frags, n_cols=6)
frags = dm.fragment.anybreak(mol)
dm.viz.to_image(frags, n_cols=6)
Out[7]:
Scaffold¶
Get the scaffolds and attachment points from a list of molecules to allow creating molecular series.
In [8]:
Copied!
# Get some mols
data = dm.data.freesolv()
smiles = data["smiles"].iloc[:].tolist()
mols = [dm.to_mol(s) for s in smiles]
scaffolds, scf2infos, scf2groups = dm.scaffold.fuzzy_scaffolding(mols)
list(scaffolds)[:4]
# Get some mols
data = dm.data.freesolv()
smiles = data["smiles"].iloc[:].tolist()
mols = [dm.to_mol(s) for s in smiles]
scaffolds, scf2infos, scf2groups = dm.scaffold.fuzzy_scaffolding(mols)
list(scaffolds)[:4]
Out[8]:
['c1cc2c([*:4])c([*:3])c([*:2])cc2cc1[*:5]', 'C(=C1c2ccccc2CCc2ccccc21)[*:1]', 'c1c([*:7])c2c(c([*:8])c1[*:9])Cc1c(c([*:1])c([*:2])c([*:3])c1[*:4])C2', 'CCc1cc([*:5])cc2cc([*:2])c([*:3])c([*:4])c12']
In [9]:
Copied!
sfs = [dm.to_mol(s) for s in list(scaffolds)]
dm.viz.to_image(sfs, n_cols=6)
sfs = [dm.to_mol(s) for s in list(scaffolds)]
dm.viz.to_image(sfs, n_cols=6)
Out[9]: