Combinatorial chemistry functions
/home/dmai/miniconda3/envs/mrl/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: to-Python converter for boost::shared_ptr<RDKit::FilterCatalogEntry const> already registered; second conversion method ignored.
  return f(*args, **kwds)

Combichem

Combichem methods use stochastic rules-based molecular changes to generate new compounds. Combichem consists of the following steps:

  1. Library generation - create the next iteration of the library
  2. Library scoring - apply a numeric score to each item in the library
  3. Library pruning - remove low scoring compounds

Library generation consists of two main steps - mutation and crossover.

A mutation is a process that maps a single molecule to a new molecule. Custom mutations can be created by subclassing Mutator

Crossover is a process where a pair of molelcules is used to generate a new molecule that contains features of each parent molecule. Custom crossovers can be created by subclassing Crossover

class Crossover[source]

Crossover(name='crossover')

Crossover - base class for crossover events. To create custom crossovers, subclass Crossover and implement the Crossover.crossover method

When called, Crossover is passed a list of Mol objects. The crossover operation randomly generates molecular pairs, and sends those pairs to Crossover.crossover

Crossover.crossover[source]

Crossover.crossover(mol_pair)

crossover - performs crossover operation

Inputs:

  • mol_pair list[Chem.Mol, Chem.Mol]: list of two Mol objects

class FragmentCrossover[source]

FragmentCrossover(full_crossover=False, name='fragment crossover') :: Crossover

FragmentCrossover - crossover based on molecular fragmentation.

Each Mol is fragmented into a set of (scaffold, rgroup) pairs by cutting single bonds in the molecule.

During crossover, molecular pairs are merged following scaffold1 + rgroup2

Inputs:

  • full_crossover bool: if True, all scaffold, rgroup combinations are generated

  • name str: crossover name

df = pd.read_csv('files/smiles.csv')
mols = to_mols(df.smiles.values[:10])
cx = FragmentCrossover()
out = cx(mols)

class Mutator[source]

Mutator(name=None)

Mutator - base class for mutations. To create custom mutations, subclass Mutator and implement Mutator.mutate

class SmartsMutator[source]

SmartsMutator(smarts, name=None) :: Mutator

SmartsMutator - SMARTS reaction based mutator.

Inputs:

  • smarts list[str]: list of SMARTS reaction strings

  • name str: mutator name

class ChangeAtom[source]

ChangeAtom(atom_types=None) :: SmartsMutator

ChangeAtom - SMARTS-based mutator that changes atom type without changing molecular structure

Inputs:

  • atom_types Optional[list[str]]: list of allowed atom types. Must be strings of atomic weights, ie ['6', '7', '8']

Default: ['6', '7', '8', '9', '15', '16', '17', '35']

class AppendAtomSingle[source]

AppendAtomSingle(atom_types=None) :: SmartsMutator

AppendAtomSingle - SMARTS-based mutator that appends an atom somewhere on the input structure with a single bond

Inputs:

  • atom_types Optional[list[str]]: list of allowed atom types. Must be strings of atomic symbols, ie ['C', 'N', 'O']

Default: ['C', 'N', 'O', 'F', 'P', 'S', 'Cl', 'Br']

class AppendAtomsDouble[source]

AppendAtomsDouble(atom_types=None) :: SmartsMutator

AppendAtomsDouble - SMARTS-based mutator that appends an atom somewhere on the input structure with a double bond

Inputs:

  • atom_types Optional[list[str]]: list of allowed atom types. Must be strings of atomic symbols, ie ['C', 'N', 'O']. Atom types must be compatible with forming a double bond

Default: ['C', 'N', 'O', 'P', 'S']

class AppendAtomsTriple[source]

AppendAtomsTriple(atom_types=None) :: SmartsMutator

AppendAtomsTriple - SMARTS-based mutator that appends an atom somewhere on the input structure with a triple bond

Inputs:

  • atom_types Optional[list[str]]: list of allowed atom types. Must be strings of atomic symbols, ie ['C', 'N']. Atom types must be compatible with forming a triple bond

Default: ['C', 'N']

class AppendAtom[source]

AppendAtom() :: SmartsMutator

AppendAtom - SMARTS-based mutator that appends an atom somewhere on the input structure.

Combines AppendAtomSingle, AppendAtomsDouble and AppendAtomsTriple

class DeleteAtom[source]

DeleteAtom() :: SmartsMutator

DeleteAtom - SMARTS-based mutator that randomly deletes an atom from the input structure

class ChangeBond[source]

ChangeBond() :: SmartsMutator

ChangeBond - SMARTS-based mutator that randomly changes a bond in the input structure

class InsertAtomSingle[source]

InsertAtomSingle(atom_types=None) :: SmartsMutator

InsertAtomSingle - SMARTS-based mutator that randomly inserts an atom into the input structure with single bonds

Inputs:

  • atom_types Optional[list[str]]: list of allowed atom types. Must be strings of atomic symbols, ie ['C', 'N', 'O']

Default: ['C', 'N', 'O', 'P', 'S']

class InsertAtomDouble[source]

InsertAtomDouble(atom_types=None) :: SmartsMutator

InsertAtomDouble - SMARTS-based mutator that randomly inserts an atom into the input structure with a double bond

Inputs:

  • atom_types Optional[list[str]]: list of allowed atom types. Must be strings of atomic symbols, ie ['C', 'N', 'O']. Atom types must be compatible with forming a double bond

Default: ['C', 'N', 'P', 'S']

class InsertAtomTriple[source]

InsertAtomTriple() :: SmartsMutator

InsertAtomTriple - SMARTS-based mutator that randomly inserts an atom into the input structure with a triple bond

class InsertAtom[source]

InsertAtom() :: SmartsMutator

InsertAtom - SMARTS-based mutator that randomly inserts an atom into the input structure.

Combines InsertAtomSingle, InsertAtomDouble and InsertAtomTriple

class AddRing[source]

AddRing() :: SmartsMutator

AddRing - SMARTS-based mutator that randomly creates rings

class AllSmarts[source]

AllSmarts() :: SmartsMutator

AllSmarts - SMARTS-based mutator that combines ChangeAtom, AppendAtom, DeleteAtom, ChangeBond, InsertAtom, and AddRing

class AppendRgroupMutator[source]

AppendRgroupMutator(rgroups, name='Rgroup') :: Mutator

AppendRgroupMutator - randomly appends r-groups to the input molecule

Inputs:

  • rgroups list[str]: list of rgroups. All rgroups should have a single wildcard (*) atom

  • name str: mutator name

class EnumerateHeterocycleMutator[source]

EnumerateHeterocycleMutator(depth=None, name='enum heteroatoms') :: Mutator

EnumerateHeterocycleMutator - mutates input molecule by enumerating nitrogens on heterocycles

Inputs:

  • depth int: number of recursive enumerations

  • name str: mutator name

m = EnumerateHeterocycleMutator()
len(m(df.smiles.values[1]))
72

class ShuffleNitrogen[source]

ShuffleNitrogen(n_shuffles, name='shuffle nitrogen') :: Mutator

ShuffleNitrogen - mutates input molecule by shuffling the positions of carbon and nitrogen atoms in the molecule

Inputs:

  • n_shuffles int: number of shuffled variants to generate

  • name str: mutator name

class ContractAtom[source]

ContractAtom(include_rings=True, name='contract') :: Mutator

ContractAtom - mutates input molecule by removing an atom with two bonds and joining the removed atoms neighbors with a single bond.

ie a-b-c -> a-c

Inputs:

  • include_rings bool: if True, rings will be contracted

  • name str: mutator name

class SelfiesMutator[source]

SelfiesMutator(n_augs, name='selfies') :: Mutator

SelfiesMutator - base class for SELFIES based mutation

Inputs:

  • n_augs int: number of mutated versions to generate

  • name str: mutator name

class SelfiesInsert[source]

SelfiesInsert(n_augs, name='selfies insert') :: SelfiesMutator

SelfiesInsert - SELFIES insertion mutator. Randomly inserts a SELFIES token into the input compound

Inputs:

  • n_augs int: number of mutated versions to generate

  • name str: mutator name

m = SelfiesInsert(20)
mol = to_mol('c1ccccc1')
len(m(mol))
20

class SelfiesReplace[source]

SelfiesReplace(n_augs, name='selfies replace') :: SelfiesMutator

SelfiesReplace - SELFIES replacement mutator. Randomly replaces a SELFIES token in the input compound

Inputs:

  • n_augs int: number of mutated versions to generate

  • name str: mutator name

m = SelfiesReplace(20)
mol = to_mol('c1ccccc1')
len(m(mol))
20

class SelfiesRemove[source]

SelfiesRemove(n_augs, name='selfies remove') :: SelfiesMutator

SelfiesRemove - SELFIES removal mutator. Randomly removes a SELFIES token in the input compound

Inputs:

  • n_augs int: number of mutated versions to generate

  • name str: mutator name

class MutatorCollection[source]

MutatorCollection(mutators, p_mutators=None)

MutatorCollection - orchestrates a set of Mutator classes. When called, randomly selects a mutator to apply to the input mol

Inputs:

  • mutators list[Mutator]: list of mutator objects

  • p_mutators Optional[list[float]]: Optional list of probabilities for selecting each mutator. If None, a uniform distribution is applied

class CombiChem[source]

CombiChem(mutator_collection=None, crossovers=None, template=None, rewards=None, prune_percentile=90, max_library_size=None, log=False, p_explore=0.0)

CombiChem - class for running combichem operations

Inputs:

  • mutator_collection Optional[MutatorCollection]: Collection of mutations to use

  • crossovers Optional[list[Crossover]]: list of Crossover objects

  • template Optional[Template]: Template to control chemical space

  • rewards Optional[Reward]: Rewards to score molecules

  • prune_percentile int[0,100]: Percentile of compounds to keep during pruning

  • max_library_size Optional[int]: Maximum library size after pruning

  • log bool: If True, compounds generated by combichem are logged

  • p_explore float[0.,1.]: Percentage of compounds below prune_percentile to keep