Small molecule Linker optimization

Linker Optimization

Often in drug discovery we want to experiment with changing a specific part of a molecule, while keeping the rest constant. One example of this is optimizing a single linker on a compound.

One way this is approached is to build a dataset of compounds containing the same core molecule with different linkers. This however can pose a problem for generative models. Generative models might "lose track" of the desired core strucure during generative screening if it is not strongly enforced.

An efficient solution to this is to only generate the linker.

This tutorial looks at using the BlockTemplate module with a pre-trained linker generative model to optimize a single linker.

Performance Notes

The workflow in this notebook is more CPU-constrained than GPU-constrained due to the need to evaluate samples on CPU. If you have a multi-core machine, it is recommended that you uncomment and run the set_global_pool cells in the notebook. This will trigger the use of multiprocessing, which will result in 2-4x speedups.

This notebook may run slow on Collab due to CPU limitations.

If running on Collab, remember to change the runtime to GPU

import sys
sys.path.append('..')

from mrl.imports import *
from mrl.core import *
from mrl.chem import *
from mrl.templates.all import *

from mrl.torch_imports import *
from mrl.torch_core import *
from mrl.layers import *
from mrl.dataloaders import *
from mrl.g_models.all import *
from mrl.vocab import *
from mrl.policy_gradient import *
from mrl.train.all import *
from mrl.model_zoo import *
from mrl.combichem import *
from sklearn.metrics import r2_score
/home/dmai/miniconda3/envs/mrl/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: to-Python converter for boost::shared_ptr<RDKit::FilterCatalogEntry const> already registered; second conversion method ignored.
  return f(*args, **kwds)
os.makedirs('untracked_files', exist_ok=True)

Reward

To start we set up our reward function. We use the reward from the EGFR Affinity Optimization tutorial. See the affinity optimization tutorial for a full description of the score function and how it was trained.

In brief, the score function predicts the affinity of a compound against the HER1 variant of EGFR.

reward_model = MLP(2048, [1024, 512, 256, 128], 1, [0.2, 0.2, 0.2, 0.2], outrange=[0,15])

r_ds = Vec_Prediction_Dataset(['C'], [0], partial(failsafe_fp, fp_function=ECFP6))

r_agent = PredictiveAgent(reward_model, MSELoss(), r_ds, opt_kwargs={'lr':1e-3})

r_agent.load_state_dict(model_from_url('egfr_affinity_mlp.pt'))

r_agent.model.eval();

freeze(r_agent.model)

reward = Reward(r_agent.predict_data, weight=1.)

aff_reward = RewardCallback(reward, 'affinity', sample_name='samples_fused')

Base Compound

We will look at optimizing a linker in Gefitinib. Gefitinib is an FDA approved EGFR inhibitor

gefitinib = 'COc1cc2ncnc(Nc3ccc(F)c(Cl)c3)c2cc1OCCCN1CCOCC1'
to_mol(gefitinib)

In this tutorial, we will optimize the linker between the scaffold and the morpholine ring.

smile1 = '*c1cc2c(Nc3ccc(F)c(Cl)c3)ncnc2cc1OC'
smile2 = '*N1CCOCC1'
to_mol(smile1+'.'+smile2)

Gefitinib gets a predicted score of 7.83 from our reward function. For benchmarking our results, we want to see a noticeably higher score

reward([gefitinib])
tensor(7.8399, device='cuda:0')

Chemical Space

Next we need to develop our chemical space. This is where we decide what compounds will be allowed and what compounds will be removed.

Getting the right filtering parameters makes a huge difference in compound quality. In practice, finding the right constraints is an interative process. First run a generative screen. Then examine the highest scoring compounds and look for undesirable properties or structural features. Then update the template and iterate.

For linker optimization, we will make use of the BlockTemplate class. Block templates allow us to make a series of nested templates to control different parts of the molecule.

We will create one template for just the linker, and another template for the full molecule. When we generate a linker, it will first be checked against the linker template. Then the linker will be fused to the scaffold and evaluated using the full molecule template.

This allows us to set seprate constraints on structures in the linker versus the full molecule.

For the linker, we specify the linker should have 0 rings and be between 20-200 g/mol in weight

linker_template = Template(
                    [MolWtFilter(20, 200),
                     RingFilter(None,0)],
                    [],
                    fail_score=-1, log=False
                    )

smarts = ['[#6](=[#16])(-[#7])-[#7]',
        '[#6]=[#6]=[#6]',
        '[#7][F,Cl,Br,I]',
        '[*]#[Cl,Br,I]',
        '[*]-[Cl,Br,I]-[*]',
        '[#6;!R]=[#6;!R]-[#6;!R]=[#6;!R]',
        '[#6]#[#6]',
        '[#6;!R]-[#7;!R](-[#6;!R])-[#6;!R]',
        '[#15]',
        '[#16]',
        '[*]=[#17,#9,#35]',
        '[*]=[*]=[*]',
        '[*]-[#6]=[#6H2]',
        '[#7]~[#8]',
        '[#7]~[#7]',
        '[*;R]=[*;!R]']

full_template = Template([ValidityFilter(), 
                     SingleCompoundFilter(), 
                     RotBondFilter(None, 10),
                     HeteroatomFilter(None, 14),
                     ChargeFilter(None, 0),
                     MaxRingFilter(None, 6),
                     MinRingFilter(5, None),
                     HBDFilter(None, 5),
                     HBAFilter(None, 10),
                     MolWtFilter(None, 560),
                     LogPFilter(None, 5),
                     SAFilter(None, 7),
                     BridgeheadFilter(None,0),
                     PAINSAFilter(),
                     ExclusionFilter(smarts, criteria='any'),
                     RotChainFilter(None, 9)
                    ],
                    [], 
                    fail_score=-1., log=False, use_lookup=False)

template = LinkerBlockTemplate(smile1, smile2, linker_template,
                                    full_molecule_template=full_template)

template_cb = TemplateCallback(template, prefilter=True)

Load Model

We load the LSTM_LM_Small_Linkers_Mapped model. This is a basic LSTM-based language model trained on linkers generated from the ZINC library. The Mapped refers to the fact that this particular model is trained to generate linkers with the form [2*:1]...[2*:2]. Where [2*:1] denotes the side of the linker attached to smile1 and [2*:2] denotes the side of the linker attached to smile2.

This mapping is required to orient the linker with respect to our two fragments

agent = LSTM_LM_Small_Linkers_Mapped(drop_scale=0., opt_kwargs={'lr':5e-5})

Fine-Tune Model

Our initial model will generate structures that are incompatible with our template. This can slow down training by wasting time generating, evaluating, and filtering unwanted compounds.

To prevent this, we first generate a dataset of valid linkers and fine-tune the model.

First we sample a large number of linkers from the model. Then we use our template to filter the linkers for ones that match our template. Then we fine-tune on the template-conforming subset

set_global_pool(min(16, os.cpu_count()))
%%time
all_smiles = set()
for i in range(300):
    if i%100==0:
        print(i)
    preds, _ = agent.model.sample_no_grad(3000, 50)
    smiles = agent.reconstruct(preds)
    mols = to_mols(smiles)
    smiles = [smiles[i] for i in range(len(smiles)) if mols[i] is not None]
    all_smiles.update(set(smiles))
0
100
200
CPU times: user 32min 54s, sys: 1min 13s, total: 34min 8s
Wall time: 3min 28s
len(all_smiles)
722056
set_global_pool(min(64, os.cpu_count()))

Here we filter the SMILES with the template

all_smiles = list(all_smiles)
screened = template.recurse_fragments(all_smiles)
screened = [i for i in screened if i[2]]
screened = [i[0] for i in screened]
screened = list(set(screened))
len(screened)
19030

Optional: save dataset

df2 = pd.DataFrame(screened, columns=['smiles'])
df2.to_csv('untracked_files/pretrain_data.csv', index=False)
close_global_pool()

Now we fine-tune on our new template-conforming dataset

# screened = df2.smiles.values
agent.update_dataset_from_inputs(screened)
agent.train_supervised(64, 6, 1e-4)
agent.base_to_model()
Epoch Train Loss Valid Loss Time
0 0.49323 0.46358 00:12
1 0.48189 0.44836 00:12
2 0.45391 0.44273 00:12
3 0.46761 0.44317 00:12
4 0.43734 0.44895 00:12
5 0.43831 0.45274 00:12

Optional: save fine-tuned weights

 
agent.load_weights('untracked_files/finetuned_model.pt')

Reinforcement Learning

Now we enter the reinforcement learning stage

Loss

We use PPO as our policy gradient loss

pg = PPO(0.99,
        0.5,
        lam=0.95,
        v_coef=0.5,
        cliprange=0.3,
        v_cliprange=0.3,
        ent_coef=0.01,
        kl_target=0.03,
        kl_horizon=3000,
        scale_rewards=True)

loss = PolicyLoss(pg, 'PPO', 
                   value_head=ValueHead(256), 
                   v_update_iter=2, 
                   vopt_kwargs={'lr':1e-3})

Samplers

We create the following samplers:

  • sampler1 ModelSampler: this samples from the main model
  • sampler2 ModelSampler: this samples from the baseline model
  • sampler3 LogSampler: this samples high scoring samples from the log
  • sampler4 CombichemSampler: this sampler runs combichem generation on the top scoring compounds. The combination of generative models with combichem greatly accelerates finding high scoring compounds
gen_bs = 1500

sampler1 = ModelSampler(agent.vocab, agent.model, 'live', 400, 0., gen_bs)
sampler2 = ModelSampler(agent.vocab, agent.base_model, 'base', 400, 0., gen_bs)
sampler3 = LogSampler('samples', 'rewards', 10, 98, 100)

mutators = [
    ChangeAtom(['6', '7', '8', '9']),
    AppendAtomSingle(['C', 'N', 'O', 'F', 'Cl', 'Br']),
    AppendAtomsDouble(['C', 'N', 'O']),
    DeleteAtom(),
    ChangeBond(),
    InsertAtomSingle(['C', 'N', 'O']),
    InsertAtomDouble(['C', 'N']),
    AddRing(),
    ShuffleNitrogen(10)
]

mc = MutatorCollection(mutators)

crossovers = [FragmentCrossover()]

cbc = CombiChem(mc, crossovers, template=template, rewards=[],
                prune_percentile=70, max_library_size=400, log=True, p_explore=0.2)

sampler4 = CombichemSampler(cbc, 20, 98, 0.2, 1, 'rewards', 'combichem')

samplers = [sampler1, sampler2, sampler3, sampler4]

Callbacks

Additional callbacks

supervised_cb = SupervisedCB(agent, 20, 0.5, 98, 1e-4, 64, epochs=2)
live_max = MaxCallback('rewards', 'live')
live_p90 = PercentileCallback('rewards', 'live', 90)

cbs = [supervised_cb, live_p90, live_max]

Environment and Train

Now we can put together our Environment and run the training process

env = Environment(agent, template_cb, samplers=samplers, rewards=[aff_reward], losses=[loss],
                 cbs=cbs)
set_global_pool(min(10, os.cpu_count()))
env.fit(128, 50, 350, 20)
iterations rewards rewards_final new diversity bs template valid affinity PPO rewards_live_p90 rewards_live_max
0 7.384 7.384 1.000 1.000 128 0.000 1.000 7.384 0.103 7.986 8.269
20 7.611 7.611 0.805 1.000 128 0.000 1.000 7.611 0.104 7.970 8.137
40 7.556 7.556 0.727 1.000 128 0.000 1.000 7.556 0.118 7.927 8.454
60 7.741 7.741 0.594 1.000 128 0.000 1.000 7.741 0.103 7.905 8.437
80 7.753 7.753 0.562 1.000 128 0.000 1.000 7.753 0.102 7.993 8.429
100 7.767 7.767 0.492 1.000 128 0.000 1.000 7.767 0.081 7.800 8.159
120 7.729 7.729 0.547 1.000 128 0.000 1.000 7.729 0.107 8.011 8.474
140 7.750 7.750 0.602 1.000 128 0.000 1.000 7.750 0.104 7.996 8.169
160 7.853 7.853 0.422 1.000 128 0.000 1.000 7.853 0.121 8.076 8.336
180 7.800 7.800 0.477 1.000 128 0.000 1.000 7.800 0.091 8.084 8.455
200 7.932 7.932 0.367 1.000 128 0.000 1.000 7.932 0.090 7.993 8.177
220 7.938 7.938 0.359 1.000 128 0.000 1.000 7.938 0.094 8.067 8.164
240 7.934 7.934 0.344 1.000 128 0.000 1.000 7.934 0.091 8.019 8.333
260 7.960 7.960 0.250 1.000 128 0.000 1.000 7.960 0.086 8.191 8.311
280 8.066 8.066 0.336 1.000 128 0.000 1.000 8.066 0.100 8.178 8.303
300 8.023 8.023 0.242 1.000 128 0.000 1.000 8.023 0.116 8.161 8.391
320 8.083 8.083 0.281 1.000 128 0.000 1.000 8.083 0.091 8.128 8.803
340 8.026 8.026 0.219 1.000 128 0.000 1.000 8.026 0.107 8.236 8.549
env.log.plot_metrics()

Evaluation

Our base compound had a score of 7.84. After 350 batches, we have found ~5000 compounds with a higher score. We found 9 compounds with a score of 9.15 or higher. Since these scores represent -log(IC50), these new high scoring compounds would have a 20x increase in affinity (assuming you trust our score function 😉)

subset = env.log.df[env.log.df.affinity>9.15]
r_mols = to_mols(subset.samples.values)
mols = to_mols(subset.samples_fused.values)
legends = [f'{subset.affinity.values[i]:.3f}' for i in range(subset.shape[0])]
draw_mols(mols, legends=legends)