Template Callback
The TemplateCallback
class is used by the Environment
to interface with a Template
during training.
Templates serve two roles during training - filtering and scoring of samples
Filtering
Templates filter all samples added to the Buffer
and sampled during each batch. If the argument prefilter=True
is passed to the template callback, the template will remove all samples that fail the template's hard filters. If prefilter=False
is passed, Template.validate
will be used to remove invalid compounds, but will ignore compounds that violate the hard filters
Scoring
If the Template has any soft filters, those filters will be used to score compounds each batch. The aggregate soft filter score will be multiplied by TemplateCallback.weight
and added to the total reward
Contrastive Template
The ContrastiveTemplate
class applies a template to tasks based around comparing input and output compounds (ie making relative improvements to a compound's properties).
During filtering, the contrastive template will keep samples where both input and output compounds pass the filters.
Contrastive templates also use a similarity function to impose a similarity constraint on sample pairs (ie output compound must have a similarity of X to the input compound).
One consideration in using contrastive scores is how to properly scale contrastive scores. If we have a score with a maximum value of 1
, a contrastive sample pair where the score goes from 0.8
to 1
should get the same reward as a sample pair where the score goes from 0.6
to 1
. In both cases, the model maximized the output score to the greatest extent possible. To do this, we can scale the reward differences by the maximum possible reward different (ie reward = (output_reward - input_reward)/(max_reward - input_reward)
). Passing a value to max_score
will cause the contrastive template to perform this scaling.