Callbacks
The training cycle in MRL is built around the Callback system. Rather than trying to explicitly define every training cycle variant, Callbacks define a series of events (see Events) that occur during training and allow users to easily hook into those events. The result is an extremely flexible framework that can adapt to most generative design challenges.
Callbacks use the __call__ function to organize events. The call function will be passed an event name, like compute_reward. If the Callback function has an attribute that matches the event name, the attribute is called.
Callbacks have access to the training environment (see Environment) and can access the training environment, the model/agent, the training buffer, training log, other callbacks and all other aspects of the training state
Batch State
The BatchState class is used by an Environment to track values generated or computed during a batch. Every batch, the old BatchState is deleted and a new BatchState is created.
Attributes in BatchState can be set or accessed with a key like a dictionary or as an attribute. BatchState can hold any arbitrary value during a batch. However, it was designed for the use case where every attribute is either a single value or a list/container with length equal to the current batch size.
Rewards
BatchState holds the rewards value for a batch. All reward functions should ultimately add their reward value to BatchState.rewards. See Reward for more information.
Loss
BatchState holds the loss value for a batch. This is the value that will be backpropagated during the optimizer update. All loss functions should ultimately add their value to BatchState.loss. See Loss for more information.