Pytorch model layers

class LinearBlock[source]

LinearBlock(d_in, d_out, act=True, bn=False, dropout=0.0, **lin_kwargs) :: Module

LinearBlock - Combined linear, batchnorm, ReLU and dropout. Layers are executed in the order linear, batchnorm, ReLU, dropout. Batchnorm, activation and dropout layers are optional

Inputs:

  • d_in int: number of input dimensions

  • d_out int: number of output dimensions

  • act bool: if True, applies a ReLU activation

  • bn bool: if True, applies 1d batchnorm

  • dropout float: dropout percentage

  • **lin_kwargs dict: keyword args passed to nn.Linear

layer = LinearBlock(128, 64, bn=True, dropout=0.5)
_ = layer(torch.randn(16,128))

class ValueHead[source]

ValueHead(d_in, dropout=0.0) :: Module

ValueHead - used in RL algorithms to predict state values

Inputs:

  • d_in int: number of input dimensions

class Conv[source]

Conv(d_in, d_out, ks=3, stride=1, padding=None, ndim=2, act=True, bn=False, dropout=0.0, **conv_kwargs) :: Module

Conv - base module for convolutions

Inputs:

  • d_in int: number of input dimensions

  • d_out int: number of output dimensions

  • ks int: kernel size

  • stride int: stride

  • padding [int, None]: padding. If None, derived from kernel size

  • ndim int: conv dimension (1D conv, 2D conv, 3D conv)

  • act bool: if True, applies a ReLU activation

  • bn bool: if True, applies batchnorm consistent with ndim

  • dropout float: dropout percentage

  • **conv_kwargs dict: keyword args passed to nn.Conv

class Conv1d[source]

Conv1d(d_in, d_out, ks=3, stride=1, padding=None, act=True, bn=False, dropout=0.0, **conv_kwargs) :: Conv

Conv1d - 1D convolution

Inputs:

  • d_in int: number of input dimensions

  • d_out int: number of output dimensions

  • ks int: kernel size

  • stride int: stride

  • padding [int, None]: padding. If None, derived from kernel size

  • act bool: if True, applies a ReLU activation

  • bn bool: if True, applies batchnorm consistent with ndim

  • dropout float: dropout percentage

  • **conv_kwargs dict: keyword args passed to nn.Conv1D

class Conv2d[source]

Conv2d(d_in, d_out, ks=3, stride=1, padding=None, act=True, bn=False, dropout=0.0, **conv_kwargs) :: Conv

Conv2d - 2D convolution

Inputs:

  • d_in int: number of input dimensions

  • d_out int: number of output dimensions

  • ks int: kernel size

  • stride int: stride

  • padding [int, None]: padding. If None, derived from kernel size

  • act bool: if True, applies a ReLU activation

  • bn bool: if True, applies batchnorm consistent with ndim

  • dropout float: dropout percentage

  • **conv_kwargs dict: keyword args passed to nn.Conv2D

class Conv3d[source]

Conv3d(d_in, d_out, ks=3, stride=1, padding=None, act=True, bn=False, dropout=0.0, **conv_kwargs) :: Conv

Conv3d - 3D convolution

Inputs:

  • d_in int: number of input dimensions

  • d_out int: number of output dimensions

  • ks int: kernel size

  • stride int: stride

  • padding [int, None]: padding. If None, derived from kernel size

  • act bool: if True, applies a ReLU activation

  • bn bool: if True, applies batchnorm consistent with ndim

  • dropout float: dropout percentage

  • **conv_kwargs dict: keyword args passed to nn.Conv3D

class SphericalDistribution[source]

SphericalDistribution(loc, scale, validate_args=False) :: Distribution

SphericalDistribution - samples from points on the surface of a sphere

Inputs:

  • loc torch.Tensor: vector of means

  • scale torch.Tensor: vector of variances

class Prior[source]

Prior() :: Module

Prior - base class for trainable priors

class NormalPrior[source]

NormalPrior(loc, log_scale, trainable=True) :: Prior

NormalPrior - normal distribution prior

Inputs:

  • loc torch.Tensor: vector of means

  • log_scale torch.Tensor: vector of log-variances

  • trainable bool: if True, loc and scale are trainable

Note that log-variances are used for stability. Optimizing the variance directly can cause issues with gradient descent sending variance values negative

class SphericalPrior[source]

SphericalPrior(loc, log_scale, trainable=True) :: NormalPrior

SphericalPrior - spherical distribution prior

Inputs:

  • loc torch.Tensor: vector of means

  • log_scale torch.Tensor: vector of log-variances

  • trainable bool: if True, loc and scale are trainable

Note that log-variances are used for stability. Optimizing the variance directly can cause issues with gradient descent sending variance values negative

p = NormalPrior(torch.zeros((64,)), torch.zeros((64,)), trainable=True)
assert p.rsample(5).requires_grad
assert not p.sample(5).requires_grad

p = SphericalPrior(torch.zeros((2,)), torch.zeros((2,)), trainable=False)
assert not p.rsample(5).requires_grad
assert not p.sample(5).requires_grad

class SequenceDropout[source]

SequenceDropout(p, batch_first=True) :: Module

SequenceDropout - dropout along the sequence dimension

Inputs:

  • p float: dropout probability

  • batch_first bool: if batch dimension is first in input tensors

Samples a dropout mask that is constant in the sequence dimension

class Conditional_LSTM[source]

Conditional_LSTM(d_embedding, d_hidden, d_output, d_latent, n_layers, condition_hidden=True, condition_output=True, bidir=False, input_dropout=0.0, lstm_dropout=0.0, batch_first=True) :: Module

Conditional_LSTM - Conditional LSTM module

Inputs:

  • d_embedding int: embedding dimension

  • d_hidden int: hidden dimension

  • d_output int: output dimension

  • d_latent int: latent vector dimension

  • n_layers int: number of layers

  • condition_hidden bool: if True, latent vector is used to initialize the hidden state

  • condition_output bool: if True, latent vector is concatenated to inputs

  • bidir bool: if the LSTM should be bidirectional

  • input_dropout float: dropout percentage on inputs

  • lstm_dropout float: dropout on LSTM layers

  • batch_first bool: if batch dimension is first on input tensors

d_embedding=64
d_hidden=128
d_latent = 32
n_layers = 2

l1 = Conditional_LSTM(d_embedding, d_hidden, d_embedding, d_latent, n_layers,
                     condition_hidden=True, condition_output=True, 
                     bidir=False, batch_first=True)

l2 = Conditional_LSTM(d_embedding, d_hidden, d_embedding, d_latent, n_layers,
                     condition_hidden=True, condition_output=True, 
                     bidir=True, batch_first=True)

l3 = Conditional_LSTM(d_embedding, d_hidden, d_embedding, d_latent, n_layers,
                     condition_hidden=False, condition_output=True, 
                     bidir=False, batch_first=True)

l4 = Conditional_LSTM(d_embedding, d_hidden, d_embedding, d_latent, n_layers,
                     condition_hidden=True, condition_output=False, 
                     bidir=False, batch_first=True)

l5 = Conditional_LSTM(d_embedding, d_hidden, d_embedding, d_latent, n_layers,
                     condition_hidden=False, condition_output=False, 
                     bidir=True, batch_first=True, input_dropout=0.5, lstm_dropout=0.5)

bs = 12
x = torch.randn((bs, 21, d_embedding))
z = torch.randn((bs, d_latent))

_ = l1(x,z)
_ = l1(x,z, l1.latent_to_hidden(z))

_ = l2(x,z)
_ = l2(x,z, l2.latent_to_hidden(z))

_ = l3(x,z)
_ = l3(x,z, l3.get_new_hidden(bs))

_ = l4(x,z)
_ = l4(x,z, l4.get_new_hidden(bs))

_ = l5(x,z)
_ = l5(x,None)
_ = l5(x,z, l5.get_new_hidden(bs))

class LSTM[source]

LSTM(d_embedding, d_hidden, d_output, n_layers, bidir=False, input_dropout=0.0, lstm_dropout=0.0, batch_first=True) :: Conditional_LSTM

LSTM - LSTM module

Inputs:

  • d_embedding int: embedding dimension

  • d_hidden int: hidden dimension

  • d_output int: output dimension

  • n_layers int: number of layers

  • bidir bool: if the LSTM should be bidirectional

  • input_dropout float: dropout percentage on inputs

  • lstm_dropout float: dropout on LSTM layers

  • batch_first bool: if batch dimension is first on input tensors

l1 = LSTM(d_embedding, d_hidden, d_embedding, n_layers, bidir=False, batch_first=True)

l2 = LSTM(d_embedding, d_hidden, d_embedding, n_layers, bidir=True, batch_first=True,
         input_dropout=0.5, lstm_dropout=0.5)

_ = l1(x)
_ = l1(x, l1.get_new_hidden(bs))

_ = l2(x)
_ = l2(x, l2.get_new_hidden(bs))

class Conditional_LSTM_Block[source]

Conditional_LSTM_Block(d_vocab, d_embedding, d_hidden, d_output, d_latent, n_layers, input_dropout=0.0, lstm_dropout=0.0, bidir=False, condition_hidden=True, condition_output=False, forward_rollout=False, p_force=1.0, p_force_decay=1.0) :: Module

Conditional_LSTM_Block - combines Embedding, Conditional LSTM, and output layer

Inputs:

  • d_vocab int: vocab size

  • d_embedding int: embedding dimension

  • d_hidden int: hidden dimension

  • d_output int: output dimension

  • d_latent int: latent vector dimension

  • n_layers int: number of layers

  • input_dropout float: dropout percentage on inputs

  • lstm_dropout float: dropout on LSTM layers

  • bidir bool: if the LSTM should be bidirectional

  • condition_hidden bool: if True, latent vector is used to initialize the hidden state

  • condition_output bool: if True, latent vector is concatenated to inputs

  • forward_rollout bool: if model should generate outputs through rollout with teacher forcing

  • p_force float: teacher forcing frequency

  • p_force_decay float: teacher forcing decay rate

class LSTM_Block[source]

LSTM_Block(d_vocab, d_embedding, d_hidden, d_output, n_layers, input_dropout=0.0, lstm_dropout=0.0, bidir=False) :: Module

LSTM_Block - combines Embedding, LSTM, and output layer

Inputs:

  • d_vocab int: vocab size

  • d_embedding int: embedding dimension

  • d_hidden int: hidden dimension

  • d_output int: output dimension

  • n_layers int: number of LSTM layers

  • input_dropout float: dropout percentage on inputs

  • lstm_dropout float: dropout on LSTM layers

  • bidir bool: if the LSTM should be bidirectional

class Encoder[source]

Encoder(d_latent) :: Module

Base encoder module. All encoders have a d_latent attribute which is referenced by other modules

class LSTM_Encoder[source]

LSTM_Encoder(d_vocab, d_embedding, d_hidden, n_layers, d_latent, input_dropout=0.0, lstm_dropout=0.0) :: Encoder

LSTM_Encoder - LSTM-based encoder

Inputs:

  • d_vocab int: vocab size

  • d_embedding int: embedding dimension

  • d_hidden int: hidden dimension

  • n_layers int: number of LSTM layers

  • d_latent int: latent space dimension

  • input_dropout float: dropout percentage on inputs

  • lstm_dropout float: dropout on LSTM layers

Generates latent vector from hidden states from the last LSTM layer

d_latent = 128
l = LSTM_Encoder(32, 64, 128, 2, 128, input_dropout=0.5, lstm_dropout=0.5)
assert l(torch.randint(0,31, (10,15))).shape[-1] == d_latent

class MLP_Encoder[source]

MLP_Encoder(d_in, dims, d_latent, dropouts, bn=True) :: Encoder

MLP_Encoder - MLP-based encoder

Inputs:

  • d_in int: number of input dimensions

  • dims list[int]: list of layer sizes ie [1024, 512, 256]

  • dropouts list[float]: list of dropout pobabilities ie [0.2, 0.2, 0.3]

  • bn Optional[bool]: if True, include batchnorm

m = MLP_Encoder(128, [64, 32, 16], d_latent, [0.1, 0.1, 0.1])
assert m(torch.randn(8,128)).shape[-1] == d_latent

class Conv_Encoder[source]

Conv_Encoder(d_vocab, d_embedding, d_latent, filters, kernel_sizes, strides, dropouts) :: Encoder

Conv_Encoder - 1D conv encoder

Inputs:

  • d_vocab int: vocab size

  • d_embedding int: embedding dimension

  • d_latent int: latent vector dimension

  • filters list[int]: filter sizes for conv layers ie [64, 128, 256]

  • kernel_sizes list[int]: kernel sizes for conv layers ie [5, 5, 5]

  • strides list[int]: strides for conv layers ie [2, 2, 2]

  • dropouts list[float]: list of dropout pobabilities ie [0.2, 0.2, 0.3]

c = Conv_Encoder(32, 64, d_latent, [32, 16], [7,7], [2,2], [0.1, 0.1])
assert c(torch.randint(0,31, (10,15))).shape[-1] == d_latent

class ScaledEncoder[source]

ScaledEncoder(encoder, head=None, outrange=None) :: Module

ScaledEncoder - wrapper to scale outputs from encoder if desired

Inputs:

  • encoder nn.Module: encoder model

  • head Optional[nn.Module]: optional head model

  • outrange Optional[list[low, high]]: optional range to scale outputs

class MLP[source]

MLP(d_in, dims, d_out, drops, outrange=None, bn=True) :: ScaledEncoder

MLP - multi-layer perceptron

Inputs:

  • d_in int: number of input dimensions

  • dims list[int]: list of layer sizes ie [1024, 512, 256]

  • d_out int: number of output dimensions

  • drops list[float]: list of dropout pobabilities ie [0.2, 0.2, 0.3]

  • outrange Optional[list[float]]: squashes the output to be between outrange[0] and outrange[1]

active_model = MLP(128, [64, 32], 1, [0.2, 0.2], outrange=[0,15])
p = active_model(torch.randn((32,128)))
assert p.min()>=0 and p.max()<=15