shredx.modules.transformer.SINDyLossTransformerEncoder#

class shredx.modules.transformer.SINDyLossTransformerEncoder(d_model: int, n_heads: int, dim_feedforward: int, dropout: float, hidden_size: int, input_length: int, num_layers: int, dt: float, sindy_loss_threshold: float, activation: Module, bias: bool, layer_norm_eps: float, norm_first: bool, device: str = 'cpu')#

Bases: SINDyLossMixin, TransformerEncoder

Transformer encoder with SINDy loss regularization.

Combines a standard transformer encoder with SINDy-based regularization that encourages the learned representations to follow sparse polynomial ODEs.

Parameters:

d_modelint: Input dimension of the model.
n_headsint: Number of attention heads.
dim_feedforwardint: Dimension of feedforward network.
dropoutfloat: Dropout probability.
hidden_sizeint: Hidden dimension size.
input_lengthint: Length of input sequences.
num_layersint: Number of transformer encoder layers.
dtfloat: Time step for SINDy derivatives.
sindy_loss_thresholdfloat: Threshold for coefficient sparsification.
activationnn.Module: Activation function for feedforward layers.
biasbool: Whether to use bias in linear layers.
layer_norm_epsfloat: Epsilon for layer normalization.
norm_firstbool: Whether to apply layer norm before attention.
devicestr, optional: Device to place the model on. Default is "cpu".

Methods

forward(src[, is_causal])

Forward pass through the transformer encoder with SINDy loss.

Notes

Class Methods:

forward(src, is_causal):

Forward pass through the transformer encoder with SINDy loss.
Parameters:
- src : Float[torch.Tensor, "batch seq_len d_model"]. Input tensor.
- is_causal : bool, optional. Whether to apply causal masking. Default is True.
Returns:
- tuple. Tuple containing the final output tensor of shape (batch_size, 1, seq_len, hidden_size) and a dictionary of auxiliary losses. The dictionary contains the SINDy loss as "sindy_loss".