shredx.modules.transformer.TransformerEncoder#
- class shredx.modules.transformer.TransformerEncoder(d_model: int, n_heads: int, num_layers: int, dim_feedforward: int, dropout: float, activation: Module, layer_norm_eps: float, norm_first: bool, bias: bool, input_length: int, hidden_size: int, device: str = 'cpu', **kwargs)#
Bases:
ModuleStandard transformer encoder for sequence modeling.
Implements input embedding, positional encoding, and stacked encoder layers for sequence-to-sequence transformation.
- Parameters:
- d_modelint
Input dimension.
- n_headsint
Number of attention heads.
- num_layersint
Number of encoder layers.
- dim_feedforwardint
Dimension of feedforward network.
- dropoutfloat
Dropout probability.
- activationnn.Module
Activation function for feedforward layers.
- layer_norm_epsfloat
Epsilon for layer normalization.
- norm_firstbool
Whether to apply layer norm before attention.
- biasbool
Whether to use bias in linear layers.
- input_lengthint
Maximum input sequence length.
- hidden_sizeint
Hidden dimension size.
- devicestr, optional
Device to place the model on. Default is
"cpu".- **kwargs
Additional keyword arguments (ignored).
Methods
forward(src[, is_causal])Forward pass through the transformer encoder.