fusionlab.nn.components.PositionalEncoding

class fusionlab.nn.components.PositionalEncoding[source]

Bases: Layer, NNLearner

Injects positional information into an input tensor.

This layer adds a positional encoding to the input, allowing models like Transformers to understand the order of the sequence. It uses the standard sinusoidal encoding from the “Attention Is All You Need” paper [1].

The positional encoding \(PE\) is defined as:

\[PE_{(pos, 2i)} = \sin\left(\frac{pos}{10000^{2i/d_{\text{model}}}}\right)\]
\[PE_{(pos, 2i+1)} = \cos\left(\frac{pos}{10000^{2i/d_{\text{model}}}}\right)\]

where \(pos\) is the position in the sequence, \(i\) is the dimension index, and \(d_{\text{model}}\) is the feature dimension.

Parameters:
  • max_length (int, default 2048) – The maximum possible sequence length. The encoding matrix will be pre-calculated up to this length.

  • **kwargs – Standard Keras Layer keyword arguments.

Examples

>>> import tensorflow as tf
>>> from fusionlab.nn.components import PositionalEncoding
>>> batch_size = 4
>>> sequence_length = 50
>>> feature_dimension = 128
>>> # Create dummy input tensor
>>> input_tensor = tf.random.normal(
...    (batch_size, sequence_length, feature_dimension)
... )
>>> # Instantiate and apply the layer
>>> pos_encoding_layer = PositionalEncoding(max_length=5000)
>>> output_tensor = pos_encoding_layer(input_tensor)
>>> print("Input Tensor Shape:", input_tensor.shape)
>>> print("Output Tensor Shape:", output_tensor.shape)
>>> # The shape should be unchanged.
>>> assert input_tensor.shape == output_tensor.shape
>>> # You can visualize the encoding if you wish
>>> import matplotlib.pyplot as plt
>>> pe_matrix = pos_encoding_layer.positional_encoding[0, :, :].numpy()
>>> plt.figure(figsize=(10, 5))
>>> cax = plt.matshow(pe_matrix, fignum=1, aspect='auto', cmap='viridis')
>>> plt.gcf().colorbar(cax)
>>> plt.title("Sinusoidal Positional Encoding Matrix")
>>> plt.xlabel("Feature Dimension")
>>> plt.ylabel("Position in Sequence")
>>> plt.show()

References

__init__(max_length=2048, **kwargs)[source]
Parameters:

max_length (int)

Methods

__init__([max_length])

add_loss(losses, **kwargs)

Add loss tensor(s), potentially dependent on layer inputs.

add_metric(value[, name])

Adds metric tensor to the layer.

add_update(updates)

Add update op(s), potentially dependent on layer inputs.

add_variable(*args, **kwargs)

Deprecated, do NOT use! Alias for add_weight.

add_weight([name, shape, dtype, ...])

Adds a new variable to the layer.

build(input_shape)

Pre-calculates the positional encoding matrix.

build_from_config(config)

Builds the layer's states with the supplied config dict.

call(inputs[, training])

Adds positional encoding to the input tensor.

compute_mask(inputs[, mask])

Computes an output mask tensor.

compute_output_shape(input_shape)

Computes the output shape of the layer.

compute_output_signature(input_signature)

Compute the output tensor signature of the layer based on the inputs.

count_params()

Count the total number of scalars composing the weights.

finalize_state()

Finalizes the layers state after updating layer weights.

from_config(config)

Creates a layer from its config.

get_build_config()

Returns a dictionary with the layer's input shape.

get_config()

Returns the configuration of the layer.

get_input_at(node_index)

Retrieves the input tensor(s) of a layer at a given node.

get_input_mask_at(node_index)

Retrieves the input mask tensor(s) of a layer at a given node.

get_input_shape_at(node_index)

Retrieves the input shape(s) of a layer at a given node.

get_output_at(node_index)

Retrieves the output tensor(s) of a layer at a given node.

get_output_mask_at(node_index)

Retrieves the output mask tensor(s) of a layer at a given node.

get_output_shape_at(node_index)

Retrieves the output shape(s) of a layer at a given node.

get_params([deep])

Get the parameters for this learner.

get_weights()

Returns the current weights of the layer, as NumPy arrays.

help(**kwargs)

load(file_path[, format])

Load the learner's state from a specified file in the desired format.

load_own_variables(store)

Loads the state of the layer.

save([file_path, format, overwrite, ...])

Save the learner's state to a specified file in the desired format.

save_own_variables(store)

Saves the state of the layer.

set_params(**params)

Set the parameters of this learner.

set_weights(weights)

Sets the weights of the layer, from NumPy arrays.

summary()

Provide a summary of the learner's parameters.

with_name_scope(method)

Decorator to automatically enter the module name scope.

Attributes

activity_regularizer

Optional regularizer function for the output of this layer.

compute_dtype

The dtype of the layer's computations.

dtype

The dtype of the layer weights.

dtype_policy

The dtype policy associated with this layer.

dynamic

Whether the layer is dynamic (eager-only); set in the constructor.

inbound_nodes

Return Functional API nodes upstream of this layer.

input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape(s) of a layer.

input_spec

InputSpec instance(s) describing the input format for this layer.

losses

List of losses added using the add_loss() API.

metrics

List of metrics attached to the layer.

my_params

name

Name of the layer (string), set in the constructor.

name_scope

Returns a tf.name_scope instance for this class.

non_trainable_variables

Sequence of non-trainable variables owned by this module and its submodules.

non_trainable_weights

List of all non-trainable weights tracked by this layer.

outbound_nodes

Return Functional API nodes downstream of this layer.

output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape(s) of a layer.

stateful

submodules

Sequence of all sub-modules.

supports_masking

Whether this layer supports computing a mask using compute_mask.

trainable

trainable_variables

Sequence of trainable variables owned by this module and its submodules.

trainable_weights

List of all trainable weights tracked by this layer.

updates

variable_dtype

Alias of Layer.dtype, the dtype of the weights.

variables

Returns the list of all layer variables/weights.

weights

Returns the list of all layer variables/weights.

__init__(max_length=2048, **kwargs)[source]
Parameters:

max_length (int)

build(input_shape)[source]

Pre-calculates the positional encoding matrix.

Parameters:

input_shape (TensorShape)

call(inputs, training=False)[source]

Adds positional encoding to the input tensor.

The ‘training’ argument is accepted but not used. This ensures API compatibility with Keras.

Parameters:

inputs (tf.Tensor) – A 3D tensor of shape \((B, T, D)\), where B is the batch size, T is the sequence length, and D is the feature dimension.

Returns:

The input tensor with positional encodings added. Shape: \((B, T, D)\).

Return type:

tf.Tensor

Notes

The Positional encoding does not depends on training. The sinusoidal PositionalEncoding layer performs a deterministic mathematical operation. It calculates a fixed matrix of sine and cosine values based on position and feature dimension and simply adds it to the input. This calculation is the same whether you are training the model or running it for inference. Unlike layers such as Dropout or BatchNormalization, PositionalEncoding has no different behavior during training.

get_config()[source]

Returns the configuration of the layer.

Return type:

dict

help(**kwargs)
my_params = PositionalEncoding(max_length=2048)