fusionlab.nn.components.PositionalEncoding¶

class fusionlab.nn.components.PositionalEncoding[source]¶

Bases: Layer, NNLearner

Injects positional information into an input tensor.

This layer adds a positional encoding to the input, allowing models like Transformers to understand the order of the sequence. It uses the standard sinusoidal encoding from the “Attention Is All You Need” paper [1].

The positional encoding \(PE\) is defined as:

\[PE_{(pos, 2i)} = \sin\left(\frac{pos}{10000^{2i/d_{\text{model}}}}\right)\]

\[PE_{(pos, 2i+1)} = \cos\left(\frac{pos}{10000^{2i/d_{\text{model}}}}\right)\]

where \(pos\) is the position in the sequence, \(i\) is the dimension index, and \(d_{\text{model}}\) is the feature dimension.

Parameters:

max_length (int, default 2048) – The maximum possible sequence length. The encoding matrix will be pre-calculated up to this length.
**kwargs – Standard Keras Layer keyword arguments.

Examples

>>> import tensorflow as tf
>>> from fusionlab.nn.components import PositionalEncoding
>>> batch_size = 4
>>> sequence_length = 50
>>> feature_dimension = 128

>>> # Create dummy input tensor
>>> input_tensor = tf.random.normal(
...    (batch_size, sequence_length, feature_dimension)
... )

>>> # Instantiate and apply the layer
>>> pos_encoding_layer = PositionalEncoding(max_length=5000)
>>> output_tensor = pos_encoding_layer(input_tensor)

>>> print("Input Tensor Shape:", input_tensor.shape)
>>> print("Output Tensor Shape:", output_tensor.shape)
>>> # The shape should be unchanged.
>>> assert input_tensor.shape == output_tensor.shape

>>> # You can visualize the encoding if you wish
>>> import matplotlib.pyplot as plt
>>> pe_matrix = pos_encoding_layer.positional_encoding[0, :, :].numpy()
>>> plt.figure(figsize=(10, 5))
>>> cax = plt.matshow(pe_matrix, fignum=1, aspect='auto', cmap='viridis')
>>> plt.gcf().colorbar(cax)
>>> plt.title("Sinusoidal Positional Encoding Matrix")
>>> plt.xlabel("Feature Dimension")
>>> plt.ylabel("Position in Sequence")
>>> plt.show()

References

__init__(max_length=2048, **kwargs)[source]¶

Parameters:: max_length (int)

Methods

`__init__`([max_length])
`add_loss`(losses, **kwargs)	Add loss tensor(s), potentially dependent on layer inputs.
`add_metric`(value[, name])	Adds metric tensor to the layer.
`add_update`(updates)	Add update op(s), potentially dependent on layer inputs.
`add_variable`(args, *kwargs)	Deprecated, do NOT use! Alias for add_weight.
`add_weight`([name, shape, dtype, ...])	Adds a new variable to the layer.
`build`(input_shape)	Pre-calculates the positional encoding matrix.
`build_from_config`(config)	Builds the layer's states with the supplied config dict.
`call`(inputs[, training])	Adds positional encoding to the input tensor.
`compute_mask`(inputs[, mask])	Computes an output mask tensor.
`compute_output_shape`(input_shape)	Computes the output shape of the layer.
`compute_output_signature`(input_signature)	Compute the output tensor signature of the layer based on the inputs.
`count_params`()	Count the total number of scalars composing the weights.
`finalize_state`()	Finalizes the layers state after updating layer weights.
`from_config`(config)	Creates a layer from its config.
`get_build_config`()	Returns a dictionary with the layer's input shape.
`get_config`()	Returns the configuration of the layer.
`get_input_at`(node_index)	Retrieves the input tensor(s) of a layer at a given node.
`get_input_mask_at`(node_index)	Retrieves the input mask tensor(s) of a layer at a given node.
`get_input_shape_at`(node_index)	Retrieves the input shape(s) of a layer at a given node.
`get_output_at`(node_index)	Retrieves the output tensor(s) of a layer at a given node.
`get_output_mask_at`(node_index)	Retrieves the output mask tensor(s) of a layer at a given node.
`get_output_shape_at`(node_index)	Retrieves the output shape(s) of a layer at a given node.
`get_params`([deep])	Get the parameters for this learner.
`get_weights`()	Returns the current weights of the layer, as NumPy arrays.
`help`(**kwargs)
`load`(file_path[, format])	Load the learner's state from a specified file in the desired format.
`load_own_variables`(store)	Loads the state of the layer.
`save`([file_path, format, overwrite, ...])	Save the learner's state to a specified file in the desired format.
`save_own_variables`(store)	Saves the state of the layer.
`set_params`(**params)	Set the parameters of this learner.
`set_weights`(weights)	Sets the weights of the layer, from NumPy arrays.
`summary`()	Provide a summary of the learner's parameters.
`with_name_scope`(method)	Decorator to automatically enter the module name scope.

Attributes

`activity_regularizer`	Optional regularizer function for the output of this layer.
`compute_dtype`	The dtype of the layer's computations.
`dtype`	The dtype of the layer weights.
`dtype_policy`	The dtype policy associated with this layer.
`dynamic`	Whether the layer is dynamic (eager-only); set in the constructor.
`inbound_nodes`	Return Functional API nodes upstream of this layer.
`input`	Retrieves the input tensor(s) of a layer.
`input_mask`	Retrieves the input mask tensor(s) of a layer.
`input_shape`	Retrieves the input shape(s) of a layer.
`input_spec`	InputSpec instance(s) describing the input format for this layer.
`losses`	List of losses added using the add_loss() API.
`metrics`	List of metrics attached to the layer.
`my_params`
`name`	Name of the layer (string), set in the constructor.
`name_scope`	Returns a tf.name_scope instance for this class.
`non_trainable_variables`	Sequence of non-trainable variables owned by this module and its submodules.
`non_trainable_weights`	List of all non-trainable weights tracked by this layer.
`outbound_nodes`	Return Functional API nodes downstream of this layer.
`output`	Retrieves the output tensor(s) of a layer.
`output_mask`	Retrieves the output mask tensor(s) of a layer.
`output_shape`	Retrieves the output shape(s) of a layer.
`stateful`
`submodules`	Sequence of all sub-modules.
`supports_masking`	Whether this layer supports computing a mask using compute_mask.
`trainable`
`trainable_variables`	Sequence of trainable variables owned by this module and its submodules.
`trainable_weights`	List of all trainable weights tracked by this layer.
`updates`
`variable_dtype`	Alias of Layer.dtype, the dtype of the weights.
`variables`	Returns the list of all layer variables/weights.
`weights`	Returns the list of all layer variables/weights.

__init__(max_length=2048, **kwargs)[source]¶

Parameters:: max_length (int)

build(input_shape)[source]¶

Pre-calculates the positional encoding matrix.

Parameters:: input_shape (TensorShape)

call(inputs, training=False)[source]¶

Adds positional encoding to the input tensor.

The ‘training’ argument is accepted but not used. This ensures API compatibility with Keras.

Parameters:: inputs (tf.Tensor) – A 3D tensor of shape \((B, T, D)\), where B is the batch size, T is the sequence length, and D is the feature dimension.
Returns:: The input tensor with positional encodings added. Shape: \((B, T, D)\).
Return type:: tf.Tensor

Notes

The Positional encoding does not depends on training. The sinusoidal PositionalEncoding layer performs a deterministic mathematical operation. It calculates a fixed matrix of sine and cosine values based on position and feature dimension and simply adds it to the input. This calculation is the same whether you are training the model or running it for inference. Unlike layers such as Dropout or BatchNormalization, PositionalEncoding has no different behavior during training.

get_config()[source]¶

Returns the configuration of the layer.

Return type:: dict

help(**kwargs)¶

my_params = PositionalEncoding(max_length=2048)¶