fusionlab.nn.components.PositionalEncoding¶
- class fusionlab.nn.components.PositionalEncoding[source]¶
Bases:
Layer,NNLearnerInjects positional information into an input tensor.
This layer adds a positional encoding to the input, allowing models like Transformers to understand the order of the sequence. It uses the standard sinusoidal encoding from the “Attention Is All You Need” paper [1].
The positional encoding \(PE\) is defined as:
\[PE_{(pos, 2i)} = \sin\left(\frac{pos}{10000^{2i/d_{\text{model}}}}\right)\]\[PE_{(pos, 2i+1)} = \cos\left(\frac{pos}{10000^{2i/d_{\text{model}}}}\right)\]where \(pos\) is the position in the sequence, \(i\) is the dimension index, and \(d_{\text{model}}\) is the feature dimension.
- Parameters:
max_length (
int, default2048) – The maximum possible sequence length. The encoding matrix will be pre-calculated up to this length.**kwargs – Standard Keras Layer keyword arguments.
Examples
>>> import tensorflow as tf >>> from fusionlab.nn.components import PositionalEncoding >>> batch_size = 4 >>> sequence_length = 50 >>> feature_dimension = 128
>>> # Create dummy input tensor >>> input_tensor = tf.random.normal( ... (batch_size, sequence_length, feature_dimension) ... )
>>> # Instantiate and apply the layer >>> pos_encoding_layer = PositionalEncoding(max_length=5000) >>> output_tensor = pos_encoding_layer(input_tensor)
>>> print("Input Tensor Shape:", input_tensor.shape) >>> print("Output Tensor Shape:", output_tensor.shape) >>> # The shape should be unchanged. >>> assert input_tensor.shape == output_tensor.shape
>>> # You can visualize the encoding if you wish >>> import matplotlib.pyplot as plt >>> pe_matrix = pos_encoding_layer.positional_encoding[0, :, :].numpy() >>> plt.figure(figsize=(10, 5)) >>> cax = plt.matshow(pe_matrix, fignum=1, aspect='auto', cmap='viridis') >>> plt.gcf().colorbar(cax) >>> plt.title("Sinusoidal Positional Encoding Matrix") >>> plt.xlabel("Feature Dimension") >>> plt.ylabel("Position in Sequence") >>> plt.show()
References
Methods
__init__([max_length])add_loss(losses, **kwargs)Add loss tensor(s), potentially dependent on layer inputs.
add_metric(value[, name])Adds metric tensor to the layer.
add_update(updates)Add update op(s), potentially dependent on layer inputs.
add_variable(*args, **kwargs)Deprecated, do NOT use! Alias for add_weight.
add_weight([name, shape, dtype, ...])Adds a new variable to the layer.
build(input_shape)Pre-calculates the positional encoding matrix.
build_from_config(config)Builds the layer's states with the supplied config dict.
call(inputs[, training])Adds positional encoding to the input tensor.
compute_mask(inputs[, mask])Computes an output mask tensor.
compute_output_shape(input_shape)Computes the output shape of the layer.
compute_output_signature(input_signature)Compute the output tensor signature of the layer based on the inputs.
count_params()Count the total number of scalars composing the weights.
finalize_state()Finalizes the layers state after updating layer weights.
from_config(config)Creates a layer from its config.
get_build_config()Returns a dictionary with the layer's input shape.
Returns the configuration of the layer.
get_input_at(node_index)Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(node_index)Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(node_index)Retrieves the input shape(s) of a layer at a given node.
get_output_at(node_index)Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(node_index)Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(node_index)Retrieves the output shape(s) of a layer at a given node.
get_params([deep])Get the parameters for this learner.
get_weights()Returns the current weights of the layer, as NumPy arrays.
help(**kwargs)load(file_path[, format])Load the learner's state from a specified file in the desired format.
load_own_variables(store)Loads the state of the layer.
save([file_path, format, overwrite, ...])Save the learner's state to a specified file in the desired format.
save_own_variables(store)Saves the state of the layer.
set_params(**params)Set the parameters of this learner.
set_weights(weights)Sets the weights of the layer, from NumPy arrays.
summary()Provide a summary of the learner's parameters.
with_name_scope(method)Decorator to automatically enter the module name scope.
Attributes
activity_regularizerOptional regularizer function for the output of this layer.
compute_dtypeThe dtype of the layer's computations.
dtypeThe dtype of the layer weights.
dtype_policyThe dtype policy associated with this layer.
dynamicWhether the layer is dynamic (eager-only); set in the constructor.
inbound_nodesReturn Functional API nodes upstream of this layer.
inputRetrieves the input tensor(s) of a layer.
input_maskRetrieves the input mask tensor(s) of a layer.
input_shapeRetrieves the input shape(s) of a layer.
input_specInputSpec instance(s) describing the input format for this layer.
lossesList of losses added using the add_loss() API.
metricsList of metrics attached to the layer.
nameName of the layer (string), set in the constructor.
name_scopeReturns a tf.name_scope instance for this class.
non_trainable_variablesSequence of non-trainable variables owned by this module and its submodules.
non_trainable_weightsList of all non-trainable weights tracked by this layer.
outbound_nodesReturn Functional API nodes downstream of this layer.
outputRetrieves the output tensor(s) of a layer.
output_maskRetrieves the output mask tensor(s) of a layer.
output_shapeRetrieves the output shape(s) of a layer.
statefulsubmodulesSequence of all sub-modules.
supports_maskingWhether this layer supports computing a mask using compute_mask.
trainabletrainable_variablesSequence of trainable variables owned by this module and its submodules.
trainable_weightsList of all trainable weights tracked by this layer.
updatesvariable_dtypeAlias of Layer.dtype, the dtype of the weights.
variablesReturns the list of all layer variables/weights.
weightsReturns the list of all layer variables/weights.
- build(input_shape)[source]¶
Pre-calculates the positional encoding matrix.
- Parameters:
input_shape (TensorShape)
- call(inputs, training=False)[source]¶
Adds positional encoding to the input tensor.
The ‘training’ argument is accepted but not used. This ensures API compatibility with Keras.
- Parameters:
inputs (
tf.Tensor) – A 3D tensor of shape \((B, T, D)\), whereBis the batch size,Tis the sequence length, andDis the feature dimension.- Returns:
The input tensor with positional encodings added. Shape: \((B, T, D)\).
- Return type:
tf.Tensor
Notes
The Positional encoding does not depends on training. The sinusoidal PositionalEncoding layer performs a deterministic mathematical operation. It calculates a fixed matrix of sine and cosine values based on position and feature dimension and simply adds it to the input. This calculation is the same whether you are training the model or running it for inference. Unlike layers such as Dropout or BatchNormalization, PositionalEncoding has no different behavior during training.
- help(**kwargs)¶
- my_params = PositionalEncoding(max_length=2048)¶