fusionlab.nn.components.CrossAttention¶
- class fusionlab.nn.components.CrossAttention[source]¶
Bases:
Layer,NNLearnerCrossAttention that attends
source1(query) tosource2(key/value) with optional masks.- attention_maskTensor, optional
Bool / 0‑1 mask broadcastable to (B, Tq, Tv). Passed directly to Keras
MultiHeadAttention.- query_mask, value_maskTensor, optional
1D/2D masks (B, Tq) or (B, Tv). If provided and
attention_maskis None, they are combined to form (B, Tq, Tv).- use_causal_maskbool
Forwarded to MHA. Default False.
Methods
__init__(units, num_heads)add_loss(loss)Can be called inside of the call() method to add a scalar loss.
add_metric(*args, **kwargs)add_variable(shape, initializer[, dtype, ...])Add a weight variable to the layer.
add_weight([shape, initializer, dtype, ...])Add a weight variable to the layer.
build(input_shape)build_from_config(config)Builds the layer's states with the supplied config dict.
call(inputs[, training, attention_mask, ...])Forward pass of CrossAttention.
compute_mask(inputs, previous_mask)compute_output_shape(*args, **kwargs)compute_output_spec(*args, **kwargs)count_params()Count the total number of scalars composing the weights.
from_config(config)Creates an operation from its config.
get_build_config()Returns a dictionary with the layer's input shape.
Returns the config of the object.
get_params([deep])Get the parameters for this learner.
get_weights()Return the values of layer.weights as a list of NumPy arrays.
help(**kwargs)load(file_path[, format])Load the learner's state from a specified file in the desired format.
load_own_variables(store)Loads the state of the layer.
quantize(mode[, type_check, config])quantized_build(input_shape, mode)quantized_call(*args, **kwargs)rematerialized_call(layer_call, *args, **kwargs)Enable rematerialization dynamically for layer's call method.
save([file_path, format, overwrite, ...])Save the learner's state to a specified file in the desired format.
save_own_variables(store)Saves the state of the layer.
set_params(**params)Set the parameters of this learner.
set_weights(weights)Sets the values of layer.weights from a list of NumPy arrays.
stateless_call(trainable_variables, ...[, ...])Call the layer without any side effects.
summary()Provide a summary of the learner's parameters.
symbolic_call(*args, **kwargs)Attributes
compute_dtypeThe dtype of the computations performed by the layer.
dtypeAlias of layer.variable_dtype.
dtype_policyinputRetrieves the input tensor(s) of a symbolic operation.
input_dtypeThe dtype layer inputs should be converted to.
input_speclossesList of scalar losses from add_loss, regularizers and sublayers.
metricsList of all metrics.
metrics_variablesList of all metric variables.
non_trainable_variablesList of all non-trainable layer state.
non_trainable_weightsList of all non-trainable weight variables of the layer.
outputRetrieves the output tensor(s) of a layer.
pathThe path of the layer.
quantization_modeThe quantization mode of this layer, None if not quantized.
supports_maskingWhether this layer supports computing a mask using compute_mask.
trainableSettable boolean, whether this layer should be trainable or not.
trainable_variablesList of all trainable layer state.
trainable_weightsList of all trainable weight variables of the layer.
variable_dtypeThe dtype of the state (weights) of the layer.
variablesList of all layer state, including random seeds.
weightsList of all weight variables of the layer.
- call(inputs, training=False, *, attention_mask=None, query_mask=None, value_mask=None, use_causal_mask=False, **kwargs)[source]¶
Forward pass of CrossAttention.
- Parameters:
inputs (
listoftf.Tensor) – A list [source1, source2], each of shape (batch_size, time_steps, features).training (
bool, optional) – Indicates if the layer is in training mode (for dropout, if any). Defaults toFalse.attention_mask (
Tensor, optional) – Bool / 0‑1 mask broadcastable to (B, Tq, Tv). Passed directly to KerasMultiHeadAttention.query_mask (
Tensor, optional) – 1D/2D masks (B, Tq) or (B, Tv). If provided andattention_maskis None, they are combined to form (B, Tq, Tv).value_mask (
Tensor, optional) – 1D/2D masks (B, Tq) or (B, Tv). If provided andattention_maskis None, they are combined to form (B, Tq, Tv).use_causal_mask (
bool) – Forwarded to MHA. Default False.
- Returns:
A tensor of shape (batch_size, time_steps, units) representing cross-attended features.
- Return type:
tf.Tensor
- get_config()[source]¶
Returns the config of the object.
An object config is a Python dictionary (serializable) containing the information needed to re-instantiate it.
- classmethod from_config(config)[source]¶
Creates an operation from its config.
This method is the reverse of get_config, capable of instantiating the same operation from the config dictionary.
Note: If you override this method, you might receive a serialized dtype config, which is a dict. You can deserialize it as follows:
```python if “dtype” in config and isinstance(config[“dtype”], dict):
policy = dtype_policies.deserialize(config[“dtype”])
- Parameters:
config – A Python dictionary, typically the output of get_config.
- Returns:
An operation instance.
- help(**kwargs)¶
- my_params = CrossAttention(units, num_heads)¶