fusionlab.nn.models.SuperXTFT¶

class fusionlab.nn.models.SuperXTFT[source]¶

Bases: XTFT

An extension of XTFT that injects Variable Selection Networks (VSNs) and a Gate→Add&Norm→GRN refinement pipeline on attention and decoder outputs. It inherits all shared logic from BaseExtreme (validation, losses, hooks) and the baseline flow from XTFT, then overrides only the parts that differ.

Key Additions¶

VariableSelectionNetwork for: - static features (no time axis), - dynamic / historical features, - future-known covariates. These learn soft weights to highlight the most informative variables at each step.
GRN refinement blocks applied after: - Hierarchical attention output, - Cross attention output, - Memory-augmented attention output, - Decoder output (via Gate→Add&Norm→GRN pipeline).
Optional feature-based anomaly scoring (when anomaly_detection_strategy == 'feature_based'): a dedicated attention → projection → scorer head produces per-timestep anomaly scores that feed the anomaly loss.

{key_parameters}

Design Rationale¶

VSNs reduce noise from high-dimensional inputs by learning which signals matter now. The extra GRNs stabilize attention outputs and improve representational depth without exploding parameters. Keeping these pieces modular lets you toggle them on variants without rewriting boilerplate.

Hook Overview¶

This class overrides:

_build_components — builds VSNs, GRNs, and (optionally) anomaly-attention layers.
_encode_inputs — runs VSNs before the usual embedding / normalization path.
_temporal_backbone — inserts GRNs after each attention.
_aggregate_decode — applies the Gate→Add&Norm→GRN pipeline to decoder outputs.
_maybe_compute_anomaly_scores — computes scores for the feature-based strategy.

Everything else (compile logic, losses, serialization) is taken care of by the base classes.

Examples

Instantiate with VSNs and GRN refinements active by default:

model = SuperXTFT(
    static_input_dim=10,
    dynamic_input_dim=32,
    future_input_dim=8,
    forecast_horizon=7,
    quantiles=[0.1, 0.5, 0.9],
    multi_scale_agg='auto',
    anomaly_detection_strategy='feature_based',
)

Then compile and fit as usual:

model.compile(optimizer='adam')
model.fit([X_static, X_dynamic, X_future], y, epochs=20)

Notes

Set the logger to DEBUG to see shapes and flow decisions.
If you do not need feature-based anomalies, omit that strategy to skip building the extra attention head and save memory.

See also

XTFT: Baseline DRY implementation without VSN/extra GRNs.
BaseExtreme: Parent class providing shared plumbing, hooks, and logging.

__init__(*, static_input_dim, dynamic_input_dim, future_input_dim, embed_dim=32, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, num_heads=4, dropout_rate=0.1, output_dim=1, attention_units=32, hidden_units=64, lstm_units=64, scales=None, multi_scale_agg='auto', activation='relu', use_residuals=True, use_batch_norm=False, final_agg='last', anomaly_config=None, anomaly_detection_strategy=None, anomaly_loss_weight=1.0, architecture_config=None, fusion_mode=None, **kw)[source]¶

Parameters:

static_input_dim (int)
dynamic_input_dim (int)
future_input_dim (int)
embed_dim (int)
forecast_horizon (int)
quantiles (str | List[float] | None)
max_window_size (int)
memory_size (int)
num_heads (int)
dropout_rate (float)
output_dim (int)
attention_units (int)
hidden_units (int)
lstm_units (int)
scales (str | List[int] | None)
multi_scale_agg (str | None)
activation (str | callable)
use_residuals (bool)
use_batch_norm (bool)
final_agg (str)
anomaly_config (Dict[str, Any] | None)
anomaly_detection_strategy (str | None)
anomaly_loss_weight (float)
architecture_config (Dict | None)
fusion_mode (str | None)
kw (Any)

Return type:

None

Methods

`__init__`(*, static_input_dim, ...[, ...])
`add_loss`(loss)	Can be called inside of the call() method to add a scalar loss.
`add_metric`(args, *kwargs)
`add_variable`(shape, initializer[, dtype, ...])	Add a weight variable to the layer.
`add_weight`([shape, initializer, dtype, ...])	Add a weight variable to the layer.
`build`(input_shape)
`build_from_config`(config)	Builds the layer's states with the supplied config dict.
`call`(inputs[, training])
`compile`(optimizer[, loss])	Configures the model for training.
`compile_from_config`(config)	Compiles the model with the information given in config.
`compiled_loss`(y, y_pred[, sample_weight, ...])
`compute_loss`([x, y, y_pred, sample_weight, ...])	Compute the total loss, validate it, and return it.
`compute_mask`(inputs, previous_mask)
`compute_metrics`(x, y, y_pred[, sample_weight])	Update metric states and collect all metrics to be returned.
`compute_output_shape`(args, *kwargs)
`compute_output_spec`(args, *kwargs)
`count_params`()	Count the total number of scalars composing the weights.
`evaluate`([x, y, batch_size, verbose, ...])	Returns the loss value & metrics values for the model in test mode.
`export`(filepath[, format, verbose, ...])	Export the model as an artifact for inference.
`fit`([x, y, batch_size, epochs, verbose, ...])	Trains the model for a fixed number of epochs (dataset iterations).
`from_config`(config)	Creates an operation from its config.
`get_build_config`()	Returns a dictionary with the layer's input shape.
`get_compile_config`()	Returns a serialized config with information for compiling the model.
`get_config`()	Returns the config of the object.
`get_layer`([name, index])	Retrieves a layer based on either its name (unique) or index.
`get_metrics_result`()	Returns the model's metrics values as a dict.
`get_params`([deep])	Get the parameters for this learner.
`get_state_tree`([value_format])	Retrieves tree-like structure of model variables.
`get_weights`()	Return the values of layer.weights as a list of NumPy arrays.
`help`(**kwargs)
`load`(file_path[, format])	Load the learner's state from a specified file in the desired format.
`load_own_variables`(store)	Loads the state of the layer.
`load_weights`(filepath[, skip_mismatch])	Load the weights from a single file or sharded files.
`loss`(y, y_pred[, sample_weight])
`make_predict_function`([force])
`make_test_function`([force])
`make_train_function`([force])
`predict`(x[, batch_size, verbose, steps, ...])	Generates output predictions for the input samples.
`predict_on_batch`(x)	Returns predictions for a single batch of samples.
`predict_step`(data)
`quantize`(mode[, config])	Quantize the weights of the model.
`quantized_build`(input_shape, mode)
`quantized_call`(args, *kwargs)
`reconfigure`(architecture_config)	Creates a new model instance with a modified architecture.
`rematerialized_call`(layer_call, args, *kwargs)	Enable rematerialization dynamically for layer's call method.
`reset_metrics`()
`save`(filepath[, overwrite, zipped])	Saves a model as a .keras file.
`save_own_variables`(store)	Saves the state of the layer.
`save_weights`(filepath[, overwrite, ...])	Saves all weights to a single file or sharded files.
`set_params`(**params)	Set the parameters of this learner.
`set_state_tree`(state_tree)	Assigns values to variables of the model.
`set_weights`(weights)	Sets the values of layer.weights from a list of NumPy arrays.
`stateless_call`(trainable_variables, ...[, ...])	Call the layer without any side effects.
`stateless_compute_loss`(trainable_variables, ...)
`summary`([line_length, positions, print_fn, ...])	Prints a string summary of the network.
`symbolic_call`(args, *kwargs)
`test_on_batch`(x[, y, sample_weight, return_dict])	Test the model on a single batch of samples.
`test_step`(data)
`to_json`(**kwargs)	Returns a JSON string containing the network configuration.
`train_on_batch`(x[, y, sample_weight, ...])	Runs a single gradient update on a single batch of data.
`train_step`(data)

Attributes

`compiled_metrics`
`compute_dtype`	The dtype of the computations performed by the layer.
`distribute_reduction_method`
`distribute_strategy`
`dtype`	Alias of layer.variable_dtype.
`dtype_policy`
`input`	Retrieves the input tensor(s) of a symbolic operation.
`input_dtype`	The dtype layer inputs should be converted to.
`input_spec`
`jit_compile`
`layers`
`losses`	List of scalar losses from add_loss, regularizers and sublayers.
`metrics`	List of all metrics.
`metrics_names`
`metrics_variables`	List of all metric variables.
`my_params`
`non_trainable_variables`	List of all non-trainable layer state.
`non_trainable_weights`	List of all non-trainable weight variables of the layer.
`output`	Retrieves the output tensor(s) of a layer.
`path`	The path of the layer.
`quantization_mode`	The quantization mode of this layer, None if not quantized.
`run_eagerly`
`supports_masking`	Whether this layer supports computing a mask using compute_mask.
`trainable`	Settable boolean, whether this layer should be trainable or not.
`trainable_variables`	List of all trainable layer state.
`trainable_weights`	List of all trainable weight variables of the layer.
`variable_dtype`	The dtype of the state (weights) of the layer.
`variables`	List of all layer state, including random seeds.
`weights`	List of all weight variables of the layer.

Parameters:

static_input_dim (int)
dynamic_input_dim (int)
future_input_dim (int)
embed_dim (int)
forecast_horizon (int)
quantiles (str | List[float] | None)
max_window_size (int)
memory_size (int)
num_heads (int)
dropout_rate (float)
output_dim (int)
attention_units (int)
hidden_units (int)
lstm_units (int)
scales (str | List[int] | None)
multi_scale_agg (str | None)
activation (str | callable)
use_residuals (bool)
use_batch_norm (bool)
final_agg (str)
anomaly_config (Dict[str, Any] | None)
anomaly_detection_strategy (str | None)
anomaly_loss_weight (float)
architecture_config (Dict | None)
fusion_mode (str | None)
kw (Any)

Return type:

None

help(**kwargs)¶

my_params = SuperXTFT( static_input_dim, dynamic_input_dim, future_input_dim, embed_dim=32, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, num_heads=4, dropout_rate=0.1, output_dim=1, attention_units=32, hidden_units=64, lstm_units=64, scales=None, multi_scale_agg='auto', activation='relu', use_residuals=True, use_batch_norm=False, final_agg='last', anomaly_config=None, anomaly_detection_strategy=None, anomaly_loss_weight=1.0, architecture_config=None, fusion_mode=None )¶