fusionlab.nn.transformers.TemporalFusionTransformer

class fusionlab.nn.transformers.TemporalFusionTransformer[source]

Bases: Model, NNLearner

TemporalFusionTransformer model implementation for multi-horizon forecasting, with optional static, past, and future inputs.

This class extends Keras Model and integrates with the gofast NNLearner interface. It supports dynamic (past) inputs, optional static inputs, and newly added optional future inputs (future_input_dim). By including the future covariates, the TemporalFusionTransformer can account for known future features (e.g., events, planned discount rates, etc.) in its predictions.

Parameters:
  • dynamic_input_dim (int) – Dimensionality of the dynamic (past) inputs. This is mandatory for the TFT model.

  • static_input_dim (int, optional) – Dimensionality of static inputs. If not None, the call method will expect static inputs.

  • future_input_dim (int, optional) – Dimensionality of future (known) inputs. If not None, the call method will expect future inputs to handle exogenous covariates known in the future (e.g., events, planned promotions, etc.).

  • hidden_units (int, default 32) – Number of hidden units for the layers that do not have a distinct specification (e.g., GRNs, variable selection networks).

  • num_heads (int, default 4) – Number of attention heads in the multi-head attention layer.

  • dropout_rate (float, default 0.1) – Dropout rate for various layers (GRNs, attention, etc.).

  • forecast_horizon (int, default 1) – Number of timesteps to forecast into the future.

  • quantiles (list of float, optional) – List of quantiles for probabilistic forecasting. If None, a single deterministic output is produced.

  • activation (str, default 'elu') – Activation function. Must be one of {'elu', 'relu', 'tanh', 'sigmoid', 'linear', 'gelu'}.

  • use_batch_norm (bool, default False) – Whether to apply batch normalization in various sub-layers.

  • num_lstm_layers (int, default 1) – Number of LSTM layers in the encoder.

  • lstm_units (list of int or None, default None) – If provided, each index corresponds to the number of LSTM units for that layer. If None, uses hidden_units for each layer.

Examples

>>> from fusionlab.nn._tensor_validation import validate_tft_inputs
>>> from fusionlab.nn.tft import TemporalFusionTransformer
>>> model = TemporalFusionTransformer(
...     dynamic_input_dim=10,
...     static_input_dim=5,
...     future_input_dim=8,
...     hidden_units=32,
...     num_heads=4,
...     dropout_rate=0.1,
...     forecast_horizon=7,
...     quantiles=[0.1, 0.5, 0.9],
...     activation='elu',
...     use_batch_norm=True,
...     num_lstm_layers=2,
...     lstm_units=[64, 32]
... )

Notes

The newly added future_input_dim allows the model to incorporate future covariates known at forecast time. In the call method, if future_input_dim is not None, the model expects three inputs: (static_inputs, dynamic_inputs, future_inputs). Otherwise, it expects only (static_inputs, dynamic_inputs).

See also

VariableSelectionNetwork

For feature selection and embedding.

GatedResidualNetwork

A GRN used in various sub-layers.

LSTM

Keras LSTM layers for sequence processing.

References

__init__(dynamic_input_dim, static_input_dim=None, future_input_dim=None, hidden_units=32, num_heads=4, dropout_rate=0.1, forecast_horizon=1, quantiles=None, activation='elu', use_batch_norm=False, num_lstm_layers=1, lstm_units=None, output_dim=1, **kw)[source]

Methods

__init__(dynamic_input_dim[, ...])

add_loss(losses, **kwargs)

Add loss tensor(s), potentially dependent on layer inputs.

add_metric(value[, name])

Adds metric tensor to the layer.

add_update(updates)

Add update op(s), potentially dependent on layer inputs.

add_variable(*args, **kwargs)

Deprecated, do NOT use! Alias for add_weight.

add_weight([name, shape, dtype, ...])

Adds a new variable to the layer.

build(input_shape)

Builds the model based on input shapes received.

build_from_config(config)

Builds the layer's states with the supplied config dict.

call(inputs[, training])

The main forward pass for NTemporalFusionTransformer.

compile([optimizer, loss, metrics, ...])

Configures the model for training.

compile_from_config(config)

Compiles the model with the information given in config.

compute_loss([x, y, y_pred, sample_weight])

Compute the total loss, validate it, and return it.

compute_mask(inputs[, mask])

Computes an output mask tensor.

compute_metrics(x, y, y_pred, sample_weight)

Update metric states and collect all metrics to be returned.

compute_output_shape(input_shape)

Computes the output shape of the layer.

compute_output_signature(input_signature)

Compute the output tensor signature of the layer based on the inputs.

count_params()

Count the total number of scalars composing the weights.

evaluate([x, y, batch_size, verbose, ...])

Returns the loss value & metrics values for the model in test mode.

evaluate_generator(generator[, steps, ...])

Evaluates the model on a data generator.

export(filepath)

Create a SavedModel artifact for inference (e.g. via TF-Serving).

finalize_state()

Finalizes the layers state after updating layer weights.

fit([x, y, batch_size, epochs, verbose, ...])

Trains the model for a fixed number of epochs (dataset iterations).

fit_generator(generator[, steps_per_epoch, ...])

Fits the model on data yielded batch-by-batch by a Python generator.

from_config(config)

Recreate NTemporalFusionTransformer instance from config.

get_build_config()

Returns a dictionary with the layer's input shape.

get_compile_config()

Returns a serialized config with information for compiling the model.

get_config()

Return the model configuration for serialization.

get_input_at(node_index)

Retrieves the input tensor(s) of a layer at a given node.

get_input_mask_at(node_index)

Retrieves the input mask tensor(s) of a layer at a given node.

get_input_shape_at(node_index)

Retrieves the input shape(s) of a layer at a given node.

get_layer([name, index])

Retrieves a layer based on either its name (unique) or index.

get_metrics_result()

Returns the model's metrics values as a dict.

get_output_at(node_index)

Retrieves the output tensor(s) of a layer at a given node.

get_output_mask_at(node_index)

Retrieves the output mask tensor(s) of a layer at a given node.

get_output_shape_at(node_index)

Retrieves the output shape(s) of a layer at a given node.

get_params([deep])

Get the parameters for this learner.

get_weight_paths()

Retrieve all the variables and their paths for the model.

get_weights()

Retrieves the weights of the model.

help(**kwargs)

load(file_path[, format])

Load the learner's state from a specified file in the desired format.

load_own_variables(store)

Loads the state of the layer.

load_weights(filepath[, skip_mismatch, ...])

Loads all layer weights from a saved files.

make_predict_function([force])

Creates a function that executes one step of inference.

make_test_function([force])

Creates a function that executes one step of evaluation.

make_train_function([force])

Creates a function that executes one step of training.

predict(x[, batch_size, verbose, steps, ...])

Generates output predictions for the input samples.

predict_generator(generator[, steps, ...])

Generates predictions for the input samples from a data generator.

predict_on_batch(x)

Returns predictions for a single batch of samples.

predict_step(data)

The logic for one inference step.

reset_metrics()

Resets the state of all the metrics in the model.

reset_states()

save(filepath[, overwrite, save_format])

Saves a model as a TensorFlow SavedModel or HDF5 file.

save_own_variables(store)

Saves the state of the layer.

save_spec([dynamic_batch])

Returns the tf.TensorSpec of call args as a tuple (args, kwargs).

save_weights(filepath[, overwrite, ...])

Saves all layer weights.

set_params(**params)

Set the parameters of this learner.

set_weights(weights)

Sets the weights of the layer, from NumPy arrays.

summary([line_length, positions, print_fn, ...])

Prints a string summary of the network.

test_on_batch(x[, y, sample_weight, ...])

Test the model on a single batch of samples.

test_step(data)

The logic for one evaluation step.

to_json(**kwargs)

Returns a JSON string containing the network configuration.

to_yaml(**kwargs)

Returns a yaml string containing the network configuration.

train_on_batch(x[, y, sample_weight, ...])

Runs a single gradient update on a single batch of data.

train_step(data)

The logic for one training step.

with_name_scope(method)

Decorator to automatically enter the module name scope.

Attributes

activity_regularizer

Optional regularizer function for the output of this layer.

autotune_steps_per_execution

Settable property to enable tuning for steps_per_execution

compute_dtype

The dtype of the layer's computations.

distribute_reduction_method

The method employed to reduce per-replica values during training.

distribute_strategy

The tf.distribute.Strategy this model was created under.

dtype

The dtype of the layer weights.

dtype_policy

The dtype policy associated with this layer.

dynamic

Whether the layer is dynamic (eager-only); set in the constructor.

inbound_nodes

Return Functional API nodes upstream of this layer.

input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape(s) of a layer.

input_spec

InputSpec instance(s) describing the input format for this layer.

jit_compile

Specify whether to compile the model with XLA.

layers

losses

List of losses added using the add_loss() API.

metrics

Return metrics added using compile() or add_metric().

metrics_names

Returns the model's display labels for all outputs.

my_params

name

Name of the layer (string), set in the constructor.

name_scope

Returns a tf.name_scope instance for this class.

non_trainable_variables

Sequence of non-trainable variables owned by this module and its submodules.

non_trainable_weights

List of all non-trainable weights tracked by this layer.

outbound_nodes

Return Functional API nodes downstream of this layer.

output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape(s) of a layer.

run_eagerly

Settable attribute indicating whether the model should run eagerly.

state_updates

Deprecated, do NOT use!

stateful

steps_per_execution

Settable `steps_per_execution variable. Requires a compiled model.

submodules

Sequence of all sub-modules.

supports_masking

Whether this layer supports computing a mask using compute_mask.

trainable

trainable_variables

Sequence of trainable variables owned by this module and its submodules.

trainable_weights

List of all trainable weights tracked by this layer.

updates

variable_dtype

Alias of Layer.dtype, the dtype of the weights.

variables

Returns the list of all layer variables/weights.

weights

Returns the list of all layer variables/weights.

__init__(dynamic_input_dim, static_input_dim=None, future_input_dim=None, hidden_units=32, num_heads=4, dropout_rate=0.1, forecast_horizon=1, quantiles=None, activation='elu', use_batch_norm=False, num_lstm_layers=1, lstm_units=None, output_dim=1, **kw)[source]
help(**kwargs)
my_params = TemporalFusionTransformer(     dynamic_input_dim,     static_input_dim=None,     future_input_dim=None,     hidden_units=32,     num_heads=4,     dropout_rate=0.1,     forecast_horizon=1,     quantiles=None,     activation='elu',     use_batch_norm=False,     num_lstm_layers=1,     lstm_units=None,     output_dim=1 )
call(inputs, training=False, **kw)[source]

The main forward pass for NTemporalFusionTransformer.

  1. Validate and unpack inputs using validate_tft_inputs.

  2. Apply variable selection to static, dynamic, and future inputs.

  3. Perform positional encoding on dynamic+future sequences.

  4. Compute static context vectors if static is present.

  5. Pass through LSTM encoders.

  6. Optionally enrich dynamic with static context.

  7. Temporal attention for interpretable weighting of time steps.

  8. Position-wise feedforward (GRN).

  9. Final slicing (forecast horizon) and output (quantiles or single).

Parameters:
  • inputs (tuple) – Should contain up to three elements: (dynamic_inputs, future_inputs, static_inputs) or fewer if not all are provided.

  • training (bool, default False) – Whether in training mode (affects dropout, BN, etc.).

Returns:

Final predicted sequences of shape (batch_size, forecast_horizon, num_quantiles or 1).

Return type:

tf.Tensor

get_config()[source]

Return the model configuration for serialization. Includes all hyperparameters that define the structure of the NTemporalFusionTransformer.

classmethod from_config(config)[source]

Recreate NTemporalFusionTransformer instance from config. This classmethod is invoked by Keras to deserialize the model.