fusionlab.nn.models.XTFT

class fusionlab.nn.models.XTFT[source]

Bases: BaseExtreme

Extreme Temporal Fusion Transformer (XTFT) model for complex time series forecasting.

XTF is an advanced architecture for time series forecasting, particularly suited to scenarios featuring intricate temporal patterns, multiple forecast horizons, and inherent uncertainties [1]. By extending the original Temporal Fusion Transformer, XTFT incorporates additional modules and strategies that enhance its representational capacity, stability, and interpretability.

See more in User Guide.

{key_improvements}

{key_parameters}

**kwdict

Additional keyword arguments passed to the model. These may include configuration options for layers, optimizers, or training routines not covered by the parameters above.

{methods}

{key_functions}

Examples

>>> import os
>>> import tensorflow as tf
>>> import pandas as pd
>>> import numpy as np
>>> from fusionlab.nn.transformers import XTFT
>>> from fusionlab.nn.losses import combined_quantile_loss
>>> from fusionlab.nn.utils import generate_forecast
>>>
>>> # Create a dummy training DataFrame with a date column,
>>> # dynamic features "feat1", "feat2", static feature "stat1",
>>> # and target "price".
>>> date_rng = pd.date_range(start="2020-01-01", periods=50, freq="D")
>>> train_df = pd.DataFrame({
...     "date": date_rng,
...     "feat1": np.random.rand(50),
...     "feat2": np.random.rand(50),
...     "stat1": np.random.rand(50),
...     "price": np.random.rand(50)
... })
>>> # Prepare a dummy XTFT model with example parameters.
>>> # Note: The model expects the following input shapes:
>>> # - X_static: (n_samples, static_input_dim)
>>> # - X_dynamic: (n_samples, time_steps, dynamic_input_dim)
>>> # - X_future:  (n_samples, time_steps, future_input_dim)
>>> # We just want to test the saved model
>>> data_path =r'J:     est_saved_models'
>>> early_stopping = tf.keras.callbacks.EarlyStopping(
...    monitor              = 'val_loss',
...    patience             = 5,
...    restore_best_weights = True
... )
>>> model_checkpoint = tf.keras.callbacks.ModelCheckpoint(
...    os.path.join( data_path, 'dummy_model'),
...    monitor           = 'val_loss',
...    save_best_only    = True,
...    save_weights_only = False,  # Save entire model
...    verbose           = 1
... )
>>> # Create a dummy DataFrame with a date column,
>>> # two dynamic features ("feat1", "feat2"), one static feature ("stat1"),
>>> # and target "price".
>>> date_rng = pd.date_range(start="2020-01-01", periods=60, freq="D")
>>> data = {
...     "date": date_rng,
...     "feat1": np.random.rand(60),
...     "feat2": np.random.rand(60),
...     "stat1": np.random.rand(60),
...     "price": np.random.rand(60)
... }
>>> df = pd.DataFrame(data)
>>> df.head(5)
>>>
>>>
>>> # Split the DataFrame into training and test sets.
>>> # Training data: dates before 2020-02-01
>>> # Test data: dates from 2020-02-01 onward.
>>> train_df = df[df["date"] < "2020-02-01"].copy()
>>> test_df  = df[df["date"] >= "2020-02-01"].copy()
>>>
>>> # Create dummy input arrays for model fitting.
>>> # Assume time_steps = 3.
>>> X_static = train_df[["stat1"]].values      # Shape: (n_train, 1)
>>> X_dynamic = np.random.rand(len(train_df), 3, 2)
>>> X_future  = np.random.rand(len(train_df), 3, 1)
>>> # Create dummy target output from "price".
>>> y_array   = train_df["price"].values.reshape(len(train_df), 1, 1)
>>>
>>> # Instantiate a dummy XTFT model.
>>> my_model = XTFT(
...     static_input_dim=1,           # "stat1"
...     dynamic_input_dim=2,          # "feat1" and "feat2"
...     future_input_dim=1,           # For the provided future feature
...     forecast_horizon=5,           # Forecasting 5 periods ahead
...     quantiles=[0.1, 0.5, 0.9],
...     embed_dim=16,
...     max_window_size=3,
...     memory_size=50,
...     num_heads=2,
...     dropout_rate=0.1,
...     lstm_units=32,
...     attention_units=32,
...     hidden_units=16
... )
>>> # build the model
>>> _=my_model([X_static, X_dynamic, X_future])
# ...    input_shape=[
# ...        (None, X_static.shape[1]),
# ...        (None, X_dynamic.shape[1], X_dynamic.shape[2]),
# ...        (None, X_future.shape[1], X_future.shape[2])
# ...    ]
# ... )
>>> loss_fn = combined_quantile_loss(my_model.quantiles)
>>> my_model.compile(optimizer="adam", loss=loss_fn)
>>>
>>> # Fit the model on the training data.
>>> my_model.fit(
...     x=[X_static, X_dynamic, X_future],
...     y=y_array,
...     epochs=10,
...     batch_size=8,
...     validation_split= 0.2,
...     callbacks = [early_stopping, model_checkpoint]
... )
>>> my_model.save(os.path.join(data_path, 'dummy_model.keras'))
Epoch 9/10
4/4 [==============================] - 0s 4ms/step - loss: 0.0958
Epoch 10/10
4/4 [==============================] - 0s 5ms/step - loss: 0.1009
Out[10]: <keras.src.callbacks.History at 0x1c7a9114c10>
>>> y_predictions=my_model.predict([X_static, X_dynamic, X_future])
1/1 [==============================] - 1s 640ms/step
>>> print(y_predictions.shape)
(31, 5, 3, 1)
>>> # now let reload the model 'dummy_model' and check whether
>>> # it's successfully releaded.
>>> test_model = tf.keras.models.load_model (os.path.join( data_path, 'dummy_model.keras'))
>>> test_model

See also

fusionlab.nn.tft.TemporalFusionTransformer

The original TFT model for comparison.

MultiHeadAttention

Keras layer for multi-head attention.

LSTM

Keras LSTM layer for sequence modeling.

References

__init__(*, static_input_dim, dynamic_input_dim, future_input_dim, embed_dim=32, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, num_heads=4, dropout_rate=0.1, output_dim=1, attention_units=32, hidden_units=64, lstm_units=64, scales=None, multi_scale_agg=None, activation='relu', use_residuals=True, use_batch_norm=False, final_agg='last', anomaly_config=None, anomaly_detection_strategy=None, anomaly_loss_weight=0.1, architecture_config=None, **kw)[source]
Parameters:
  • static_input_dim (int)

  • dynamic_input_dim (int)

  • future_input_dim (int)

  • embed_dim (int)

  • forecast_horizon (int)

  • quantiles (str | List[float] | None)

  • max_window_size (int)

  • memory_size (int)

  • num_heads (int)

  • dropout_rate (float)

  • output_dim (int)

  • attention_units (int)

  • hidden_units (int)

  • lstm_units (int)

  • scales (str | List[int] | None)

  • multi_scale_agg (str | None)

  • activation (str | callable)

  • use_residuals (bool)

  • use_batch_norm (bool)

  • final_agg (str)

  • anomaly_config (Dict[str, Any] | None)

  • anomaly_detection_strategy (str | None)

  • anomaly_loss_weight (float)

  • architecture_config (Dict | None)

  • kw (Any)

Return type:

None

Methods

__init__(*, static_input_dim, ...[, ...])

add_loss(loss)

Can be called inside of the call() method to add a scalar loss.

add_metric(*args, **kwargs)

add_variable(shape, initializer[, dtype, ...])

Add a weight variable to the layer.

add_weight([shape, initializer, dtype, ...])

Add a weight variable to the layer.

build(input_shape)

build_from_config(config)

Builds the layer's states with the supplied config dict.

call(inputs[, training])

compile(optimizer[, loss])

Configures the model for training.

compile_from_config(config)

Compiles the model with the information given in config.

compiled_loss(y, y_pred[, sample_weight, ...])

compute_loss([x, y, y_pred, sample_weight, ...])

Compute the total loss, validate it, and return it.

compute_mask(inputs, previous_mask)

compute_metrics(x, y, y_pred[, sample_weight])

Update metric states and collect all metrics to be returned.

compute_output_shape(*args, **kwargs)

compute_output_spec(*args, **kwargs)

count_params()

Count the total number of scalars composing the weights.

evaluate([x, y, batch_size, verbose, ...])

Returns the loss value & metrics values for the model in test mode.

export(filepath[, format, verbose, ...])

Export the model as an artifact for inference.

fit([x, y, batch_size, epochs, verbose, ...])

Trains the model for a fixed number of epochs (dataset iterations).

from_config(config)

Creates an operation from its config.

get_build_config()

Returns a dictionary with the layer's input shape.

get_compile_config()

Returns a serialized config with information for compiling the model.

get_config()

Returns the config of the object.

get_layer([name, index])

Retrieves a layer based on either its name (unique) or index.

get_metrics_result()

Returns the model's metrics values as a dict.

get_params([deep])

Get the parameters for this learner.

get_state_tree([value_format])

Retrieves tree-like structure of model variables.

get_weights()

Return the values of layer.weights as a list of NumPy arrays.

help(**kwargs)

load(file_path[, format])

Load the learner's state from a specified file in the desired format.

load_own_variables(store)

Loads the state of the layer.

load_weights(filepath[, skip_mismatch])

Load the weights from a single file or sharded files.

loss(y, y_pred[, sample_weight])

make_predict_function([force])

make_test_function([force])

make_train_function([force])

predict(x[, batch_size, verbose, steps, ...])

Generates output predictions for the input samples.

predict_on_batch(x)

Returns predictions for a single batch of samples.

predict_step(data)

quantize(mode[, config])

Quantize the weights of the model.

quantized_build(input_shape, mode)

quantized_call(*args, **kwargs)

reconfigure(architecture_config)

Creates a new model instance with a modified architecture.

rematerialized_call(layer_call, *args, **kwargs)

Enable rematerialization dynamically for layer's call method.

reset_metrics()

save(filepath[, overwrite, zipped])

Saves a model as a .keras file.

save_own_variables(store)

Saves the state of the layer.

save_weights(filepath[, overwrite, ...])

Saves all weights to a single file or sharded files.

set_params(**params)

Set the parameters of this learner.

set_state_tree(state_tree)

Assigns values to variables of the model.

set_weights(weights)

Sets the values of layer.weights from a list of NumPy arrays.

stateless_call(trainable_variables, ...[, ...])

Call the layer without any side effects.

stateless_compute_loss(trainable_variables, ...)

summary([line_length, positions, print_fn, ...])

Prints a string summary of the network.

symbolic_call(*args, **kwargs)

test_on_batch(x[, y, sample_weight, return_dict])

Test the model on a single batch of samples.

test_step(data)

to_json(**kwargs)

Returns a JSON string containing the network configuration.

train_on_batch(x[, y, sample_weight, ...])

Runs a single gradient update on a single batch of data.

train_step(data)

Attributes

compiled_metrics

compute_dtype

The dtype of the computations performed by the layer.

distribute_reduction_method

distribute_strategy

dtype

Alias of layer.variable_dtype.

dtype_policy

input

Retrieves the input tensor(s) of a symbolic operation.

input_dtype

The dtype layer inputs should be converted to.

input_spec

jit_compile

layers

losses

List of scalar losses from add_loss, regularizers and sublayers.

metrics

List of all metrics.

metrics_names

metrics_variables

List of all metric variables.

my_params

non_trainable_variables

List of all non-trainable layer state.

non_trainable_weights

List of all non-trainable weight variables of the layer.

output

Retrieves the output tensor(s) of a layer.

path

The path of the layer.

quantization_mode

The quantization mode of this layer, None if not quantized.

run_eagerly

supports_masking

Whether this layer supports computing a mask using compute_mask.

trainable

Settable boolean, whether this layer should be trainable or not.

trainable_variables

List of all trainable layer state.

trainable_weights

List of all trainable weight variables of the layer.

variable_dtype

The dtype of the state (weights) of the layer.

variables

List of all layer state, including random seeds.

weights

List of all weight variables of the layer.

__init__(*, static_input_dim, dynamic_input_dim, future_input_dim, embed_dim=32, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, num_heads=4, dropout_rate=0.1, output_dim=1, attention_units=32, hidden_units=64, lstm_units=64, scales=None, multi_scale_agg=None, activation='relu', use_residuals=True, use_batch_norm=False, final_agg='last', anomaly_config=None, anomaly_detection_strategy=None, anomaly_loss_weight=0.1, architecture_config=None, **kw)[source]
Parameters:
  • static_input_dim (int)

  • dynamic_input_dim (int)

  • future_input_dim (int)

  • embed_dim (int)

  • forecast_horizon (int)

  • quantiles (str | List[float] | None)

  • max_window_size (int)

  • memory_size (int)

  • num_heads (int)

  • dropout_rate (float)

  • output_dim (int)

  • attention_units (int)

  • hidden_units (int)

  • lstm_units (int)

  • scales (str | List[int] | None)

  • multi_scale_agg (str | None)

  • activation (str | callable)

  • use_residuals (bool)

  • use_batch_norm (bool)

  • final_agg (str)

  • anomaly_config (Dict[str, Any] | None)

  • anomaly_detection_strategy (str | None)

  • anomaly_loss_weight (float)

  • architecture_config (Dict | None)

  • kw (Any)

Return type:

None

help(**kwargs)
my_params = XTFT(     static_input_dim,     dynamic_input_dim,     future_input_dim,     embed_dim=32,     forecast_horizon=1,     quantiles=None,     max_window_size=10,     memory_size=100,     num_heads=4,     dropout_rate=0.1,     output_dim=1,     attention_units=32,     hidden_units=64,     lstm_units=64,     scales=None,     multi_scale_agg=None,     activation='relu',     use_residuals=True,     use_batch_norm=False,     final_agg='last',     anomaly_config=None,     anomaly_detection_strategy=None,     anomaly_loss_weight=0.1,     architecture_config=None )