fusionlab.nn.utils.forecast_multi_step¶

fusionlab.nn.utils.forecast_multi_step(xtft_model, inputs, forecast_horizon, y=None, dt_col=None, mode='quantile', spatial_cols=None, q=None, tname=None, forecast_dt=None, apply_mask=False, mask_values=None, mask_fill_value=None, savefile=None, verbose=3, **kws)[source]¶

Generate a multi-step forecast using the XTFT model.

This function generates forecasts for multiple future time steps using a pre-trained XTFT deep learning model. The model takes three inputs: X_static, X_dynamic, and X_future, and produces predictions according to the formulation:

\[\hat{y}_{t+i} = f\Bigl( X_{ ext{static}},\; X_{ ext{dynamic}},\; X_{ ext{future}} \Bigr)\]

for \(i = 1, \dots, forecast_horizon\), where \(f\) is the trained XTFT model.

Parameters:

xtft_model (object) – A validated Keras model instance. The model is expected to be verified via validate_keras_model.
inputs (list or tuple of numpy.ndarray) – A list containing three elements: X_static, X_dynamic, and X_future. If spatial_cols is provided, it is assumed that the first two columns of X_static correspond to the first and second spatial coordinates of the original training data.
forecast_horizon (int) – The number of future time steps to forecast. For example, if forecast_horizon is 4, the model will generate predictions for 4 steps ahead.
y (numpy.ndarray, optional) – Actual target values. If provided, evaluation metrics such as R² Score and, in quantile mode, the coverage score are computed.
dt_col (str, optional) – Name of the time column (e.g. "year"). If provided, a column with this name is added to the output DataFrame. The actual time values must be supplied externally.
mode (str, optional) – Forecast mode. Must be either "quantile" or "point". In quantile mode, predictions are generated for multiple quantiles (default: [0.1, 0.5, 0.9]); in point mode, a single prediction is generated.
spatial_cols (list of str, optional) – A list of spatial column names. If provided, it must contain at least two elements corresponding to the first and second columns of the original training data’s X_static.
time_steps (int, optional) – The number of historical time steps used as input. Default is 3.
q (list of float, optional) – List of quantile values for quantile forecasting. The default is [0.1, 0.5, 0.9] when mode is "quantile".
tname (str, optional) – Target variable name used to construct output column names. For instance, if tname is "subsidence", then output columns may be named "subsidence_q10_step1", "subsidence_q50_step2", etc. Default is "target".
forecast_dt (any, optional) – Forecast datetime information. If provided and its length matches forecast_horizon, its values are added to the output DataFrame.
apply_mask (bool, optional) – If True, applies masking via mask_by_reference to replace predictions in non-subsiding areas. Requires that both mask_values and mask_fill_value are provided.
mask_values (scalar, optional) – The reference value(s) used for masking. Must be provided if apply_mask is True.
mask_fill_value (scalar, optional) – The value used to fill masked predictions. Must be provided if apply_mask is True.
savefile (str, optional) – File path to save the forecast results as a CSV file. If not provided, a default filename is generated.
verbose (int, optional) – Verbosity level controlling printed output. Higher values produce more detailed messages.

Returns:

A DataFrame containing the multi-step forecast results. In quantile mode, the DataFrame includes columns for each quantile and each forecast step (e.g. <tname>_q10_step1, <tname>_q50_step2, etc.); in point mode, it contains a single prediction column per forecast step (e.g. <tname>_pred_step1). If y is provided, an additional column (<tname>_actual) is included.

Return type:

pandas.DataFrame

Examples

>>> from fusionlab.nn.transformers import XTFT
>>> from fusionlab.nn.utils import forecast_multi_step
>>> from fusionlab.nn.losses import combined_quantile_loss
>>> import pandas as pd
>>> import numpy as np
>>>
>>> # Create a dummy training DataFrame with a date column,
>>> # spatial features ("longitude", "latitude"), two dynamic
>>> # features ("feat1", "feat2"), a static feature ("stat1"), and
>>> # the target variable "subsidence".
>>> date_rng = pd.date_range(start="2020-01-01", periods=60,
...                          freq="D")
>>> train_df = pd.DataFrame({
...     "date": date_rng,
...     "longitude": np.random.uniform(-180, 180, 60),
...     "latitude": np.random.uniform(-90, 90, 60),
...     "feat1": np.random.rand(60),
...     "feat2": np.random.rand(60),
...     "stat1": np.random.rand(60),
...     "subsidence": np.random.rand(60)
... })
>>>
>>> # Prepare dummy input arrays for model training.
>>> # X_static is constructed using "longitude" and "stat1".
>>> X_static = train_df[["longitude", "stat1"]].values
>>> # X_dynamic for "feat1" and "feat2" with time_steps = 3.
>>> X_dynamic = np.random.rand(60, 3, 2)
>>> # X_future is a dummy future feature array with shape (60, 3, 1).
>>> X_future = np.random.rand(60, 3, 1)
>>> # Target output from "subsidence" reshaped to
>>> # (60, 1, 1). For multi-step forecast, forecast_horizon is 4.
>>> forecast_horizon = 4
>>> y_array = train_df["subsidence"].values.reshape(60, 1, 1)
>>>
>>> # Instantiate a dummy XTFT model.
>>> my_model = XTFT(
...     static_input_dim=2,    # "longitude" and "stat1"
...     dynamic_input_dim=2,   # "feat1" and "feat2"
...     future_input_dim=1,    # One future feature
...     forecast_horizon=forecast_horizon,
...     quantiles=[0.1, 0.5, 0.9],
...     embed_dim=16,
...     max_window_size=3,
...     memory_size=50,
...     num_heads=2,
...     dropout_rate=0.1,
...     lstm_units=32,
...     attention_units=32,
...     hidden_units=16
... )
>>> my_model.compile(
...    optimizer="adam",
...    loss=combined_quantile_loss(my_model.quantiles)
...    )
>>>
>>> # Fit the model on the dummy data for demonstration.
>>> my_model.fit(
...     x=[X_static, X_dynamic, X_future],
...     y=y_array,
...     epochs=1,
...     batch_size=8
... )
>>>
>>> # Generate forecast datetime values for the forecast horizon.
>>> forecast_dates = pd.date_range(start="2020-02-01",
...                                periods=forecast_horizon, freq="D")
>>>
>>> # Package inputs as expected by forecast_multi_step.
>>> inputs = [X_static, X_dynamic, X_future]
>>>
>>> # Generate a multi-step forecast in quantile mode.
>>> forecast_df_quantile = forecast_multi_step(
...     xtft_model=my_model,
...     inputs=inputs,
...     forecast_horizon=forecast_horizon,
...     y=y_array,
...     dt_col="date",
...     mode="quantile",
...     spatial_cols=["longitude", "latitude"],
...     q=[0.1, 0.5, 0.9],
...     tname="subsidence",
...     forecast_dt=forecast_dates,
...     apply_mask=False,
...     verbose=3
... )
>>> print("Quantile Forecast:")
>>> print(forecast_df_quantile.head())
>>>

For point forecast

>>> # Instantiate a dummy XTFT model.
>>> my_model = XTFT(
...     static_input_dim=2,    # "longitude" and "stat1"
...     dynamic_input_dim=2,   # "feat1" and "feat2"
...     future_input_dim=1,    # One future feature
...     forecast_horizon=forecast_horizon,
...     quantiles=None, # set quantiles to None
...     embed_dim=16,
...     max_window_size=3,
...     memory_size=50,
...     num_heads=2,
...     dropout_rate=0.1,
...     lstm_units=32,
...     attention_units=32,
...     hidden_units=16
... )
>>> my_model.compile(
...    optimizer="adam", loss="mse",
...    )
>>>
>>> # Fit the model on the dummy data for demonstration.
>>> my_model.fit(
...     x=[X_static, X_dynamic, X_future],
...     y=y_array,
...     epochs=1,
...     batch_size=8
... )

>>> # Generate a multi-step forecast in point mode.
>>> forecast_df_point = forecast_multi_step(
...     xtft_model=my_model,
...     inputs=inputs,
...     forecast_horizon=forecast_horizon,
...     y=y_array,
...     dt_col="date",
...     mode="point",
...     spatial_cols=["longitude", "latitude"],
...     tname="subsidence",
...     forecast_dt=forecast_dates,
...     apply_mask=False,
...     verbose=3
... )
>>> print("Point Forecast:")
>>> print(forecast_df_point.head())

Notes

In quantile mode, predictions are generated for each specified quantile for every forecast step, and the median (0.5) is used for evaluation.
In point mode, a single prediction is generated per forecast step.
The output prediction array is expected to have the shape \((n, forecast\_horizon, m)\), where \(n\) is the number of samples and \(m\) is the number of outputs per step (e.g., number of quantiles in quantile mode or 1 in point mode).
The provided spatial_cols must correspond to the first two columns of the original training data’s X_static.
Evaluation metrics such as R² Score and Coverage Score (in quantile mode) are computed if actual target values (y) are provided.
The DataFrame is constructed by iterating over each sample and each forecast step.