fusionlab.nn.utils.step_to_long

fusionlab.nn.utils.step_to_long(df, tname=None, dt_col=None, spatial_cols=None, mode='quantile', quantiles=None, verbose=3, sort=True)[source]

Convert a multi-step forecast DataFrame from wide to long format.

This function transforms a DataFrame containing multi-step forecast predictions into a long-format DataFrame. In quantile mode, forecast columns such as subsidence_q10_step1, subsidence_q50_step1, etc. are consolidated into unified columns (e.g. subsidence_q10, subsidence_q50, etc.), while in point mode, a single prediction column (subsidence_pred) is generated. The transformation also carries over additional columns (e.g. spatial coordinates and time) from the original DataFrame.

Parameters:
  • df (pandas.DataFrame) – The multi-step forecast DataFrame. Expected to contain forecast prediction columns (e.g. columns with _q or _pred_step in their names) along with other identifiers.

  • tname (str, optional) – The base name of the target variable (e.g. "subsidence"). If None, the function attempts to auto-detect the target name from the column names.

  • dt_col (str, optional) – The name of the time column to include in the final DataFrame. If not provided, time sorting is not performed.

  • spatial_cols (list of str, optional) – A list of spatial coordinate columns (e.g. ["longitude", "latitude"]) to be retained in the final output.

  • mode ({"quantile", "point"}, default "quantile") – The forecast mode. In "quantile" mode, multiple quantile forecast columns are merged into unified columns. In "point" mode, a single prediction column is produced.

  • quantiles (list of float, optional) – The quantile values for quantile mode (e.g. [0.1, 0.5, 0.9]). If not provided, defaults are used.

  • sort (bool, optional) – If True, sorts the final DataFrame by the column specified in dt_col (if present). Default is True.

  • verbose (int, optional) – Verbosity level for logging output. Higher values (e.g. 5 to 7) provide more detailed debug information.

Returns:

A long-format DataFrame with the following columns:
  • Spatial columns (if provided)

  • The time column (dt_col), if provided

  • Forecast prediction columns: - In quantile mode: unified columns (e.g.

    subsidence_q10, subsidence_q50, etc.)

    • In point mode: a single column (subsidence_pred)

Return type:

pandas.DataFrame

Examples

>>> from fusionlab.nn.utils import step_to_long
>>> # Given a DataFrame `forecast_df` with columns like:
>>> # ['longitude', 'latitude', 'year', 'subsidence_actual',
>>> #  'subsidence_q10_step1', 'subsidence_q50_step1', 'subsidence_q89_step1',
>>> #  'subsidence_q10_step2', ...]
>>> long_df = step_to_long(
...     df=forecast_df,
...     tname="subsidence",
...     dt_col="year",
...     spatial_cols=["longitude", "latitude"],
...     mode="quantile",
...     quantiles=[0.1, 0.5, 0.9],
...     verbose=3,
...     sort=True
... )
>>> print(long_df.head())

Notes

Internally, this function calls:

  • check_forecast_mode() to validate the user-specified quantiles.

  • validate_consistency_q() and validate_quantiles() to ensure that the quantile values provided by the user match those auto-detected from the DataFrame.

  • Depending on the mode, it then calls either _step_to_long_q() (for quantile mode) or _step_to_long_pred() (for point mode) to perform the conversion.

Mathematically, let \(X \in \mathbb{R}^{n \times m}\) represent the wide-format DataFrame, where each row corresponds to one sample and each forecast step is stored in separate columns. The function reshapes \(X\) into a long-format DataFrame \(Y \in \mathbb{R}^{(n \cdot s) \times p}\), where \(s\) is the forecast horizon and \(p\) is the number of output columns after merging forecast step values.

See also

_step_to_long_q

Converts multi-step quantile forecasts to long format.

_step_to_long_pred

Converts multi-step point forecasts to long format.

detect_digits

Extracts numeric values from strings for quantile detection.

References