fusionlab.nn.utils.step_to_long¶
- fusionlab.nn.utils.step_to_long(df, tname=None, dt_col=None, spatial_cols=None, mode='quantile', quantiles=None, verbose=3, sort=True)[source]¶
Convert a multi-step forecast DataFrame from wide to long format.
This function transforms a DataFrame containing multi-step forecast predictions into a long-format DataFrame. In quantile mode, forecast columns such as
subsidence_q10_step1,subsidence_q50_step1, etc. are consolidated into unified columns (e.g.subsidence_q10,subsidence_q50, etc.), while in point mode, a single prediction column (subsidence_pred) is generated. The transformation also carries over additional columns (e.g. spatial coordinates and time) from the original DataFrame.- Parameters:
df (
pandas.DataFrame) – The multi-step forecast DataFrame. Expected to contain forecast prediction columns (e.g. columns with_qor_pred_stepin their names) along with other identifiers.tname (
str, optional) – The base name of the target variable (e.g."subsidence"). IfNone, the function attempts to auto-detect the target name from the column names.dt_col (
str, optional) – The name of the time column to include in the final DataFrame. If not provided, time sorting is not performed.spatial_cols (
listofstr, optional) – A list of spatial coordinate columns (e.g.["longitude", "latitude"]) to be retained in the final output.mode (
{"quantile", "point"}, default"quantile") – The forecast mode. In"quantile"mode, multiple quantile forecast columns are merged into unified columns. In"point"mode, a single prediction column is produced.quantiles (
listoffloat, optional) – The quantile values for quantile mode (e.g.[0.1, 0.5, 0.9]). If not provided, defaults are used.sort (
bool, optional) – If True, sorts the final DataFrame by the column specified indt_col(if present). Default is True.verbose (
int, optional) – Verbosity level for logging output. Higher values (e.g. 5 to 7) provide more detailed debug information.
- Returns:
- A long-format DataFrame with the following columns:
Spatial columns (if provided)
The time column (
dt_col), if providedForecast prediction columns: - In quantile mode: unified columns (e.g.
subsidence_q10,subsidence_q50, etc.)In point mode: a single column (
subsidence_pred)
- Return type:
pandas.DataFrame
Examples
>>> from fusionlab.nn.utils import step_to_long >>> # Given a DataFrame `forecast_df` with columns like: >>> # ['longitude', 'latitude', 'year', 'subsidence_actual', >>> # 'subsidence_q10_step1', 'subsidence_q50_step1', 'subsidence_q89_step1', >>> # 'subsidence_q10_step2', ...] >>> long_df = step_to_long( ... df=forecast_df, ... tname="subsidence", ... dt_col="year", ... spatial_cols=["longitude", "latitude"], ... mode="quantile", ... quantiles=[0.1, 0.5, 0.9], ... verbose=3, ... sort=True ... ) >>> print(long_df.head())
Notes
Internally, this function calls:
check_forecast_mode()to validate the user-specified quantiles.validate_consistency_q()andvalidate_quantiles()to ensure that the quantile values provided by the user match those auto-detected from the DataFrame.Depending on the
mode, it then calls either_step_to_long_q()(for quantile mode) or_step_to_long_pred()(for point mode) to perform the conversion.
Mathematically, let \(X \in \mathbb{R}^{n \times m}\) represent the wide-format DataFrame, where each row corresponds to one sample and each forecast step is stored in separate columns. The function reshapes \(X\) into a long-format DataFrame \(Y \in \mathbb{R}^{(n \cdot s) \times p}\), where \(s\) is the forecast horizon and \(p\) is the number of output columns after merging forecast step values.
See also
_step_to_long_qConverts multi-step quantile forecasts to long format.
_step_to_long_predConverts multi-step point forecasts to long format.
detect_digitsExtracts numeric values from strings for quantile detection.
References