fusionlab.nn.utils.prepare_spatial_future_data¶
- fusionlab.nn.utils.prepare_spatial_future_data(final_processed_data, feature_columns, dynamic_feature_indices, sequence_length=1, dt_col='date', static_feature_names=None, forecast_horizon=None, future_years=None, encoded_cat_columns=None, scaling_params=None, spatial_cols=None, squeeze_last=False, verbosity=0)[source]¶
Prepare future static and dynamic inputs for making predictions.
This function prepares the necessary static and dynamic inputs required for forecasting future values in time series data. It processes the provided dataset by grouping it by
location_id, extracting the last sequence of data points based on the specifiedsequence_length, and generating future inputs for prediction over the definedforecast_horizon.The function handles both integer and datetime representations of the
dt_col, extracting the year from datetime columns when necessary. It also allows for flexibility in specifying static features and encoded categorical variables.\[ext{scaled\_time} =\]rac{ ext{future_time} - mu}{sigma}
- final_processed_datapandas.DataFrame
The processed DataFrame containing all features and targets. Must include the
location_idcolumn and the specifieddt_col.- feature_columnsList[str]
List of feature column names to be used for dynamic input preparation.
- dynamic_feature_indicesList[int]
Indices of dynamic features in
feature_columns. These features are considered time-dependent and are used to prepare dynamic inputs.- sequence_lengthint, optional
The number of past time steps to include in each input sequence. Default is
1.- dt_colstr, optional
The name of the time-related column in
final_processed_data. Defaults to'date'.- static_feature_namesList[str], optional
List of static feature column names. If not provided, defaults to
['longitude', 'latitude']plus anyencoded_cat_columns.- forecast_horizonint, optional
The number of future time steps to predict. If set to
None, the function defaults to predicting the next immediate time step.- future_yearsList[int], optional
List of future years to predict. Must match the length of
forecast_horizonifforecast_horizonis provided.- encoded_cat_columnsList[str], optional
List of encoded categorical column names to be treated as static features.
- scaling_paramsDict[str, Dict[str, float]], optional
Dictionary containing scaling parameters (mean and standard deviation) for features. Example:
{'year': {'mean': 2000, 'std': 10}}. If not provided, the function computes the mean and std for thedt_col.- squeeze_last: bool, default=True,
Squeeze the last axis which correspond to the output dimension
yif equal to1.- verbosityint, optional
Verbosity level from
0to7for debugging and understanding the process. Higher values produce more detailed logs.
- Tuple[np.ndarray, np.ndarray, List[int], List[int], List[float], List[float]]
A tuple containing:
future_static_inputsnumpy.ndarrayArray of future static inputs with shape
(num_samples, num_static_vars, 1).
future_dynamic_inputsnumpy.ndarrayArray of future dynamic inputs with shape
(num_samples, sequence_length, num_dynamic_vars, 1).
future_years_listList[int]List of future time values corresponding to each sample.
location_ids_listList[int]List of location IDs corresponding to each sample.
longitudesList[float]List of longitude values corresponding to each sample.
latitudesList[float]List of latitude values corresponding to each sample.
>>> from fusionlab.nn.utils import prepare_spatial_future_data >>> import pandas as pd >>> data = pd.DataFrame({ ... 'location_id': [1, 1, 1, 2, 2, 2], ... 'year': [2018, 2019, 2020, 2018, 2019, 2020], ... 'longitude': [10.0, 10.0, 10.0, 20.0, 20.0, 20.0], ... 'latitude': [50.0, 50.0, 50.0, 60.0, 60.0, 60.0], ... 'temperature': [15, 16, 15.5, 20, 21, 20.5], ... 'rainfall': [100, 110, 105, 200, 210, 205], ... 'encoded_cat': [1, 1, 1, 2, 2, 2] ... }) >>> feature_cols = ['year', 'temperature', 'rainfall', 'encoded_cat'] >>> dynamic_indices = [0, 1, 2] >>> future_static, future_dynamic, future_years, loc_ids, longs, lats = prepare_spatial_future_data( ... final_processed_data=data, ... feature_columns=feature_cols, ... dynamic_feature_indices=dynamic_indices, ... sequence_length=2, ... forecast_horizon=1, ... future_years=[2021], ... encoded_cat_columns=['encoded_cat'], ... verbosity=5, ... dt_col='year' ... ) >>> print(future_static.shape) (2, 3, 1) >>> print(future_dynamic.shape) (2, 2, 3, 1)
The function handles both integer and datetime representations of the
dt_col. Ifdt_colis a datetime type, the year is extracted for scaling purposes.If
forecast_horizonis set toNone, the function defaults to generating data for the next immediate time step based on the last entry in the time column.Ensure that the length of
future_yearsmatchesforecast_horizonifforecast_horizonis provided.The
static_feature_namesparameter allows for flexibility in specifying which static features to include. If not provided, it defaults to['longitude', 'latitude']plus anyencoded_cat_columns.
prepare_future_data : Main function for preparing future data inputs.
- Parameters:
final_processed_data (DataFrame)
feature_columns (List[str])
dynamic_feature_indices (List[int])
sequence_length (int)
dt_col (str)
static_feature_names (List[str] | None)
forecast_horizon (int | None)
future_years (List[int] | None)
encoded_cat_columns (List[str] | None)
scaling_params (Dict[str, Dict[str, float]] | None)
spatial_cols (Tuple[str, str])
squeeze_last (bool)
verbosity (int)
- Return type:
Tuple[ndarray, ndarray, List[int], List[int], List[float], List[float]]