fusionlab.nn.utils.prepare_spatial_future_data¶
- fusionlab.nn.utils.prepare_spatial_future_data(final_processed_data, feature_columns, dynamic_feature_indices, sequence_length=1, dt_col='date', static_feature_names=None, forecast_horizon=None, future_years=None, encoded_cat_columns=None, scaling_params=None, spatial_cols=None, squeeze_last=False, verbosity=0)[source]¶
Prepare future static and dynamic inputs for making predictions.
This function prepares the necessary static and dynamic inputs required for forecasting future values in time series data. It processes the provided dataset by grouping it by
location_id, extracting the last sequence of data points based on the specifiedsequence_length, and generating future inputs for prediction over the definedforecast_horizon.The function handles both integer and datetime representations of the
dt_col, extracting the year from datetime columns when necessary. It also allows for flexibility in specifying static features and encoded categorical variables.\[\text{scaled\_time} = \frac{\text{future\_time} - \mu}{\sigma}\]- Parameters:
final_processed_data (
pandas.DataFrame) – The processed DataFrame containing all features and targets. Must include thelocation_idcolumn and the specifieddt_col.feature_columns (
List[str]) – List of feature column names to be used for dynamic input preparation.dynamic_feature_indices (
List[int]) – Indices of dynamic features infeature_columns. These features are considered time-dependent and are used to prepare dynamic inputs.sequence_length (
int, optional) – The number of past time steps to include in each input sequence. Default is1.dt_col (
str, optional) – The name of the time-related column infinal_processed_data. Defaults to'date'.static_feature_names (
List[str], optional) – List of static feature column names. If not provided, defaults to['longitude', 'latitude']plus anyencoded_cat_columns.forecast_horizon (
int, optional) – The number of future time steps to predict. If set toNone, the function defaults to predicting the next immediate time step.future_years (
List[int], optional) – List of future years to predict. Must match the length offorecast_horizonifforecast_horizonis provided.encoded_cat_columns (
List[str], optional) – List of encoded categorical column names to be treated as static features.scaling_params (
Dict[str,Dict[str,float]], optional) – Dictionary containing scaling parameters (mean and standard deviation) for features. Example:{'year': {'mean': 2000, 'std': 10}}. If not provided, the function computes the mean and std for thedt_col.squeeze_last (
bool, defaultTrue,) – Squeeze the last axis which correspond to the output dimensionyif equal to1.verbosity (
int, optional) – Verbosity level from0to7for debugging and understanding the process. Higher values produce more detailed logs.spatial_cols (Tuple[str, str])
- Returns:
A tuple containing:
future_static_inputsnumpy.ndarrayArray of future static inputs with shape
(num_samples, num_static_vars, 1).
future_dynamic_inputsnumpy.ndarrayArray of future dynamic inputs with shape
(num_samples, sequence_length, num_dynamic_vars, 1).
future_years_listList[int]List of future time values corresponding to each sample.
location_ids_listList[int]List of location IDs corresponding to each sample.
longitudesList[float]List of longitude values corresponding to each sample.
latitudesList[float]List of latitude values corresponding to each sample.
- Return type:
Tuple[np.ndarray,np.ndarray,List[int],List[int],List[float],List[float]]
Examples
>>> from fusionlab.nn.utils import prepare_spatial_future_data >>> import pandas as pd >>> data = pd.DataFrame({ ... 'location_id': [1, 1, 1, 2, 2, 2], ... 'year': [2018, 2019, 2020, 2018, 2019, 2020], ... 'longitude': [10.0, 10.0, 10.0, 20.0, 20.0, 20.0], ... 'latitude': [50.0, 50.0, 50.0, 60.0, 60.0, 60.0], ... 'temperature': [15, 16, 15.5, 20, 21, 20.5], ... 'rainfall': [100, 110, 105, 200, 210, 205], ... 'encoded_cat': [1, 1, 1, 2, 2, 2] ... }) >>> feature_cols = ['year', 'temperature', 'rainfall', 'encoded_cat'] >>> dynamic_indices = [0, 1, 2] >>> future_static, future_dynamic, future_years, loc_ids, longs,\ lats = prepare_spatial_future_data( ... final_processed_data=data, ... feature_columns=feature_cols, ... dynamic_feature_indices=dynamic_indices, ... sequence_length=2, ... forecast_horizon=1, ... future_years=[2021], ... encoded_cat_columns=['encoded_cat'], ... verbosity=5, ... dt_col='year' ... ) >>> print(future_static.shape) (2, 3, 1) >>> print(future_dynamic.shape) (2, 2, 3, 1)
Notes
The function handles both integer and datetime representations of the
dt_col. Ifdt_colis a datetime type, the year is extracted for scaling purposes.If
forecast_horizonis set toNone, the function defaults to generating data for the next immediate time step based on the last entry in the time column.Ensure that the length of
future_yearsmatchesforecast_horizonifforecast_horizonis provided.The
static_feature_namesparameter allows for flexibility in specifying which static features to include. If not provided, it defaults to['longitude', 'latitude']plus anyencoded_cat_columns.
See also
prepare_future_dataMain function for preparing future data inputs.
References