fusionlab.nn.utils.compute_forecast_horizon¶

fusionlab.nn.utils.compute_forecast_horizon(data=None, dt_col=None, start_pred=None, end_pred=None, error='raise', verbose=1)[source]¶

Compute the forecast horizon for time series forecasting models.

This function calculates the number of future time steps (forecast_horizon) a model should predict based on the provided data or specified prediction dates. It intelligently infers the frequency of the data and computes the horizon accordingly. The function accommodates various datetime formats and handles different input scenarios robustly.

Parameters:

data (pandas.DataFrame, pandas.Series, list, or numpy.ndarray, optional) – The dataset containing datetime information. If a pandas.DataFrame is provided, the dt_col parameter must be specified to indicate which column contains the datetime data. For pandas.Series, list, or numpy.ndarray, the function attempts to infer the frequency directly.
dt_col (str, optional) – The name of the column in data that contains datetime information. This parameter is required if data is a pandas.DataFrame. Example: dt_col='timestamp'
start_pred (str, int, or datetime-like) – The starting point for forecasting. This can be a date string (e.g., ‘2023-04-10’), a datetime object, or an integer representing a year (e.g., 2024). If an integer is provided, it is interpreted as a year, and a warning is issued to inform the user of this interpretation.
end_pred (str, int, or datetime-like) – The ending point for forecasting. Similar to start_pred, this can be a date string, a datetime object, or an integer representing a year. The function calculates the forecast horizon based on the difference between start_pred and `end_pred.
error ({'raise', 'warn', 'ignore'}, default 'raise') –
Defines the error handling behavior when encountering issues such as invalid input types, missing date columns, or unparseable dates.
- ’raise’: Raises a ValueError when an error is encountered.
- ’warn’: Emits a warning and attempts to proceed with default behavior.
verbose (int, default 1) –
Controls the level of verbosity for debug information.
- 0: No output.
- 1: Minimal output (e.g., starting message).
- 2: Intermediate output (e.g., detected dates, computed horizons).
- 3: Detailed output (e.g., types of predictions, inferred frequencies).

Returns:

The computed forecast_horizon representing the number of steps ahead the model should predict. Returns None if an error occurs and error is set to ‘warn’.

Return type:

int or None

Raises:

ValueError – If invalid parameters are provided and error is set to ‘raise’.

Examples

>>> from fusionlab.nn.utils import compute_forecast_horizon
>>> import pandas as pd
>>> import numpy as np
>>> from datetime import datetime, timedelta
>>>
>>> # Example 1: Using a DataFrame with a Date Column
>>> df = pd.DataFrame({
...     'date': pd.date_range(start='2023-01-01', periods=100, freq='D'),
...     'value': np.random.randn(100)
... })
>>> horizon = compute_forecast_horizon(
...     data=df,
...     dt_col='date',
...     start_pred='2023-04-10',
...     end_pred='2023-04-20',
...     error='raise',
...     verbose=3
... )
>>> print(f"Forecast Horizon: {horizon}")
Forecast Horizon: 11

>>> # Example 2: Using a List of Datetimes
>>> dates = [datetime(2023, 1, 1) + timedelta(days=i) for i in range(100)]
>>> horizon = compute_forecast_horizon(
...     data=dates,
...     start_pred='2023-04-10',
...     end_pred='2023-04-20',
...     error='warn',
...     verbose=2
... )
>>> print(f"Forecast Horizon: {horizon}")
Forecast Horizon: 11

>>> # Example 3: Handling Integer Years
>>> horizon = compute_forecast_horizon(
...     start_pred=2024,
...     end_pred=2030,
...     error='raise',
...     verbose=1
... )
Forecast Horizon: 7

>>> # Example 4: Without Providing Data (Assuming Frequency Based on Prediction Dates)
>>> horizon = compute_forecast_horizon(
...     start_pred='2023-04-10',
...     end_pred='2023-04-20',
...     error='raise',
...     verbose=1
... )
Forecast Horizon: 11

Notes

When data is not provided, the function relies solely on the difference between start_pred and end_pred to compute the forecast horizon. In such cases, if the frequency cannot be inferred, the horizon is calculated based on the largest possible time unit (years, months, weeks, days).
If start_pred is after end_pred, the function returns 0 and issues a warning or raises an error based on the error parameter.
The function attempts to infer the frequency of the data using pandas utilities. If the frequency cannot be inferred, it defaults to calculating the horizon based on the time difference in the most significant unit.