fusionlab.utils.ts_utils.decompose_ts

fusionlab.utils.ts_utils.decompose_ts(df, value_col, dt_col=None, method='additive', strategy='STL', seasonal_period=12, robust=True)[source]

Decompose a time series into trend, seasonal, and residual components while keeping the other features intact.

In practice, the time series \(Y_t\) is broken down into three main components [1] [2]:

\[Y_t = T_t + S_t + R_t\]

where \(T_t\) is the trend, \(S_t\) is the seasonal component, and \(R_t\) is the residual or irregular term. If a multiplicative method is used, the decomposition can be modeled as:

\[Y_t = T_t \times S_t \times R_t,\]

or equivalently in logarithms:

\[\log(Y_t) = \log(T_t) + \log(S_t) + \log(R_t).\]
Parameters:
  • df (pandas.DataFrame) – The input DataFrame containing the time series data along with potential additional features.

  • value_col (str) – The name of the column holding the primary time series to be decomposed. This column is used to derive \(T_t, S_t, R_t\).

  • dt_col (str, optional) – The column holding datetime information, if needed for validations or indexing. If None, the function assumes the time series is already aligned or validated.

  • method ({'additive', 'multiplicative'}, optional) –

    The type of decomposition model:

    • 'additive': Assumes data can be decomposed as a sum of its components.

    • 'multiplicative': Assumes the product of components. Useful if the amplitude of seasonality scales with the level of the series.

  • strategy ({'STL', 'SDT'}, optional) –

    Determines how the decomposition is performed:

    • 'STL': Uses statsmodels.tsa.seasonal.STL (Seasonal-Trend decomposition using LOESS).

    • 'SDT': Uses classic statsmodels.tsa.seasonal.seasonal_decompose().

  • seasonal_period (int, optional) – Defines the periodicity or frequency of the seasonality. For example, 12 for monthly data exhibiting yearly seasonality. Must be an odd integer >= 3.

  • robust (bool, optional) – Whether to perform a robust STL decomposition (only valid for strategy='STL'). With robust set to True, the algorithm can better handle outliers.

Returns:

decomposed_df – A new DataFrame containing columns for trend, seasonal, and residual, along with the original time series column and any other existing features in df. This allows further analysis without losing context of the other data.

Return type:

pandas.DataFrame

Notes

STL decomposition (strategy='STL') is typically more flexible than the classical approach, particularly for handling complex seasonal patterns or outliers. The seasonal period must be an odd integer >= 3 in STL.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from fusionlab.utils.ts_utils import decompose_ts
>>> # Generate 100 days of synthetic data
>>> df = pd.DataFrame({
...     'time': pd.date_range(start='2020-01-01',
...                           periods=100,
...                           freq='D'),
...     'value': np.random.randn(100).cumsum() + 5
... })
>>> df.set_index('time', inplace=True)
>>> # Decompose using STL (Seasonal-Trend decomposition)
>>> decomposed_df = decompose_ts(
...     df,
...     value_col='value',
...     method='additive',
...     strategy='STL',
...     seasonal_period=12
... )
>>> print(decomposed_df.head())
>>> # Decompose using SDT (Seasonal Decomposition of Time Series)
>>> decomposed_df_sdt = decompose_ts(
...     df,
...     value_col='value',
...     method='multiplicative',
...     strategy='SDT',
...     seasonal_period=12
... )
>>> print(decomposed_df_sdt.head())

See also

STL

Seasonal and Trend decomposition using LOESS

from

mod:statsmodels.tsa.seasonal.

seasonal_decompose

Classic decomposition method

from

mod:statsmodels.tsa.seasonal.

References