fusionlab.utils.ts_utils.infer_decomposition_method¶
- fusionlab.utils.ts_utils.infer_decomposition_method(df, dt_col, period=12, return_components=False, view=False, figsize=(10, 8), method='heuristic', verbose=0)[source]¶
Determine the best decomposition approach for a time series, offering two modes:
method='heuristic': Checks if all data points are strictly positive and decides on multiplicative if they are, or additive otherwise. This approach does not evaluate the fit.method='variance_comparison': Performs both additive and multiplicative decompositions, compares residual variances, and chooses the method with the smaller residual variance.
\[\text{Additive: } Y_t = T_t + S_t + \epsilon_t\]\[\text{Multiplicative: } Y_t = T_t \times S_t \times \epsilon_t \quad\text{or}\quad \log(Y_t) = \log(T_t) + \log(S_t) + \epsilon_t.\]- Parameters:
df (
pandas.DataFrame) – The DataFrame containing time series data. Must include the datetime columndt_coland at least one column of values to decompose.dt_col (
str) – The column name representing datetime. This column is set as the index for decomposition.period (
int, optional) – The seasonal period (frequency) for decomposition. Commonly,12for monthly data showing yearly seasonality.return_components (
bool, optional) – IfTrue, returns a dictionary of decomposition components (trend,seasonal,residual). Otherwise, returns only the chosen model.view (
bool, optional) – IfTrue, displays histograms of residuals in thevariance_comparisonmode to facilitate comparison.figsize (
tupleof(float,float), optional) – Figure dimensions for residual plots.method (
{'heuristic','variance_comparison'}, optional) –Strategy for deciding on the decomposition approach:
'heuristic': If all data points are positive, uses'multiplicative'; else'additive'.'variance_comparison': Tries both models, compares the variance of residuals, and picks the one with smaller residual variance.
verbose (
{0, 1, 2, 3}, optional) –Control the amount of logging:
0 : No messages printed.
1 : Basic info about chosen model and decomposition.
2 : Additional details about data checks.
3 : Very detailed logs, including internal states and partial results.
- Returns:
best_method (
str) – The chosen decomposition type:'additive'or'multiplicative'.components (
dict, optional) – Returned only ifreturn_components=True. Contains the keys'trend','seasonal', and'residual'mapped topandas.Seriesobjects from the best decomposition.
Notes
Selecting an appropriate decomposition model can be crucial for capturing both trend and seasonality accurately [1]. In particular, the variance comparison approach ensures a more data-driven selection [2].
Examples
>>> import pandas as pd >>> from fusionlab.utils.ts_utils import infer_decomposition_method >>> data = { ... 'Date': [ ... '2020-01-01','2020-02-01','2020-03-01', ... '2020-04-01','2020-05-01' ... ], ... 'Sales': [100, 120, 140, 135, 150] ... } >>> df = pd.DataFrame(data) >>> df['Date'] = pd.to_datetime(df['Date']) >>> best_model = infer_decomposition_method( ... df, dt_col='Date', period=12, ... method='heuristic', verbose=2 ... ) Checking positivity for heuristic method... All values are > 0. Using 'multiplicative' model. >>> best_model 'multiplicative'
See also
seasonal_decomposeDecompose a time series into trend, seasonal, and residual components.
References