fusionlab.utils.ts_utils.ts_corr_analysis

fusionlab.utils.ts_utils.ts_corr_analysis(df, dt_col, value_col, lags=2, features=None, view_acf_pacf=True, view_cross_corr=True, fig_size=(14, 6), show_grid=True, cross_corr_on_sep=False, verbose=0)[source]

Perform correlation analysis on a time series dataset, including autocorrelation (ACF), partial autocorrelation (PACF), and cross-correlation with external features.

\[\rho(h) = \frac{E\big[(X_t - \mu)(X_{t+h} - \mu)\big]} {\sigma^2},\]

where \(h\) denotes the lag, \(\mu\) the mean, and \(\sigma^2\) the variance of the time series [1].

Parameters:
  • df (pandas.DataFrame) – The input DataFrame containing time series data. Must contain at least one time-like column or index.

  • dt_col (str) – Column name representing the datetime dimension (e.g. “DateTime” or “timestamp”).

  • value_col (str) – Name of the primary target variable column (e.g. “sales”).

  • lags (int, optional) – Number of time lags for ACF/PACF analysis. Default is 2.

  • features (list of str, optional) – List of external feature columns to analyze for cross-correlation with value_col. If None, uses all non-target, non-datetime columns in df.

  • view_acf_pacf (bool, optional) – Whether to generate and display ACF and PACF plots.

  • view_cross_corr (bool, optional) – Whether to visualize cross-correlations for selected external features.

  • fig_size (tuple of (float, float), optional) – Figure dimension for ACF/PACF plots and optionally cross-correlation bars. Default is (14, 6).

  • show_grid (bool, optional) – Whether to display gridlines in the plots. Default is True.

  • cross_corr_on_sep (bool, optional) – If True, plots cross-correlation results in a separate figure. If False and view_cross_corr=True, it appends the cross-corr plot to the same figure containing ACF/PACF (if feasible).

  • verbose (int, optional) –

    Verbosity level:

    • 0: No console messages.

    • 1: Basic info messages.

    • 2: More detailed logs.

Returns:

results – Dictionary of correlation metrics:

  • 'acf_values': ACF values up to lags.

  • 'pacf_values': PACF values up to lags.

  • 'cross_corr': Cross-correlation coefficients (and p-values) for external features.

Return type:

dict

Notes

This function can aid in both univariate and multivariate time series analysis. By assessing ACF and PACF, users glean insights about autocorrelation structure (e.g. potential AR or MA terms in ARIMA). Cross-correlation helps identify external predictors correlated with the target [2].

Examples

>>> import pandas as pd
>>> from fusionlab.utils.ts_utils import ts_corr_analysis
>>> data = {
...     'Date': [
...         '2021-01-01','2021-01-02','2021-01-03',
...         '2021-01-04','2021-01-05'
...     ],
...     'Sales': [10, 12, 14, 13, 15],
...     'Promo': [0, 1, 0, 1, 1]
... }
>>> df = pd.DataFrame(data)
>>> results = ts_corr_analysis(
...     df,
...     dt_col='Date',
...     value_col='Sales',
...     lags=1,
...     features=['Promo'],
...     view_acf_pacf=True,
...     view_cross_corr=True,
...     verbose=1
... )
Performing ACF and PACF analysis...
Target variable: Sales
Datetime column: Date
Cross-correlation features: ['Promo']
Performing cross-correlation analysis...
CrossCorrResults > item 1: correlation=0.2890, p_value=0.6367

See also

statsmodels.graphics.tsaplots.plot_acf

Plot the autocorrelation function.

statsmodels.graphics.tsaplots.plot_pacf

Plot the partial autocorrelation function.

References