fusionlab.metrics.time_weighted_interval_score

fusionlab.metrics.time_weighted_interval_score(y_true, y_median, y_lower, y_upper, alphas, time_weights='inverse_time', sample_weight=None, nan_policy='propagate', multioutput='uniform_average', warn_invalid_bounds=True, eps=1e-08, verbose=0)[source]

Compute the Time-Weighted Interval Score (TWIS).

TWIS evaluates probabilistic forecasts (median and prediction intervals) over a time horizon, applying time-dependent weights. It extends the Weighted Interval Score (WIS) by incorporating temporal emphasis. Lower scores are better.

The WIS for a single observation \(y\), median \(m\), and \(K\) prediction intervals \(\{(l_k, u_k, \alpha_k)\}_{k=1}^K\) (where \(\alpha_k\) is the nominal coverage level for the k-th interval, meaning the interval is \([q_{\alpha_k/2}, q_{1-\alpha_k/2}]\)) is given by:

\[\mathrm{WIS}(y, m, \text{intervals}) = \frac{1}{K+1} \left( |y-m| + \sum_{k=1}^K \mathrm{IS}_{\alpha_k}(y, l_k, u_k) \right)\]

where \(\mathrm{IS}_{\alpha_k}\) is the interval score for the k-th interval, commonly defined as: .. math:

\mathrm{IS}_{\alpha_k}(y, l_k, u_k) = (u_k - l_k) +
\frac{2}{\alpha_k}(l_k - y)\mathbf{1}\{y < l_k\} +
\frac{2}{\alpha_k}(y - u_k)\mathbf{1}\{y > u_k\}

Alternatively, the sum term in WIS can be written using direct WIS components for each interval: .. math:

\sum_{k=1}^K \left[ \frac{\alpha_k}{2}(u_k - l_k) +
(l_k - y)\mathbf{1}\{y < l_k\} +
(y - u_k)\mathbf{1}\{y > u_k\} \right]

This function calculates \(\mathrm{WIS}_{iot}\) for each sample \(i\), output \(o\), and time step \(t\). Then, the Time-Weighted Interval Score for sample \(i\), output \(o\) is: .. math:

\mathrm{TWIS}_{io} = \sum_{t=1}^{T_{steps}} w_t \cdot \mathrm{WIS}_{iot}

where \(w_t\) are normalized time weights. The final score is an average of \(\mathrm{TWIS}_{io}\).

Parameters:
  • y_true (array-like) – True target values. Expected shapes: - (n_timesteps,) - (n_samples, n_timesteps) - (n_samples, n_outputs, n_timesteps)

  • y_median (array-like) – Median forecasts, matching y_true’s shape.

  • y_lower (array-like) – Lower bounds of K prediction intervals. Expected shapes: - If y_true is (T,): (K_intervals, n_timesteps) - If y_true is (N,T): (n_samples, K_intervals, n_timesteps) - If y_true is (N,O,T): (N_samp, N_out, K_int, n_timesteps)

  • y_upper (array-like) – Upper bounds, matching y_lower’s shape.

  • alphas (array-like of shape (K_intervals,)) – Nominal central interval probability levels (e.g., 0.1 for 90% PI). Each alpha must be in (0, 1). These define the \(\alpha_k\) values used in the IS and WIS component weighting.

  • time_weights (array-like of shape (n_timesteps,), str, or None, :class:``) – default=’inverse_time’ Weights for each time step. Normalized to sum to 1. See time_weighted_accuracy_score for details.

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights. Sum must be > eps.

  • nan_policy ({'omit', 'propagate', 'raise'}, default 'propagate') – How to handle NaNs in inputs.

  • multioutput ({'raw_values', 'uniform_average'}, default 'uniform_average') – Aggregation for multi-output data.

  • warn_invalid_bounds (bool, default True) – If True, warn if any y_lower > y_upper. Widths will be negative.

  • eps (float, default 1e-8) – Epsilon for safe division (e.g., sum of weights).

  • verbose (int, default 0) – Verbosity level.

Returns:

score – Mean TWIS. Lower values are better.

Return type:

float or ndarray of floats

Examples

>>> import numpy as np
>>> # from fusionlab.metrics import time_weighted_interval_score
>>> y_t = np.array([[10, 11], [20, 22]]) # 2 samples, 2 timesteps
>>> y_m = np.array([[10, 11.5], [19, 21.5]])
>>> # For K=1 interval
>>> y_l = np.array([[[9], [10]], [[18],[20]]]) # (2s, 1o, 1k, 2t)
>>> y_l = y_l.transpose(0,2,1,3) # -> (2s,1k,1o,2t) for processing
>>> # Reshape to (2s, 1o, 1k, 2t) for this example if y_true is (2s,1o,2t)
>>> # Let's assume y_true is (2s, 1o_dummy, 2t) after processing
>>> # y_l needs to be (2s, 1o_dummy, 1k, 2t)
>>> y_l_example = np.array([[[[9, 10]]], [[[18, 20]]]]) # (2s,1o,1k,2t)
>>> y_u_example = np.array([[[[11, 12]]], [[[20, 23]]]])
>>> alphas_ex = np.array([0.2]) # Single 80% PI
>>> # For simplicity, let time_weights be uniform [0.5, 0.5]
>>> score = time_weighted_interval_score(
...     y_t, y_m, y_l_example, y_u_example, alphas_ex,
...     time_weights=None, verbose=0
... )
>>> print(f"TWIS: {score:.4f}") # Example output, calculation is involved
TWIS: 0.8750

See also

weighted_interval_score

Non-time-weighted version.

time_weighted_accuracy_score

Time-weighted accuracy for classification.