fusionlab.metrics.time_weighted_interval_score¶
- fusionlab.metrics.time_weighted_interval_score(y_true, y_median, y_lower, y_upper, alphas, time_weights='inverse_time', sample_weight=None, nan_policy='propagate', multioutput='uniform_average', warn_invalid_bounds=True, eps=1e-08, verbose=0)[source]¶
Compute the Time-Weighted Interval Score (TWIS).
TWIS evaluates probabilistic forecasts (median and prediction intervals) over a time horizon, applying time-dependent weights. It extends the Weighted Interval Score (WIS) by incorporating temporal emphasis. Lower scores are better.
The WIS for a single observation \(y\), median \(m\), and \(K\) prediction intervals \(\{(l_k, u_k, \alpha_k)\}_{k=1}^K\) (where \(\alpha_k\) is the nominal coverage level for the k-th interval, meaning the interval is \([q_{\alpha_k/2}, q_{1-\alpha_k/2}]\)) is given by:
\[\mathrm{WIS}(y, m, \text{intervals}) = \frac{1}{K+1} \left( |y-m| + \sum_{k=1}^K \mathrm{IS}_{\alpha_k}(y, l_k, u_k) \right)\]where \(\mathrm{IS}_{\alpha_k}\) is the interval score for the k-th interval, commonly defined as: .. math:
\mathrm{IS}_{\alpha_k}(y, l_k, u_k) = (u_k - l_k) + \frac{2}{\alpha_k}(l_k - y)\mathbf{1}\{y < l_k\} + \frac{2}{\alpha_k}(y - u_k)\mathbf{1}\{y > u_k\}
Alternatively, the sum term in WIS can be written using direct WIS components for each interval: .. math:
\sum_{k=1}^K \left[ \frac{\alpha_k}{2}(u_k - l_k) + (l_k - y)\mathbf{1}\{y < l_k\} + (y - u_k)\mathbf{1}\{y > u_k\} \right]
This function calculates \(\mathrm{WIS}_{iot}\) for each sample \(i\), output \(o\), and time step \(t\). Then, the Time-Weighted Interval Score for sample \(i\), output \(o\) is: .. math:
\mathrm{TWIS}_{io} = \sum_{t=1}^{T_{steps}} w_t \cdot \mathrm{WIS}_{iot}
where \(w_t\) are normalized time weights. The final score is an average of \(\mathrm{TWIS}_{io}\).
- Parameters:
y_true (
array-like) – True target values. Expected shapes: - (n_timesteps,) - (n_samples, n_timesteps) - (n_samples, n_outputs, n_timesteps)y_median (
array-like) – Median forecasts, matching y_true’s shape.y_lower (
array-like) – Lower bounds of K prediction intervals. Expected shapes: - If y_true is (T,): (K_intervals, n_timesteps) - If y_true is (N,T): (n_samples, K_intervals, n_timesteps) - If y_true is (N,O,T): (N_samp, N_out, K_int, n_timesteps)y_upper (
array-like) – Upper bounds, matching y_lower’s shape.alphas (
array-likeofshape (K_intervals,)) – Nominal central interval probability levels (e.g., 0.1 for 90% PI). Each alpha must be in (0, 1). These define the \(\alpha_k\) values used in the IS and WIS component weighting.time_weights (
array-likeofshape (n_timesteps,),str, orNone, :class:``) – default=’inverse_time’ Weights for each time step. Normalized to sum to 1. See time_weighted_accuracy_score for details.sample_weight (
array-likeofshape (n_samples,), optional) – Sample weights. Sum must be > eps.nan_policy (
{'omit', 'propagate', 'raise'}, default'propagate') – How to handle NaNs in inputs.multioutput (
{'raw_values', 'uniform_average'}, default'uniform_average') – Aggregation for multi-output data.warn_invalid_bounds (
bool, defaultTrue) – If True, warn if any y_lower > y_upper. Widths will be negative.eps (
float, default1e-8) – Epsilon for safe division (e.g., sum of weights).verbose (
int, default0) – Verbosity level.
- Returns:
score – Mean TWIS. Lower values are better.
- Return type:
floatorndarrayoffloats
Examples
>>> import numpy as np >>> # from fusionlab.metrics import time_weighted_interval_score >>> y_t = np.array([[10, 11], [20, 22]]) # 2 samples, 2 timesteps >>> y_m = np.array([[10, 11.5], [19, 21.5]]) >>> # For K=1 interval >>> y_l = np.array([[[9], [10]], [[18],[20]]]) # (2s, 1o, 1k, 2t) >>> y_l = y_l.transpose(0,2,1,3) # -> (2s,1k,1o,2t) for processing >>> # Reshape to (2s, 1o, 1k, 2t) for this example if y_true is (2s,1o,2t) >>> # Let's assume y_true is (2s, 1o_dummy, 2t) after processing >>> # y_l needs to be (2s, 1o_dummy, 1k, 2t) >>> y_l_example = np.array([[[[9, 10]]], [[[18, 20]]]]) # (2s,1o,1k,2t) >>> y_u_example = np.array([[[[11, 12]]], [[[20, 23]]]]) >>> alphas_ex = np.array([0.2]) # Single 80% PI >>> # For simplicity, let time_weights be uniform [0.5, 0.5] >>> score = time_weighted_interval_score( ... y_t, y_m, y_l_example, y_u_example, alphas_ex, ... time_weights=None, verbose=0 ... ) >>> print(f"TWIS: {score:.4f}") # Example output, calculation is involved TWIS: 0.8750
See also
weighted_interval_scoreNon-time-weighted version.
time_weighted_accuracy_scoreTime-weighted accuracy for classification.