fusionlab.metrics.coverage_score

fusionlab.metrics.coverage_score(y_true, y_lower, y_upper, sample_weight=None, nan_policy='propagate', multioutput='uniform_average', warn_invalid_bounds=True, eps=1e-08, verbose=0)[source]

Compute the coverage score of prediction intervals.

Measures the fraction of instances where the true value lies within a provided lower and upper bound. This metric is useful for evaluating uncertainty estimates in probabilistic forecasts.

Formally, given observed true values \(y = \{y_1, \ldots, y_n\}\) (which can be multi-output), and corresponding interval bounds \(\{l_1, \ldots, l_n\}\) and \(\{u_1, \ldots, u_n\}\), the coverage score is defined for each output (if applicable) as:

\[\text{coverage} = \frac{1}{N_{valid}}\sum_{i=1}^{N_{valid}} \mathbf{1}\{ l_i \leq y_i \leq u_i \},\]

where \(\mathbf{1}\{\cdot\}\) is an indicator function and \(N_{valid}\) is the number of valid samples after handling NaNs.

Parameters:
  • y_true (array-like of shape (n_samples,) or (n_samples, n_outputs)) – The true observed values. Must be numeric.

  • y_lower (array-like of shape (n_samples,) or (n_samples, n_outputs)) – The lower bound predictions, matching y_true in shape.

  • y_upper (array-like of shape (n_samples,) or (n_samples, n_outputs)) – The upper bound predictions, matching y_true in shape.

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights. If None, then samples are equally weighted.

  • nan_policy ({'omit', 'propagate', 'raise'}, default 'propagate') –

    Defines how to handle NaN values: - 'propagate': If NaNs are present in inputs, they propagate

    to the output. For multioutput=’raw_values’, an output column with NaNs will result in a NaN score for that output. For multioutput=’uniform_average’, if any per-output score is NaN, the final average may be NaN (unless np.nanmean behavior is more nuanced, here standard mean is used after per-output scores are found).

    • 'omit': NaNs in any of y_true, y_lower, or y_upper for a given sample (row) will lead to the omission of that entire sample from the coverage calculation.

    • 'raise': Encountering NaNs raises a ValueError.

  • multioutput ({'raw_values', 'uniform_average'}, default 'uniform_average') –

    Defines aggregating of multiple output values. Array-like value defines weights used to average scores. - 'raw_values': Returns a full set of scores in case of

    multi-output input.

    • 'uniform_average': Scores of all outputs are averaged with uniform weight.

  • warn_invalid_bounds (bool, default True) – If True, issues a UserWarning if any y_lower[i] > y_upper[i]. These samples will always count as uncovered.

  • eps (float, default 1e-8) – Small epsilon value to prevent division by zero or issues with very small sum of weights when sample_weight is used.

  • verbose (int, default 0) – Controls the level of verbosity for internal logging (prints to console): - 0: No output. - 1: Basic info (e.g., final coverage). - >=2: More details (e.g., NaN handling, shapes).

Returns:

score – Coverage score. If multioutput=’raw_values’, an array of scores is returned, one for each output. Otherwise, a single float average score is returned. Returns np.nan if calculation is not possible (e.g., all samples omitted due to NaNs).

Return type:

float or ndarray of floats

Notes

If y_true, y_lower, and y_upper are 1D arrays, the behavior is equivalent to a single-output scenario, and multioutput options will yield consistent scalar results (though ‘raw_values’ will technically return a 1-element array that is then squeezed to scalar if input was 1D).

Examples

>>> from fusionlab.metrics import coverage_score
>>> import numpy as np
>>> y_true = np.array([10, 12, 11, 9, np.nan])
>>> y_lower = np.array([9, 11, 10, 8, 9])
>>> y_upper = np.array([11, 13, 12, 10, 11])
>>> # Default: nan_policy='propagate'
>>> coverage_score(y_true, y_lower, y_upper) # Propagates NaN
nan
>>> # Omitting NaNs
>>> coverage_score(y_true, y_lower, y_upper, nan_policy='omit')
Coverage computed: 1.0000
1.0
>>> # Multi-output example
>>> y_true_mo = np.array([[10, 20], [12, 22], [11, np.nan]])
>>> y_lower_mo = np.array([[9, 19], [11, 21], [10, 20]])
>>> y_upper_mo = np.array([[11, 21], [13, 23], [12, 22]])
>>> coverage_score(y_true_mo, y_lower_mo, y_upper_mo, nan_policy='omit',
                   multioutput='raw_values')
Coverage computed: [1. 1.]
array([1., 1.])
>>> coverage_score(y_true_mo, y_lower_mo, y_upper_mo, nan_policy='omit',
                   multioutput='uniform_average')
Coverage computed: 1.0000
1.0
>>> coverage_score(y_true_mo, y_lower_mo, y_upper_mo,
                   nan_policy='propagate', multioutput='raw_values')
Coverage computed: [1. nan]
array([ 1., nan])

See also

sklearn.utils.validation.check_array

Utility for input validation.

numpy.average

Compute weighted average, used with sample_weight.

References