fusionlab.metrics.continuous_ranked_probability_score

fusionlab.metrics.continuous_ranked_probability_score(y_true, y_pred, sample_weight=None, nan_policy='propagate', multioutput='uniform_average', verbose=0)[source]

Compute the sample-based Continuous Ranked Probability Score (CRPS).

This proper scoring rule measures both calibration and sharpness of ensemble forecasts by comparing predictive samples to true observations [1]. The sample approximation is:

\[\mathrm{CRPS} = \frac{1}{m}\sum_{j=1}^{m} |x_j - y| - \frac{1}{2m^2}\sum_{i=1}^{m}\sum_{j=1}^{m} |x_i - x_j|,\]

where \(x_1,\dots,x_m\) are ensemble members for a single observation \(y\). The score is then averaged over all samples.

Parameters:
  • y_true (array-like of shape (n_samples,) or (n_samples, n_outputs)) – Observed true values.

  • y_pred (array-like) –

    Ensemble forecast samples. - If y_true is 1D (n_samples,), y_pred must be 2D

    (n_samples, n_ensemble_members).

    • If y_true is 2D (n_samples, n_outputs), y_pred must be 3D (n_samples, n_outputs, n_ensemble_members).

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights. If None, samples are equally weighted.

  • nan_policy ({'omit', 'propagate', 'raise'}, default 'propagate') –

    How to handle NaNs:
    • 'raise': Raise an error if NaNs are present in inputs.

    • 'omit': Remove samples (rows) containing NaNs in y_true or y_pred before computation.

    • 'propagate': NaNs in inputs will propagate to the CRPS score for the affected sample(s)/output(s).

  • multioutput ({'raw_values', 'uniform_average'}, default 'uniform_average') –

    Defines aggregation for multi-output y_true.
    • 'raw_values': Returns a full set of scores, one for each output.

    • 'uniform_average': Scores of all outputs are averaged with uniform weight.

  • verbose (int, default 0) – Verbosity level: 0 (silent), 1 (summary), >=2 (debug details).

Returns:

score – Average CRPS. A scalar if multioutput=’uniform_average’ or if y_true is 1D. An array of shape (n_outputs,) if multioutput=’raw_values’ and y_true is 2D. Lower values are better.

Return type:

float or ndarray of floats

Examples

>>> import numpy as np
>>> # from fusionlab.metrics import continuous_ranked_probability_score
>>> y_true_1d = np.array([0.5, 0.0, 1.0, np.nan])
>>> y_pred_1d = np.array([
...     [0.0, 0.5, 1.0],  # For 0.5
...     [0.0, 0.1, 0.2],  # For 0.0
...     [0.9, 1.1, 1.0],  # For 1.0
...     [0.0, 0.5, np.nan] # For np.nan y_true
... ])
>>> score = continuous_ranked_probability_score(y_true_1d, y_pred_1d, nan_policy='omit', verbose=1)
CRPS computed: 0.0333
>>> print(f"CRPS (1D, omit NaNs): {score:.4f}")
CRPS (1D, omit NaNs): 0.0333
>>> score_prop = continuous_ranked_probability_score(y_true_1d, y_pred_1d, nan_policy='propagate')
>>> print(f"CRPS (1D, propagate NaNs): {score_prop}") # Will be nan
CRPS (1D, propagate NaNs): nan
>>> y_true_2d = np.array([[0.5, 2.5], [0.0, np.nan], [1.0, 3.0]])
>>> y_pred_2d = np.array([
...     [[0.0, 0.5, 1.0], [2.0, 2.5, 3.0]], # For [0.5, 2.5]
...     [[0.0, 0.1, 0.2], [np.nan, 3.1, 3.2]], # For [0.0, np.nan]
...     [[0.9, 1.1, 1.0], [2.8, 3.0, 3.2]]  # For [1.0, 3.0]
... ])
>>> raw_scores = continuous_ranked_probability_score(y_true_2d, y_pred_2d,
...                         nan_policy='propagate',
...                         multioutput='raw_values', verbose=1)
CRPS computed: [0.0333 nan   ]
>>> print(f"CRPS (2D, raw, propagate): {raw_scores}")
CRPS (2D, raw, propagate): [0.03333333        nan]
>>> avg_score = continuous_ranked_probability_score(y_true_2d, y_pred_2d,
...                        nan_policy='omit',
...                        multioutput='uniform_average', verbose=1)
CRPS computed: 0.0500
>>> print(f"CRPS (2D, omit, average): {avg_score:.4f}")
CRPS (2D, omit, average): 0.0500

Notes

  • This function calculates the CRPS based on ensemble samples.

  • It is suitable for evaluating probabilistic forecasts like Monte Carlo simulations or bagged ensembles.

  • CRPS is a strictly proper scoring rule, meaning it encourages honest and accurate probabilistic forecasts.

  • Lower CRPS values indicate better forecast performance.

See also

coverage_score

Metric for prediction interval coverage.

sklearn.metrics.mean_squared_error

A common deterministic metric.

References