fusionlab.datasets.make_quantile_prediction_data

fusionlab.datasets.make_quantile_prediction_data(n_samples=100, n_horizons=6, quantiles=[0.1, 0.5, 0.9], target_mean=50.0, target_stddev=10.0, pred_bias=1.0, pred_spread_factor=1.5, add_coords=True, coord_scale=10.0, as_frame=False, seed=None)[source]

Generate synthetic actuals and corresponding quantile predictions.

Creates a dataset simulating the output of a multi-horizon quantile forecasting model. It includes actual target values and predicted values for specified quantiles across multiple forecast horizons for a set of samples (e.g., locations).

This data is useful for demonstrating and testing functions that evaluate or visualize probabilistic forecasts, such as those comparing prediction intervals to actual outcomes.

Parameters:
  • n_samples (int, default 100) – Number of independent samples (e.g., locations) to generate.

  • n_horizons (int, default 6) – Number of future time steps (forecast horizon) per sample.

  • quantiles (list of float, default [0.1, 0.5, 0.9]) – List of quantile levels (between 0 and 1) for which to generate predictions.

  • target_mean (float, default 50.0) – Mean value around which the ‘actual’ target values are generated.

  • target_stddev (float, default 10.0) – Standard deviation for generating the ‘actual’ target values (using a normal distribution).

  • pred_bias (float, default 1.0) – Systematic bias added to the median (0.5 quantile) prediction relative to the generated actual value.

  • pred_spread_factor (float, default 1.5) – Factor controlling the width of the prediction intervals. A higher value creates wider intervals between quantiles. Specifically, it scales the offsets added/subtracted from the biased median.

  • add_coords (bool, default True) – If True, add ‘longitude’ and ‘latitude’ columns with random coordinates.

  • coord_scale (float, default 10.0) – Scaling factor for the random coordinates if add_coords is True.

  • as_frame (bool, default False) – Determines the return type: - False (default): Returns a Bunch object. - True: Returns only the pandas DataFrame.

  • seed (int, optional) – Seed for NumPy’s random number generator for reproducibility. Default is None.

Returns:

data – If as_frame=False (default): A Bunch object with attributes like frame (DataFrame), quantiles (list), horizons (list), target_cols, prediction_cols (nested dict), longitude, latitude (if generated), and DESCR. If as_frame=True: The generated data solely as a pandas DataFrame in wide format (e.g., columns ‘target_h1’, ‘pred_q10_h1’, ‘pred_q50_h1’, …).

Return type:

Bunch or pandas.DataFrame

Examples

>>> from fusionlab.datasets import make_quantile_prediction_data
>>> # Generate data as Bunch
>>> pred_bunch = make_quantile_prediction_data(n_samples=5, n_horizons=3, seed=1)
>>> print(pred_bunch.frame.head())
>>> print("Quantile columns for q=0.1:", pred_bunch.prediction_cols['q0.1'])
>>> # Generate data as DataFrame
>>> pred_df = make_quantile_prediction_data(as_frame=True, seed=2)
>>> print(pred_df.info())