fusionlab.datasets.make_quantile_prediction_data¶
- fusionlab.datasets.make_quantile_prediction_data(n_samples=100, n_horizons=6, quantiles=[0.1, 0.5, 0.9], target_mean=50.0, target_stddev=10.0, pred_bias=1.0, pred_spread_factor=1.5, add_coords=True, coord_scale=10.0, as_frame=False, seed=None)[source]¶
Generate synthetic actuals and corresponding quantile predictions.
Creates a dataset simulating the output of a multi-horizon quantile forecasting model. It includes actual target values and predicted values for specified quantiles across multiple forecast horizons for a set of samples (e.g., locations).
This data is useful for demonstrating and testing functions that evaluate or visualize probabilistic forecasts, such as those comparing prediction intervals to actual outcomes.
- Parameters:
n_samples (
int, default100) – Number of independent samples (e.g., locations) to generate.n_horizons (
int, default6) – Number of future time steps (forecast horizon) per sample.quantiles (
listoffloat, default[0.1,0.5,0.9]) – List of quantile levels (between 0 and 1) for which to generate predictions.target_mean (
float, default50.0) – Mean value around which the ‘actual’ target values are generated.target_stddev (
float, default10.0) – Standard deviation for generating the ‘actual’ target values (using a normal distribution).pred_bias (
float, default1.0) – Systematic bias added to the median (0.5 quantile) prediction relative to the generated actual value.pred_spread_factor (
float, default1.5) – Factor controlling the width of the prediction intervals. A higher value creates wider intervals between quantiles. Specifically, it scales the offsets added/subtracted from the biased median.add_coords (
bool, defaultTrue) – IfTrue, add ‘longitude’ and ‘latitude’ columns with random coordinates.coord_scale (
float, default10.0) – Scaling factor for the random coordinates if add_coords is True.as_frame (
bool, defaultFalse) – Determines the return type: -False(default): Returns a Bunch object. -True: Returns only the pandas DataFrame.seed (
int, optional) – Seed for NumPy’s random number generator for reproducibility. Default is None.
- Returns:
data – If
as_frame=False(default): A Bunch object with attributes likeframe(DataFrame),quantiles(list),horizons(list),target_cols,prediction_cols(nested dict), longitude, latitude (if generated), andDESCR. Ifas_frame=True: The generated data solely as a pandas DataFrame in wide format (e.g., columns ‘target_h1’, ‘pred_q10_h1’, ‘pred_q50_h1’, …).- Return type:
Bunchorpandas.DataFrame
Examples
>>> from fusionlab.datasets import make_quantile_prediction_data >>> # Generate data as Bunch >>> pred_bunch = make_quantile_prediction_data(n_samples=5, n_horizons=3, seed=1) >>> print(pred_bunch.frame.head()) >>> print("Quantile columns for q=0.1:", pred_bunch.prediction_cols['q0.1'])
>>> # Generate data as DataFrame >>> pred_df = make_quantile_prediction_data(as_frame=True, seed=2) >>> print(pred_df.info())