Evaluating and Visualizing Forecasts

Effective evaluation and clear visualization are key to understanding the performance of your forecasting models and communicating their results. fusionlab-learn provides utilities in fusionlab.plot.evaluation to help with this process, working seamlessly with forecast data structured by format_predictions_to_dataframe().

This guide demonstrates how to use the primary plotting functions:

Prerequisites

Ensure you have fusionlab-learn and its common dependencies installed. For visualizations, matplotlib is essential.

pip install fusionlab-learn matplotlib scikit-learn

Common Setup for Examples

The following imports and basic data generation will be used across the examples. We’ll simulate a forecast DataFrame that might be produced after running a model and formatting its output.

 1import numpy as np
 2import pandas as pd
 3import tensorflow as tf # For Tensor type hint if needed
 4import matplotlib.pyplot as plt
 5import os
 6import warnings
 7
 8# FusionLab imports
 9from fusionlab.nn.utils import format_predictions_to_dataframe
10from fusionlab.plot.evaluation import (
11    plot_forecast_comparison,
12    plot_metric_over_horizon,
13    plot_metric_radar
14)
15# For dummy scaler and metrics if needed by plot functions
16from sklearn.preprocessing import StandardScaler
17from sklearn.metrics import mean_absolute_error
18
19# Suppress warnings and TF logs for cleaner output
20warnings.filterwarnings('ignore')
21os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
22tf.get_logger().setLevel('ERROR')
23if hasattr(tf, 'autograph'):
24    tf.autograph.set_verbosity(0)
25
26# Directory for saving any output images from this guide
27evaluation_plot_dir = "./evaluation_plots_output"
28os.makedirs(evaluation_plot_dir, exist_ok=True)
29
30print("Libraries imported and setup complete for evaluation plotting.")
31
32# --- Generate Base Dummy Forecast Data ---
33# This data will be used as input to format_predictions_to_dataframe
34# to create the forecast_df for our plotting functions.
35B, H, O_SINGLE, O_MULTI = 10, 6, 1, 2 # Batch, Horizon, OutputDims
36Q_LIST_VIZ = [0.1, 0.5, 0.9]
37N_Q_VIZ = len(Q_LIST_VIZ)
38SAMPLES_VIZ = B # Number of sequences
39
40np.random.seed(42) # For reproducibility
41base_y_true_single = 50 + np.cumsum(
42    np.random.randn(SAMPLES_VIZ, H, O_SINGLE) * 2, axis=1)
43base_preds_point_single = base_y_true_single * \
44    np.random.uniform(0.9, 1.1, size=base_y_true_single.shape) + \
45    np.random.normal(0, 2, size=base_y_true_single.shape)
46
47# For quantile, ensure median is somewhat centered, and bounds span it
48base_preds_q_median = base_y_true_single * \
49    np.random.uniform(0.95, 1.05, size=base_y_true_single.shape) + \
50    np.random.normal(0, 1, size=base_y_true_single.shape)
51interval_spread = np.abs(np.random.normal(2, 1, size=base_y_true_single.shape))
52base_preds_q_lower = base_preds_q_median - interval_spread
53base_preds_q_upper = base_preds_q_median + interval_spread
54
55# Stack quantiles for single output: (Samples, Horizon, NumQuantiles)
56base_preds_quant_single = np.stack([
57    base_preds_q_lower, base_preds_q_median, base_preds_q_upper
58], axis=-1).reshape(SAMPLES_VIZ, H, N_Q_VIZ)
59
60
61# Create a sample forecast_df for point forecasts
62forecast_df_point_viz = format_predictions_to_dataframe(
63    predictions=base_preds_point_single.astype(np.float32),
64    y_true_sequences=base_y_true_single.astype(np.float32),
65    target_name="value",
66    forecast_horizon=H,
67    output_dim=O_SINGLE
68)
69# Add a segment column for radar plot example
70forecast_df_point_viz['category'] = np.random.choice(
71    ['CatA', 'CatB', 'CatC'], size=len(forecast_df_point_viz)
72    )
73# Add spatial columns for spatial plot example
74forecast_df_point_viz['longitude'] = np.tile(
75    np.linspace(110, 111, SAMPLES_VIZ), H)
76forecast_df_point_viz['latitude'] = np.tile(
77    np.linspace(22, 23, SAMPLES_VIZ), H)
78
79
80# Create a sample forecast_df for quantile forecasts
81forecast_df_quant_viz = format_predictions_to_dataframe(
82    predictions=base_preds_quant_single.astype(np.float32),
83    y_true_sequences=base_y_true_single.astype(np.float32),
84    target_name="value",
85    quantiles=Q_LIST_VIZ,
86    forecast_horizon=H,
87    output_dim=O_SINGLE
88)
89forecast_df_quant_viz['category'] = np.random.choice(
90    ['CatX', 'CatY', 'CatZ'], size=len(forecast_df_quant_viz)
91    )
92forecast_df_quant_viz['longitude'] = np.tile(
93    np.linspace(110, 111, SAMPLES_VIZ), H)
94forecast_df_quant_viz['latitude'] = np.tile(
95    np.linspace(22, 23, SAMPLES_VIZ), H)
96
97
98print("Base data and sample DataFrames prepared for plotting examples.")

Expected Output (Common Setup):

Libraries imported and setup complete for evaluation plotting.
Base data and sample DataFrames prepared for plotting examples.

1. Visualizing Forecast Comparisons (plot_forecast_comparison)

API Reference:

plot_forecast_comparison()

This function is your primary tool for visually comparing model predictions against actual values. It supports both temporal line plots (showing forecasts over the horizon for specific samples) and spatial scatter plots (showing forecast values across geographical coordinates for a specific horizon step).

Key Use Cases:

  • Temporal Point Forecasts: Plot actual vs. predicted lines for selected time series samples.

  • Temporal Quantile Forecasts: Plot actuals, the median prediction, and the uncertainty interval (e.g., between 10th and 90th quantiles).

  • Spatial Forecasts: Visualize predicted values (e.g., median for quantiles) on a map for a specific forecast step.

Example 1.1: Temporal Point Forecast Visualization

 1print("\nPlotting Temporal Point Forecast Comparison...")
 2plot_forecast_comparison(
 3    forecast_df=forecast_df_point_viz,
 4    target_name="value",
 5    kind="temporal",
 6    sample_ids="first_n", # Plot for the first N samples
 7    num_samples=2,        # Plot for 2 samples
 8    max_cols=1,           # Each sample in its own row
 9    figsize_per_subplot=(10, 4),
10    verbose=0
11)
12# To save:
13# fig_path = os.path.join(evaluation_plot_dir, "eval_temporal_point.png")
14# plt.savefig(fig_path) # Call before plt.show() if saving

Expected Plot 1.1:

Temporal Point Forecast Comparison

Line plot showing actual vs. predicted values over the forecast horizon for selected samples (point forecast).

Example 1.2: Temporal Quantile Forecast Visualization

 1print("\nPlotting Temporal Quantile Forecast Comparison...")
 2plot_forecast_comparison(
 3    forecast_df=forecast_df_quant_viz,
 4    target_name="value",
 5    quantiles=Q_LIST_VIZ,
 6    kind="temporal",
 7    sample_ids=[0, 1], # Plot for specific sample_idx 0 and 1
 8    max_cols=2,
 9    figsize_per_subplot=(9, 4.5),
10    verbose=0
11)
12# To save:
13# fig_path = os.path.join(evaluation_plot_dir, "eval_temporal_quantile.png")
14# plt.savefig(fig_path)

Expected Plot 1.2:

Temporal Quantile Forecast Comparison

Line plot showing actual values, median prediction, and the prediction interval for selected samples (quantile forecast).

Example 1.3: Spatial Quantile Forecast Visualization

This requires spatial_cols (e.g., ‘longitude’, ‘latitude’) to be present in forecast_df.

 1print("\nPlotting Spatial Quantile Forecast Comparison...")
 2plot_forecast_comparison(
 3    forecast_df=forecast_df_quant_viz,
 4    target_name="value",
 5    quantiles=Q_LIST_VIZ,
 6    kind="spatial",
 7    horizon_steps=1, # Visualize the first step of the horizon
 8    spatial_cols=['longitude', 'latitude'],
 9    figsize_per_subplot=(7, 6), # Single plot, so this is figure size
10    verbose=0
11)
12# To save:
13# fig_path = os.path.join(evaluation_plot_dir, "eval_spatial_quantile.png")
14# plt.savefig(fig_path)

Expected Plot 1.3:

Spatial Quantile Forecast Comparison

Scatter plot showing the median predicted values across spatial coordinates for a specific forecast horizon step.

2. Visualizing Metrics Over the Forecast Horizon (plot_metric_over_horizon)

API Reference:

plot_metric_over_horizon()

This function helps understand how a model’s performance, measured by one or more metrics, changes as the forecast lead time increases. It’s useful for identifying if a model’s accuracy degrades significantly for longer horizons.

Key Use Cases:

  • Plotting MAE, RMSE, MAPE, etc., for each step of the horizon.

  • For quantile forecasts, plotting coverage or pinball loss over the horizon.

  • Comparing horizon-wise metrics across different segments if group_by_cols is used.

Example 2.1: MAE of Point Forecasts Over Horizon

 1print("\nPlotting MAE of Point Forecast Over Horizon...")
 2plot_metric_over_horizon(
 3    forecast_df=forecast_df_point_viz,
 4    target_name="value",
 5    metrics='mae', # Calculate Mean Absolute Error
 6    plot_kind='bar', # Display as a bar chart
 7    figsize_per_subplot=(8, 5),
 8    verbose=0
 9)
10# To save:
11# fig_path = os.path.join(evaluation_plot_dir, "eval_moh_mae_point.png")
12# plt.savefig(fig_path)

Expected Plot 2.1:

MAE of Point Forecast Over Horizon

Bar chart showing Mean Absolute Error for each step of the forecast horizon.

Example 2.2: Coverage of Quantile Forecasts Over Horizon (Grouped)

 1# Ensure coverage_score is available for this example
 2
 3print("\nPlotting Coverage of Quantile Forecast Over Horizon (Grouped)...")
 4plot_metric_over_horizon(
 5     forecast_df=forecast_df_quant_viz,
 6     target_name="value",
 7     metrics='coverage',
 8     quantiles=Q_LIST_VIZ, # Required for coverage
 9     group_by_cols=['category'], # Show coverage per category
10     plot_kind='line',
11     figsize_per_subplot=(9, 5),
12     verbose=0
13)
14 # To save:
15 # fig_path = os.path.join(evaluation_plot_dir, "eval_moh_coverage_quant.png")
16 # plt.savefig(fig_path)

Expected Plot 2.2:

Coverage of Quantile Forecast Over Horizon

Line plot showing prediction interval coverage for each forecast step, potentially with separate lines for different categories.

3. Visualizing Metrics Across Segments with Radar Plots (plot_metric_radar)

API Reference:

plot_metric_radar()

Radar charts provide a way to compare a single performance metric across different categorical segments (e.g., item types, regions, months). Each segment forms an axis on the radar, and the metric’s value for that segment is plotted along it.

Key Use Cases:

  • Comparing model performance (e.g., MAE, RMSE) across different product categories.

  • Identifying if a model performs consistently across different days of the week or months.

Example 3.1: MAE of Median Forecast by Category (Radar Plot)

 1print("\nPlotting MAE by Category (Radar Plot)...")
 2# Using forecast_df_quant_viz which has a 'category' column
 3plot_metric_radar(
 4    forecast_df=forecast_df_quant_viz,
 5    segment_col='category', # Column defining the radar axes
 6    metric='mae',
 7    target_name="value",
 8    quantiles=Q_LIST_VIZ, # MAE will be on the median
 9    figsize=(7, 7),
10    verbose=0
11)
12# To save:
13# fig_path = os.path.join(evaluation_plot_dir, "eval_radar_mae_category.png")
14# plt.savefig(fig_path)

Expected Plot 3.1:

MAE by Category Radar Plot

Radar chart showing the Mean Absolute Error (of the median forecast) for different categories.

Further Exploration

These examples provide a starting point for visualizing your fusionlab-learn model outputs. For a detailed understanding of the metrics themselves, including their mathematical formulations and calculation examples, please refer to the Metrics for Forecasting Evaluation page.

Experiment with different parameters of these plotting functions to customize the visualizations for your specific analysis needs.