Exercise: Hybrid Forecasting with PIHALNet

Welcome to this hands-on exercise for the Physics-Informed Hybrid Attentive LSTM Network, PIHALNet. This tutorial will guide you through the end-to-end process of training a hybrid model that learns from both time series data and the governing laws of physics.

We will tackle a multi-step forecasting problem for land subsidence and groundwater levels. The primary goal is to demonstrate how to prepare the specialized input data required by PIHALNet and how to configure its unique, composite loss function.

Learning Objectives:

  • Generate a synthetic dataset with features, spatio-temporal coordinates, and two physically-linked target variables.

  • Structure the inputs and targets into the required nested dictionary format.

  • Instantiate the modern PIHALNet using the smart architecture_config for custom internal structures.

  • Configure the model to treat a physical coefficient as a learnable parameter, to be discovered during training.

  • Compile the model with both a data-fidelity loss and a weight for the physics-based loss (\(\lambda_{physics}\)).

  • Train the model and interpret the multi-component loss from the training logs.

  • Visualize both the training history and the final forecast results.

Let’s get started!

Prerequisites

Ensure you have fusionlab-learn and its common dependencies installed.

pip install fusionlab-learn matplotlib scikit-learn

Step 1: Imports and Setup

First, we import all necessary libraries and set up our environment for reproducibility and clean output.

 1import os
 2import numpy as np
 3import tensorflow as tf
 4import matplotlib.pyplot as plt
 5
 6# FusionLab imports
 7from fusionlab.nn.pinn import PIHALNet
 8from fusionlab.params import LearnableC
 9from fusionlab.nn.models.utils import plot_history_in
10
11# Suppress warnings and TF logs for cleaner output
12import warnings
13warnings.filterwarnings('ignore')
14tf.get_logger().setLevel('ERROR')
15
16# Directory for saving any output images
17EXERCISE_OUTPUT_DIR = "./pihalnet_exercise_outputs"
18os.makedirs(EXERCISE_OUTPUT_DIR, exist_ok=True)
19
20print("Libraries imported and setup complete for PIHALNet exercise.")

Expected Output:

Libraries imported and setup complete for PIHALNet exercise.

Step 2: Generate Synthetic Hybrid Data

This is the most critical step. PIHALNet requires a dataset that contains both standard time series features and spatio-temporal coordinates. We will generate a dataset where the targets (h and s) are based on a known analytical function, ensuring they are physically plausible.

 1# Configuration
 2N_SAMPLES   = 1000
 3PAST_STEPS  = 15
 4HORIZON     = 7
 5SEED        = 42
 6np.random.seed(SEED)
 7tf.random.set_seed(SEED)
 8
 9# --- 1. Generate Spatio-Temporal Coordinates ---
10# Separate past vs. future time grids
11t_past   = tf.random.uniform((N_SAMPLES, PAST_STEPS, 1), 0, 5)     # (1000, 15, 1)
12t_future = tf.random.uniform((N_SAMPLES, HORIZON,    1), 0, 5)     # (1000,  7, 1)
13x_past   = tf.random.uniform((N_SAMPLES, PAST_STEPS, 1), -1, 1)   # (1000, 15, 1)
14x_future = tf.random.uniform((N_SAMPLES, HORIZON,    1), -1, 1)   # (1000,  7, 1)
15y_past   = tf.random.uniform((N_SAMPLES, PAST_STEPS, 1), -1, 1)   # (1000, 15, 1)
16y_future = tf.random.uniform((N_SAMPLES, HORIZON,    1), -1, 1)   # (1000,  7, 1)
17
18coords_past   = tf.concat([t_past,   x_past,   y_past],   axis=-1)  # (1000, 15, 3)
19coords_future = tf.concat([t_future, x_future, y_future], axis=-1)  # (1000,  7, 3)
20
21# --- 2. Generate Physically-Plausible Targets ---
22# Use future coords for targets
23h_true = tf.sin(np.pi * x_future) * tf.cos(np.pi * y_future) * tf.exp(-0.2 * t_future)
24# s_true shape: (1000, 7, 1)
25s_true = (
26    (1 - tf.exp(-0.2 * t_future)) * tf.cos(np.pi * x_future)**2
27    + h_true * 0.1
28    + tf.random.normal(h_true.shape, stddev=0.05)
29)
30
31# --- 3. Generate Correlated Time Series Features ---
32static_features = tf.random.normal([N_SAMPLES, 2])               # (1000, 2)
33
34# Dynamic features correlated with physics (using past window)
35dyn_physics = tf.sin(t_past * 2)                                 # (1000, 15, 1)
36dyn_noise   = tf.random.normal([N_SAMPLES, PAST_STEPS, 4])       # (1000, 15, 4)
37dynamic_features = tf.concat([dyn_physics, dyn_noise], axis=-1) # (1000, 15, 5)
38
39# Future features (e.g., pumping schedule + noise)
40fut_schedule = tf.cast(t_future > 2.5, tf.float32)               # (1000,  7, 1)
41fut_noise    = tf.random.normal([N_SAMPLES, HORIZON, 2])         # (1000,  7, 2)
42future_features = tf.concat([fut_schedule, fut_noise], axis=-1) # (1000,  7, 3)
43
44print(f"Generated data with {N_SAMPLES} samples.")

Expected Output:

Generated data with 1000 samples.

Step 3: Structure Inputs and Targets

We now assemble the generated data into the nested dictionary format required by PIHALNet for both its inputs and targets, and then we create a training and validation split.

 1# Input dictionary for the model
 2inputs = {
 3    "static_features": static_features,
 4    "dynamic_features": dynamic_features,
 5    "future_features": future_features,
 6    "coords": coords, # The crucial PINN component
 7}
 8
 9# Target dictionary for the model
10targets = {
11    "subs_pred": s_true,
12    "gwl_pred": h_true,
13}
14
15# Create a validation split (80% train, 20% validation)
16val_split = int(N_SAMPLES * 0.8)
17train_inputs = {k: v[:val_split] for k, v in inputs.items()}
18val_inputs = {k: v[val_split:] for k, v in inputs.items()}
19train_targets = {k: v[:val_split] for k, v in targets.items()}
20val_targets = {k: v[val_split:] for k, v in targets.items()}
21
22print("Data structured into training and validation sets.")
23print(f"Number of training samples: {len(train_inputs['static_features'])}")
24print(f"Number of validation samples: {len(val_inputs['static_features'])}")

Expected Output:

Data structured into training and validation sets.
Number of training samples: 800
Number of validation samples: 200

Step 4: Define, Compile, and Train PIHALNet

We will now instantiate PIHALNet. We will use the architecture_config to define a custom internal structure and configure the model to treat the physical coefficient \(C\) as a learnable parameter. The compilation step is key, as we must provide both the data losses and the weight for the physics loss, lambda_physics.

 1# Define a custom architecture for the data-driven core
 2pinn_architecture = {
 3    'encoder_type': 'transformer',
 4    'feature_processing': 'dense',
 5    'decoder_attention_stack': ['cross', 'hierarchical']
 6}
 7
 8# Instantiate the model
 9model = PIHALNet(
10    static_input_dim=static_features.shape[-1],
11    dynamic_input_dim=dynamic_features.shape[-1],
12    future_input_dim=future_features.shape[-1],
13    output_subsidence_dim=1,
14    output_gwl_dim=1,
15    forecast_horizon=HORIZON,
16    max_window_size=PAST_STEPS,
17    mode='pihal_like',
18    architecture_config=pinn_architecture,
19    # Ask the model to discover the consolidation coefficient
20    pinn_coefficient_C=LearnableC(initial_value=0.01)
21)
22
23# Compile the model with the composite loss
24model.compile(
25    optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
26    loss={'subs_pred': 'mse', 'gwl_pred': 'mse'}, # Data losses
27    lambda_pde=0.2 # Weight for the consolidation physics
28)
29
30# Train the model
31print("\nStarting PIHALNet training...")
32history = model.fit(
33    train_inputs,
34    train_targets,
35    validation_data=(val_inputs, val_targets),
36    epochs=10,
37    batch_size=64,
38    verbose=1
39)
40print("Training complete.")

Expected Output:

Starting PIHALNet training...
Epoch 1/10
13/13 [==============================] - 7s 72ms/step - loss: 3.9772 - gwl_pred_loss: 1.0878 - subs_pred_loss: 2.8894 - total_loss: 367.5720 - data_loss: 3.9803 - physics_loss: 1817.9586 - val_loss: 0.7371 - val_gwl_pred_loss: 0.7371 - val_subs_pred_loss: 0.0000e+00
Epoch 2/10
13/13 [==============================] - 0s 16ms/step - loss: 3.1833 - gwl_pred_loss: 0.7940 - subs_pred_loss: 2.3893 - total_loss: 1825.8313 - data_loss: 3.0913 - physics_loss: 9113.6998 - val_loss: 1.1084 - val_gwl_pred_loss: 1.1084 - val_subs_pred_loss: 0.0000e+00
...
Epoch 10/10
13/13 [==============================] - 0s 15ms/step - loss: 0.8129 - gwl_pred_loss: 0.6270 - subs_pred_loss: 0.1859 - total_loss: 10.7931 - data_loss: 0.8096 - physics_loss: 49.9174 - val_loss: 0.6465 - val_gwl_pred_loss: 0.6465 - val_subs_pred_loss: 0.0000e+00
Training complete.

Step 5: Visualize Training History

We can use the plot_history_in utility to view the different components of our composite loss, which helps in diagnosing how the model balanced the data and physics objectives during training.

 1metrics_to_plot = {
 2    "Loss Breakdown": ["total_loss", "data_loss", "physics_loss"],
 3    "Subsidence Loss": ["subs_pred_loss"],
 4    "GWL Loss": ["gwl_pred_loss"],
 5}
 6plot_history_in(
 7    history,
 8    metrics=metrics_to_plot,
 9    title="PIHALNet Loss Components During Training",
10    max_cols=3
11)

Expected Plot:

PIHALNet Training History

The plot shows three subplots: one for the composite loss breakdown, and two for the individual data losses for subsidence and groundwater level predictions.

Step 6: Visualize the Forecast

Finally, we’ll make predictions on the validation set and plot the forecasted subsidence against the actual values for a few samples.

 1# Make predictions on the validation set
 2val_predictions = model.predict(val_inputs)
 3# Predictions are a dict; get the one for subsidence
 4s_preds = val_predictions['subs_pred']
 5s_actuals = val_targets['subs_pred']
 6
 7# --- Visualization ---
 8plt.figure(figsize=(14, 7))
 9# Plot the forecast for the first 4 validation samples
10for i in range(4):
11    plt.plot(s_actuals[i, :, 0],
12             label=f'Actual Sample {i+1}', linestyle='--', marker='o')
13    plt.plot(s_preds[i, :, 0],
14             label=f'Predicted Sample {i+1}', linestyle='-', marker='x')
15
16plt.title('Subsidence Forecast vs. Actuals (Validation Set)')
17plt.xlabel(f'Forecast Step (Horizon = {HORIZON} steps)')
18plt.ylabel('Normalized Subsidence')
19plt.legend(ncol=2)
20plt.grid(True, linestyle=':')
21plt.tight_layout()
22plt.show()

Expected Plot:

PIHALNet Forecast Results

A plot comparing the model’s multi-step forecasts for land subsidence against the true values for several validation samples.

Discussion of Exercise

Congratulations! You have successfully trained a sophisticated hybrid physics-data model. In this exercise, you have learned how to:

  • Create a complex dataset with both time series features and spatio-temporal coordinates.

  • Structure data into the dictionary format required by PIHALNet.

  • Use the architecture_config to customize the model’s powerful data-driven core.

  • Compile and train the model with a composite loss function, effectively balancing data accuracy and physical consistency.

This powerful workflow is at the cutting edge of scientific machine learning, enabling the development of robust models that can provide reliable insights even in data-scarce environments.