Exercise: Hybrid Forecasting with PIHALNet¶
Welcome to this hands-on exercise for the Physics-Informed Hybrid
Attentive LSTM Network, PIHALNet. This
tutorial will guide you through the end-to-end process of training
a hybrid model that learns from both time series data and the
governing laws of physics.
We will tackle a multi-step forecasting problem for land subsidence
and groundwater levels. The primary goal is to demonstrate how to
prepare the specialized input data required by PIHALNet and how
to configure its unique, composite loss function.
Learning Objectives:
Generate a synthetic dataset with features, spatio-temporal coordinates, and two physically-linked target variables.
Structure the inputs and targets into the required nested dictionary format.
Instantiate the modern
PIHALNetusing the smartarchitecture_configfor custom internal structures.Configure the model to treat a physical coefficient as a learnable parameter, to be discovered during training.
Compile the model with both a data-fidelity loss and a weight for the physics-based loss (\(\lambda_{physics}\)).
Train the model and interpret the multi-component loss from the training logs.
Visualize both the training history and the final forecast results.
Let’s get started!
Prerequisites¶
Ensure you have fusionlab-learn and its common dependencies
installed.
pip install fusionlab-learn matplotlib scikit-learn
Step 1: Imports and Setup¶
First, we import all necessary libraries and set up our environment for reproducibility and clean output.
1import os
2import numpy as np
3import tensorflow as tf
4import matplotlib.pyplot as plt
5
6# FusionLab imports
7from fusionlab.nn.pinn import PIHALNet
8from fusionlab.params import LearnableC
9from fusionlab.nn.models.utils import plot_history_in
10
11# Suppress warnings and TF logs for cleaner output
12import warnings
13warnings.filterwarnings('ignore')
14tf.get_logger().setLevel('ERROR')
15
16# Directory for saving any output images
17EXERCISE_OUTPUT_DIR = "./pihalnet_exercise_outputs"
18os.makedirs(EXERCISE_OUTPUT_DIR, exist_ok=True)
19
20print("Libraries imported and setup complete for PIHALNet exercise.")
Expected Output:
Libraries imported and setup complete for PIHALNet exercise.
Step 2: Generate Synthetic Hybrid Data¶
This is the most critical step. PIHALNet requires a dataset that
contains both standard time series features and spatio-temporal
coordinates. We will generate a dataset where the targets (h and s)
are based on a known analytical function, ensuring they are physically
plausible.
1# Configuration
2N_SAMPLES = 1000
3PAST_STEPS = 15
4HORIZON = 7
5SEED = 42
6np.random.seed(SEED)
7tf.random.set_seed(SEED)
8
9# --- 1. Generate Spatio-Temporal Coordinates ---
10# Separate past vs. future time grids
11t_past = tf.random.uniform((N_SAMPLES, PAST_STEPS, 1), 0, 5) # (1000, 15, 1)
12t_future = tf.random.uniform((N_SAMPLES, HORIZON, 1), 0, 5) # (1000, 7, 1)
13x_past = tf.random.uniform((N_SAMPLES, PAST_STEPS, 1), -1, 1) # (1000, 15, 1)
14x_future = tf.random.uniform((N_SAMPLES, HORIZON, 1), -1, 1) # (1000, 7, 1)
15y_past = tf.random.uniform((N_SAMPLES, PAST_STEPS, 1), -1, 1) # (1000, 15, 1)
16y_future = tf.random.uniform((N_SAMPLES, HORIZON, 1), -1, 1) # (1000, 7, 1)
17
18coords_past = tf.concat([t_past, x_past, y_past], axis=-1) # (1000, 15, 3)
19coords_future = tf.concat([t_future, x_future, y_future], axis=-1) # (1000, 7, 3)
20
21# --- 2. Generate Physically-Plausible Targets ---
22# Use future coords for targets
23h_true = tf.sin(np.pi * x_future) * tf.cos(np.pi * y_future) * tf.exp(-0.2 * t_future)
24# s_true shape: (1000, 7, 1)
25s_true = (
26 (1 - tf.exp(-0.2 * t_future)) * tf.cos(np.pi * x_future)**2
27 + h_true * 0.1
28 + tf.random.normal(h_true.shape, stddev=0.05)
29)
30
31# --- 3. Generate Correlated Time Series Features ---
32static_features = tf.random.normal([N_SAMPLES, 2]) # (1000, 2)
33
34# Dynamic features correlated with physics (using past window)
35dyn_physics = tf.sin(t_past * 2) # (1000, 15, 1)
36dyn_noise = tf.random.normal([N_SAMPLES, PAST_STEPS, 4]) # (1000, 15, 4)
37dynamic_features = tf.concat([dyn_physics, dyn_noise], axis=-1) # (1000, 15, 5)
38
39# Future features (e.g., pumping schedule + noise)
40fut_schedule = tf.cast(t_future > 2.5, tf.float32) # (1000, 7, 1)
41fut_noise = tf.random.normal([N_SAMPLES, HORIZON, 2]) # (1000, 7, 2)
42future_features = tf.concat([fut_schedule, fut_noise], axis=-1) # (1000, 7, 3)
43
44print(f"Generated data with {N_SAMPLES} samples.")
Expected Output:
Generated data with 1000 samples.
Step 3: Structure Inputs and Targets¶
We now assemble the generated data into the nested dictionary format required by PIHALNet for both its inputs and targets, and then we create a training and validation split.
1# Input dictionary for the model
2inputs = {
3 "static_features": static_features,
4 "dynamic_features": dynamic_features,
5 "future_features": future_features,
6 "coords": coords, # The crucial PINN component
7}
8
9# Target dictionary for the model
10targets = {
11 "subs_pred": s_true,
12 "gwl_pred": h_true,
13}
14
15# Create a validation split (80% train, 20% validation)
16val_split = int(N_SAMPLES * 0.8)
17train_inputs = {k: v[:val_split] for k, v in inputs.items()}
18val_inputs = {k: v[val_split:] for k, v in inputs.items()}
19train_targets = {k: v[:val_split] for k, v in targets.items()}
20val_targets = {k: v[val_split:] for k, v in targets.items()}
21
22print("Data structured into training and validation sets.")
23print(f"Number of training samples: {len(train_inputs['static_features'])}")
24print(f"Number of validation samples: {len(val_inputs['static_features'])}")
Expected Output:
Data structured into training and validation sets.
Number of training samples: 800
Number of validation samples: 200
Step 4: Define, Compile, and Train PIHALNet¶
We will now instantiate PIHALNet. We will use the architecture_config to define a custom internal structure and configure the model to treat the physical coefficient \(C\) as a learnable parameter. The compilation step is key, as we must provide both the data losses and the weight for the physics loss, lambda_physics.
1# Define a custom architecture for the data-driven core
2pinn_architecture = {
3 'encoder_type': 'transformer',
4 'feature_processing': 'dense',
5 'decoder_attention_stack': ['cross', 'hierarchical']
6}
7
8# Instantiate the model
9model = PIHALNet(
10 static_input_dim=static_features.shape[-1],
11 dynamic_input_dim=dynamic_features.shape[-1],
12 future_input_dim=future_features.shape[-1],
13 output_subsidence_dim=1,
14 output_gwl_dim=1,
15 forecast_horizon=HORIZON,
16 max_window_size=PAST_STEPS,
17 mode='pihal_like',
18 architecture_config=pinn_architecture,
19 # Ask the model to discover the consolidation coefficient
20 pinn_coefficient_C=LearnableC(initial_value=0.01)
21)
22
23# Compile the model with the composite loss
24model.compile(
25 optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
26 loss={'subs_pred': 'mse', 'gwl_pred': 'mse'}, # Data losses
27 lambda_pde=0.2 # Weight for the consolidation physics
28)
29
30# Train the model
31print("\nStarting PIHALNet training...")
32history = model.fit(
33 train_inputs,
34 train_targets,
35 validation_data=(val_inputs, val_targets),
36 epochs=10,
37 batch_size=64,
38 verbose=1
39)
40print("Training complete.")
Expected Output:
Starting PIHALNet training...
Epoch 1/10
13/13 [==============================] - 7s 72ms/step - loss: 3.9772 - gwl_pred_loss: 1.0878 - subs_pred_loss: 2.8894 - total_loss: 367.5720 - data_loss: 3.9803 - physics_loss: 1817.9586 - val_loss: 0.7371 - val_gwl_pred_loss: 0.7371 - val_subs_pred_loss: 0.0000e+00
Epoch 2/10
13/13 [==============================] - 0s 16ms/step - loss: 3.1833 - gwl_pred_loss: 0.7940 - subs_pred_loss: 2.3893 - total_loss: 1825.8313 - data_loss: 3.0913 - physics_loss: 9113.6998 - val_loss: 1.1084 - val_gwl_pred_loss: 1.1084 - val_subs_pred_loss: 0.0000e+00
...
Epoch 10/10
13/13 [==============================] - 0s 15ms/step - loss: 0.8129 - gwl_pred_loss: 0.6270 - subs_pred_loss: 0.1859 - total_loss: 10.7931 - data_loss: 0.8096 - physics_loss: 49.9174 - val_loss: 0.6465 - val_gwl_pred_loss: 0.6465 - val_subs_pred_loss: 0.0000e+00
Training complete.
Step 5: Visualize Training History¶
We can use the plot_history_in utility to view the different components of our composite loss, which helps in diagnosing how the model balanced the data and physics objectives during training.
1metrics_to_plot = {
2 "Loss Breakdown": ["total_loss", "data_loss", "physics_loss"],
3 "Subsidence Loss": ["subs_pred_loss"],
4 "GWL Loss": ["gwl_pred_loss"],
5}
6plot_history_in(
7 history,
8 metrics=metrics_to_plot,
9 title="PIHALNet Loss Components During Training",
10 max_cols=3
11)
Expected Plot:
The plot shows three subplots: one for the composite loss breakdown, and two for the individual data losses for subsidence and groundwater level predictions.¶
Step 6: Visualize the Forecast¶
Finally, we’ll make predictions on the validation set and plot the forecasted subsidence against the actual values for a few samples.
1# Make predictions on the validation set
2val_predictions = model.predict(val_inputs)
3# Predictions are a dict; get the one for subsidence
4s_preds = val_predictions['subs_pred']
5s_actuals = val_targets['subs_pred']
6
7# --- Visualization ---
8plt.figure(figsize=(14, 7))
9# Plot the forecast for the first 4 validation samples
10for i in range(4):
11 plt.plot(s_actuals[i, :, 0],
12 label=f'Actual Sample {i+1}', linestyle='--', marker='o')
13 plt.plot(s_preds[i, :, 0],
14 label=f'Predicted Sample {i+1}', linestyle='-', marker='x')
15
16plt.title('Subsidence Forecast vs. Actuals (Validation Set)')
17plt.xlabel(f'Forecast Step (Horizon = {HORIZON} steps)')
18plt.ylabel('Normalized Subsidence')
19plt.legend(ncol=2)
20plt.grid(True, linestyle=':')
21plt.tight_layout()
22plt.show()
Expected Plot:
A plot comparing the model’s multi-step forecasts for land subsidence against the true values for several validation samples.¶
Discussion of Exercise¶
Congratulations! You have successfully trained a sophisticated hybrid physics-data model. In this exercise, you have learned how to:
Create a complex dataset with both time series features and spatio-temporal coordinates.
Structure data into the dictionary format required by
PIHALNet.Use the architecture_config to customize the model’s powerful data-driven core.
Compile and train the model with a composite loss function, effectively balancing data accuracy and physical consistency.
This powerful workflow is at the cutting edge of scientific machine learning, enabling the development of robust models that can provide reliable insights even in data-scarce environments.