Exercise: Hybrid Forecasting with TransFlowSubsNet

Welcome to this exercise on using TransFlowSubsNet, a hybrid physics-informed neural network from the fusionlab-learn library. This model is unique because it learns from two sources of information simultaneously: structured time series data and the governing laws of physics.

We will perform a multi-step forecast for both groundwater level and land subsidence. This exercise will walk you through the specific data preparation needed to satisfy both the data-driven and physics-informed components of the model.

Learning Objectives:

  • Generate a synthetic dataset with features, coordinates, and two target variables (subsidence and groundwater level).

  • Structure the inputs into the required dictionary format, separating feature inputs from coordinate inputs.

  • Instantiate TransFlowSubsNet with both data-driven and physics-informed configurations.

  • Compile the model with a composite data loss and physics-loss weights (\(\lambda_{gw}\), \(\lambda_{c}\)).

  • Train the model and interpret the multi-component loss in the training log.

  • Visualize the forecast results against the true values.

Let’s begin!

Prerequisites

Ensure you have fusionlab-learn and its common dependencies installed.

pip install fusionlab-learn matplotlib

Step 1: Imports and Setup

First, we import all necessary libraries and set up our environment.

 1import os
 2import numpy as np
 3import tensorflow as tf
 4import matplotlib.pyplot as plt
 5import warnings
 6
 7# FusionLab imports
 8from fusionlab.nn.pinn import TransFlowSubsNet
 9from fusionlab.nn.models.utils import plot_history_in
10
11# Suppress warnings and TF logs for cleaner output
12warnings.filterwarnings('ignore')
13tf.get_logger().setLevel('ERROR')
14
15# Directory for saving any output images
16EXERCISE_OUTPUT_DIR = "./transflow_exercise_outputs"
17os.makedirs(EXERCISE_OUTPUT_DIR, exist_ok=True)
18
19print("Libraries imported and setup complete for TransFlowSubsNet exercise.")

Expected Output:

Libraries imported and setup complete for TransFlowSubsNet exercise.

Step 2: Generate Synthetic Hybrid Data

This is the most critical step. We need to create a dataset that includes: 1. Standard time series features (static, dynamic, future). 2. Spatio-temporal coordinates (t, x, y). 3. Two target variables (groundwater_level, subsidence) that are

logically linked to the coordinates and features.

 1# Configuration
 2N_SAMPLES = 500
 3PAST_STEPS = 12
 4HORIZON = 6
 5SEED = 42
 6np.random.seed(SEED)
 7tf.random.set_seed(SEED)
 8
 9# --- Generate Coordinates ---
10t = tf.random.uniform((N_SAMPLES, HORIZON, 1), 0, 10)
11t_past = tf.random.uniform((N_SAMPLES, PAST_STEPS, 1), 0, 10)
12x = tf.random.uniform((N_SAMPLES, HORIZON, 1), -1, 1)
13y = tf.random.uniform((N_SAMPLES, HORIZON, 1), -1, 1)
14coords = tf.concat([t, x, y], axis=-1)
15
16# --- Generate Physically-Plausible Targets ---
17# Groundwater level (h) based on a simple analytical solution
18h_true = tf.sin(np.pi * x) * tf.cos(np.pi * y) * tf.exp(-0.1 * t)
19# Subsidence (s) as an integrated function of head decline
20s_true = (1 - tf.exp(-0.1 * t)) * (tf.cos(np.pi * x))**2 + h_true * 0.1
21
22# --- Generate Correlated Features ---
23# These features will be used by the data-driven part of the model
24static_features = tf.random.normal([N_SAMPLES, 3])
25# Dynamic features correlated with the physics
26dynamic_features = tf.concat([
27    tf.sin(t_past[:, :PAST_STEPS, :]),
28    tf.random.normal([N_SAMPLES, PAST_STEPS, 7])
29], axis=-1)
30# Future features
31future_features = tf.random.normal([N_SAMPLES, HORIZON, 4])
32
33print("Generated data shapes:")
34print(f"  Static Features:  {static_features.shape}")
35print(f"  Dynamic Features: {dynamic_features.shape}")
36print(f"  Future Features:  {future_features.shape}")
37print(f"  Coordinates:      {coords.shape}")
38print(f"  True GWL Target:    {h_true.shape}")
39print(f"  True Subsidence Target: {s_true.shape}")

Expected Output:

Generated data shapes:
  Static Features:  (500, 3)
  Dynamic Features: (500, 12, 8)
  Future Features:  (500, 6, 4)
  Coordinates:      (500, 6, 3)
  True GWL Target:    (500, 6, 1)
  True Subsidence Target: (500, 6, 1)

Step 3: Structure Inputs and Targets for Training

TransFlowSubsNet expects a dictionary of inputs and a dictionary of targets for its .fit() method. We now assemble the data we generated into this required format.

 1# Input dictionary for the model
 2inputs = {
 3    "static_features": static_features,
 4    "dynamic_features": dynamic_features,
 5    "future_features": future_features,
 6    "coords": coords, # The crucial PINN component
 7}
 8
 9# Target dictionary for the model
10targets = {
11    "subs_pred": s_true,
12    "gwl_pred": h_true,
13}
14
15# Create a validation split
16val_split = int(N_SAMPLES * 0.8)
17train_inputs = {k: v[:val_split] for k, v in inputs.items()}
18val_inputs = {k: v[val_split:] for k, v in inputs.items()}
19train_targets = {k: v[:val_split] for k, v in targets.items()}
20val_targets = {k: v[val_split:] for k, v in targets.items()}
21
22print("Data structured into training and validation sets.")

Step 4: Define, Compile, and Train TransFlowSubsNet

We now instantiate the model. The most important step is .compile(), where we provide both the standard data loss functions (one for each target) and the weights for the physics-based losses (\(\lambda_{gw}\) and \(\lambda_{c}\)).

 1# Instantiate the model
 2model = TransFlowSubsNet(
 3    static_input_dim=static_features.shape[-1],
 4    dynamic_input_dim=dynamic_features.shape[-1],
 5    future_input_dim=future_features.shape[-1],
 6    output_subsidence_dim=1,
 7    output_gwl_dim=1,
 8    forecast_horizon=HORIZON,
 9    max_window_size=PAST_STEPS,
10    mode='pihal_like',
11    pde_mode='both',      # Use both physics loss terms
12    K='learnable',        # Ask the model to infer K
13    pinn_coefficient_C=0.01 # Use a fixed C
14)
15
16# Compile the model with composite loss
17model.compile(
18    optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
19    loss={'subs_pred': 'mse', 'gwl_pred': 'mse'}, # Data losses
20    lambda_gw=1.0,      # Weight for groundwater physics
21    lambda_cons=0.5     # Weight for consolidation physics
22)
23
24# Train the model
25print("\nStarting TransFlowSubsNet training...")
26history = model.fit(
27    train_inputs,
28    train_targets,
29    validation_data=(val_inputs, val_targets),
30    epochs=5,
31    batch_size=32,
32    verbose=1
33)
34print("Training complete.")

Expected Output:

Starting TransFlowSubsNet training...
Epoch 1/5
13/13 [==============================] - 37s 237ms/step - loss: 1.0443 - gwl_pred_loss: 0.6407 - subs_pred_loss: 0.4036 - total_loss: 0.9620 - data_loss: 0.9616 - consolidation_loss: 8.8173e-04 - gw_flow_loss: 1.6358e-07 - val_loss: 0.1779 - val_gwl_pred_loss: 0.1779 - val_subs_pred_loss: 0.0000e+00
Epoch 2/5
13/13 [==============================] - 0s 24ms/step - loss: 0.2164 - gwl_pred_loss: 0.1499 - subs_pred_loss: 0.0665 - total_loss: 0.2129 - data_loss: 0.2128 - consolidation_loss: 2.9616e-04 - gw_flow_loss: 4.7181e-08 - val_loss: 0.1271 - val_gwl_pred_loss: 0.1271 - val_subs_pred_loss: 0.0000e+00
...
Epoch 5/5
13/13 [==============================] - 0s 24ms/step - loss: 0.1502 - gwl_pred_loss: 0.1120 - subs_pred_loss: 0.0382 - total_loss: 0.1496 - data_loss: 0.1496 - consolidation_loss: 2.7363e-05 - gw_flow_loss: 7.7033e-10 - val_loss: 0.1170 - val_gwl_pred_loss: 0.1170 - val_subs_pred_loss: 0.0000e+00
Training complete.

Step 5: Visualize Training History

We can use the plot_history_in utility to view all the components of our composite loss function, which helps in understanding how the model balanced the data and physics objectives.

1metrics_to_plot = {
2    "Loss Breakdown": ["total_loss", "data_loss"],
3    "Physics Residuals": ["consolidation_loss", "gw_flow_loss"]
4}
5plot_history_in(
6    history,
7    metrics=metrics_to_plot,
8    title="TransFlowSubsNet Loss Components"
9)

Expected Plot:

TransFlowSubsNet Training History

The plot shows two subplots: one comparing the total loss to the data-fidelity loss, and another showing the evolution of the two physics-based loss components.

Step 6: Visualize the Forecast

Finally, let’s make predictions on the validation set and compare the forecasted groundwater level against the actual values for a sample.

 1# Make predictions on the validation set
 2val_predictions = model.predict(val_inputs)
 3# Predictions are a dict, get the one for groundwater level
 4gwl_preds = val_predictions['gwl_pred']
 5gwl_actuals = val_targets['gwl_pred']
 6
 7# Plot the forecast for the first 5 validation samples
 8plt.figure(figsize=(14, 7))
 9for i in range(5):
10    plt.plot(gwl_actuals[i, :, 0],
11             label=f'Actual Sample {i+1}', linestyle='--')
12    plt.plot(gwl_preds[i, :, 0],
13             label=f'Predicted Sample {i+1}', linestyle='-')
14
15plt.title('Groundwater Level Forecast vs. Actuals (Validation Set)')
16plt.xlabel(f'Forecast Step (Horizon = {HORIZON} steps)')
17plt.ylabel('Normalized Groundwater Level')
18plt.legend(ncol=2)
19plt.grid(True, linestyle=':')
20plt.tight_layout()
21plt.show()

Expected Plot:

TransFlowSubsNet Forecast Results

A comparison plot showing the model’s multi-step forecasts for the groundwater level against the true values for several validation samples.

Discussion of Exercise

Congratulations! You have successfully trained a hybrid physics-data model. In this exercise, you have learned to:

  • Create a complex dataset suitable for a hybrid model that requires both feature and coordinate inputs.

  • Structure the data into the dictionary format required by TransFlowSubsNet.

  • Compile the model with a composite loss function, balancing data fidelity and physical consistency using loss weights.

  • Train the model and interpret its multi-component loss log.

This workflow is a powerful paradigm for building more robust and generalizable scientific machine-learning models, especially in data-scarce or noisy environments where physics can provide a strong inductive bias.