Hybrid Physics-Data Models: PiHALNet & PIHALNet

This guide delves into the PiHALNet family of models, a suite of sophisticated hybrid architectures designed for complex, coupled geophysical forecasting. These models uniquely combine a powerful, data-driven deep learning engine with the physical laws of soil mechanics and groundwater flow.

The primary goal of these models is to produce forecasts for land subsidence and groundwater levels that are not only accurate with respect to observational data but are also physically consistent. This page details the two evolutionary versions of this concept, highlighting their shared principles and key differences.

Common Key Features

Both versions of PiHALNet are built upon the same core principles, offering a powerful set of common features that make them uniquely suited for complex geophysical forecasting tasks.

  • Hybrid Data-Physics Architecture The models integrate a powerful data-driven forecasting core (the Hybrid Attentive LSTM Network, or HALNet, engine) with a physics-informed module. This means they learn from both observational data and the governing physical laws, with the PDE residual being a key component of the loss function.

  • Coupled Multi-Target Prediction They are designed to simultaneously forecast multiple, physically linked variables. The primary use case is predicting land subsidence (\(s\)) and groundwater levels (\(h\)) in a coupled manner.

  • Advanced Input Handling The architecture natively processes three distinct types of time series inputs:

    • Static features: Time-invariant metadata (e.g., sensor location, soil type).

    • Dynamic past features: Time-varying data observed up to the present (e.g., historical rainfall, past measurements).

    • Known future features: Time-varying data known in advance (e.g., day of the week, scheduled pumping).

    This is enhanced by an optional VariableSelectionNetwork (VSN) for intelligent, learnable feature selection.

  • Sophisticated Temporal Processing To capture complex time-dependencies, the models employ:

    • A MultiScaleLSTM that processes the input sequence at various user-defined temporal resolutions, capturing both short-term and long-term patterns.

    • A rich suite of attention mechanisms that work together to fuse information from all sources, including CrossAttention (for encoder-decoder interaction) and various self-attention layers ike HierarchicalAttention.

  • Flexible Physics-Informed Constraints The physics module is highly configurable:

    • The specific physical law to enforce can be selected via the pde_mode parameter, with the primary focus being on 'consolidation'.

    • Key physical coefficients in the PDEs (like the consolidation coefficient \(C\) or hydraulic conductivity \(K\)) can be either fixed as known constants or made learnable. This allows the model to perform parameter inversion— discovering the values of physical constants directly from the observational data.

  • Probabilistic Forecasting for Uncertainty The models can produce probabilistic forecasts to quantify prediction uncertainty. By specifying a list of quantiles, the QuantileDistributionModeling head is activated, generating prediction intervals alongside the point forecast.

  • Multi-Horizon Output Structure Using a MultiDecoder, the models generate predictions for each step in the forecast horizon in a sequence-to-sequence manner, making them true multi-step-ahead forecasters.

Physical Formulation and Hybrid Loss

The power of the PIHALNet family lies in its hybrid formulation, which forces the data-driven predictions to conform to physical laws. This is achieved by integrating the governing equations of groundwater hydrology and soil mechanics directly into the model’s training objective.

Governing Equations

The models are designed to understand and simulate two coupled physical processes. The specific equations activated during training depend on the pde_mode setting.

1. Transient Groundwater Flow

This equation, a form of the diffusion equation, enforces the conservation of mass for groundwater moving through a porous medium. It describes how the hydraulic head \(h\) changes in time and space. The 2D residual form used by the model is:

\[\mathcal{R}_{gw} = S_s \frac{\partial h}{\partial t} - K \left( \frac{\partial^2 h}{\partial x^2} + \frac{\partial^2 h}{\partial y^2} \right) - Q\]

Here, \(K\) is the hydraulic conductivity, \(S_s\) is the specific storage, and \(Q\) is a source/sink term.

2. Aquifer-System Consolidation

This principle links the rate of land subsidence (\(s\)) to changes in the hydraulic head field. As the head \(h\) declines, pressure within the aquifer system changes, causing fine-grained clay layers to compact, which results in subsidence at the surface. The residual form of this relationship is:

\[\mathcal{R}_{c} = \frac{\partial s}{\partial t} - C \left( \frac{\partial^2 h}{\partial x^2} + \frac{\partial^2 h}{\partial y^2} \right)\]

Here, \(C\) is the consolidation coefficient, a parameter that encapsulates the mechanical properties of the aquifer system.

Operational Workflow: From Data to Physics

The model’s custom train_step seamlessly integrates the data-driven and physics-informed components in a three-stage process:

1. Data-Driven Prediction

First, the model acts as a powerful data-driven forecaster. It uses its internal BaseAttentive engine to process the rich set of static, dynamic, and future features. This stage produces initial, highly accurate “mean” predictions for the target variables, denoted as \(\bar{s}_{net}\) and \(\bar{h}_{net}\). These predictions are used to calculate the data-fidelity portion of the loss.

2. Physics Residual Calculation

Next, the physics module is activated. The model takes the mean predictions (\(\bar{s}_{net}\), \(\bar{h}_{net}\)) and their corresponding spatio-temporal coordinates (\(t, x, y\)). Using TensorFlow’s GradientTape for automatic differentiation, it computes all the necessary derivatives (e.g., \(\frac{\partial \bar{s}_{net}}{\partial t}\), \(\frac{\partial^2 \bar{h}_{net}}{\partial x^2}\)). These derivatives are then plugged into the governing equations to calculate the physics residuals, \(\mathcal{R}_{gw}\) and/or \(\mathcal{R}_{c}\).

3. Composite Loss Function

Finally, the total loss function, \(\mathcal{L}_{total}\), is assembled as a weighted sum of the data and physics components.

\[\mathcal{L}_{total} = \mathcal{L}_{data} + \sum_{i \in \{gw, c\}} \lambda_{i} \mathcal{L}_{physics, i}\]
  • \(\mathcal{L}_{data}\): This is the supervised loss (e.g., Mean Squared Error or a Quantile Loss) calculated between the model’s final forecast and the true observational data.

  • \(\mathcal{L}_{physics, i}\): This is the Mean Squared Error of a specific PDE residual (e.g., \(\text{mean}(\mathcal{R}_c^2)\)). It quantifies how much the predictions violate that physical law.

  • \(\lambda_{i}\): These are user-defined hyperparameters (e.g., lambda_gw, lambda_cons) passed to .compile() that control the influence of each physical constraint on the total loss.

This composite loss is then used to update all trainable parameters in the model, ensuring that the network learns to be accurate to both the data and the underlying physics simultaneously.

Architectural & Feature Differences

While both models in the PIHALNet family aim to solve the same problem, they represent a significant evolution in software design and capability. Understanding their differences is key to leveraging the full power of the library.

The Legacy PiHALNet

The original PiHALNet is a monolithic, self-contained class that inherits directly from tf.keras.Model. Its data-driven components, such as the LSTMs and attention layers, were implemented specifically for its own use case. While effective, this design meant that the architecture was relatively rigid. Configuration was handled via a long list of parameters in the __init__ method, and the sequence of internal operations (like the application of attention) was largely fixed.

The Modern PIHALNet (BaseAttentive-based)

The modern PIHALNet represents a paradigm shift towards modularity and flexibility.

  • Inheritance from BaseAttentive: Its most important feature is that it inherits from the BaseAttentive class. It does not reinvent the data-driven forecasting engine; instead, it inherits a powerful, tested, and highly configurable one. This means any improvements to BaseAttentive are immediately available to PIHALNet.

  • Smart Configuration: Architectural choices are no longer controlled by numerous, disconnected parameters. Instead, they are defined in a single, clean architecture_config dictionary. This allows for clear and explicit control over key components like the encoder_type (‘hybrid’ vs. ‘transformer’) or feature_processing (‘vsn’ vs. ‘dense’). This makes experimenting with different architectures trivial.

  • Modular Attention Stack: The sequence of attention mechanisms in the decoder is no longer hardcoded. It is now controlled by the decoder_attention_stack key in the configuration dictionary, allowing the user to easily add, remove, or reorder attention layers (e.g., [‘cross’, ‘hierarchical’]) to tailor the model to a specific problem.

In essence, the modern design separates the “what” (the physics of subsidence and groundwater flow, handled by PIHALNet) from the “how” (the data-driven sequence processing, handled by BaseAttentive).

Comparison Summary

Comparison of PiHALNet Model Versions

Feature

PiHALNet (Legacy)

PIHALNet (Modern, BaseAttentive-based)

Base Class

Inherits directly from tf.keras.Model.

Inherits from the powerful and modular BaseAttentive class.

Core Architecture

Data-driven components are implemented internally and are specific to this class.

Leverages the full, tested, and highly-configurable BaseAttentive engine.

Configuration

Primarily configured via a long list of individual __init__ parameters.

Uses the modern architecture_config dictionary for clear, flexible control over internal structure.

Attention Mechanism

The sequence of attention layers is largely hardcoded within the call method.

The decoder’s attention stack is fully configurable via the decoder_attention_stack key in the config.

Feature Selection

Control over VSNs is a simple boolean flag (use_vsn).

Controlled via the feature_processing key, allowing easy switching between ‘vsn’ and ‘dense’.

For all new projects, the modern, BaseAttentive-based PIHALNet is the recommended choice due to its modularity, configurability, and alignment with the latest architectural patterns in the library. The legacy version is maintained for backward compatibility.


PIHALNet (Modern, BaseAttentive-based)

API Reference:

PIHALNet

The modern PIHALNet is a powerful and flexible implementation built upon the modular BaseAttentive architecture. It combines a state-of-the-art data-driven forecasting engine with physics-based regularization, making it the recommended choice for all new projects.

This version inherits all the advanced features of its parent class, including the smart configuration system, which allows for precise control over the model’s internal structure.

Usage Example: Standard Hybrid Model

This example demonstrates a typical use case for PIHALNet, where we use the default hybrid architecture (LSTM + Attention) and configure it to learn the physical consolidation coefficient \(C\) from data.

 1import tensorflow as tf
 2from fusionlab.nn.pinn import PIHALNet
 3from fusionlab.params import LearnableC
 4
 5# 1. Define Model & Data Dimensions
 6BATCH_SIZE = 16
 7PAST_STEPS = 10
 8HORIZON = 5
 9STATIC_DIM, DYNAMIC_DIM, FUTURE_DIM = 4, 6, 3
10
11# 2. Prepare Dummy Input Data
12# Feature-based inputs for the data-driven core
13static_features = tf.random.normal([BATCH_SIZE, STATIC_DIM])
14dynamic_features = tf.random.normal([BATCH_SIZE, PAST_STEPS, DYNAMIC_DIM])
15# For 'pihal_like' mode, future features span the horizon
16future_features = tf.random.normal([BATCH_SIZE, HORIZON, FUTURE_DIM])
17
18# Coordinate inputs for the PINN module
19coords = tf.random.normal([BATCH_SIZE, HORIZON, 3]) # (t, x, y)
20
21# Assemble the full input dictionary
22inputs = {
23    "static_features": static_features,
24    "dynamic_features": dynamic_features,
25    "future_features": future_features,
26    "coords": coords,
27}
28
29# Prepare dummy target data
30true_subsidence = tf.random.normal([BATCH_SIZE, HORIZON, 1])
31true_gwl = tf.random.normal([BATCH_SIZE, HORIZON, 1])
32targets = {
33    "subs_pred": true_subsidence,
34    "gwl_pred": true_gwl
35}
36
37# 3. Instantiate the Model
38model = PIHALNet(
39    static_input_dim=STATIC_DIM,
40    dynamic_input_dim=DYNAMIC_DIM,
41    future_input_dim=FUTURE_DIM,
42    output_subsidence_dim=1,
43    output_gwl_dim=1,
44    forecast_horizon=HORIZON,
45    max_window_size=PAST_STEPS,
46    mode='pihal_like',
47    # Ask the model to discover the consolidation coefficient
48    pinn_coefficient_C=LearnableC(initial_value=0.01),
49)
50
51# 4. Compile the model with data losses and a physics weight
52model.compile(
53    optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
54    loss={'subs_pred': 'mse', 'gwl_pred': 'mse'},
55    lambda_physics=0.1 # Weight for the consolidation loss
56)
57
58# 5. Display the model summary
59model.summary(line_length=110)

Advanced Configuration Example

This example demonstrates the power and flexibility of the smart configuration system. We will create a PIHALNet variant that uses a pure transformer encoder and a simplified attention stack in the decoder, showcasing how easily the internal architecture can be modified.

 1# 1. Define a custom architecture using the config dictionary
 2transformer_pinn_config = {
 3    'encoder_type': 'transformer',
 4    'decoder_attention_stack': ['cross', 'hierarchical'], # Simpler stack
 5    'feature_processing': 'dense' # Use dense layers instead of VSN
 6}
 7
 8# 2. Instantiate the model with the custom architecture
 9tfmr_pinn_model = PIHALNet(
10    static_input_dim=STATIC_DIM,
11    dynamic_input_dim=DYNAMIC_DIM,
12    future_input_dim=FUTURE_DIM,
13    output_subsidence_dim=1,
14    output_gwl_dim=1,
15    forecast_horizon=HORIZON,
16    max_window_size=PAST_STEPS,
17    mode='pihal_like',
18    pinn_coefficient_C=0.05, # Use a fixed physical constant
19    architecture_config=transformer_pinn_config # Pass the config
20)
21
22# 3. Compile the model as before
23tfmr_pinn_model.compile(
24    optimizer='adam',
25    loss='mae', # Use a different data loss
26    lambda_physics=0.2
27)
28
29# 4. Train for a single step to demonstrate it works
30print("\nTraining a Transformer-based PIHALNet for one step...")
31history = tfmr_pinn_model.fit(
32    inputs, targets, epochs=1, verbose=1
33)
34print("\nTraining step complete.")

PiHALNet (Legacy Version)

API Reference:

PiHALNet

This section documents the original, legacy version of PiHALNet. It is maintained primarily for backward compatibility. For all new projects, using the modern, PIHALNet (which inherits from BaseAttentive) is strongly recommended due to its superior flexibility and modularity.

The legacy PiHALNet is a self-contained, monolithic class that implements its data-driven components (LSTMs, attention) internally. Its architecture is configured via a long list of individual parameters in its constructor, making it less flexible than the modern version’s smart configuration system.

Usage Example

The instantiation and compilation process is similar to the modern version, but it relies on direct keyword arguments like objective and attention_levels instead of the architecture_config dictionary.

 1import tensorflow as tf
 2from fusionlab.nn.pinn.models.legacy import PiHALNet
 3
 4# 1. Define Model & Data Dimensions
 5BATCH_SIZE = 16
 6PAST_STEPS = 10
 7HORIZON = 5
 8STATIC_DIM, DYNAMIC_DIM, FUTURE_DIM = 4, 6, 3
 9
10# 2. Prepare Dummy Input Data (same as modern version)
11inputs = {
12    "static_features": tf.random.normal([BATCH_SIZE, STATIC_DIM]),
13    "dynamic_features": tf.random.normal([BATCH_SIZE, PAST_STEPS, DYNAMIC_DIM]),
14    "future_features": tf.random.normal([BATCH_SIZE, HORIZON, FUTURE_DIM]),
15    "coords": tf.random.normal([BATCH_SIZE, HORIZON, 3]),
16}
17targets = {
18    "subs_pred": tf.random.normal([BATCH_SIZE, HORIZON, 1]),
19    "gwl_pred": tf.random.normal([BATCH_SIZE, HORIZON, 1])
20}
21
22# 3. Instantiate the Legacy Model
23# Note the direct use of parameters like `objective`
24legacy_model = PiHALNet(
25    static_input_dim=STATIC_DIM,
26    dynamic_input_dim=DYNAMIC_DIM,
27    future_input_dim=FUTURE_DIM,
28    output_subsidence_dim=1,
29    output_gwl_dim=1,
30    forecast_horizon=HORIZON,
31    max_window_size=PAST_STEPS,
32    objective='hybrid', # Configured directly
33    pinn_coefficient_C='learnable'
34)
35
36# 4. Compile and train as usual
37legacy_model.compile(
38    optimizer='adam',
39    loss='mse',
40    lambda_physics=0.1
41)
42print("Successfully instantiated and compiled the legacy PiHALNet model.")

Next Steps

Note

Now that you are familiar with the architecture and features of the PIHALNet models, you can put them into practice.

Proceed to the exercises for a hands-on guide: Exercise: Hybrid Forecasting with PIHALNet