Hybrid Physics-Data Models: PiHALNet & PIHALNet¶
This guide delves into the PiHALNet family of models, a suite
of sophisticated hybrid architectures designed for complex, coupled
geophysical forecasting. These models uniquely combine a powerful,
data-driven deep learning engine with the physical laws of soil
mechanics and groundwater flow.
The primary goal of these models is to produce forecasts for land subsidence and groundwater levels that are not only accurate with respect to observational data but are also physically consistent. This page details the two evolutionary versions of this concept, highlighting their shared principles and key differences.
Common Key Features¶
Both versions of PiHALNet are built upon the same core
principles, offering a powerful set of common features that make
them uniquely suited for complex geophysical forecasting tasks.
Hybrid Data-Physics Architecture The models integrate a powerful data-driven forecasting core (the Hybrid Attentive LSTM Network, or HALNet, engine) with a physics-informed module. This means they learn from both observational data and the governing physical laws, with the PDE residual being a key component of the loss function.
Coupled Multi-Target Prediction They are designed to simultaneously forecast multiple, physically linked variables. The primary use case is predicting land subsidence (\(s\)) and groundwater levels (\(h\)) in a coupled manner.
Advanced Input Handling The architecture natively processes three distinct types of time series inputs:
Static features: Time-invariant metadata (e.g., sensor location, soil type).
Dynamic past features: Time-varying data observed up to the present (e.g., historical rainfall, past measurements).
Known future features: Time-varying data known in advance (e.g., day of the week, scheduled pumping).
This is enhanced by an optional
VariableSelectionNetwork(VSN) for intelligent, learnable feature selection.Sophisticated Temporal Processing To capture complex time-dependencies, the models employ:
A
MultiScaleLSTMthat processes the input sequence at various user-defined temporal resolutions, capturing both short-term and long-term patterns.A rich suite of attention mechanisms that work together to fuse information from all sources, including
CrossAttention(for encoder-decoder interaction) and various self-attention layers ikeHierarchicalAttention.
Flexible Physics-Informed Constraints The physics module is highly configurable:
The specific physical law to enforce can be selected via the
pde_modeparameter, with the primary focus being on'consolidation'.Key physical coefficients in the PDEs (like the consolidation coefficient \(C\) or hydraulic conductivity \(K\)) can be either fixed as known constants or made learnable. This allows the model to perform parameter inversion— discovering the values of physical constants directly from the observational data.
Probabilistic Forecasting for Uncertainty The models can produce probabilistic forecasts to quantify prediction uncertainty. By specifying a list of
quantiles, theQuantileDistributionModelinghead is activated, generating prediction intervals alongside the point forecast.Multi-Horizon Output Structure Using a
MultiDecoder, the models generate predictions for each step in the forecast horizon in a sequence-to-sequence manner, making them true multi-step-ahead forecasters.
Physical Formulation and Hybrid Loss¶
The power of the PIHALNet family lies in its hybrid formulation,
which forces the data-driven predictions to conform to physical laws.
This is achieved by integrating the governing equations of groundwater
hydrology and soil mechanics directly into the model’s training
objective.
Governing Equations¶
The models are designed to understand and simulate two coupled
physical processes. The specific equations activated during training
depend on the pde_mode setting.
1. Transient Groundwater Flow
This equation, a form of the diffusion equation, enforces the conservation of mass for groundwater moving through a porous medium. It describes how the hydraulic head \(h\) changes in time and space. The 2D residual form used by the model is:
Here, \(K\) is the hydraulic conductivity, \(S_s\) is the specific storage, and \(Q\) is a source/sink term.
2. Aquifer-System Consolidation
This principle links the rate of land subsidence (\(s\)) to changes in the hydraulic head field. As the head \(h\) declines, pressure within the aquifer system changes, causing fine-grained clay layers to compact, which results in subsidence at the surface. The residual form of this relationship is:
Here, \(C\) is the consolidation coefficient, a parameter that encapsulates the mechanical properties of the aquifer system.
Operational Workflow: From Data to Physics¶
The model’s custom train_step seamlessly integrates the data-driven
and physics-informed components in a three-stage process:
1. Data-Driven Prediction
First, the model acts as a powerful data-driven forecaster. It uses
its internal BaseAttentive engine to process the rich set of
static, dynamic, and future features. This stage produces initial,
highly accurate “mean” predictions for the target variables, denoted
as \(\bar{s}_{net}\) and \(\bar{h}_{net}\). These predictions
are used to calculate the data-fidelity portion of the loss.
2. Physics Residual Calculation
Next, the physics module is activated. The model takes the mean
predictions (\(\bar{s}_{net}\), \(\bar{h}_{net}\)) and their
corresponding spatio-temporal coordinates (\(t, x, y\)). Using
TensorFlow’s GradientTape for automatic differentiation, it
computes all the necessary derivatives (e.g.,
\(\frac{\partial \bar{s}_{net}}{\partial t}\),
\(\frac{\partial^2 \bar{h}_{net}}{\partial x^2}\)). These derivatives
are then plugged into the governing equations to calculate the
physics residuals, \(\mathcal{R}_{gw}\) and/or \(\mathcal{R}_{c}\).
3. Composite Loss Function
Finally, the total loss function, \(\mathcal{L}_{total}\), is assembled as a weighted sum of the data and physics components.
\(\mathcal{L}_{data}\): This is the supervised loss (e.g., Mean Squared Error or a Quantile Loss) calculated between the model’s final forecast and the true observational data.
\(\mathcal{L}_{physics, i}\): This is the Mean Squared Error of a specific PDE residual (e.g., \(\text{mean}(\mathcal{R}_c^2)\)). It quantifies how much the predictions violate that physical law.
\(\lambda_{i}\): These are user-defined hyperparameters (e.g.,
lambda_gw,lambda_cons) passed to.compile()that control the influence of each physical constraint on the total loss.
This composite loss is then used to update all trainable parameters in the model, ensuring that the network learns to be accurate to both the data and the underlying physics simultaneously.
Architectural & Feature Differences¶
While both models in the PIHALNet family aim to solve the same
problem, they represent a significant evolution in software design
and capability. Understanding their differences is key to leveraging
the full power of the library.
The Legacy PiHALNet¶
The original PiHALNet is a monolithic, self-contained class that
inherits directly from tf.keras.Model. Its data-driven components,
such as the LSTMs and attention layers, were implemented specifically
for its own use case. While effective, this design meant that the
architecture was relatively rigid. Configuration was handled via a
long list of parameters in the __init__ method, and the sequence
of internal operations (like the application of attention) was largely
fixed.
The Modern PIHALNet (BaseAttentive-based)¶
The modern PIHALNet represents a paradigm shift towards
modularity and flexibility.
Inheritance from BaseAttentive: Its most important feature is that it inherits from the
BaseAttentiveclass. It does not reinvent the data-driven forecasting engine; instead, it inherits a powerful, tested, and highly configurable one. This means any improvements toBaseAttentiveare immediately available toPIHALNet.Smart Configuration: Architectural choices are no longer controlled by numerous, disconnected parameters. Instead, they are defined in a single, clean
architecture_configdictionary. This allows for clear and explicit control over key components like theencoder_type(‘hybrid’ vs. ‘transformer’) orfeature_processing(‘vsn’ vs. ‘dense’). This makes experimenting with different architectures trivial.Modular Attention Stack: The sequence of attention mechanisms in the decoder is no longer hardcoded. It is now controlled by the
decoder_attention_stackkey in the configuration dictionary, allowing the user to easily add, remove, or reorder attention layers (e.g., [‘cross’, ‘hierarchical’]) to tailor the model to a specific problem.
In essence, the modern design separates the “what” (the physics
of subsidence and groundwater flow, handled by PIHALNet) from the
“how” (the data-driven sequence processing, handled by
BaseAttentive).
Comparison Summary
Feature |
PiHALNet (Legacy) |
PIHALNet (Modern, BaseAttentive-based) |
|---|---|---|
Base Class |
Inherits directly from tf.keras.Model. |
Inherits from the powerful and modular
|
Core Architecture |
Data-driven components are implemented internally and are specific to this class. |
Leverages the full, tested, and highly-configurable BaseAttentive engine. |
Configuration |
Primarily configured via a long list of individual |
Uses the modern |
Attention Mechanism |
The sequence of attention layers is largely hardcoded within the call method. |
The decoder’s attention stack is fully configurable via the
|
Feature Selection |
Control over VSNs is a simple boolean flag (use_vsn). |
Controlled via the |
For all new projects, the modern, BaseAttentive-based PIHALNet
is the recommended choice due to its modularity,
configurability, and alignment with the latest architectural patterns
in the library. The legacy version is maintained for backward
compatibility.
PIHALNet (Modern, BaseAttentive-based)¶
- API Reference:
The modern PIHALNet is a powerful and flexible implementation built
upon the modular BaseAttentive
architecture. It combines a state-of-the-art data-driven forecasting
engine with physics-based regularization, making it the recommended
choice for all new projects.
This version inherits all the advanced features of its parent class, including the smart configuration system, which allows for precise control over the model’s internal structure.
Usage Example: Standard Hybrid Model¶
This example demonstrates a typical use case for PIHALNet, where
we use the default hybrid architecture (LSTM + Attention) and configure
it to learn the physical consolidation coefficient \(C\) from data.
1import tensorflow as tf
2from fusionlab.nn.pinn import PIHALNet
3from fusionlab.params import LearnableC
4
5# 1. Define Model & Data Dimensions
6BATCH_SIZE = 16
7PAST_STEPS = 10
8HORIZON = 5
9STATIC_DIM, DYNAMIC_DIM, FUTURE_DIM = 4, 6, 3
10
11# 2. Prepare Dummy Input Data
12# Feature-based inputs for the data-driven core
13static_features = tf.random.normal([BATCH_SIZE, STATIC_DIM])
14dynamic_features = tf.random.normal([BATCH_SIZE, PAST_STEPS, DYNAMIC_DIM])
15# For 'pihal_like' mode, future features span the horizon
16future_features = tf.random.normal([BATCH_SIZE, HORIZON, FUTURE_DIM])
17
18# Coordinate inputs for the PINN module
19coords = tf.random.normal([BATCH_SIZE, HORIZON, 3]) # (t, x, y)
20
21# Assemble the full input dictionary
22inputs = {
23 "static_features": static_features,
24 "dynamic_features": dynamic_features,
25 "future_features": future_features,
26 "coords": coords,
27}
28
29# Prepare dummy target data
30true_subsidence = tf.random.normal([BATCH_SIZE, HORIZON, 1])
31true_gwl = tf.random.normal([BATCH_SIZE, HORIZON, 1])
32targets = {
33 "subs_pred": true_subsidence,
34 "gwl_pred": true_gwl
35}
36
37# 3. Instantiate the Model
38model = PIHALNet(
39 static_input_dim=STATIC_DIM,
40 dynamic_input_dim=DYNAMIC_DIM,
41 future_input_dim=FUTURE_DIM,
42 output_subsidence_dim=1,
43 output_gwl_dim=1,
44 forecast_horizon=HORIZON,
45 max_window_size=PAST_STEPS,
46 mode='pihal_like',
47 # Ask the model to discover the consolidation coefficient
48 pinn_coefficient_C=LearnableC(initial_value=0.01),
49)
50
51# 4. Compile the model with data losses and a physics weight
52model.compile(
53 optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
54 loss={'subs_pred': 'mse', 'gwl_pred': 'mse'},
55 lambda_physics=0.1 # Weight for the consolidation loss
56)
57
58# 5. Display the model summary
59model.summary(line_length=110)
Advanced Configuration Example¶
This example demonstrates the power and flexibility of the smart configuration system. We will create a PIHALNet variant that uses a pure transformer encoder and a simplified attention stack in the decoder, showcasing how easily the internal architecture can be modified.
1# 1. Define a custom architecture using the config dictionary
2transformer_pinn_config = {
3 'encoder_type': 'transformer',
4 'decoder_attention_stack': ['cross', 'hierarchical'], # Simpler stack
5 'feature_processing': 'dense' # Use dense layers instead of VSN
6}
7
8# 2. Instantiate the model with the custom architecture
9tfmr_pinn_model = PIHALNet(
10 static_input_dim=STATIC_DIM,
11 dynamic_input_dim=DYNAMIC_DIM,
12 future_input_dim=FUTURE_DIM,
13 output_subsidence_dim=1,
14 output_gwl_dim=1,
15 forecast_horizon=HORIZON,
16 max_window_size=PAST_STEPS,
17 mode='pihal_like',
18 pinn_coefficient_C=0.05, # Use a fixed physical constant
19 architecture_config=transformer_pinn_config # Pass the config
20)
21
22# 3. Compile the model as before
23tfmr_pinn_model.compile(
24 optimizer='adam',
25 loss='mae', # Use a different data loss
26 lambda_physics=0.2
27)
28
29# 4. Train for a single step to demonstrate it works
30print("\nTraining a Transformer-based PIHALNet for one step...")
31history = tfmr_pinn_model.fit(
32 inputs, targets, epochs=1, verbose=1
33)
34print("\nTraining step complete.")
PiHALNet (Legacy Version)¶
- API Reference:
PiHALNet
This section documents the original, legacy version of PiHALNet. It
is maintained primarily for backward compatibility. For all new
projects, using the modern, PIHALNet
(which inherits from BaseAttentive) is strongly recommended due to
its superior flexibility and modularity.
The legacy PiHALNet is a self-contained, monolithic class that
implements its data-driven components (LSTMs, attention) internally.
Its architecture is configured via a long list of individual parameters
in its constructor, making it less flexible than the modern version’s
smart configuration system.
Usage Example¶
The instantiation and compilation process is similar to the modern
version, but it relies on direct keyword arguments like objective
and attention_levels instead of the architecture_config
dictionary.
1import tensorflow as tf
2from fusionlab.nn.pinn.models.legacy import PiHALNet
3
4# 1. Define Model & Data Dimensions
5BATCH_SIZE = 16
6PAST_STEPS = 10
7HORIZON = 5
8STATIC_DIM, DYNAMIC_DIM, FUTURE_DIM = 4, 6, 3
9
10# 2. Prepare Dummy Input Data (same as modern version)
11inputs = {
12 "static_features": tf.random.normal([BATCH_SIZE, STATIC_DIM]),
13 "dynamic_features": tf.random.normal([BATCH_SIZE, PAST_STEPS, DYNAMIC_DIM]),
14 "future_features": tf.random.normal([BATCH_SIZE, HORIZON, FUTURE_DIM]),
15 "coords": tf.random.normal([BATCH_SIZE, HORIZON, 3]),
16}
17targets = {
18 "subs_pred": tf.random.normal([BATCH_SIZE, HORIZON, 1]),
19 "gwl_pred": tf.random.normal([BATCH_SIZE, HORIZON, 1])
20}
21
22# 3. Instantiate the Legacy Model
23# Note the direct use of parameters like `objective`
24legacy_model = PiHALNet(
25 static_input_dim=STATIC_DIM,
26 dynamic_input_dim=DYNAMIC_DIM,
27 future_input_dim=FUTURE_DIM,
28 output_subsidence_dim=1,
29 output_gwl_dim=1,
30 forecast_horizon=HORIZON,
31 max_window_size=PAST_STEPS,
32 objective='hybrid', # Configured directly
33 pinn_coefficient_C='learnable'
34)
35
36# 4. Compile and train as usual
37legacy_model.compile(
38 optimizer='adam',
39 loss='mse',
40 lambda_physics=0.1
41)
42print("Successfully instantiated and compiled the legacy PiHALNet model.")
Next Steps¶
Note
Now that you are familiar with the architecture and features of
the PIHALNet models, you can put them into practice.
Proceed to the exercises for a hands-on guide: Exercise: Hybrid Forecasting with PIHALNet