fusionlab.nn.pinn.PiHALNet¶
- class fusionlab.nn.pinn.PiHALNet[source]¶
Bases:
Model,NNLearnerPhysics-Informed Hybrid Attentive LSTM Network (PiHALNet).
This model integrates a data-driven forecasting architecture, based on LSTMs and multiple attention mechanisms, with physics-informed constraints derived from the governing equations of land subsidence. It is designed to produce physically consistent, multi-horizon probabilistic forecasts for both subsidence and groundwater levels, while also offering the capability to discover physical parameters from observational data.
The architecture can operate in two modes for its physical coefficients: 1. Parameter Specification: Use predefined physical constants. 2. Parameter Discovery: Treat physical constants as trainable
variables to be learned during training (default).
n
The architecture can operate in two modes for its physical coefficient “C”: 1. Parameter Specification: Use a user-supplied constant value. 2. Parameter Discovery: Treat the coefficient as trainable (default),
learning log(C) during training to ensure positivity.
PIHALNet’s total loss is a weighted sum of the data fidelity loss on subsidence/GWL predictions and a physics residual loss (PDE loss).
Formulation¶
Given inputs \(\mathbf{x}_{\text{static}}\), \(\mathbf{x}_{\text{dyn}}\), and (optionally) \(\mathbf{x}_{\text{fut}}\), PIHALNet produces multi-horizon predictions:
:math:`\hat{s}[t+h],\; \hat{h}[t+h] \quad (h=1,\dots,H),`for subsidence \(s\) and GWL \(h\). The data loss (\(L_{\text{data}}\)) is:
.. math:: L_{\text{data}} = \sum_{h=1}^H \bigl\{ \ell\bigl( \hat{s}[t+h], s[t+h]\bigr) + \ell\bigl( \hat{h}[t+h], h[t+h]\bigr) \bigr\},
where \(\ell\) is typically MSE or MAE. The PDE residual loss (\(L_{\text{pde}}\)) for Terzaghi’s consolidation equation is:
.. math:: L_{\text{pde}} = \frac{1}{H} \sum_{h=1}^H \bigl\| C \, \Delta s[t+h] - \frac{\partial h}{\partial t}[t+h] \bigr\|^2,
computed via finite differences on the sequence of mean predictions. The total loss is:
.. math:: L_{\text{total}} = L_{\text{data}} + \lambda_{\text{pde}} \, L_{\text{pde}},
where \(\lambda_{\text{pde}}\) is a user-defined weight.
- param static_input_dim:
Dimensionality of static (time-invariant) feature vector. Must be \(\geq 0\). If zero, static inputs are omitted.
- type static_input_dim:
int- param dynamic_input_dim:
Dimensionality of the historical dynamic feature vector at each time step. Must be \(\geq 1\).
- type dynamic_input_dim:
int- param future_input_dim:
Dimensionality of known future covariates at each forecast step. Must be \(\geq 0\). If zero, no future covariates are used.
- type future_input_dim:
int- param output_subsidence_dim:
Number of simultaneous subsidence targets (usually 1). Must be \(\geq 1\).
- type output_subsidence_dim:
int, default1- param output_gwl_dim:
Number of simultaneous groundwater-level targets (usually 1). Must be \(\geq 1\).
- type output_gwl_dim:
int, default1- param embed_dim:
Size of the embedding after initial feature processing (VSN/GRN). Controls hidden dimension for attention and LSTM inputs.
- type embed_dim:
int, default32- param hidden_units:
Number of hidden units in the Gated Residual Networks (GRNs) and Dense layers. Must be \(\geq 1\).
- type hidden_units:
int, default64- param lstm_units:
Number of units in each LSTM cell. Must be \(\geq 1\). For multi-scale LSTM, this is the base size at each scale.
- type lstm_units:
int, default64- param attention_units:
Number of units in multi-head attention and Hierarchical Attention layers. Must be \(\geq 1\).
- type attention_units:
int, default32- param num_heads:
Number of attention heads in all multi-head attention modules. Must be \(\geq 1\).
- type num_heads:
int, default4- param dropout_rate:
Dropout rate applied in various layers (VSN GRNs, attention heads, final MLP). Must be in \([0.0, 1.0]\).
- type dropout_rate:
float, default0.1- param forecast_horizon:
Number of future time steps to predict. Must be \(\geq 1\). Multi-horizon predictions are produced for \(h=1,\dots,\text{forecast_horizon}\).
- type forecast_horizon:
int, default1- param quantiles:
If provided, PIHALNet will output quantile forecasts at each horizon. Each quantile dimension produces an additional branch. If None, only mean predictions are used for PDE residual (physics) loss.
- type quantiles:
listoffloat, optional- param max_window_size:
Maximum time-window length for DynamicTimeWindow layer. Must be \(\geq 1\). Controls the longest subsequence of historical dynamic features used at each decoding step.
- type max_window_size:
int, default10- param memory_size:
Size of the memory bank for MemoryAugmentedAttention. Must be \(\geq 1\).
- type memory_size:
int, default100- param scales:
Scales used in MultiScaleLSTM. If “auto”, scales are chosen automatically based on forecast_horizon. Otherwise, each scale must divide the forecast_horizon. Example: \([1, 3, 6]\) for a 6-step horizon.
- type scales:
listofintor“auto”, optional- param multi_scale_agg:
Aggregation method for multi-scale outputs: - “last”: take last time-step output from each scale. - “average”: average outputs over time. - “flatten”: concatenate outputs over time. - “auto”: choose “last” by default.
- type multi_scale_agg:
{“last”, “average”, “flatten”, “auto”}, default“last”- param final_agg:
Aggregation method after DynamicTimeWindow: - “last”: use final time-step. - “average”: average over windows. - “flatten”: flatten all window outputs.
- type final_agg:
{“last”, “average”, “flatten”}, default“last”- param activation:
Activation function for all GRNs, Dense layers, and VSNs. If a string, must be one of Keras built-ins (e.g. “relu”, “gelu”).
- type activation:
strorcallable, default“relu”- param use_residuals:
If True, apply a residual connection via a Dense layer to embeddings.
- type use_residuals:
bool, defaultTrue- param use_batch_norm:
If True, apply LayerNormalization after each Dense/GRN block.
- type use_batch_norm:
bool, defaultFalse- param pde_mode:
Determines which PDE component(s) to include in the physics loss: - “consolidation”: solve Terzaghi’s consolidation only (1-D vertical). - “gw_flow”: solve coupled groundwater flow (reserved for future release). - “both”: include both consolidation and gw_flow (reserved). - “none”: disable physics loss entirely. If a list is provided, only those modes are active. “consolidation” is enforced by default for this release.
- type pde_mode:
strorlistofstrorNone, default“consolidation”- param pinn_coefficient_C:
Configuration for consolidation coefficient \(C\): - “learnable”: estimate \(\log(C)\) as a trainable scalar. - float (\(>0\)): use this fixed constant. - None: disable physics entirely (\(C=1\) but unused).
- type pinn_coefficient_C:
str,float, orNone, default“learnable”- param gw_flow_coeffs:
Dictionary of groundwater-flow coefficients: - “K”: hydraulic conductivity (\(>0\)), default “learnable”. - “Ss”: specific storage (\(>0\)), default “learnable”. - “Q”: source/sink term, default 0.0. Only used if “gw_flow” is in pde_mode.
- type gw_flow_coeffs:
dictorNone, defaultNone- param use_vsn:
If True, apply VariableSelectionNetwork blocks to static, dynamic, and future inputs. If False, skip VSN and use simple dense projections.
- type use_vsn:
bool, defaultTrue- param vsn_units:
Output dimension of each VSN block. If None, defaults to hidden_units.
- type vsn_units:
intorNone, defaultNone- param name:
Keras model name.
- type name:
str, default“PIHALNet”- param **kwargs:
Additional keyword arguments forwarded to tf.keras.Model initializer.
- ivar static_vsn:
VSN block for static features (if use_vsn=True and static_input_dim>0). Otherwise None.
- vartype static_vsn:
VariableSelectionNetworkorNone- ivar dynamic_vsn:
VSN block for dynamic features (if use_vsn=True).
- vartype dynamic_vsn:
VariableSelectionNetworkorNone- ivar future_vsn:
VSN block for future features (if use_vsn=True and future_input_dim>0).
- vartype future_vsn:
VariableSelectionNetworkorNone- ivar multi_modal_embedding:
Fallback embedding layer if VSNs are disabled or incomplete.
- vartype multi_modal_embedding:
MultiModalEmbeddingorNone- ivar multi_scale_lstm:
LSTM module that operates at multiple temporal scales.
- vartype multi_scale_lstm:
MultiScaleLSTM- ivar hierarchical_attention:
Performs hierarchical attention over dynamic/future features.
- vartype hierarchical_attention:
HierarchicalAttention- ivar cross_attention:
Cross-attends dynamic features with fused embeddings.
- vartype cross_attention:
CrossAttention- ivar memory_augmented_attention:
Attention mechanism with an external memory bank.
- vartype memory_augmented_attention:
MemoryAugmentedAttention- ivar dynamic_time_window:
Splits attention-fused features into overlapping windows.
- vartype dynamic_time_window:
DynamicTimeWindow- ivar multi_decoder:
Produces final multi-horizon mean forecasts for combined targets.
- vartype multi_decoder:
MultiDecoder- ivar quantile_distribution_modeling:
If quantiles is not None, applies quantile modeling over decoder outputs.
- vartype quantile_distribution_modeling:
QuantileDistributionModelingorNone- ivar positional_encoding:
Adds positional embeddings to fused features.
- vartype positional_encoding:
PositionalEncoding- ivar learned_normalization:
Optional normalization layer applied to raw inputs or VSN outputs.
- vartype learned_normalization:
LearnedNormalization- ivar log_C_coefficient:
If pinn_coefficient_C == “learnable”, stores \(\log(C)\). Otherwise None.
- vartype log_C_coefficient:
tf.VariableorNone- ivar log_K_gwflow_var:
Trainable log(K) for groundwater flow, if enabled.
- vartype log_K_gwflow_var:
tf.VariableorNone- ivar log_Ss_gwflow_var:
Trainable log(Ss) for groundwater flow, if enabled.
- vartype log_Ss_gwflow_var:
tf.VariableorNone- ivar Q_gwflow_var:
Trainable or fixed Q term for groundwater flow, if enabled.
- vartype Q_gwflow_var:
tf.VariableorNone
- call(inputs, training=False) dict[str, tf.Tensor][source]¶
Forward pass computing predictions and PDE residual: 1. Process inputs via process_pinn_inputs. 2. Validate tensor shapes (via validate_model_inputs). 3. Extract features with build_halnet_layers and attention/LSTM. 4. Decode multi-horizon mean predictions via multi_decoder. 5. If quantiles is provided, produce quantile outputs. 6. Split outputs into subs_pred, gwl_pred, plus mean for PDE. 7. Compute PDE residual via compute_consolidation_residual. Returns a dict containing:
“subs_pred”: subsidence forecasts (with quantiles if requested).
“gwl_pred”: GWL forecasts (with quantiles if requested).
“pde_residual”: tensor of physics residuals.
- Parameters:
inputs (Dict[str, Tensor | None])
training (bool)
- Return type:
Dict[str, Tensor]
- compile(optimizer, loss, metrics=None, loss_weights=None, lambda_pde=1.0, \*\*kwargs)[source]¶
Configures the model for training with a composite PINN loss. Args:
optimizer : Keras optimizer instance (e.g. Adam). loss : dict mapping output names (“subs_pred”, “gwl_pred”) to Keras loss
functions or string identifiers (e.g. “mse”).
metrics : dict mapping output names to lists of metrics to track. loss_weights : dict mapping output names to scalar weights for data loss. lambda_pde : float, weight for physics residual loss. Defaults to 1.0. **kwargs : Additional args for tf.keras.Model.compile.
- Parameters:
lambda_pde (float)
- train_step(data: tuple) dict[str, tf.Tensor][source]¶
Custom training step to handle the composite PINN loss. 1. Unpack (inputs_dict, targets_dict) from data. 2. Forward pass with self(inputs, training=True). 3. Extract subs_pred, gwl_pred for data loss. 4. Compute data loss via self.compute_loss(x, y, y_pred). 5. Compute physics loss: \(\mathrm{MSE}(\text{pde_residual})\). 6. Total loss = data_loss + lambda_pde * physics_loss. 7. Compute/apply gradients via optimizer. 8. Update compiled metrics. Returns a dict of results:
“loss” (total PINN loss), “data_loss”, “physics_loss”, plus any compiled metrics (e.g. “subs_mae”, “gwl_mae”).
- Parameters:
data (Tuple[Dict, Dict])
- Return type:
Dict[str, Tensor]
- get_pinn_coefficient_C() tf.Tensor or None[source]¶
Returns the positive consolidation coefficient \(C\): - If “learnable”, returns \(\exp(\log_C_coefficient)\). - If fixed float was provided, returns that constant tensor. - If disabled, returns \(1.0\) if only consolidation is active, else None.
- Return type:
Tensor
- get_K_gwflow(), get_Ss_gwflow(), get_Q_gwflow() tf.Tensor or None¶
Return positive hydraulic conductivity \(K\), specific storage \(S_s\), and source/sink term \(Q\) for groundwater flow PDE, if “gw_flow” mode is active. If not, return None.
- get_config() dict[source]¶
Returns a dict of all initialization arguments (static, dynamic, future dims; PINN coefficients; architectural HPs). Enables model saving/loading via tf.keras.models.clone_model.
- from_config(config: dict, custom_objects=None) PIHALNet[source]¶
Reconstructs a PIHALNet instance from get_config() output.
- run_halnet_core(static_input, dynamic_input, future_input, training) tf.Tensor[source]¶
Executes the core data-driven feature pipeline: - Applies VSNs (if enabled) or Dense to each input block. - Aligns future features via align_temporal_dimensions. - Concatenates dynamic + future embeddings (or uses MME if no VSN). - Applies positional encoding and optional residual connection. - Runs MultiScaleLSTM, hierarchical/cross/memory-attention, and fusion. - Returns final features to feed into MultiDecoder.
- Parameters:
static_input (Tensor)
dynamic_input (Tensor)
future_input (Tensor)
training (bool)
- Return type:
Tensor
- split_outputs(predictions_combined, decoded_outputs_for_mean) tuple[source]¶
Separates combined predictions into subsidence and GWL components: - predictions_combined: may include a quantile dimension (Rank 4) or be
Rank 3 if only mean forecasts. Splits along last axis into subsidence and GWL for data loss.
decoded_outputs_for_mean: always Rank 3 (Batch, Horizon, CombinedDim) before quantile modeling. Splits into s_pred_mean_for_pde and gwl_pred_mean_for_pde for physics residual calculation.
- Parameters:
predictions_combined (Tensor)
decoded_outputs_for_mean (Tensor)
- Return type:
Tuple[Tensor, Tensor, Tensor, Tensor]
Notes
If quantiles is provided, final outputs shape is (batch_size, horizon, num_quantiles, combined_output_dim). Otherwise shape is (batch_size, horizon, combined_output_dim).
Consolidation residual uses finite differences along horizon steps to approximate \(\partial h / \partial t\) and \(\Delta s\).
Groundwater-flow PDE is reserved for a future release (“gw_flow” mode).
VariableSelectionNetworks (VSNs) refine feature selection. If use_vsn=False, simple Dense projections are used instead.
MultiScaleLSTM can process temporal patterns at different resolutions. Scales must divide the forecast horizon if not “auto”.
Examples
# 1) Instantiate PIHALNet for point forecasts (no quantiles): >>> from fusionlab.nn.pinn.models import PiHALNet >>> model = PiHALNet( … static_input_dim=5, … dynamic_input_dim=3, … future_input_dim=2, … output_subsidence_dim=1, … output_gwl_dim=1, … embed_dim=32, … hidden_units=64, … lstm_units=64, … attention_units=32, … num_heads=4, … dropout_rate=0.1, … forecast_horizon=1, … quantiles=None, … max_window_size=10, … memory_size=100, … scales=’auto’, … multi_scale_agg=’last’, … final_agg=’last’, … activation=’relu’, … use_residuals=True, … use_batch_norm=False, … pde_mode=’consolidation’, … pinn_coefficient_C=’learnable’, … gw_flow_coeffs=None, … use_vsn=True, … vsn_units=None, … ) >>> model.compile( … optimizer=’adam’, … loss={‘subs_pred’: ‘mse’, ‘gwl_pred’: ‘mse’}, … metrics={‘subs_pred’: [‘mae’], ‘gwl_pred’: [‘mae’]}, … loss_weights={‘subs_pred’: 1.0, ‘gwl_pred’: 0.8}, … lambda_pde=0.5, … ) >>> import numpy as np >>> inputs = { … ‘coords’: np.zeros((2, 2), dtype=’float32’), … ‘static_features’: np.zeros((2, 5), dtype=’float32’), … ‘dynamic_features’: np.zeros((2, 1, 3), dtype=’float32’), … ‘future_features’: np.zeros((2, 1, 2), dtype=’float32’), … } >>> targets = { … ‘subs_pred’: np.zeros((2, 1, 1), dtype=’float32’), … ‘gwl_pred’: np.zeros((2, 1, 1), dtype=’float32’), … } >>> outputs = model(inputs, training=False) >>> print(outputs[‘subs_pred’].shape, outputs[‘gwl_pred’].shape) (2, 1, 1) (2, 1, 1)
# 2) Instantiate with quantile forecasting (e.g., [0.1, 0.5, 0.9]): >>> model_q = PIHALNet( … static_input_dim=5, … dynamic_input_dim=3, … future_input_dim=2, … output_subsidence_dim=1, … output_gwl_dim=1, … embed_dim=32, … hidden_units=64, … lstm_units=64, … attention_units=32, … num_heads=4, … dropout_rate=0.1, … forecast_horizon=3, … quantiles=[0.1, 0.5, 0.9], … max_window_size=10, … memory_size=100, … scales=[1, 3], … multi_scale_agg=’average’, … final_agg=’flatten’, … activation=’gelu’, … use_residuals=True, … use_batch_norm=True, … pde_mode=’consolidation’, … pinn_coefficient_C=0.02, # fixed C = 0.02 … gw_flow_coeffs={‘K’: 1e-4, ‘Ss’: 1e-5, ‘Q’: 0.0}, … use_vsn=False, … vsn_units=None, … ) >>> model_q.compile( … optimizer=’adam’, … loss={‘subs_pred’: ‘mse’, ‘gwl_pred’: ‘mse’}, … metrics={‘subs_pred’: [‘mae’], ‘gwl_pred’: [‘mae’]}, … loss_weights={‘subs_pred’: 1.0, ‘gwl_pred’: 0.8}, … lambda_pde=1.0, … ) >>> outputs_q = model_q(inputs, training=False) >>> # Since quantiles=3, final outputs have shape (2, 3, 3, 2): >>> print(outputs_q[‘subs_pred’].shape, outputs_q[‘gwl_pred’].shape) (2, 3, 3, 1) (2, 3, 3, 1)
See also
fusionlab.nn.pinn.tuning.PIHALTunerHyperparameter tuner specifically built for PIHALNet.
fusionlab.nn.pinn.utils.process_pinn_inputsPreprocessing of nested input dict to tensors.
fusionlab.nn.pinn._tensor_validation.validate_model_inputsEnsures dynamic/static/future tensors match declared dims.
References
- __init__(static_input_dim, dynamic_input_dim, future_input_dim, output_subsidence_dim=1, output_gwl_dim=1, embed_dim=32, hidden_units=64, lstm_units=64, attention_units=32, num_heads=4, dropout_rate=0.1, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, scales=None, multi_scale_agg='last', final_agg='last', activation='relu', use_residuals=True, use_batch_norm=False, pde_mode='consolidation', pinn_coefficient_C=<LearnableC trainable=True, value=<tf.Variable 'log_pinn_coefficient_C:0' shape=() dtype=float32, numpy=-4.6051702>>, gw_flow_coeffs=None, use_vsn=True, vsn_units=None, mode=None, name='PiHALNet', **kwargs)[source]¶
- Parameters:
static_input_dim (int)
dynamic_input_dim (int)
future_input_dim (int)
output_subsidence_dim (int)
output_gwl_dim (int)
embed_dim (int)
hidden_units (int)
lstm_units (int)
attention_units (int)
num_heads (int)
dropout_rate (float)
forecast_horizon (int)
quantiles (List[float] | None)
max_window_size (int)
memory_size (int)
scales (List[int] | None)
multi_scale_agg (str)
final_agg (str)
activation (str)
use_residuals (bool)
use_batch_norm (bool)
pde_mode (str | List[str] | None)
pinn_coefficient_C (LearnableC | FixedC | DisabledC | str | float | None)
gw_flow_coeffs (Dict[str, str | float | None] | None)
use_vsn (bool)
vsn_units (int | None)
mode (str | None)
name (str)
Methods
__init__(static_input_dim, ...[, ...])add_loss(losses, **kwargs)Add loss tensor(s), potentially dependent on layer inputs.
add_metric(value[, name])Adds metric tensor to the layer.
add_update(updates)Add update op(s), potentially dependent on layer inputs.
add_variable(*args, **kwargs)Deprecated, do NOT use! Alias for add_weight.
add_weight([name, shape, dtype, ...])Adds a new variable to the layer.
build(input_shape)Builds the model based on input shapes received.
build_from_config(config)Builds the layer's states with the supplied config dict.
call(inputs[, training])Forward pass for
PIHALNet.compile(optimizer, loss[, metrics, ...])Configure PIHALNet for training with a composite PINN loss.
compile_from_config(config)Compiles the model with the information given in config.
compute_loss([x, y, y_pred, sample_weight])Compute the total loss, validate it, and return it.
compute_mask(inputs[, mask])Computes an output mask tensor.
compute_metrics(x, y, y_pred, sample_weight)Update metric states and collect all metrics to be returned.
compute_output_shape(input_shape)Computes the output shape of the layer.
compute_output_signature(input_signature)Compute the output tensor signature of the layer based on the inputs.
count_params()Count the total number of scalars composing the weights.
evaluate([x, y, batch_size, verbose, ...])Returns the loss value & metrics values for the model in test mode.
evaluate_generator(generator[, steps, ...])Evaluates the model on a data generator.
export(filepath)Create a SavedModel artifact for inference (e.g. via TF-Serving).
finalize_state()Finalizes the layers state after updating layer weights.
fit([x, y, batch_size, epochs, verbose, ...])Trains the model for a fixed number of epochs (dataset iterations).
fit_generator(generator[, steps_per_epoch, ...])Fits the model on data yielded batch-by-batch by a Python generator.
from_config(config[, custom_objects])Creates a layer from its config.
get_build_config()Returns a dictionary with the layer's input shape.
get_compile_config()Returns a serialized config with information for compiling the model.
Returns the config of the Model.
get_input_at(node_index)Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(node_index)Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(node_index)Retrieves the input shape(s) of a layer at a given node.
get_layer([name, index])Retrieves a layer based on either its name (unique) or index.
get_metrics_result()Returns the model's metrics values as a dict.
get_output_at(node_index)Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(node_index)Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(node_index)Retrieves the output shape(s) of a layer at a given node.
get_params([deep])Get the parameters for this learner.
Returns the physical coefficient C.
get_weight_paths()Retrieve all the variables and their paths for the model.
get_weights()Retrieves the weights of the model.
help(**kwargs)load(file_path[, format])Load the learner's state from a specified file in the desired format.
load_own_variables(store)Loads the state of the layer.
load_weights(filepath[, skip_mismatch, ...])Loads all layer weights from a saved files.
make_predict_function([force])Creates a function that executes one step of inference.
make_test_function([force])Creates a function that executes one step of evaluation.
make_train_function([force])Creates a function that executes one step of training.
predict(x[, batch_size, verbose, steps, ...])Generates output predictions for the input samples.
predict_generator(generator[, steps, ...])Generates predictions for the input samples from a data generator.
predict_on_batch(x)Returns predictions for a single batch of samples.
predict_step(data)The logic for one inference step.
reset_metrics()Resets the state of all the metrics in the model.
reset_states()run_halnet_core(static_input, dynamic_input, ...)Execute the core encoder–decoder data‑driven pipeline of
PIHALNet.save(filepath[, overwrite, save_format])Saves a model as a TensorFlow SavedModel or HDF5 file.
save_own_variables(store)Saves the state of the layer.
save_spec([dynamic_batch])Returns the tf.TensorSpec of call args as a tuple (args, kwargs).
save_weights(filepath[, overwrite, ...])Saves all layer weights.
set_params(**params)Set the parameters of this learner.
set_weights(weights)Sets the weights of the layer, from NumPy arrays.
split_outputs(predictions_combined, ...)Separate the combined output tensor into individual subsidence and groundwater‑level (GWL) components and return both the final and mean predictions needed for the two loss terms used in PIHALNet (data loss and physics/PDE loss).
summary([line_length, positions, print_fn, ...])Prints a string summary of the network.
test_on_batch(x[, y, sample_weight, ...])Test the model on a single batch of samples.
test_step(data)The logic for one evaluation step.
to_json(**kwargs)Returns a JSON string containing the network configuration.
to_yaml(**kwargs)Returns a yaml string containing the network configuration.
train_on_batch(x[, y, sample_weight, ...])Runs a single gradient update on a single batch of data.
train_step(data)Single optimisation step implementing the composite loss.
with_name_scope(method)Decorator to automatically enter the module name scope.
Attributes
activity_regularizerOptional regularizer function for the output of this layer.
autotune_steps_per_executionSettable property to enable tuning for steps_per_execution
compute_dtypeThe dtype of the layer's computations.
distribute_reduction_methodThe method employed to reduce per-replica values during training.
distribute_strategyThe tf.distribute.Strategy this model was created under.
dtypeThe dtype of the layer weights.
dtype_policyThe dtype policy associated with this layer.
dynamicWhether the layer is dynamic (eager-only); set in the constructor.
inbound_nodesReturn Functional API nodes upstream of this layer.
inputRetrieves the input tensor(s) of a layer.
input_maskRetrieves the input mask tensor(s) of a layer.
input_shapeRetrieves the input shape(s) of a layer.
input_specInputSpec instance(s) describing the input format for this layer.
jit_compileSpecify whether to compile the model with XLA.
layerslossesList of losses added using the add_loss() API.
metricsReturn metrics added using compile() or add_metric().
metrics_namesReturns the model's display labels for all outputs.
nameName of the layer (string), set in the constructor.
name_scopeReturns a tf.name_scope instance for this class.
non_trainable_variablesSequence of non-trainable variables owned by this module and its submodules.
non_trainable_weightsList of all non-trainable weights tracked by this layer.
outbound_nodesReturn Functional API nodes downstream of this layer.
outputRetrieves the output tensor(s) of a layer.
output_maskRetrieves the output mask tensor(s) of a layer.
output_shapeRetrieves the output shape(s) of a layer.
run_eagerlySettable attribute indicating whether the model should run eagerly.
state_updatesDeprecated, do NOT use!
statefulsteps_per_executionSettable `steps_per_execution variable. Requires a compiled model.
submodulesSequence of all sub-modules.
supports_maskingWhether this layer supports computing a mask using compute_mask.
trainabletrainable_variablesSequence of trainable variables owned by this module and its submodules.
trainable_weightsList of all trainable weights tracked by this layer.
updatesvariable_dtypeAlias of Layer.dtype, the dtype of the weights.
variablesReturns the list of all layer variables/weights.
weightsReturns the list of all layer variables/weights.
- __init__(static_input_dim, dynamic_input_dim, future_input_dim, output_subsidence_dim=1, output_gwl_dim=1, embed_dim=32, hidden_units=64, lstm_units=64, attention_units=32, num_heads=4, dropout_rate=0.1, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, scales=None, multi_scale_agg='last', final_agg='last', activation='relu', use_residuals=True, use_batch_norm=False, pde_mode='consolidation', pinn_coefficient_C=<LearnableC trainable=True, value=<tf.Variable 'log_pinn_coefficient_C:0' shape=() dtype=float32, numpy=-4.6051702>>, gw_flow_coeffs=None, use_vsn=True, vsn_units=None, mode=None, name='PiHALNet', **kwargs)[source]¶
- Parameters:
static_input_dim (int)
dynamic_input_dim (int)
future_input_dim (int)
output_subsidence_dim (int)
output_gwl_dim (int)
embed_dim (int)
hidden_units (int)
lstm_units (int)
attention_units (int)
num_heads (int)
dropout_rate (float)
forecast_horizon (int)
quantiles (List[float] | None)
max_window_size (int)
memory_size (int)
scales (List[int] | None)
multi_scale_agg (str)
final_agg (str)
activation (str)
use_residuals (bool)
use_batch_norm (bool)
pde_mode (str | List[str] | None)
pinn_coefficient_C (LearnableC | FixedC | DisabledC | str | float | None)
gw_flow_coeffs (Dict[str, str | float | None] | None)
use_vsn (bool)
vsn_units (int | None)
mode (str | None)
name (str)
- run_halnet_core(static_input, dynamic_input, future_input, training)[source]¶
Execute the core encoder–decoder data‑driven pipeline of
PIHALNet.Executes data-driven pipeline with flexible encoder-decoder logic.
The method ingests static, dynamic (past), and future covariates, passes them through Variable Selection Networks (VSNs) or dense projections, and then processes them with a multi‑scale LSTM encoder and a hierarchical attention‑augmented decoder to produce a single latent vector per sample. This vector is subsequently fed to the model’s task‑specific output head (not shown here).
- Parameters:
static_input (
Tensor) – Tensor of shape(B, S)containing time‑invariant features such as lithology or well depth, where B is the batch size and S isstatic_input_dim.dynamic_input (
Tensor) – Past time‑series of length \(T_\mathrm{past}\) with shape(B, T_past, D_in). Typical examples are historical groundwater levels or precipitation.future_input (
Tensor) – Known future covariates of length :pyattr:`forecast_horizon` with shape(B, T_future, F_in).training (
bool) – Flag forwarded to Keras layers to enable dropout and other training‑only behaviour.
- Returns:
A 2‑D tensor of shape
(B, A), where A isattention_units. This latent representation encodes the fused historical context, static descriptors, and known future information.- Return type:
Tensor
Notes
If :pyattr:`use_vsn` is True, each input type is first passed through a Variable Selection Network that outputs both feature‑wise importance weights and transformed features.
Duplicate temporal resolutions produced by :pyclass:`~fusionlab.layers.MultiScaleLSTM` are aggregated with :pyfunc:`fusionlab.ops.aggregate_multiscale_on_3d`.
Duplicate residual connections follow the original TFT design but employ GRN‑based \(\mathrm{Add}\!\!+\!\! \mathrm{LayerNorm}\) blocks for improved stability.
- split_outputs(predictions_combined, decoded_outputs_for_mean)[source]¶
Separate the combined output tensor into individual subsidence and groundwater‑level (GWL) components and return both the final and mean predictions needed for the two loss terms used in PIHALNet (data loss and physics/PDE loss).
The method supports two output shapes:
Quantile mode
(B, H, Q, C)where Q is the number of quantiles and C =output_subsidence_dim + output_gwl_dim.Deterministic mode
(B, H, C)when quantiles are disabled.
- Parameters:
predictions_combined (
Tensor) – Network output after theQuantileDistributionModelingstage. Shape is(B, H, C)or(B, H, Q, C).decoded_outputs_for_mean (
Tensor) – Decoder output before quantile distribution, used to compute the PDE residual. Shape is(B, H, C).training (
bool, optional) – Inherited from the calling context. Present only in TensorFlow graph mode; not used explicitly here.
- Returns:
s_pred_final (
Tensor) – Subsidence predictions ready for the data‑fidelity loss. Shape matchespredictions_combinedminus the C split.gwl_pred_final (
Tensor) – GWL predictions ready for the data‑fidelity loss.s_pred_mean_for_pde (
Tensor) – Mean (deterministic) subsidence predictions used when computing physics‑based derivatives.gwl_pred_mean_for_pde (
Tensor) – Mean GWL predictions for the PDE residual term.
- Return type:
Tuple[Tensor, Tensor, Tensor, Tensor]
Notes
Mean predictions are extracted only from
decoded_outputs_for_meanbecause applying the quantile mapping first would break the differentiability required for spatial–temporal derivatives.When TensorFlow executes in graph mode and the rank of predictions_combined is dynamic, the function falls back to :pyfunc:`tf.rank` for shape inspection.
Examples
>>> outputs = model(...) # forward pass >>> s_final, gwl_final, s_mean, gwl_mean = ( ... model.split_outputs( ... predictions_combined=outputs["pred"], ... decoded_outputs_for_mean=outputs["dec_mean"], ... ) ... ) >>> s_final.shape TensorShape([32, 24, 3]) # e.g. B=32, H=24, Q=3 >>> gwl_mean.shape TensorShape([32, 24, 1]) # deterministic mean
See also
fusionlab.nn.pinn.QuantileDistributionModelingLayer that adds the quantile dimension.
fusionlab.nn.pinn.PIHALNet.run_halnet_coreProduces
decoded_outputs_for_mean.
- call(inputs, training=False)[source]¶
Forward pass for
PIHALNet.This method orchestrates the full physics‑informed workflow:
Validate and split the input dictionary into static, dynamic, future, and coordinate tensors.
Run the HALNet encoder–decoder core to obtain a latent representation.
Produce mean forecasts with the multi‑horizon decoder.
Optionally expand those means into quantile predictions.
Separate combined outputs into subsidence and GWL streams for both data‑loss and physics‑loss branches.
Compute the consolidation PDE residual on the mean series.
Return a dictionary ready for the model’s composite loss.
- Parameters:
inputs (
dict[str,Tensor]) –Dictionary containing at least the following keys (created by :pyfunc:`fusionlab.nn.pinn.process_pinn_inputs`):
'coords'– tensor(B, H, 3)with (t, x, y) coordinates.'static_features'– tensor(B, S).'dynamic_features'– tensor(B, T_past, D).'future_features'– tensor(B, H, F).
training (
bool, defaultFalse) – Standard Keras flag indicating training or inference mode.
- Returns:
'subs_pred'– subsidence predictions, shape(B, H, Q, O_s)or(B, H, O_s).'gwl_pred'– GWL predictions, same layout for O_g.'pde_residual'– physics residual, shape(B, H, 1)(all zeros if H = 1).
- Return type:
dict[str,Tensor]
Notes
Quantile outputs are produced only when the model’s
quantilesattribute is not None.The PDE residual is based on a discrete‑time consolidation equation evaluated with finite differences; therefore a forecast horizon > 1 is required.
Examples
>>> out = pihalnet( ... { ... "coords": coords, ... "static_features": s, ... "dynamic_features": d, ... "future_features": f, ... }, ... training=True, ... ) >>> out["subs_pred"].shape TensorShape([32, 24, 3, 1]) # B=32, H=24, Q=3, O_s=1 >>> out["pde_residual"].shape TensorShape([32, 24, 1])
See also
fusionlab.nn.pinn.PIHALNet.run_halnet_coreFeature‑extraction backbone called internally.
fusionlab.nn.pinn.PIHALNet.split_outputsHelper that separates subsidence and GWL channels.
- compile(optimizer, loss, metrics=None, loss_weights=None, lambda_pde=1.0, **kwargs)[source]¶
Configure PIHALNet for training with a composite PINN loss.
The total loss optimised during :pyfunc:`train_step` is
\[L_\text{total} \;=\; \sum_i w_i\,L_{\text{data},i} + \lambda_\text{pde}\,L_\text{pde}\]where i indexes the outputs (subsidence, GWL).
- Parameters:
optimizer (
tf.keras.optimizers.Optimizer | str) – Optimiser instance or Keras alias (e.g.,"adam").loss (
dict) – Mapping{"subs_pred": loss_fn, "gwl_pred": loss_fn}. Each value can be a Keras loss object or a string such as"mse".metrics (
dict, optional) – Mapping from output keys to a list of Keras metric objects (or their aliases) that will be tracked during training and evaluation.loss_weights (
dict, optional) – Scalar weights \(w_i\) for each data loss term. Defaults to 1 for every output.lambda_pde (
float, default1.0) – Weight applied to \(L_\text{pde}\) (physics residual).**kwargs – Additional keywords forwarded to :pyfunc:`tf.keras.Model.compile`.
Notes
The physics‑residual term is added manually in :pyfunc:`train_step`; therefore
lambda_pdeis stored as an attribute rather than passed toloss_weights.
- train_step(data)[source]¶
Single optimisation step implementing the composite loss.
The procedure is:
Forward pass →
self.callto obtain{"subs_pred", "gwl_pred", "pde_residual"}.Compute data‑fidelity loss via :pyfunc:`tf.keras.Model.compute_loss`.
Compute physics residual loss \(L_\text{pde} = \langle r^2\rangle\) where
r = outputs["pde_residual"].Form \(L_\text{total}=L_\text{data} + \lambda_\text{pde}\,L_\text{pde}\) and back‑propagate.
Update Keras metrics and return a results dictionary.
- Parameters:
data (
tuple(dict,dict)) – Tuple(inputs, targets)produced by the data pipeline.- Returns:
Includes user‑defined metrics plus
"total_loss","data_loss", and"physics_loss".- Return type:
dict[str,Tensor]- Raises:
ValueError – If data is not a two‑element tuple of dictionaries.
- get_config()[source]¶
Returns the config of the Model.
Config is a Python dictionary (serializable) containing the configuration of an object, which in this case is a Model. This allows the Model to be be reinstantiated later (without its trained weights) from this configuration.
Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.
Developers of subclassed Model are advised to override this method, and continue to update the dict from super(MyModel, self).get_config() to provide the proper configuration of this Model. The default config will return config dict for init parameters if they are basic types. Raises NotImplementedError when in cases where a custom get_config() implementation is required for the subclassed model.
- Returns:
Python dictionary containing the configuration of this Model.
- classmethod from_config(config, custom_objects=None)[source]¶
Creates a layer from its config.
This method is the reverse of get_config, capable of instantiating the same layer from the config dictionary. It does not handle layer connectivity (handled by Network), nor weights (handled by set_weights).
- Parameters:
config – A Python dictionary, typically the output of get_config.
- Returns:
A layer instance.
- help(**kwargs)¶
- my_params = PiHALNet( static_input_dim, dynamic_input_dim, future_input_dim, output_subsidence_dim=1, output_gwl_dim=1, embed_dim=32, hidden_units=64, lstm_units=64, attention_units=32, num_heads=4, dropout_rate=0.1, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, scales=None, multi_scale_agg='last', final_agg='last', activation='relu', use_residuals=True, use_batch_norm=False, pde_mode='consolidation', pinn_coefficient_C=<LearnableC trainable=True, value=<tf.Variable 'log_pinn_coefficient_C:0' shape=() dtype=float32, numpy=-4.6051702>>, gw_flow_coeffs=None, use_vsn=True, vsn_units=None, mode=None, name='PiHALNet' )¶