fusionlab.nn.pinn.TransFlowSubsNet¶
- class fusionlab.nn.pinn.TransFlowSubsNet[source]¶
Bases:
BaseAttentiveTransient Ground-Water–Driven Subsidence Network
TransFlowSubsNet fuses deep-learning encoder–decoder with two physics losses so that the network learns a forecast and honours the governing PDEs at once.
Consolidation loss forces surface settlement \(s\) to balance the Laplacian of hydraulic head \(h\).
Transient ground-water flow loss constrains head to obey the diffusivity equation with source/sink term.
Both terms vanish when pde_mode switches them off.
See User Guide for a walkthrough.
- Parameters:
static_input_dim (
int) – Dimensionality of the static (time-invariant) input features. These are features that do not change over time for a given sample, such as a sensor’s location ID, soil type, or a product category. If 0, no static features are used.dynamic_input_dim (
int) – Dimensionality of the dynamic (time-varying) input features that are known in the past (the “lookback” window). This is a required parameter and typically includes the target variable itself (lagged) and other historical drivers like rainfall, temperature, or sales figures.future_input_dim (
int) – Dimensionality of the time-varying features for which values are known in advance for the forecast period. Examples include holidays, scheduled promotions, or day-of-week indicators. If 0, no future features are used.output_subsidence_dim (
int, default1) – How many subsidence series are produced at each horizon step. A multi-well scenario with n Digital Leveling benchmarks would usen.output_gwl_dim (
int, default1) – How many head series are produced. Use >1 for multi-aquifer or multi-well settings.forecast_horizon (
int, default1) – Horizon length \(H\). The decoder emits \(H\) steps; the physics terms are evaluated for every emitted step.quantiles (
list[float] | None, defaultNone) – Optional list of quantile levels; enables the Quantile-Distribution head.embed_dim (
int, default32) – The base dimensionality for the internal feature space of the model. Various input features (static, dynamic, future) are projected into this common dimension to allow for meaningful interactions within downstream layers like LSTMs and attention mechanisms. It’s a key parameter for controlling model capacity.hidden_units (
int, default64) – The number of units in the hidden layers of the Gated Residual Networks (GRNs). GRNs are core components used for non-linear transformations throughout the architecture. A larger value increases the model’s capacity to learn complex patterns.lstm_units (
int, default64) – The number of hidden units in each LSTM layer within theMultiScaleLSTMblock. This parameter determines the memory capacity of the recurrent cells processing the historical sequence data.attention_units (
int, default32) – The dimensionality of the output space for the various attention mechanisms (e.g., CrossAttention, HierarchicalAttention). This is also often referred to as the model’s dimension, \(d_{model}\). It must be divisible by num_heads.num_heads (
int, default4) – The number of attention heads in each MultiHeadAttention sub-layer. Using multiple heads allows the model to jointly attend to information from different representation subspaces at different positions, which can improve learning.dropout_rate (
float, default0.1) – The dropout rate applied within various components like Gated Residual Networks (GRNs) and after some attention layers to prevent overfitting. It must be a float between 0.0 and 1.0.max_window_size (
int, default10) – The number of past time steps (the lookback window) that the model considers. This should directly correspond to the time_steps parameter used during data preparation and is used by components likeDynamicTimeWindow.memory_size (
int, default100) – The number of memory slots in theMemoryAugmentedAttentionlayer. This external memory allows the model to learn and access patterns over very long-range dependencies that might be missed by standard LSTMs or attention.scales (
listofint, optional) – A list of scale factors for theMultiScaleLSTM. Each scale s creates an LSTM that processes the input sequence by taking every s-th time step. For example, scales=[1, 3] would process the sequence at its original resolution and at a coarser, every-third-timestep resolution. If None or ‘auto’, defaults to [1].multi_scale_agg (
{'last', 'average', 'concat', ...}, default'last') –The strategy used by the aggregation function to combine the outputs from the different LSTMs in MultiScaleLSTM. -
'concat': (For 3D output) Pads sequences from differentscales to the same length and concatenates them along the feature axis. This is the primary mode for creating a rich sequence representation for downstream attention layers in an encoder-decoder setup.
'last'or'auto': (For 2D output) Creates a context vector by taking the last hidden state from each LSTM scale and concatenating them.'average'or'sum': Create a 2D context vector by averaging or summing over the time dimension for each scale.
final_agg (
{'last', 'average', 'flatten'}, default'last') – The aggregation strategy used to collapse the final temporal feature map (which has a time dimension equal to forecast_horizon) into a single feature vector before the final decoding step.activation (
str, default'relu') – The name of the activation function to use in Dense layers and Gated Residual Networks (GRNs) throughout the model. Common choices include ‘relu’, ‘gelu’, ‘swish’, and ‘tanh’.use_residuals (
bool, defaultTrue) – If True, enables residual “add & norm” connections after key sub-layers (like attention and GRNs). These shortcut connections are crucial for training very deep networks as they help prevent vanishing gradients and ease the optimization process.use_batch_norm (
bool, defaultFalse) – If True, BatchNormalization is used within Gated Residual Networks (GRNs). If False (default), LayerNormalization is used instead. LayerNormalization is often more stable and effective for time series data with varying sequence lengths.use_vsn (
bool, defaultTrue) – If True, the model usesVariableSelectionNetwork(VSN) layers at the input stage. VSNs perform intelligent, learnable feature selection, allowing the model to up-weight or down-weight the importance of each input variable. This can improve performance and provide insights into which features are most impactful. If False, simpler Dense layers are used for initial projection.vsn_units (
int, optional) – The number of units in the internal Gated Residual Networks (GRNs) of the Variable Selection Networks. This parameter controls the capacity of the feature selection sub-networks. If None, it often defaults to a value based on hidden_units.pde_mode (
{'consolidation', 'gw_flow', 'both', 'none'}, :class:``)'both' (default) – Select which PDE residuals participate in the loss: ┌─────────────────┬────────────────────────────────────────┐ │ ‘consolidation’ │ only \(s\)–balance term │ │ ‘gw_flow’ │ only flow equation for \(h\) │ │ ‘both’ │ both residuals (recommended) │ │ ‘none’ │ pure data-driven; behaves like HAL-Net │ └─────────────────┴────────────────────────────────────────┘
K (
float | str | Learnable*,defaults 1e-4,1e-5,0.) –Hydraulic conductivity \(K\), specific storage \(S_s\), and volumetric source/sink \(Q\). Accepted forms: * float / int → fixed numeric. *
'learnable'→ wrap into the correspondingLearnableK/ Ss / Q. Initial seed is taken from the numeric value given in the same call or falls back to 1e-4 / 1e-5 / 0.'fixed'→ force numeric even if param_status=’learnable’ inresolve_gw_coeffs().Learnable* instance → forwarded unchanged.
Ss (
float | str | Learnable*,defaults 1e-4,1e-5,0.) –Hydraulic conductivity \(K\), specific storage \(S_s\), and volumetric source/sink \(Q\). Accepted forms: * float / int → fixed numeric. *
'learnable'→ wrap into the correspondingLearnableK/ Ss / Q. Initial seed is taken from the numeric value given in the same call or falls back to 1e-4 / 1e-5 / 0.'fixed'→ force numeric even if param_status=’learnable’ inresolve_gw_coeffs().Learnable* instance → forwarded unchanged.
Q (
float | str | Learnable*,defaults 1e-4,1e-5,0.) –Hydraulic conductivity \(K\), specific storage \(S_s\), and volumetric source/sink \(Q\). Accepted forms: * float / int → fixed numeric. *
'learnable'→ wrap into the correspondingLearnableK/ Ss / Q. Initial seed is taken from the numeric value given in the same call or falls back to 1e-4 / 1e-5 / 0.'fixed'→ force numeric even if param_status=’learnable’ inresolve_gw_coeffs().Learnable* instance → forwarded unchanged.
pinn_coefficient_C (
float | str | LearnableC | FixedC | DisabledC, :class:``)LearnableC(0.01) (default) –
Coefficient in the consolidation PDE:
\[\partial_t s - C\,\nabla^2 h = 0\]float– fixed.'learnable'orLearnableC– optimised in log-space to keep \(C>0\).DisabledC– disables consolidation regardless of pde_mode.
gw_flow_coeffs (
dict | None, defaultNone) –Convenience container overriding K/Ss/Q in one go, e.g.
gw_flow_coeffs = {'K': 'learnable', 'Ss': 1e-6, 'Q': 'fixed'}
Dict entries win over the individual keyword arguments.
mode (
{'pihal_like', 'tft_like'}, defaultNone) –Routing for future_features: * pihal_like – decoder gets all \(H\) rows, encoder none. * tft_like – first max_window_size rows to encoder,
next \(H\) rows to decoder, matching the original Temporal Fusion Transformer.
Noneinherits BaseAttentive default (‘tft_like’).objective (
{'hybrid', 'transformer'}, default :class:``’hybrid’:class:``) –Selects the backbone architecture that processes dynamic-past and (optionally) known-future covariates before the decoding stage.
'hybrid'– Multi-scale LSTM -> Transformer. The encoder first extracts multi-resolution temporal features with a stack of LSTMs (one per scale), then refines these features with hierarchical/cross attention blocks. This configuration balances the strong sequence-memory capability of recurrent networks with the global-context modelling power of Transformers and is recommended for most tabular time-series data.'transformer'– Pure Transformer. Bypasses the LSTM stack and feeds the embeddings directly into the attention encoder, resulting in a lightweight, fully self-attention model. Choose this if your data exhibit long-range dependencies for which an LSTM adds little benefit, or when you need faster training/inference at the cost of some short-term pattern capture.
In future release:
Shortcut for common loss presets. Should be recognised: *
'nse'– Nash–Sutcliffe model-efficiency score. *'rmse'– root-mean-square error. When None we will supply losses viacompile().attention_levels (
str | list[str] | None) – Which hierarchical attention outputs are returned when the model is called withtraining=False. Use'all'or a subset such as['scale', 'cross']for interpretability.name (
str, default :class:``”TransFlowSubsNet”:class:``) – Model scope as registered in Keras.**kwargs – Forwarded verbatim to
tf.keras.Model.
Notes
Physics loss is added outside the Keras loss container inside
train_step; compile withlambda_consandlambda_gwto scale them. When any parameter is learnable its :pyattr:`tf.Variable` automatically appears inmodel.trainable_variables.See also
fusionlab.nn.models.HALNetPurely data-driven encoder–decoder (no physics terms).
fusionlab.nn.pinn.models.PIHALNetPhysics-informed HAL-Net that couples consolidation PDEs and adds an anomaly module.
fusionlab.nn.pinn.models.PiTGWFlowStand-alone PINN that solves 2-D / 3-D transient groundwater-flow equations without subsidence coupling.
Examples
>>> model = TransFlowSubsNet( ... static_input_dim=3, dynamic_input_dim=8, future_input_dim=4, ... output_subsidence_dim=1, output_gwl_dim=1, ... K='learnable', Ss=1e-5, Q='fixed', ... pde_mode='both', scales=[1, 3], multi_scale_agg='concat' ... ) >>> batch = { ... "static_features": tf.zeros([8, 3]), ... "dynamic_features": tf.zeros([8, 12, 8]), ... "future_features": tf.zeros([8, 6, 4]), ... "coords": tf.zeros([8, 6, 3]), ... } >>> pred = model(batch, training=False) >>> list(pred) ['subs_pred', 'gwl_pred', 'subs_pred_mean', 'gwl_pred_mean']
- __init__(static_input_dim, dynamic_input_dim, future_input_dim, output_subsidence_dim=1, output_gwl_dim=1, embed_dim=32, hidden_units=64, lstm_units=64, attention_units=32, num_heads=4, dropout_rate=0.1, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, scales=None, multi_scale_agg='last', final_agg='last', activation='relu', use_residuals=True, use_batch_norm=False, pde_mode='both', K=LearnableK(initial_value=0.0001, trainable=True, name=learnable_K), Ss=LearnableSs(initial_value=1e-05, trainable=True, name=learnable_Ss), Q=LearnableQ(initial_value=0.0, trainable=True, name=learnable_Q), pinn_coefficient_C=<LearnableC trainable=True, value=<tf.Variable 'log_pinn_coefficient_C:0' shape=() dtype=float32, numpy=-4.6051702>>, gw_flow_coeffs=None, use_vsn=True, vsn_units=None, mode=None, objective=None, attention_levels=None, architecture_config=None, name='TransFlowSubsNet', **kwargs)[source]¶
- Parameters:
static_input_dim (int)
dynamic_input_dim (int)
future_input_dim (int)
output_subsidence_dim (int)
output_gwl_dim (int)
embed_dim (int)
hidden_units (int)
lstm_units (int)
attention_units (int)
num_heads (int)
dropout_rate (float)
forecast_horizon (int)
quantiles (List[float] | None)
max_window_size (int)
memory_size (int)
scales (List[int] | None)
multi_scale_agg (str)
final_agg (str)
activation (str)
use_residuals (bool)
use_batch_norm (bool)
pde_mode (str | List[str])
K (str | float | LearnableK)
Ss (float | LearnableSs | str)
Q (float | LearnableQ | str)
pinn_coefficient_C (LearnableC | FixedC | DisabledC | str | float | None)
gw_flow_coeffs (Dict[str, str | float | None] | None)
use_vsn (bool)
vsn_units (int | None)
mode (str | None)
objective (str | None)
attention_levels (str | List[str] | None)
architecture_config (Dict | None)
name (str)
Methods
__init__(static_input_dim, ...[, ...])add_loss(losses, **kwargs)Add loss tensor(s), potentially dependent on layer inputs.
add_metric(value[, name])Adds metric tensor to the layer.
add_update(updates)Add update op(s), potentially dependent on layer inputs.
add_variable(*args, **kwargs)Deprecated, do NOT use! Alias for add_weight.
add_weight([name, shape, dtype, ...])Adds a new variable to the layer.
apply_attention_levels(...)Applies attention mechanisms in the order specified by att_levels, using the provided attention methods such as cross attention, hierarchical attention, and memory-augmented attention.
build(input_shape)Builds the model based on input shapes received.
build_from_config(config)Builds the layer's states with the supplied config dict.
call(inputs[, training])Single forward sweep mixing data and physics paths.
compile([lambda_cons, lambda_gw])Compiles the model with composite loss weights.
compile_from_config(config)Compiles the model with the information given in config.
compute_loss([x, y, y_pred, sample_weight])Compute the total loss, validate it, and return it.
compute_mask(inputs[, mask])Computes an output mask tensor.
compute_metrics(x, y, y_pred, sample_weight)Update metric states and collect all metrics to be returned.
compute_output_shape(input_shape)Computes the output shape of the layer.
compute_output_signature(input_signature)Compute the output tensor signature of the layer based on the inputs.
compute_physics_loss(inputs)Computes the physics-based loss terms for both consolidation and groundwater flow.
count_params()Count the total number of scalars composing the weights.
evaluate([x, y, batch_size, verbose, ...])Returns the loss value & metrics values for the model in test mode.
evaluate_generator(generator[, steps, ...])Evaluates the model on a data generator.
export(filepath)Create a SavedModel artifact for inference (e.g. via TF-Serving).
finalize_state()Finalizes the layers state after updating layer weights.
fit([x, y, batch_size, epochs, verbose, ...])Trains the model for a fixed number of epochs (dataset iterations).
fit_generator(generator[, steps_per_epoch, ...])Fits the model on data yielded batch-by-batch by a Python generator.
from_config(config[, custom_objects])Reconstructs a model instance from its configuration.
get_build_config()Returns a dictionary with the layer's input shape.
get_compile_config()Returns a serialized config with information for compiling the model.
Returns the full configuration of the model.
get_input_at(node_index)Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at(node_index)Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at(node_index)Retrieves the input shape(s) of a layer at a given node.
get_layer([name, index])Retrieves a layer based on either its name (unique) or index.
get_metrics_result()Returns the model's metrics values as a dict.
get_output_at(node_index)Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at(node_index)Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at(node_index)Retrieves the output shape(s) of a layer at a given node.
get_params([deep])Get the parameters for this learner.
Returns the physical coefficient C.
get_weight_paths()Retrieve all the variables and their paths for the model.
get_weights()Retrieves the weights of the model.
help(**kwargs)load(file_path[, format])Load the learner's state from a specified file in the desired format.
load_own_variables(store)Loads the state of the layer.
load_weights(filepath[, skip_mismatch, ...])Loads all layer weights from a saved files.
make_predict_function([force])Creates a function that executes one step of inference.
make_test_function([force])Creates a function that executes one step of evaluation.
make_train_function([force])Creates a function that executes one step of training.
predict(x[, batch_size, verbose, steps, ...])Generates output predictions for the input samples.
predict_generator(generator[, steps, ...])Generates predictions for the input samples from a data generator.
predict_on_batch(x)Returns predictions for a single batch of samples.
predict_step(data)The logic for one inference step.
reconfigure(architecture_config)Creates a new model instance with a modified architecture.
reset_metrics()Resets the state of all the metrics in the model.
reset_states()run_encoder_decoder_core(static_input, ...)Executes the data-driven pipeline with a selectable encoder architecture, processing static, dynamic, and future inputs through the encoder-decoder interaction.
save(filepath[, overwrite, save_format])Saves a model as a TensorFlow SavedModel or HDF5 file.
save_own_variables(store)Saves the state of the layer.
save_spec([dynamic_batch])Returns the tf.TensorSpec of call args as a tuple (args, kwargs).
save_weights(filepath[, overwrite, ...])Saves all layer weights.
set_params(**params)Set the parameters of this learner.
set_weights(weights)Sets the weights of the layer, from NumPy arrays.
split_outputs(predictions_combined, ...)Separate the combined output tensor into individual subsidence and groundwater‑level (GWL) components and return both the final and mean predictions needed for the two loss terms used in TransFlowSubsNet (data loss and physics/PDE loss).
summary([line_length, positions, print_fn, ...])Prints a string summary of the network.
test_on_batch(x[, y, sample_weight, ...])Test the model on a single batch of samples.
test_step(data)The logic for one evaluation step.
to_json(**kwargs)Returns a JSON string containing the network configuration.
to_yaml(**kwargs)Returns a yaml string containing the network configuration.
train_on_batch(x[, y, sample_weight, ...])Runs a single gradient update on a single batch of data.
train_step(data)One optimization step uniting data and physics terms.
with_name_scope(method)Decorator to automatically enter the module name scope.
Attributes
activity_regularizerOptional regularizer function for the output of this layer.
autotune_steps_per_executionSettable property to enable tuning for steps_per_execution
compute_dtypeThe dtype of the layer's computations.
distribute_reduction_methodThe method employed to reduce per-replica values during training.
distribute_strategyThe tf.distribute.Strategy this model was created under.
dtypeThe dtype of the layer weights.
dtype_policyThe dtype policy associated with this layer.
dynamicWhether the layer is dynamic (eager-only); set in the constructor.
inbound_nodesReturn Functional API nodes upstream of this layer.
inputRetrieves the input tensor(s) of a layer.
input_maskRetrieves the input mask tensor(s) of a layer.
input_shapeRetrieves the input shape(s) of a layer.
input_specInputSpec instance(s) describing the input format for this layer.
jit_compileSpecify whether to compile the model with XLA.
layerslossesList of losses added using the add_loss() API.
metricsReturn metrics added using compile() or add_metric().
metrics_namesReturns the model's display labels for all outputs.
nameName of the layer (string), set in the constructor.
name_scopeReturns a tf.name_scope instance for this class.
non_trainable_variablesSequence of non-trainable variables owned by this module and its submodules.
non_trainable_weightsList of all non-trainable weights tracked by this layer.
outbound_nodesReturn Functional API nodes downstream of this layer.
outputRetrieves the output tensor(s) of a layer.
output_maskRetrieves the output mask tensor(s) of a layer.
output_shapeRetrieves the output shape(s) of a layer.
run_eagerlySettable attribute indicating whether the model should run eagerly.
state_updatesDeprecated, do NOT use!
statefulsteps_per_executionSettable `steps_per_execution variable. Requires a compiled model.
submodulesSequence of all sub-modules.
supports_maskingWhether this layer supports computing a mask using compute_mask.
trainabletrainable_variablesSequence of trainable variables owned by this module and its submodules.
trainable_weightsList of all trainable weights tracked by this layer.
updatesvariable_dtypeAlias of Layer.dtype, the dtype of the weights.
variablesReturns the list of all layer variables/weights.
weightsReturns the list of all layer variables/weights.
- __init__(static_input_dim, dynamic_input_dim, future_input_dim, output_subsidence_dim=1, output_gwl_dim=1, embed_dim=32, hidden_units=64, lstm_units=64, attention_units=32, num_heads=4, dropout_rate=0.1, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, scales=None, multi_scale_agg='last', final_agg='last', activation='relu', use_residuals=True, use_batch_norm=False, pde_mode='both', K=LearnableK(initial_value=0.0001, trainable=True, name=learnable_K), Ss=LearnableSs(initial_value=1e-05, trainable=True, name=learnable_Ss), Q=LearnableQ(initial_value=0.0, trainable=True, name=learnable_Q), pinn_coefficient_C=<LearnableC trainable=True, value=<tf.Variable 'log_pinn_coefficient_C:0' shape=() dtype=float32, numpy=-4.6051702>>, gw_flow_coeffs=None, use_vsn=True, vsn_units=None, mode=None, objective=None, attention_levels=None, architecture_config=None, name='TransFlowSubsNet', **kwargs)[source]¶
- Parameters:
static_input_dim (int)
dynamic_input_dim (int)
future_input_dim (int)
output_subsidence_dim (int)
output_gwl_dim (int)
embed_dim (int)
hidden_units (int)
lstm_units (int)
attention_units (int)
num_heads (int)
dropout_rate (float)
forecast_horizon (int)
quantiles (List[float] | None)
max_window_size (int)
memory_size (int)
scales (List[int] | None)
multi_scale_agg (str)
final_agg (str)
activation (str)
use_residuals (bool)
use_batch_norm (bool)
pde_mode (str | List[str])
K (str | float | LearnableK)
Ss (float | LearnableSs | str)
Q (float | LearnableQ | str)
pinn_coefficient_C (LearnableC | FixedC | DisabledC | str | float | None)
gw_flow_coeffs (Dict[str, str | float | None] | None)
use_vsn (bool)
vsn_units (int | None)
mode (str | None)
objective (str | None)
attention_levels (str | List[str] | None)
architecture_config (Dict | None)
name (str)
- call(inputs, training=False)[source]¶
Single forward sweep mixing data and physics paths.
The routine
extracts :tmath:`t,x,y` and covariate tensors from inputs;
runs sanity checks on dimensionality;
feeds the validated features through the inherited encoder–decoder stack; and
splits the decoder output into mean and final predictions that will later enter the data‐ and PDE‐loss terms.
- Returns:
{"subs_pred": s, "gwl_pred": h, "subs_pred_mean": \bar s, "gwl_pred_mean": \bar h}with shapes\[\begin{split}s,\;h &\in \mathbb R^{B\times H\times d},\\ \bar s,\;\bar h &\in \mathbb R^{B\times H\times d}.\end{split}\]Here B is the batch size, H the forecast horizon and d each target’s width.
- Return type:
dict- Parameters:
inputs (Dict[str, Tensor | None])
training (bool)
Notes
All coordinate tensors are not differentiated here; their gradients are taken in
train_step().The method remains side‐effect free: no weights are updated, no losses are added. It purely produces tensors needed by the custom training loop that follows.
- compute_physics_loss(inputs)[source]¶
Computes the physics-based loss terms for both consolidation and groundwater flow.
- Parameters:
inputs (Dict[str, Tensor])
- Return type:
Tuple[Tensor, Tensor]
- train_step(data)[source]¶
One optimization step uniting data and physics terms.
Uses a single
tf.GradientTapeto obtain all first- and second-order coordinate derivatives required by the groundwater-flow and consolidation PDEs.Data loss: any Keras loss supplied in
compile().PDE losses: .. math:
S_s \,\partial_t h \;-\; K (\partial_{xx} h + \partial_{yy} h) - Q &= 0 \ \partial_t s \;-\; C (\partial_{xx} h + \partial_{yy} h) &= 0
The final objective is \(\mathcal L = L_ ext{data} + \lambda_c L_ ext{cons} + \lambda_g L_ ext{gw}\).
On return the method updates built-in metrics and provides a dict with total, data and PDE losses.
- split_outputs(predictions_combined, decoded_outputs_for_mean)[source]¶
Separate the combined output tensor into individual subsidence and groundwater‑level (GWL) components and return both the final and mean predictions needed for the two loss terms used in TransFlowSubsNet (data loss and physics/PDE loss).
The method supports two output shapes:
Quantile mode
(B, H, Q, C)where Q is the number of quantiles and C =output_subsidence_dim + output_gwl_dim.Deterministic mode
(B, H, C)when quantiles are disabled.
- Parameters:
predictions_combined (
Tensor) – Network output after theQuantileDistributionModelingstage. Shape is(B, H, C)or(B, H, Q, C).decoded_outputs_for_mean (
Tensor) – Decoder output before quantile distribution, used to compute the PDE residual. Shape is(B, H, C).training (
bool, optional) – Inherited from the calling context. Present only in TensorFlow graph mode; not used explicitly here.
- Returns:
s_pred_final (
Tensor) – Subsidence predictions ready for the data‑fidelity loss. Shape matchespredictions_combinedminus the C split.gwl_pred_final (
Tensor) – GWL predictions ready for the data‑fidelity loss.s_pred_mean_for_pde (
Tensor) – Mean (deterministic) subsidence predictions used when computing physics‑based derivatives.gwl_pred_mean_for_pde (
Tensor) – Mean GWL predictions for the PDE residual term.
- Return type:
Tuple[Tensor, Tensor, Tensor, Tensor]
Notes
Mean predictions are extracted only from
decoded_outputs_for_meanbecause applying the quantile mapping first would break the differentiability required for spatial–temporal derivatives.When TensorFlow executes in graph mode and the rank of predictions_combined is dynamic, the function falls back to :pyfunc:`tf.rank` for shape inspection.
Examples
>>> outputs = model(...) # forward pass >>> s_final, gwl_final, s_mean, gwl_mean = ( ... model.split_outputs( ... predictions_combined=outputs["pred"], ... decoded_outputs_for_mean=outputs["dec_mean"], ... ) ... ) >>> s_final.shape TensorShape([32, 24, 3]) # e.g. B=32, H=24, Q=3 >>> gwl_mean.shape TensorShape([32, 24, 1]) # deterministic mean
See also
fusionlab.nn.pinn.QuantileDistributionModelingLayer that adds the quantile dimension.
fusionlab.nn.pinn.PiHALNet.run_halnet_coreProduces
decoded_outputs_for_mean.
- compile(lambda_cons=1.0, lambda_gw=1.0, **kwargs)[source]¶
Compiles the model with composite loss weights.
This method extends the default Keras compile method to accept weights for the physics-based loss components.
- Parameters:
lambda_cons (
float, optional) – The weight for the consolidation PDE residual loss. Defaults to 1.0.lambda_gw (
float, optional) – The weight for the groundwater flow PDE residual loss. Defaults to 1.0.**kwargs – Standard arguments for tf.keras.Model.compile, such as optimizer, loss, and metrics.
- get_config()[source]¶
Returns the full configuration of the model.
This method serializes the model’s configuration by combining the configuration of the parent BaseAttentive class with the parameters specific to this PINN subclass.
- Returns:
A dictionary containing all the necessary parameters to reconstruct the model.
- Return type:
dict
- classmethod from_config(config, custom_objects=None)[source]¶
Reconstructs a model instance from its configuration.
- Parameters:
config (
dict) – Configuration dictionary from get_config.custom_objects (
dict, optional) – Unused here, as custom objects are expected to be registered with Keras.
- Returns:
A new instance of the model.
- Return type:
- help(**kwargs)¶
- my_params = TransFlowSubsNet( static_input_dim, dynamic_input_dim, future_input_dim, output_subsidence_dim=1, output_gwl_dim=1, embed_dim=32, hidden_units=64, lstm_units=64, attention_units=32, num_heads=4, dropout_rate=0.1, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, scales=None, multi_scale_agg='last', final_agg='last', activation='relu', use_residuals=True, use_batch_norm=False, pde_mode='both', K=LearnableK(initial_value=0.0001, trainable=True, name=learnable_K), Ss=LearnableSs(initial_value=1e-05, trainable=True, name=learnable_Ss), Q=LearnableQ(initial_value=0.0, trainable=True, name=learnable_Q), pinn_coefficient_C=<LearnableC trainable=True, value=<tf.Variable 'log_pinn_coefficient_C:0' shape=() dtype=float32, numpy=-4.6051702>>, gw_flow_coeffs=None, use_vsn=True, vsn_units=None, mode=None, objective=None, attention_levels=None, architecture_config=None, name='TransFlowSubsNet' )¶