fusionlab.nn.pinn.TransFlowSubsNet¶

class fusionlab.nn.pinn.TransFlowSubsNet[source]¶

Bases: BaseAttentive

Transient Ground-Water–Driven Subsidence Network

TransFlowSubsNet fuses a deep encoder–decoder with two physics residuals so the network learns a forecast and honours the governing dynamics:

Consolidation (rate form) couples settlement rate and head rate: \(\partial_t s + C\,\partial_t h = 0\). For \(C>0\), a head decline (\(\partial_t h<0\)) implies a positive settlement rate (\(\partial_t s>0\)).
Transient ground-water flow (diffusivity) enforces \(K\nabla^2 h + Q - S_s\,\partial_t h = 0\).

Both terms can be disabled via pde_mode (or DisabledC for the consolidation branch). Physics terms are computed on the mean path emitted by the decoder to keep a unique differentiable trajectory.

See User Guide for a walkthrough.

Parameters:

static_input_dim (int) – Dimensionality of the static (time-invariant) input features. These are features that do not change over time for a given sample, such as a sensor’s location ID, soil type, or a product category. If 0, no static features are used.
dynamic_input_dim (int) – Dimensionality of the dynamic (time-varying) input features that are known in the past (the “lookback” window). This is a required parameter and typically includes the target variable itself (lagged) and other historical drivers like rainfall, temperature, or sales figures.
future_input_dim (int) – Dimensionality of the time-varying features for which values are known in advance for the forecast period. Examples include holidays, scheduled promotions, or day-of-week indicators. If 0, no future features are used.
output_subsidence_dim (int, default 1) – Number of subsidence series per horizon step. In a multi-well setting with n benchmarks, use n.
output_gwl_dim (int, default 1) – Number of hydraulic-head series. Use >1 for multi-aquifer or multi-well configurations.
forecast_horizon (int, default 1) – Horizon length \(H\). The decoder emits \(H\) steps; physics residuals are evaluated at each emitted step.
quantiles (list[float] | None, default None) – Optional list of quantile levels; enables the Quantile-Distribution head for calibrated uncertainty.
embed_dim (int, default 32) – The base dimensionality for the internal feature space of the model. Various input features (static, dynamic, future) are projected into this common dimension to allow for meaningful interactions within downstream layers like LSTMs and attention mechanisms. It’s a key parameter for controlling model capacity.
hidden_units (int, default 64) – The number of units in the hidden layers of the Gated Residual Networks (GRNs). GRNs are core components used for non-linear transformations throughout the architecture. A larger value increases the model’s capacity to learn complex patterns.
lstm_units (int, default 64) – The number of hidden units in each LSTM layer within the MultiScaleLSTM block. This parameter determines the memory capacity of the recurrent cells processing the historical sequence data.
attention_units (int, default 32) – The dimensionality of the output space for the various attention mechanisms (e.g., CrossAttention, HierarchicalAttention). This is also often referred to as the model’s dimension, \(d_{model}\). It must be divisible by num_heads.
num_heads (int, default 4) – The number of attention heads in each MultiHeadAttention sub-layer. Using multiple heads allows the model to jointly attend to information from different representation subspaces at different positions, which can improve learning.
dropout_rate (float, default 0.1) – The dropout rate applied within various components like Gated Residual Networks (GRNs) and after some attention layers to prevent overfitting. It must be a float between 0.0 and 1.0.
max_window_size (int, default 10) – The number of past time steps (the lookback window) that the model considers. This should directly correspond to the time_steps parameter used during data preparation and is used by components like DynamicTimeWindow.
memory_size (int, default 100) – The number of memory slots in the MemoryAugmentedAttention layer. This external memory allows the model to learn and access patterns over very long-range dependencies that might be missed by standard LSTMs or attention.
scales (list of int, optional) – A list of scale factors for the MultiScaleLSTM. Each scale s creates an LSTM that processes the input sequence by taking every s-th time step. For example, scales=[1, 3] would process the sequence at its original resolution and at a coarser, every-third-timestep resolution. If None or ‘auto’, defaults to [1].
multi_scale_agg ({'last', 'average', 'concat', ...}, default 'last') –
The strategy used by the aggregation function to combine the outputs from the different LSTMs in MultiScaleLSTM. - 'concat': (For 3D output) Pads sequences from different

scales to the same length and concatenates them along the feature axis. This is the primary mode for creating a rich sequence representation for downstream attention layers in an encoder-decoder setup.
- 'last' or 'auto': (For 2D output) Creates a context vector by taking the last hidden state from each LSTM scale and concatenating them.
- 'average' or 'sum': Create a 2D context vector by averaging or summing over the time dimension for each scale.
final_agg ({'last', 'average', 'flatten'}, default 'last') – The aggregation strategy used to collapse the final temporal feature map (which has a time dimension equal to forecast_horizon) into a single feature vector before the final decoding step.
activation (str, default 'relu') – The name of the activation function to use in Dense layers and Gated Residual Networks (GRNs) throughout the model. Common choices include ‘relu’, ‘gelu’, ‘swish’, and ‘tanh’.
use_residuals (bool, default True) – If True, enables residual “add & norm” connections after key sub-layers (like attention and GRNs). These shortcut connections are crucial for training very deep networks as they help prevent vanishing gradients and ease the optimization process.
use_batch_norm (bool, default False) – If True, BatchNormalization is used within Gated Residual Networks (GRNs). If False (default), LayerNormalization is used instead. LayerNormalization is often more stable and effective for time series data with varying sequence lengths.
use_vsn (bool, default True) – If True, the model uses VariableSelectionNetwork (VSN) layers at the input stage. VSNs perform intelligent, learnable feature selection, allowing the model to up-weight or down-weight the importance of each input variable. This can improve performance and provide insights into which features are most impactful. If False, simpler Dense layers are used for initial projection.
vsn_units (int, optional) – The number of units in the internal Gated Residual Networks (GRNs) of the Variable Selection Networks. This parameter controls the capacity of the feature selection sub-networks. If None, it often defaults to a value based on hidden_units.
pde_mode ({'consolidation', 'gw_flow', 'both', 'none'}, default 'both') –
Select which residuals participate in the physics loss:

┌─────────────────┬───────────────────────────────────────────────┐ │ ‘consolidation’ │ only the rate-coupled term │ │ │ \(\partial_t s + C\,\partial_t h\) │ │ ‘gw_flow’ │ only the diffusivity term │ │ │ \(K\nabla^2 h + Q - S_s\,\partial_t h\) │ │ ‘both’ │ both residuals (recommended) │ │ ‘none’ │ pure data-driven; behaves like HAL-Net │ └─────────────────┴───────────────────────────────────────────────┘
K (float | str | Learnable*, defaults 1e-4, 1e-5, 0) –
Hydraulic conductivity \(K\) [L/T], specific storage \(S_s\) [1/L], and volumetric source/sink \(Q\) [L/T]. Accepted forms:
- float / int → fixed numeric.
- 'learnable' → wrap into LearnableK / Ss / *Q`, seeding from the numeric value in the same call (or the default).
- 'fixed' → force numeric even if a global ‘learnable’ setting is present in resolve_gw_coeffs().
- Learnable* instance → forwarded unchanged.
Ss (float | str | Learnable*, defaults 1e-4, 1e-5, 0) –
Hydraulic conductivity \(K\) [L/T], specific storage \(S_s\) [1/L], and volumetric source/sink \(Q\) [L/T]. Accepted forms:
- float / int → fixed numeric.
- 'learnable' → wrap into LearnableK / Ss / *Q`, seeding from the numeric value in the same call (or the default).
- 'fixed' → force numeric even if a global ‘learnable’ setting is present in resolve_gw_coeffs().
- Learnable* instance → forwarded unchanged.
Q (float | str | Learnable*, defaults 1e-4, 1e-5, 0) –
Hydraulic conductivity \(K\) [L/T], specific storage \(S_s\) [1/L], and volumetric source/sink \(Q\) [L/T]. Accepted forms:
- float / int → fixed numeric.
- 'learnable' → wrap into LearnableK / Ss / *Q`, seeding from the numeric value in the same call (or the default).
- 'fixed' → force numeric even if a global ‘learnable’ setting is present in resolve_gw_coeffs().
- Learnable* instance → forwarded unchanged.
pinn_coefficient_C (float | str | LearnableC | FixedC | DisabledC, :class:``)
LearnableC(0.01) (default) –
Consolidation-coupling coefficient in

\[\partial_t s + C\,\partial_t h = 0\]
- float – fixed.
- 'learnable' or LearnableC – optimized in log-space to keep \(C>0\).
- DisabledC – disables the consolidation residual regardless of pde_mode.
gw_flow_coeffs (dict | None, default None) –
Convenience container to override K/Ss/Q jointly, e.g.
```
gw_flow_coeffs = {'K': 'learnable', 'Ss': 1e-6, 'Q': 'fixed'}
```
Dict entries win over the individual keyword arguments.

scale_pde_residualsbool, default True

If True, non-dimensionalize physics residuals with simple data-driven scales so the loss terms are \(\mathcal{O}(1)\). By default:

Ground-water scale \(\text{gw\_scale} \approx \overline{S_s}\, \overline{|h|}/\overline{\Delta t}\).
Consolidation scale \(\text{cons\_scale} \approx \overline{|s|}/\overline{\Delta t}\).

These references are computed per batch on the mean paths and used as divisors in the residuals (see default_scales(), scale_residual()).

scaling_kwargsdict | None, default None

Extra keyword arguments forwarded to the internal scaling routine. Reserved for future extensions (custom reference magnitudes or alternative scalings). When None or empty, the default batch statistics are used. Unrecognized keys are currently ignored.

mode{‘pihal_like’, ‘tft_like’}, default None

Routing for future_features:

pihal_like – decoder gets all \(H\) rows, encoder none.
tft_like – first max_window_size rows to encoder, next \(H\) rows to decoder, matching the original Temporal Fusion Transformer. None inherits BaseAttentive default (‘tft_like’).

objective{‘hybrid’, ‘transformer’}, default 'hybrid'

Selects the backbone architecture that processes dynamic-past and (optionally) known-future covariates before the decoding stage.

'hybrid' – Multi-scale LSTM -> Transformer. The encoder first extracts multi-resolution temporal features with a stack of LSTMs (one per scale), then refines these features with hierarchical/cross attention blocks. This configuration balances the strong sequence-memory capability of recurrent networks with the global-context modelling power of Transformers and is recommended for most tabular time-series data.
'transformer' – Pure Transformer. Bypasses the LSTM stack and feeds the embeddings directly into the attention encoder, resulting in a lightweight, fully self-attention model. Choose this if your data exhibit long-range dependencies for which an LSTM adds little benefit, or when you need faster training/inference at the cost of some short-term pattern capture.

In future release:

Shortcut for common loss presets. Should be recognised: * 'nse' – Nash–Sutcliffe model-efficiency score. * 'rmse' – root-mean-square error. When None we will supply losses via compile().

attention_levelsstr | list[str] | None

Which hierarchical attention outputs are returned when the model is called with training=False. Use 'all' or a subset such as ['scale', 'cross'] for interpretability.

namestr, default “TransFlowSubsNet”

Model scope as registered in Keras.

**kwargs

Forwarded verbatim to tf.keras.Model.

Notes

Physics loss is added outside the Keras loss container inside train_step(); compile with lambda_cons and lambda_gw to scale the two physics terms.
Residuals are evaluated on the mean decoder outputs (pre-quantile) to keep a single differentiable reference trajectory. Quantile heads are used for data-fidelity with calibrated uncertainty.

See also

fusionlab.nn.models.HALNet: Purely data-driven encoder–decoder (no physics terms).
fusionlab.nn.pinn.models.PIHALNet: Physics-informed HAL-Net with consolidation and anomaly modules.
fusionlab.nn.pinn.models.PiTGWFlow: PINN for transient ground-water flow without subsidence coupling.
fusionlab.nn.pinn.op.default_scales: Batchwise scale construction for residual normalization.
fusionlab.nn.pinn.op.scale_residual: Safe residual normalization helper.

Examples

>>> import tensorflow as tf
>>> from fusionlab.nn.pinn import TransFlowSubsNet
>>> model = TransFlowSubsNet(
...     static_input_dim=3, dynamic_input_dim=8, future_input_dim=4,
...     output_subsidence_dim=1, output_gwl_dim=1,
...     K='learnable', Ss=1e-5, Q='fixed',
...     pde_mode='both', scales=[1, 3], multi_scale_agg='concat',
...     scale_pde_residuals=True
... )
>>> batch = {
...     "static_features":  tf.zeros([8, 3]),
...     "dynamic_features": tf.zeros([8, 12, 8]),
...     "future_features":  tf.zeros([8, 6, 4]),
...     "coords":           tf.zeros([8, 6, 3]),
... }
>>> out = model(batch, training=False)
>>> sorted(out.keys())
['gwl_pred', 'gwl_pred_mean', 'subs_pred', 'subs_pred_mean']

__init__(static_input_dim, dynamic_input_dim, future_input_dim, output_subsidence_dim=1, output_gwl_dim=1, embed_dim=32, hidden_units=64, lstm_units=64, attention_units=32, num_heads=4, dropout_rate=0.1, forecast_horizon=1, quantiles=None, max_window_size=10, memory_size=100, scales=None, multi_scale_agg='last', final_agg='last', activation='relu', use_residuals=True, use_batch_norm=False, pde_mode='both', K=LearnableK(initial_value=0.0001, trainable=True, name=learnable_K), Ss=LearnableSs(initial_value=1e-05, trainable=True, name=learnable_Ss), Q=LearnableQ(initial_value=0.0, trainable=True, name=learnable_Q), pinn_coefficient_C=<LearnableC trainable=True, value=<tf.Variable 'log_pinn_coefficient_C:0' shape=() dtype=float32, numpy=-4.605170249938965>>, gw_flow_coeffs=None, use_vsn=True, vsn_units=None, mode=None, objective=None, attention_levels=None, architecture_config=None, scale_pde_residuals=True, scaling_kwargs=None, name='TransFlowSubsNet', **kwargs)[source]¶

Parameters:

static_input_dim (int)
dynamic_input_dim (int)
future_input_dim (int)
output_subsidence_dim (int)
output_gwl_dim (int)
embed_dim (int)
hidden_units (int)
lstm_units (int)
attention_units (int)
num_heads (int)
dropout_rate (float)
forecast_horizon (int)
quantiles (List[float] | None)
max_window_size (int)
memory_size (int)
scales (List[int] | None)
multi_scale_agg (str)
final_agg (str)
activation (str)
use_residuals (bool)
use_batch_norm (bool)
pde_mode (str | List[str])
K (str | float | LearnableK)
Ss (float | LearnableSs | str)
Q (float | LearnableQ | str)
pinn_coefficient_C (LearnableC | FixedC | DisabledC | str | float | None)
gw_flow_coeffs (Dict[str, str | float | None] | None)
use_vsn (bool)
vsn_units (int | None)
mode (str | None)
objective (str | None)
attention_levels (str | List[str] | None)
architecture_config (Dict | None)
scale_pde_residuals (bool)
scaling_kwargs (Dict[str, Any] | None)
name (str)

Methods

`__init__`(static_input_dim, ...[, ...])
`add_loss`(loss)	Can be called inside of the call() method to add a scalar loss.
`add_metric`(args, *kwargs)
`add_variable`(shape, initializer[, dtype, ...])	Add a weight variable to the layer.
`add_weight`([shape, initializer, dtype, ...])	Add a weight variable to the layer.
`apply_attention_levels`(...)	Applies attention mechanisms in the order specified by att_levels, using the provided attention methods such as cross attention, hierarchical attention, and memory-augmented attention.
`build`(input_shape)
`build_from_config`(config)	Builds the layer's states with the supplied config dict.
`call`(inputs[, training])	Single forward sweep mixing data and physics paths.
`compile`([lambda_cons, lambda_gw])	Compiles the model with composite loss weights.
`compile_from_config`(config)	Compiles the model with the information given in config.
`compiled_loss`(y, y_pred[, sample_weight, ...])
`compute_loss`([x, y, y_pred, sample_weight, ...])	Compute the total loss, validate it, and return it.
`compute_mask`(inputs, previous_mask)
`compute_metrics`(x, y, y_pred[, sample_weight])	Update metric states and collect all metrics to be returned.
`compute_output_shape`(args, *kwargs)
`compute_output_spec`(args, *kwargs)
`count_params`()	Count the total number of scalars composing the weights.
`evaluate`([x, y, batch_size, verbose, ...])	Returns the loss value & metrics values for the model in test mode.
`export`(filepath[, format, verbose, ...])	Export the model as an artifact for inference.
`fit`([x, y, batch_size, epochs, verbose, ...])	Trains the model for a fixed number of epochs (dataset iterations).
`from_config`(config[, custom_objects])	Reconstructs a model instance from its configuration.
`get_build_config`()	Returns a dictionary with the layer's input shape.
`get_compile_config`()	Returns a serialized config with information for compiling the model.
`get_config`()	Returns the full configuration of the model.
`get_layer`([name, index])	Retrieves a layer based on either its name (unique) or index.
`get_metrics_result`()	Returns the model's metrics values as a dict.
`get_params`([deep])	Get the parameters for this learner.
`get_pinn_coefficient_C`()	Returns the physical coefficient C.
`get_state_tree`([value_format])	Retrieves tree-like structure of model variables.
`get_weights`()	Return the values of layer.weights as a list of NumPy arrays.
`help`(**kwargs)
`load`(file_path[, format])	Load the learner's state from a specified file in the desired format.
`load_own_variables`(store)	Loads the state of the layer.
`load_weights`(filepath[, skip_mismatch])	Load the weights from a single file or sharded files.
`loss`(y, y_pred[, sample_weight])
`make_predict_function`([force])
`make_test_function`([force])
`make_train_function`([force])
`predict`(x[, batch_size, verbose, steps, ...])	Generates output predictions for the input samples.
`predict_on_batch`(x)	Returns predictions for a single batch of samples.
`predict_step`(data)
`quantize`(mode[, config])	Quantize the weights of the model.
`quantized_build`(input_shape, mode)
`quantized_call`(args, *kwargs)
`reconfigure`(architecture_config)	Creates a new model instance with a modified architecture.
`rematerialized_call`(layer_call, args, *kwargs)	Enable rematerialization dynamically for layer's call method.
`reset_metrics`()
`run_encoder_decoder_core`(static_input, ...)	Executes the data-driven pipeline with a selectable encoder architecture, processing static, dynamic, and future inputs through the encoder-decoder interaction.
`save`(filepath[, overwrite, zipped])	Saves a model as a .keras file.
`save_own_variables`(store)	Saves the state of the layer.
`save_weights`(filepath[, overwrite, ...])	Saves all weights to a single file or sharded files.
`set_params`(**params)	Set the parameters of this learner.
`set_state_tree`(state_tree)	Assigns values to variables of the model.
`set_weights`(weights)	Sets the values of layer.weights from a list of NumPy arrays.
`split_outputs`(predictions_combined, ...)	Separate the combined output tensor into individual subsidence and groundwater‑level (GWL) components and return both the final and mean predictions needed for the two loss terms used in TransFlowSubsNet (data loss and physics/PDE loss).
`stateless_call`(trainable_variables, ...[, ...])	Call the layer without any side effects.
`stateless_compute_loss`(trainable_variables, ...)
`summary`([line_length, positions, print_fn, ...])	Prints a string summary of the network.
`symbolic_call`(args, *kwargs)
`test_on_batch`(x[, y, sample_weight, return_dict])	Test the model on a single batch of samples.
`test_step`(data)
`to_json`(**kwargs)	Returns a JSON string containing the network configuration.
`train_on_batch`(x[, y, sample_weight, ...])	Runs a single gradient update on a single batch of data.
`train_step`(data)	One optimization step uniting data and physics terms.

Attributes

`compiled_metrics`
`compute_dtype`	The dtype of the computations performed by the layer.
`distribute_reduction_method`
`distribute_strategy`
`dtype`	Alias of layer.variable_dtype.
`dtype_policy`
`input`	Retrieves the input tensor(s) of a symbolic operation.
`input_dtype`	The dtype layer inputs should be converted to.
`input_spec`
`jit_compile`
`layers`
`losses`	List of scalar losses from add_loss, regularizers and sublayers.
`metrics`	List of all metrics.
`metrics_names`
`metrics_variables`	List of all metric variables.
`my_params`
`non_trainable_variables`	List of all non-trainable layer state.
`non_trainable_weights`	List of all non-trainable weight variables of the layer.
`output`	Retrieves the output tensor(s) of a layer.
`path`	The path of the layer.
`quantization_mode`	The quantization mode of this layer, None if not quantized.
`run_eagerly`
`supports_masking`	Whether this layer supports computing a mask using compute_mask.
`trainable`	Settable boolean, whether this layer should be trainable or not.
`trainable_variables`	List of all trainable layer state.
`trainable_weights`	List of all trainable weight variables of the layer.
`variable_dtype`	The dtype of the state (weights) of the layer.
`variables`	List of all layer state, including random seeds.
`weights`	List of all weight variables of the layer.

Parameters:

static_input_dim (int)
dynamic_input_dim (int)
future_input_dim (int)
output_subsidence_dim (int)
output_gwl_dim (int)
embed_dim (int)
hidden_units (int)
lstm_units (int)
attention_units (int)
num_heads (int)
dropout_rate (float)
forecast_horizon (int)
quantiles (List[float] | None)
max_window_size (int)
memory_size (int)
scales (List[int] | None)
multi_scale_agg (str)
final_agg (str)
activation (str)
use_residuals (bool)
use_batch_norm (bool)
pde_mode (str | List[str])
K (str | float | LearnableK)
Ss (float | LearnableSs | str)
Q (float | LearnableQ | str)
pinn_coefficient_C (LearnableC | FixedC | DisabledC | str | float | None)
gw_flow_coeffs (Dict[str, str | float | None] | None)
use_vsn (bool)
vsn_units (int | None)
mode (str | None)
objective (str | None)
attention_levels (str | List[str] | None)
architecture_config (Dict | None)
scale_pde_residuals (bool)
scaling_kwargs (Dict[str, Any] | None)
name (str)

get_pinn_coefficient_C()[source]¶

Returns the physical coefficient C.

Return type:: Tensor

call(inputs, training=False)[source]¶

Single forward sweep mixing data and physics paths.

The routine

extracts :tmath:`t,x,y` and covariate tensors from inputs;
runs sanity checks on dimensionality;
feeds the validated features through the inherited encoder–decoder stack; and
splits the decoder output into mean and final predictions that will later enter the data‐ and PDE‐loss terms.

Returns:

{"subs_pred": s, "gwl_pred": h, "subs_pred_mean": \bar s, "gwl_pred_mean": \bar h} with shapes

\[\begin{split}s,\;h &\in \mathbb R^{B\times H\times d},\\ \bar s,\;\bar h &\in \mathbb R^{B\times H\times d}.\end{split}\]

Here B is the batch size, H the forecast horizon and d each target’s width.

Return type:

dict

Parameters:

inputs (Dict[str, Tensor | None])
training (bool)

Notes

All coordinate tensors are not differentiated here; their gradients are taken in train_step().
The method remains side‐effect free: no weights are updated, no losses are added. It purely produces tensors needed by the custom training loop that follows.

train_step(data)[source]¶

One optimization step uniting data and physics terms.

Uses a single tf.GradientTape to obtain all first- and second-order coordinate derivatives required by the groundwater-flow and consolidation PDEs.

Data loss: any Keras loss supplied in compile().

PDE losses:

\[\begin{split}R_c = \\partial_t s \\; + \\; C \\, \\partial_t h R_{gw} = K(\\partial_{xx} h + \\partial_{yy} h)\\ + Q - S_s \\, \\partial_t h\end{split}\]

The final objective is \(\mathcal L = L_\text{data} + \lambda_c L_\text{cons} + \lambda_g L_\text{gw}\).

On return the method updates built-in metrics and provides a dict with total, data and PDE losses.

split_outputs(predictions_combined, decoded_outputs_for_mean)[source]¶

Separate the combined output tensor into individual subsidence and groundwater‑level (GWL) components and return both the final and mean predictions needed for the two loss terms used in TransFlowSubsNet (data loss and physics/PDE loss).

The method supports two output shapes:

Quantile mode (B, H, Q, C) where Q is the number of quantiles and C = output_subsidence_dim + output_gwl_dim.
Deterministic mode (B, H, C) when quantiles are disabled.

Parameters:

predictions_combined (Tensor) – Network output after the QuantileDistributionModeling stage. Shape is (B, H, C) or (B, H, Q, C).
decoded_outputs_for_mean (Tensor) – Decoder output before quantile distribution, used to compute the PDE residual. Shape is (B, H, C).
training (bool, optional) – Inherited from the calling context. Present only in TensorFlow graph mode; not used explicitly here.

Returns:

s_pred_final (Tensor) – Subsidence predictions ready for the data‑fidelity loss. Shape matches predictions_combined minus the C split.
gwl_pred_final (Tensor) – GWL predictions ready for the data‑fidelity loss.
s_pred_mean_for_pde (Tensor) – Mean (deterministic) subsidence predictions used when computing physics‑based derivatives.
gwl_pred_mean_for_pde (Tensor) – Mean GWL predictions for the PDE residual term.

Return type:

Tuple[Tensor, Tensor, Tensor, Tensor]

Notes

Mean predictions are extracted only from decoded_outputs_for_mean because applying the quantile mapping first would break the differentiability required for spatial–temporal derivatives.
When TensorFlow executes in graph mode and the rank of predictions_combined is dynamic, the function falls back to :pyfunc:`tf.rank` for shape inspection.

Examples

>>> outputs = model(...)                       # forward pass
>>> s_final, gwl_final, s_mean, gwl_mean = (
...     model.split_outputs(
...         predictions_combined=outputs["pred"],
...         decoded_outputs_for_mean=outputs["dec_mean"],
...     )
... )
>>> s_final.shape
TensorShape([32, 24, 3])          # e.g. B=32, H=24, Q=3
>>> gwl_mean.shape
TensorShape([32, 24, 1])          # deterministic mean