fusionlab.nn.utils.format_pinn_predictions¶
- fusionlab.nn.utils.format_pinn_predictions(predictions=None, model=None, model_inputs=None, y_true_dict=None, target_mapping=None, include_gwl=True, include_coords=True, quantiles=None, forecast_horizon=None, output_dims=None, ids_data_array=None, ids_cols=None, ids_cols_indices=None, scaler_info=None, coord_scaler=None, evaluate_coverage=False, coverage_quantile_indices=(0, -1), savefile=None, _logger=None, verbose=0, **kwargs)[source]¶
Formats PINN model predictions into a structured pandas DataFrame.
This is a general-purpose utility for transforming raw model outputs (from models like PIHALNet or TransFlowSubsNet) into a long-format DataFrame suitable for analysis, visualization, or export.
This is a powerful, general-purpose utility for transforming raw model outputs into a long-format DataFrame suitable for analysis, visualization, or export. It handles multi-target outputs (e.g., subsidence and GWL), point or quantile forecasts, and can optionally include true values, coordinate information, and other metadata. It also supports inverse-scaling of predictions and evaluation of quantile coverage.
- Parameters:
predictions (
dictofTensors, optional) – The dictionary of prediction tensors, typically returned by a model’s.predict()method. Keys should match the model’s output layer names (e.g.,'subs_pred','gwl_pred'). IfNone, predictions are generated internally using the model and model_inputs arguments. Default isNone.model (
keras.Model, optional) – A compiled Keras model instance used to generate predictions if the predictions dictionary is not provided. Default isNone.model_inputs (
dictofTensors, optional) – A dictionary of input tensors matching the model’s signature, required only if predictions isNone. Default isNone.y_true_dict (
dict, optional) – A dictionary containing the ground-truth target arrays, keyed by their base names (e.g.,'subsidence','gwl'). If provided, an<target>_actualcolumn will be added to the output DataFrame for comparison. Default isNone.target_mapping (
dict, optional) – A custom mapping from model output keys to desired base names in the DataFrame columns. For example:{'subs_pred': 'subsidence_mm', 'gwl_pred': 'head_m'}. Default isNone.include_gwl (
bool, defaultTrue) – Toggles the inclusion of groundwater level (GWL) predictions in the final DataFrame.include_coords (
bool, defaultTrue) – Toggles the inclusion of the spatio-temporal coordinate columns (coord_t,coord_x,coord_y) in the final DataFrame.quantiles (
listoffloat, optional) – The list of quantile levels (e.g.,[0.1, 0.5, 0.9]) that the model predicted. This is crucial for correctly parsing probabilistic forecasts. Default isNone.forecast_horizon (
int, optional) – The length of the forecast horizon. IfNone, it is inferred from the shape of the prediction tensors. Default isNone.output_dims (
dictofstr, optional) – A dictionary specifying the feature dimension of each target, e.g.,{'subs_pred': 1, 'gwl_pred': 1}. IfNone, it’s inferred from the tensor shapes. Default isNone.ids_data_array (
np.ndarrayorpd.DataFrame, optional) – An array or DataFrame containing static identifiers (e.g., well IDs, site categories) for each sample. Its length must match the number of samples in the prediction. Default isNone.ids_cols (
listofstr, optional) – A list of column names for the ids_data_array. Required if ids_data_array is a NumPy array. Default isNone.ids_cols_indices (
listofint, optional) – A list of column indices to select from ids_data_array if it is a NumPy array. Default isNone.scaler_info (
dict, optional) – A dictionary providing the necessary information to perform inverse scaling on a per-target basis. Each key should be a target name (e.g., ‘subsidence’) and its value a dictionary containing{'scaler': obj, 'all_features': list, 'idx': int}. Default isNone.coord_scaler (
object, optional) – A fitted scikit-learn-like scaler object used to perform an inverse transform on the coordinate columns. Default isNone.evaluate_coverage (
bool, defaultFalse) – IfTrueand quantile predictions are present, calculates the unconditional coverage of the prediction interval.coverage_quantile_indices (
tupleof(int,int), default(0,-1)) – The indices of the lower and upper quantiles in the sorted quantiles list to use for the coverage calculation. Default is(0, -1), which corresponds to the full range.savefile (
str, optional) – If a file path is provided, the final DataFrame is saved to a CSV file at this location. Default isNone.verbose (
int, default0) – The verbosity level, from 0 (silent) to 5 (trace every step).**kwargs (
dict,) – Additional keyword arguments for future extensions._logger (Logger | Callable[[str], None] | None)
- Returns:
A long-format DataFrame where each row represents a single forecast step for a single sample. Columns include sample and step identifiers, coordinates, predictions, and optionally actuals and metadata.
- Return type:
pd.DataFrame
Notes
The function returns a column-aligned DataFrame, which simplifies subsequent analysis and plotting.
For quantile forecasts, prediction columns are named using the pattern
<target_name>_q<quantile*100>, e.g.,subsidence_q5,subsidence_q50,subsidence_q95.For point forecasts, the column is named
<target_name>_pred.
See also
fusionlab.plot.forecast.plot_forecastsA powerful utility for visualizing the DataFrame produced by this function.