fusionlab.nn.utils.format_pihalnet_predictions¶
- fusionlab.nn.utils.format_pihalnet_predictions(pihalnet_outputs=None, model=None, model_inputs=None, y_true_dict=None, target_mapping=None, include_gwl=True, include_coords=True, quantiles=None, forecast_horizon=None, output_dims=None, ids_data_array=None, ids_cols=None, ids_cols_indices=None, scaler_info=None, coord_scaler=None, evaluate_coverage=False, coverage_quantile_indices=(0, -1), savefile=None, name=None, model_name=None, apply_mask=False, mask_values=None, mask_fill_value=None, verbose=0, _logger=None, stop_check=None, **kwargs)[source]¶
Formats PIHALNet/GeoPriorSubsNet predictions into a structured pandas DataFrame, handling inversion, quantiles, and coordinates.
This function is the core formatter. It: 1. Gets model outputs (or uses provided ones). 2. Unpacks ‘data_final’ if model_name is ‘geoprior’. 3. Inverse-transforms all prediction and actual arrays using scaler_info. 4. Builds a long-format DataFrame with sample_idx and forecast_step. 5. Appends inverted quantile/point predictions. 6. Appends inverted actual values. 7. Appends inverted coordinates. 8. Appends static/ID columns. 9. Evaluates coverage on the inverted data.
- Parameters:
pihalnet_outputs (
dict, optional) – Raw output from model.predict(). If None, model and model_inputs must be provided.model (
tf.keras.Model, optional) – Trained model instance (if pihalnet_outputs is None).model_inputs (
dict, optional) – Inputs for the model to generate predictions (if pihalnet_outputs is None).y_true_dict (
dict, optional) – Dictionary of true target arrays (e.g., {‘subs_pred’: y_true_s}). Required for including actuals and evaluating coverage.target_mapping (
dict, optional) – Maps prediction keys to base names for DataFrame columns. Default: {‘subs_pred’: ‘subsidence’, ‘gwl_pred’: ‘gwl’}.include_gwl (
bool, defaultTrue) – Whether to include ‘gwl_pred’ in the final DataFrame.include_coords (
bool, defaultTrue) – Whether to include ‘coord_t’, ‘coord_x’, ‘coord_y’ columns.quantiles (
list[float], optional) – List of quantiles (e.g., [0.1, 0.5, 0.9]). If provided, quantile columns (e.g., ‘subsidence_q10’) are created.forecast_horizon (
int, optional) – The forecast horizon length (H). If not provided, it’s inferred from the prediction array’s shape.output_dims (
dict, optional) – Maps prediction keys to their output dimension (O). E.g., {‘subs_pred’: 1, ‘gwl_pred’: 1}. Crucial for correctly splitting GeoPrior outputs and reshaping.ids_data_array (
np.ndarrayorpd.DataFrame, optional) – Static/ID data (e.g., original coordinates) to merge. Must have the same number of samples (B) as predictions.ids_cols (
list[str], optional) – Column names if ids_data_array is a DataFrame.ids_cols_indices (
list[int], optional) – Column indices if ids_data_array is a NumPy array.scaler_info (
dict, optional) –Dictionary for inverse scaling, structured as: { ‘subsidence’: {‘scaler’: scaler_obj, ‘idx’: 0, ‘all_features’: […]},
’gwl’: {‘scaler’: scaler_obj, ‘idx’: 1, ‘all_features’: […]} }
coord_scaler (
sklearn.preprocessing.Scaler, optional) – A fitted scaler object for inverse transforming the ‘coords’ tensor.evaluate_coverage (
bool, defaultFalse) – If True, calculates coverage percentage for quantiles.coverage_quantile_indices (
tuple[int,int], default(0,-1)) – Indices of the low and high quantiles in the quantiles list to use for coverage (e.g., 0 and -1 for 10th and 90th).savefile (
str, optional) – If provided, saves the final DataFrame to this path.model_name (
str, optional) – Specifies the model type. If ‘geoprior’ or ‘geopriorsubsnet’, triggers unpacking of the ‘data_final’ output.apply_mask (
bool, defaultFalse) – If True, masks predictions based on mask_values in the first target’s _actual column.mask_values (
floatorint, optional) – The value in the _actual column to trigger masking.mask_fill_value (
float, optional) – The value to replace masked predictions with (e.g., np.nan).verbose (
int, default0) – Logging verbosity._logger (
logging.Loggerorcallable, optional) – Logger object.stop_check (
callable, optional) – Function to check for early stopping.name (str | None)
- Returns:
A long-format DataFrame with predictions, actuals, and coordinates.
- Return type:
pd.DataFrame