postprocessing

This consists of modules which handles the output of Model after the model has been trained i.e. after .fit method has been called on it.

Please note that the SeqMetrics sub-module has been deprecated. Please use SeqMetrics library instead.

ProcessPredictions

class ai4water.postprocessing.ProcessPredictions(mode: str, forecast_len: Optional[int] = None, output_features: Optional[Union[str, list]] = None, is_multiclass: Optional[bool] = None, is_binary: Optional[bool] = None, is_multilabel: Optional[bool] = None, wandb_config: Optional[dict] = None, path: Optional[str] = None, dpi: int = 300, show=1, save: bool = True, plots: Optional[Union[str, list]] = None)[source]

Bases: Plot

post processing of results after training

__init__(mode: str, forecast_len: Optional[int] = None, output_features: Optional[Union[str, list]] = None, is_multiclass: Optional[bool] = None, is_binary: Optional[bool] = None, is_multilabel: Optional[bool] = None, wandb_config: Optional[dict] = None, path: Optional[str] = None, dpi: int = 300, show=1, save: bool = True, plots: Optional[Union[str, list]] = None)[source]
Parameters
  • mode (str) – either “regression” or “classification”

  • forecast_len (int) – forecast length, only valid when mode is regression

  • output_features (str, optional) – names of output features

  • is_binary (bool, optional (default=None)) – whether the results correspond to binary classification problem.

  • is_multiclass (bool) – whether the results correspond to multiclass classification problem. Only valid if mode is classification

  • is_multilabel (bool, optional (default=None)) – whether the results correspond to multilabel classification problem. Only valid if mode is classification

  • plots (int, list) –

    the names of plots to draw. Following plots are avialble.

    residual regression prediction errors fdc murphy edf

  • path (str) – folder in which to save the results/plots

  • show (bool) – whether to show the plots or not

  • save (bool) – whether to save the plots or not

  • wandb_config – weights and bias configuration dictionary

  • dpi (int) – determines resolution of saved figure

Examples

>>> true = np.random.random(100)
>>> predicted = np.random.random(100)
>>> processor = ProcessPredictions("regression", plots=['prediction', 'regression', 'residual', 'murphy'])
>>> processor(true, predicted)

# for classification

>>> true = np.random.randint(0, 2, (100, 1))
>>> predicted = np.random.randint(0, 2, (100, 1))
>>> processor = ProcessPredictions("classification", is_binary=True)
>>> processor(true, predicted)
__call__(true_outputs, predicted, metrics='minimal', prefix='test', index=None, inputs=None)[source]

Call self as a function.

available_plots = ['regression', 'prediction', 'residual', 'murphy', 'fdc', 'errors', 'edf']
average_target_across_feature(true, predicted, feature)[source]
classes(array)[source]
confusion_matrx(true, predicted, **kwargs)[source]
edf_plot(true, predicted, prefix, where, **kwargs)[source]

cummulative distribution function of absolute error between true and predicted.

errors_plot(true, predicted, prefix, where, **kwargs)[source]
fdc_plot(true, predicted, prefix, where, **kwargs)[source]
horizon_plots(errors: dict, fname='', save=True)[source]
maybe_not_3d_data(true, predicted)[source]
murphy_plot(true, predicted, prefix, where, inputs, **kwargs)[source]
n_classes(array)[source]
plot_all_qs(true_outputs, predicted, save=False)[source]
plot_loss(history: dict, name='loss_curve')[source]

Considering history is a dictionary of different arrays, possible training and validation loss arrays, this method plots those arrays.

plot_quantile(true_outputs, predicted, min_q: int, max_q, st=0, en=None, save=False)[source]
plot_quantiles1(true_outputs, predicted, st=0, en=None, save=True)[source]
plot_quantiles2(true_outputs, predicted, st=0, en=None, save=True)[source]
plot_results(true, predicted: DataFrame, prefix, where, inputs=None)[source]
# kwargs can be any/all of followings

# fillstyle: # marker: # linestyle: # markersize: # color:

precision_recall_curve(estimator, x, y)[source]
prediction_distribution_across_feature(true, predicted, feature)[source]
prediction_plot(true, predicted, prefix, where)[source]
process_binary(true, predicted, metrics, prefix, index)[source]
process_cls_results(true: ndarray, predicted: ndarray, metrics='minimal', prefix=None, index=None, inputs=None)[source]

post-processes classification results.

process_multiclass(true, predicted, metrics, prefix, index)[source]
process_multilabel(true, predicted, metrics, prefix, index)[source]
process_rgr_results(true: ndarray, predicted: ndarray, metrics='minimal', prefix=None, index=None, remove_nans=True, inputs=None)[source]

predicted, true are arrays of shape (examples, outs, forecast_len).

property quantiles
regression_plot(true, predicted, target_name, where, annotate_with='r2')[source]
residual_plot(true, predicted, prefix, where, **kwargs)[source]
roc_curve(estimator, x, y)[source]
save_or_show(show=None, **kwargs)[source]