Performance Metrics

Please note that the SeqMetrics sub-module has been deprecated. Please use SeqMetrics library instead.

SeqMetrics

class ai4water.postprocessing.SeqMetrics.Metrics(true: Union[ndarray, list], predicted: Union[ndarray, list], replace_nan: Optional[Union[int, float]] = None, replace_inf: Optional[Union[int, float]] = None, remove_zero: bool = False, remove_neg: bool = False, metric_type: str = 'regression')[source]

Bases: object

__init__(true: Union[ndarray, list], predicted: Union[ndarray, list], replace_nan: Optional[Union[int, float]] = None, replace_inf: Optional[Union[int, float]] = None, remove_zero: bool = False, remove_neg: bool = False, metric_type: str = 'regression')[source]

Parameters:

true – array like, ture/observed/actual/target values
predicted – array like, simulated values
replace_nan – default None. if not None, then NaNs in true and predicted will be replaced by this value.
replace_inf – default None, if not None, then inf vlaues in true and predicted will be replaced by this value.
remove_zero – default False, if True, the zero values in true or predicted arrays will be removed. If a zero is found in one array, the corresponding value in the other array will also be removed.
remove_neg – default False, if True, the negative values in true or predicted arrays will be removed.
metric_type – type of metric.

property assert_greater_than_one

calculate_all(statistics=False, verbose=False, write=False, name=None) → dict[source]: calculates errors using all available methods except brier_score.. write: bool, if True, will write the calculated errors in file. name: str, if not None, then must be path of the file in which to write.

calculate_minimal() → dict[source]

Calculates some basic metrics.

Returns:: Dictionary with all metrics
Return type:: dict

calculate_scale_dependent_metrics() → dict[source]

Calculates scale dependent metrics

Returns:: Dictionary with all metrics
Return type:: dict

calculate_scale_independent_metrics() → dict[source]

Calculates scale independent metrics

Returns:: Dictionary with all metrics
Return type:: dict

composite_metrics()[source]

mse(weights=None) → float[source]: mean square error

percentage_metrics()[source]

relative_metrics()[source]

property remove_neg

property remove_zero

property replace_inf

property replace_nan

scale_dependent_metrics()[source]

stats(verbose: bool = False) → dict[source]: returs some important stats about true and predicted values.

treat_values()[source]: This function is applied by default at the start/at the time of initiating the class. However, it can used any time after that. This can be handy if we want to calculate error first by ignoring nan and then by no ignoring nan. Adopting from https://github.com/BYU-Hydroinformatics/HydroErr/blob/master/HydroErr/HydroErr.py#L6210 Removes the nan, negative, and inf values in two numpy arrays

RegressionMetrics

class ai4water.postprocessing.SeqMetrics.RegressionMetrics(*args, **kwargs)[source]

Bases: Metrics

Calculates more than 100 regression performance metrics related to sequence data.

Example

>>>import numpy as np >>>from ai4water.postprocessing.SeqMetrics import RegressionMetrics >>>t = np.random.random(10) >>>p = np.random.random(10) >>>errors = RegressionMetrics(t,p) >>>all_errors = errors.calculate_all()

__init__(*args, **kwargs)[source]

Initializes Metrics.

args and kwargs go to parent class [‘Metrics’][ai4water.postprocessing.SeqMetrics.Metrics].

JS() → float[source]: Jensen-shannon divergence

abs_pbias() → float[source]: Absolute Percent bias

acc() → float[source]: Anomaly correction coefficient. Reference:

[Langland et al., 2012](https://doi.org/10.3402/tellusa.v64i0.17531). Miyakoda et al., 1972. Murphy et al., 1989.

adjusted_r2() → float[source]: Adjusted R squared.

agreement_index() → float[source]

Agreement Index (d) developed by [Willmott, 1981](https://doi.org/10.1080/02723646.1981.10642213).

It detects additive and pro-portional differences in the observed and simulated means and vari-ances [Moriasi et al., 2015](https://doi.org/10.13031/trans.58.10715). It is overly sensitive to extreme values due to the squared differences [2]. It can also be used as a substitute for R2 to identify the degree to which model predic-tions are error-free [2].

\[\]

d = 1 - frac{sum_{i=1}^{N}(e_{i} - s_{i})^2}{sum_{i=1}^{N}(left | s_{i} - bar{e}
right | + left | e_{i} - bar{e} right |)^2}

[2] Legates and McCabe, 199

aic(p=1) → float[source]: [Akaike’s Information Criterion](https://doi.org/10.1007/978-1-4612-1694-0_15) Modifying from https://github.com/UBC-MDS/RegscorePy/blob/master/RegscorePy/aic.py

aitchison(center='mean') → float[source]: Aitchison distance. used in [Zhang et al., 2020](https://doi.org/10.5194/hess-24-2505-2020)

amemiya_adj_r2() → float[source]: Amemiya’s Adjusted R-squared

amemiya_pred_criterion() → float[source]: Amemiya’s Prediction Criterion

bias() → float[source]: Bias as shown in https://doi.org/10.1029/97WR03495 and given by [Gupta et al., 1998](https://doi.org/10.1080/02626667.2018.1552002

\[\]

Bias=frac{1}{N}sum_{i=1}^{N}(e_{i}-s_{i})

bic(p=1) → float[source]

Bayesian Information Criterion

Minimising the BIC is intended to give the best model. The model chosen by the BIC is either the same as that chosen by the AIC, or one with fewer terms. This is because the BIC penalises the number of parameters more heavily than the AIC [1]. Modified after https://github.com/UBC-MDS/RegscorePy/blob/master/RegscorePy/bic.py [1]: https://otexts.com/fpp2/selecting-predictors.html#schwarzs-bayesian-information-criterion

brier_score() → float[source]

Adopted from https://github.com/PeterRochford/SkillMetrics/blob/master/skill_metrics/brier_score.py Calculates the Brier score (BS), a measure of the mean-square error of probability forecasts for a dichotomous (two-category) event, such as the occurrence/non-occurrence of precipitation. The score is calculated using the formula: BS = sum_(n=1)^N (f_n - o_n)^2/N

where f is the forecast probabilities, o is the observed probabilities (0 or 1), and N is the total number of values in f & o. Note that f & o must have the same number of values, and those values must be in the range [0,1]. https://data.library.virginia.edu/a-brief-on-brier-scores/

Output: BS : Brier score

Reference: Glenn W. Brier, 1950: Verification of forecasts expressed in terms of probabilities. Mon. We. Rev., 78, 1-23. D. S. Wilks, 1995: Statistical Methods in the Atmospheric Sciences. Cambridge Press. 547 pp.

calculate_hydro_metrics()[source]

Calculates all metrics for hydrological data.

Returns:: Dictionary with all metrics
Return type:: dict

centered_rms_dev() → float[source]

Modified after https://github.com/PeterRochford/SkillMetrics/blob/master/skill_metrics/centered_rms_dev.py Calculates the centered root-mean-square (RMS) difference between true and predicted using the formula: (E’)^2 = sum_(n=1)^N [(p_n - mean(p))(r_n - mean(r))]^2/N where p is the predicted values, r is the true values, and N is the total number of values in p & r.

Output: CRMSDIFF : centered root-mean-square (RMS) difference (E’)^2

corr_coeff() → float[source]

Pearson correlation coefficient. It measures linear correlatin between true and predicted arrays. It is sensitive to outliers. Reference: Pearson, K 1895.

\[\]

r = frac{sum ^n _{i=1}(e_i - bar{e})(s_i - bar{s})}{sqrt{sum ^n _{i=1}(e_i - bar{e})^2}
sqrt{sum ^n _{i=1}(s_i - bar{s})^2}}

cosine_similarity() → float[source]

It is a judgment of orientation and not magnitude: two vectors with the same orientation have a cosine similarity of 1, two vectors oriented at 90° relative to each other have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude.

covariance() → float[source]

Covariance: \[\]

Covariance = frac{1}{N} sum_{i=1}^{N}((e_{i} - bar{e}) * (s_{i} - bar{s}))

cronbach_alpha() → float[source]: It is a measure of internal consitency of data https://stats.idre.ucla.edu/spss/faq/what-does-cronbachs-alpha-mean/ https://stackoverflow.com/a/20799687/5982232

decomposed_mse() → float[source]

Decomposed MSE developed by Kobayashi and Salam (2000): \[\]

dMSE = (frac{1}{N}sum_{i=1}^{N}(e_{i}-s_{i}))^2 + SDSD + LCS SDSD = (sigma(e) - sigma(s))^2 LCS = 2 sigma(e) sigma(s) * (1 - frac{sum ^n _{i=1}(e_i - bar{e})(s_i - bar{s})} {sqrt{sum ^n _{i=1}(e_i - bar{e})^2} sqrt{sum ^n _{i=1}(s_i - bar{s})^2}})

euclid_distance() → float[source]

Euclidian distance

Referneces: Kennard et al., 2010

exp_var_score(weights=None) → Optional[float][source]: Explained variance score https://stackoverflow.com/questions/24378176/python-sci-kit-learn-metrics-difference-between-r2-score-and-explained-varian best value is 1, lower values are less accurate.

expanded_uncertainty(cov_fact=1.96) → float[source]

By default it calculates uncertainty with 95% confidence interval. 1.96 is the coverage factor: corresponding 95% confidence level [2]. This indicator is used in order to show more information about the model deviation [2].

Using formula from by [1] and [2]. [1] https://doi.org/10.1016/j.enconman.2015.03.067 [2] https://doi.org/10.1016/j.rser.2014.07.117

fdc_fhv(h: float = 0.02) → float[source]

modified after: https://github.com/kratzert/ealstm_regional_modeling/blob/64a446e9012ecd601e0a9680246d3bbf3f002f6d/papercode/metrics.py#L190 Peak flow bias of the flow duration curve (Yilmaz 2008). used in kratzert et al., 2018 :returns: Bias of the peak flows :rtype: float

Raises:: RuntimeError – If h is not in range(0,1)

fdc_flv(low_flow: float = 0.3) → float[source]

bias of the bottom 30 % low flows modified after: https://github.com/kratzert/ealstm_regional_modeling/blob/64a446e9012ecd601e0a9680246d3bbf3f002f6d/papercode/metrics.py#L237 used in kratzert et al., 2018 :param low_flow: Upper limit of the flow duration curve. E.g. 0.3 means the bottom 30% of the flows are

considered as low flows, by default 0.3

Returns:: Bias of the low flows.
Return type:: float
Raises:: RuntimeError – If low_flow is not in the range(0,1)

gmae() → float[source]: Geometric Mean Absolute Error

gmean_diff() → float[source]: Geometric mean difference. First geometric mean is calculated for each of two samples and their difference is calculated.

gmrae(benchmark: Optional[ndarray] = None) → float[source]: Geometric Mean Relative Absolute Error

inrse() → float[source]: Integral Normalized Root Squared Error

irmse() → float[source]: Inertial RMSE. RMSE divided by standard deviation of the gradient of true.

kendaull_tau(return_p=False) → Union[float, tuple][source]: Kendall’s tau https://machinelearningmastery.com/how-to-calculate-nonparametric-rank-correlation-in-python/ used in https://www.jmlr.org/papers/volume20/18-444/18-444.pdf

kge(return_all=False)[source]

Kling-Gupta Efficiency Gupta, Kling, Yilmaz, Martinez, 2009, Decomposition of the mean squared error and NSE performance

criteria: Implications for improving hydrological modelling

output:: kge: Kling-Gupta Efficiency cc: correlation alpha: ratio of the standard deviation beta: ratio of the mean

kge_bound() → float[source]: Bounded Version of the Original Kling-Gupta Efficiency https://iahs.info/uploads/dms/13614.21–211-219-41-MATHEVET.pdf

kge_mod(return_all=False)[source]: Modified Kling-Gupta Efficiency (Kling et al. 2012 - https://doi.org/10.1016/j.jhydrol.2012.01.011)

kge_np(return_all=False)[source]

Non parametric Kling-Gupta Efficiency Corresponding paper: Pool, Vis, and Seibert, 2018 Evaluating model performance: towards a non-parametric variant of the

Kling-Gupta efficiency, Hydrological Sciences Journal.

https://doi.org/10.1080/02626667.2018.1552002 output:

kge: Kling-Gupta Efficiency cc: correlation alpha: ratio of the standard deviation beta: ratio of the mean

kgenp_bound()[source]: Bounded Version of the Non-Parametric Kling-Gupta Efficiency

kgeprime_c2m() → float[source]

https://iahs.info/uploads/dms/13614.21–211-219-41-MATHEVET.pdf: Bounded Version of the Modified Kling-Gupta Efficiency

kl_sym() → Optional[float][source]: Symmetric kullback-leibler divergence

lm_index(obs_bar_p=None) → float[source]: Legate-McCabe Efficiency Index. Less sensitive to outliers in the data. obs_bar_p: float, Seasonal or other selected average. If None, the mean of the observed array will be used.

log_nse(epsilon=0.0) → float[source]

log Nash-Sutcliffe model efficiency: \[\]

NSE = 1-frac{sum_{i=1}^{N}(log(e_{i})-log(s_{i}))^2}{sum_{i=1}^{N}(log(e_{i})-log(bar{e})^2}-1)*-1

log_prob() → float[source]: Logarithmic probability distribution

maape() → float[source]: Mean Arctangent Absolute Percentage Error Note: result is NOT multiplied by 100

mae(true=None, predicted=None) → float[source]: Mean Absolute Error

mapd() → float[source]: Mean absolute percentage deviation.

mape() → float[source]

Mean Absolute Percentage Error. The MAPE is often used when the quantity to predict is known to remain way above zero [1]. It is useful when the size or size of a prediction variable is significant in evaluating the accuracy of a prediction [2]. It has advantages of scale-independency and interpretability [3]. However, it has the significant disadvantage that it produces infinite or undefined values for zero or close-to-zero actual values [3].

[1] https://doi.org/10.1016/j.neucom.2015.12.114 [2] https://doi.org/10.1088/1742-6596/930/1/012002 [3] https://doi.org/10.1016/j.ijforecast.2015.12.003

mare() → float[source]: Mean Absolute Relative Error. When expressed in %age, it is also known as mape. [1] https://doi.org/10.1016/j.rser.2015.08.035

mase(seasonality: int = 1)[source]: Mean Absolute Scaled Error Baseline (benchmark) is computed with naive forecasting (shifted by @seasonality) modified after https://gist.github.com/bshishov/5dc237f59f019b26145648e2124ca1c9 Hyndman, R. J. (2006). Another look at forecast-accuracy metrics for intermittent demand. Foresight: The International Journal of Applied Forecasting, 4(4), 43-46.

max_error() → float[source]: maximum error

mb_r() → float[source]: Mielke-Berry R value. Berry and Mielke, 1988. Mielke, P. W., & Berry, K. J. (2007). Permutation methods: a distance function approach.

Springer Science & Business Media.

mbe() → float[source]

Mean bias error. This indicator expresses a tendency of model to underestimate (negative value) or overestimate (positive value) global radiation, while the MBE values closest to zero are desirable. The drawback of this test is that it does not show the correct performance when the model presents overestimated and underestimated values at the same time, since overestimation and underestimation values cancel each other. [1]

[1] https://doi.org/10.1016/j.rser.2015.08.035

mbrae(benchmark: Optional[ndarray] = None) → float[source]: Mean Bounded Relative Absolute Error

mda() → float[source]: Mean Directional Accuracy modified after https://gist.github.com/bshishov/5dc237f59f019b26145648e2124ca1c9

mdape() → float[source]: Median Absolute Percentage Error

mde() → float[source]: Median Error

mdrae(benchmark: Optional[ndarray] = None) → float[source]: Median Relative Absolute Error

me()[source]: Mean error

mean_bias_error() → float[source]

Mean Bias Error It represents overall bias error or systematic error. It shows average interpolation bias; i.e. average over- or underestimation. [1][2].This indicator expresses a tendency of model to underestimate (negative value) or overestimate (positive value) global radiation, while the MBE values closest to zero are desirable. The drawback of this test is that it does not show the correct performance when the model presents overestimated and underestimated values at the same time, since overestimation and underestimation values cancel each other.

[2] Willmott, C. J., & Matsuura, K. (2006). On the use of dimensioned measures of error to evaluate the performance

of spatial interpolators. International Journal of Geographical Information Science, 20(1), 89-102.: https://doi.org/10.1080/1365881050028697

[1] Valipour, M. (2015). Retracted: Comparative Evaluation of Radiation-Based Methods for Estimation of Potential

Evapotranspiration. Journal of Hydrologic Engineering, 20(5), 04014068.: http://dx.doi.org/10.1061/(ASCE)HE.1943-5584.0001066

[3] https://doi.org/10.1016/j.rser.2015.08.035

mean_gamma_deviance(weights=None) → float[source]: mean gamma deviance

mean_poisson_deviance(weights=None) → float[source]: mean poisson deviance

mean_var() → float[source]: Mean variance

med_seq_error() → float[source]: Median Squared Error Same as mse but it takes median which reduces the impact of outliers.

median_abs_error() → float[source]: median absolute error

mle() → float[source]: Mean log error

mod_agreement_index(j=1) → float[source]: Modified agreement of index. j: int, when j==1, this is same as agreement_index. Higher j means more impact of outliers.

mpe() → float[source]: Mean Percentage Error

mrae(benchmark: Optional[ndarray] = None)[source]: Mean Relative Absolute Error

msle(weights=None) → float[source]: mean square logrithmic error

norm_ae() → float[source]: Normalized Absolute Error

norm_ape() → float[source]: Normalized Absolute Percentage Error

norm_euclid_distance() → float[source]: Normalized Euclidian distance

nrmse() → float[source]: Normalized Root Mean Squared Error

nrmse_ipercentile(q1=25, q2=75) → float[source]: RMSE normalized by inter percentile range of true. This is least sensitive to outliers. q1: any interger between 1 and 99 q2: any integer between 2 and 100. Should be greater than q1. Reference: Pontius et al., 2008.

nrmse_mean() → float[source]

Mean Normalized RMSE RMSE normalized by mean of true values.This allows comparison between datasets with different scales.

Reference: Pontius et al., 2008

nrmse_range() → float[source]

Range Normalized Root Mean Squared Error. RMSE normalized by true values. This allows comparison between data sets with different scales. It is more sensitive to outliers.

Reference: Pontius et al., 2008

nse() → float[source]

Nash-Sutcliff Efficiency.

It determine how well the model simulates trends for the output response of concern. But cannot help identify model bias and cannot be used to identify differences in timing and magnitude of peak flows and shape of recession curves; in other words, it cannot be used for single-event simulations. It is sensitive to extreme values due to the squared differ-ences [1]. To make it less sensitive to outliers, [2] proposed log and relative nse. [1] Moriasi, D. N., Gitau, M. W., Pai, N., & Daggupati, P. (2015). Hydrologic and water quality models:

Performance measures and evaluation criteria. Transactions of the ASABE, 58(6), 1763-1785.

[2] Krause, P., Boyle, D., & Bäse, F. (2005). Comparison of different efficiency criteria for hydrological: model assessment. Adv. Geosci., 5, 89-97. http://dx.doi.org/10.5194/adgeo-5-89-2005.

nse_alpha() → float[source]: Alpha decomposition of the NSE, see [Gupta et al. 2009](https://doi.org/10.1029/97WR03495) used in kratzert et al., 2018 :returns: Alpha decomposition of the NSE :rtype: float

nse_beta() → float[source]: Beta decomposition of NSE. See [Gupta et. al 2009](https://doi.org/10.1016/j.jhydrol.2009.08.003) used in kratzert et al., 2018 :returns: Beta decomposition of the NSE :rtype: float

nse_bound() → float[source]: Bounded Version of the Nash-Sutcliffe Efficiency https://iahs.info/uploads/dms/13614.21–211-219-41-MATHEVET.pdf

nse_mod(j=1) → float[source]: Gives less weightage of outliers if j=1 and if j>1, gives more weightage to outliers. Reference: Krause et al., 2005

nse_rel() → float[source]: Relative NSE.

pbias() → float[source]: Percent Bias. It determine how well the model simulates the average magnitudes for the output response of interest. It can also determine over and under-prediction. It cannot be used (1) for single-event simula-tions to identify differences in timing and magnitude of peak flows and the shape of recession curves nor (2) to determine how well the model simulates residual variations and/or trends for the output response of interest. It can give a deceiving rating of model performance if the model overpredicts as much as it underpredicts, in which case PBIAS will be close to zero even though the model simulation is poor. [1] [1] Moriasi et al., 2015

r2() → float[source]: Quantifies the percent of variation in the response that the ‘model’ explains. The ‘model’ here is anything from which we obtained predicted array. It is also called coefficient of determination or square of pearson correlation coefficient. More heavily affected by outliers than pearson correlatin r. https://data.library.virginia.edu/is-r-squared-useless/

r2_score(weights=None)[source]: This is not a symmetric function. Unlike most other scores, R^2 score may be negative (it need not actually be the square of a quantity R). This metric is not well-defined for single samples and will return a NaN value if n_samples is less than two.

rae() → float[source]: Relative Absolute Error (aka Approximation Error)

ref_agreement_index() → float[source]: Refined Index of Agreement. From -1 to 1. Larger the better. Refrence: Willmott et al., 2012

rel_agreement_index() → float[source]: Relative index of agreement. from 0 to 1. larger the better.

relative_rmse() → float[source]

Relative Root Mean Squared Error: \[\]

RRMSE=frac{sqrt{frac{1}{N}sum_{i=1}^{N}(e_{i}-s_{i})^2}}{bar{e}}

rmdspe() → float[source]: Root Median Squared Percentage Error

rmse(weights=None) → float[source]: root mean square error

rmsle() → float[source]

Root mean square log error.

This error is less sensitive to [outliers](https://stats.stackexchange.com/q/56658/314919). Compared to RMSE, RMSLE only considers the relative error between predicted and actual values, and the scale of the error is nullified by the log-transformation. Furthermore, RMSLE penalizes underestimation more than overestimation. This is especially useful in those studies where the underestimation of the target variable is not acceptable but overestimation can be tolerated. [1]

[1] https://doi.org/10.1016/j.scitotenv.2020.137894

rmspe() → float[source]: Root Mean Square Percentage Error https://stackoverflow.com/a/53166790/5982232

rmsse(seasonality: int = 1) → float[source]: Root Mean Squared Scaled Error

rrse() → float[source]: Root Relative Squared Error

rse() → float[source]: Relative Squared Error

rsr() → float[source]: Moriasi et al., 2007. It incorporates the benefits of error index statistics andincludes a scaling/normalization factor, so that the resulting statistic and reported values can apply to various constitu-ents.

sa() → float[source]: Spectral angle. From -pi/2 to pi/2. Closer to 0 is better. It measures angle between two vectors in hyperspace indicating how well the shape of two arrays match instead of their magnitude. Reference: Robila and Gershman, 2005.

sc() → float[source]: Spectral correlation. From -pi/2 to pi/2. Closer to 0 is better.

sga() → float[source]: Spectral gradient angle. From -pi/2 to pi/2. Closer to 0 is better.

sid() → float[source]: Spectral Information Divergence. From -pi/2 to pi/2. Closer to 0 is better.

skill_score_murphy() → float[source]

Adopted from https://github.com/PeterRochford/SkillMetrics/blob/278b2f58c7d73566f25f10c9c16a15dc204f5869/skill_metrics/skill_score_murphy.py Calculate non-dimensional skill score (SS) between two variables using definition of Murphy (1988) using the formula:

SS = 1 - RMSE^2/SDEV^2

SDEV is the standard deviation of the true values

SDEV^2 = sum_(n=1)^N [r_n - mean(r)]^2/(N-1)

where p is the predicted values, r is the reference values, and N is the total number of values in p & r. Note that p & r must have the same number of values. A positive skill score can be interpreted as the percentage of improvement of the new model forecast in comparison to the reference. On the other hand, a negative skill score denotes that the forecast of interest is worse than the referencing forecast. Consequently, a value of zero denotes that both forecasts perform equally [MLAir, 2020].

Output: SS : skill score Reference: Allan H. Murphy, 1988: Skill Scores Based on the Mean Square Error and Their Relationships to the Correlation Coefficient. Mon. Wea. Rev., 116, 2417-2424. doi: http//dx.doi.org/10.1175/1520-0493(1988)<2417:SSBOTM>2.0.CO;2

smape() → float[source]: Symmetric Mean Absolute Percentage Error https://en.wikipedia.org/wiki/Symmetric_mean_absolute_percentage_error https://stackoverflow.com/a/51440114/5982232

smdape() → float[source]: Symmetric Median Absolute Percentage Error Note: result is NOT multiplied by 100

spearmann_corr() → float[source]: Separmann correlation coefficient. This is a nonparametric metric and assesses how well the relationship between the true and predicted data can be described using a monotonic function. https://hess.copernicus.org/articles/24/2505/2020/hess-24-2505-2020.pdf

sse() → float[source]

Sum of squared errors (model vs actual). measure of how far off our model’s predictions are from the observed values. A value of 0 indicates that all

predications are spot on. A non-zero value indicates errors.

https://dziganto.github.io/data%20science/linear%20regression/machine%20learning/python/Linear-Regression-101-Metrics/ This is also called residual sum of squares (RSS) or sum of squared residuals as per https://www.tutorialspoint.com/statistics/residual_sum_of_squares.htm

std_ratio(**kwargs) → float[source]: ratio of standard deviations of predictions and trues. Also known as standard ratio, it varies from 0.0 to infinity while 1.0 being the perfect value.

umbrae(benchmark: Optional[ndarray] = None)[source]: Unscaled Mean Bounded Relative Absolute Error

ve() → float[source]: Volumetric efficiency. from 0 to 1. Smaller the better. Reference: Criss and Winston 2008.

volume_error() → float[source]: Returns the Volume Error (Ve). It is an indicator of the agreement between the averages of the simulated and observed runoff (i.e. long-term water balance). used in this paper: Reynolds, J.E., S. Halldin, C.Y. Xu, J. Seibert, and A. Kauffeldt. 2017. “Sub-Daily Runoff Predictions Using Parameters Calibrated on the Basis of Data with a Daily Temporal Resolution.” Journal of Hydrology 550 (July):399?411. https://doi.org/10.1016/j.jhydrol.2017.05.012.

\[\]

Sum(self.predicted- true)/sum(self.predicted)

wape() → float[source]

[weighted absolute percentage error](https://mattdyor.wordpress.com/2018/05/23/calculating-wape/)

It is a variation of mape but [more suitable for intermittent and low-volume data](https://arxiv.org/pdf/2103.12057v1.pdf).

watt_m() → float[source]: Watterson’s M. Refrence: Watterson., 1996

wmape() → float[source]: Weighted Mean Absolute Percent Error https://stackoverflow.com/a/54833202/5982232

ClassificationMetrics

class ai4water.postprocessing.SeqMetrics.ClassificationMetrics(*args, multiclass=False, **kwargs)[source]

Bases: Metrics

Calculates classification metrics.

__init__(*args, multiclass=False, **kwargs)[source]

Parameters:

true – array like, ture/observed/actual/target values
predicted – array like, simulated values
replace_nan – default None. if not None, then NaNs in true and predicted will be replaced by this value.
replace_inf – default None, if not None, then inf vlaues in true and predicted will be replaced by this value.
remove_zero – default False, if True, the zero values in true or predicted arrays will be removed. If a zero is found in one array, the corresponding value in the other array will also be removed.
remove_neg – default False, if True, the negative values in true or predicted arrays will be removed.
metric_type – type of metric.

accuracy(normalize=True)[source]

balanced_accuracy_score()[source]

cross_entropy(epsilon=1e-12)[source]

Computes cross entropy between targets (encoded as one-hot vectors) and predictions. Input: predictions (N, k) ndarray

targets (N, k) ndarray

Returns: scalar

Utils

class ai4water.postprocessing.SeqMetrics.utils.plot_metrics(metrics: dict, ranges: tuple = ((0.0, 1.0), (1.0, 10), (10, 1000)), exclude: Optional[list] = None, plot_type: str = 'bar', max_metrics_per_fig: int = 15, show: bool = True, save: bool = False, save_path: Optional[str] = None, **kwargs)[source]

Bases:

Plots the metrics given as dictionary as radial or bar plot between specified ranges.

Parameters:

metrics – dictionary whose keys are names are erros and values are error values.
ranges – tuple of tuples defining range of errors to plot in one plot
exclude – List of metrics to be excluded from plotting.
max_metrics_per_fig – maximum number of metrics to show in one figure.
plot_type – either of radial or bar.
show – If, then figure will be shown/drawn
save – if True, the figure will be saved.
save_path – if given, the figure will the saved at this location.
kwargs – keyword arguments for plotting

Examples

>>> import numpy as np
>>> from ai4water.postprocessing.SeqMetrics import RegressionMetrics
>>> from ai4water.postprocessing.SeqMetrics import plot_metrics
>>> t = np.random.random((20, 1))
>>> p = np.random.random((20, 1))
>>> er = RegressionMetrics(t, p)
>>> all_errors = er.calculate_all()
>>> plot_metrics(all_errors, plot_type='bar', max_metrics_per_fig=50)
>>># or draw the radial plot
>>> plot_metrics(all_errors, plot_type='radial', max_metrics_per_fig=50)

```

__init__(**kwargs)