HyperParameter Optimization

This module is for optimization of hyper-parameters. The HyperOpt class performs optimization by minimizing the objective which is defined by a user defined objective function. The space of hyperparameters can be defined by using Categorical, Integer and Real classes.

For tutorial on using this class, see tutorials

Categorical

class ai4water.hyperopt.Categorical(categories, prior=None, transform=None, name=None)[source]

This class is used when parameter has distinct group/class of values such as [1,2,3] or [‘a’, ‘b’, ‘c’]. This class overrides skopt’s Categorical class. It Can be converted to optuna’s distribution or hyper_opt’s choice. It uses same input arguments as received by skopt’s Categorical class

- as_hp
- to_optuna
- suggest
- to_optuna
- serialize

Example

>>> from ai4water.hyperopt import Categorical
>>> activations = Categorical(categories=['relu', 'tanh', 'sigmoid'], name='activations')
__init__(categories, prior=None, transform=None, name=None)[source]
serialize()[source]

Serializes the Categorical object so that it can be saved in json

Real

class ai4water.hyperopt.Real(low: Optional[float] = None, high: Optional[float] = None, num_samples: Optional[int] = None, step: Optional[int] = None, grid: Optional[Union[list, ndarray]] = None, *args, **kwargs)[source]

This class is used for the parameters which have fractional values such as real values from 1.0 to 3.5. This class extends the Real class of Skopt so that it has an attribute grid which then can be fed to optimization algorithm to create grid space. It also adds several further methods to it.

grid
- as_hp
- to_optuna
- suggest
- to_optuna
- serialize

Example

>>> from ai4water.hyperopt import Real
>>> lr = Real(low=0.0005, high=0.01, prior='log-uniform', name='lr')
__init__(low: Optional[float] = None, high: Optional[float] = None, num_samples: Optional[int] = None, step: Optional[int] = None, grid: Optional[Union[list, ndarray]] = None, *args, **kwargs)[source]
Parameters:
  • low – lower limit of parameter

  • high – upper limit of parameter

  • step – used to define grid in conjuction with low and high This argument is only used when grid search algorithm is used.

  • grid – array like, if given, low, high, step and num_samples will be redundant.

  • num_samples – if given, it will be used to create grid space using the formula

  • ``np.linspace

serialize()[source]

Serializes the Real object so that it can be saved in json

to_optuna()[source]

returns an equivalent optuna space

Integer

class ai4water.hyperopt.Integer(low: int = None, high: int = None, num_samples: int = None, step: int = None, grid: np.ndarray, list = None, *args, **kwargs)[source]

This class is used when the parameter is integer such as integer values from 1 to 10. Extends the Real class of Skopt so that it has an attribute grid which then can be fed to optimization algorithm to create grid space. Moreover it also generates optuna and hyperopt compatible/equivalent instances.

grid
- as_hp
- to_optuna
- suggest
- to_optuna
- serialize
Example:
>>> from ai4water.hyperopt import Integer
>>> units = Integer(low=16, high=128, name='units')
__init__(low: int = None, high: int = None, num_samples: int = None, step: int = None, grid: np.ndarray, list = None, *args, **kwargs)[source]
Parameters:
  • low – lower limit of parameter

  • high – upper limit of parameter

  • list/array (grid) – If given, low and high should not be given as they will be calculated from this grid.

  • int (num_samples) – if given , it will be used to calculated grid using the formula np.arange(low, high, step)

  • int – if given, it will be used to create grid space using the formula np.linspace(low, high, num_samples)

serialize()[source]

Serializes the Integer object so that it can be saved in json

to_optuna()[source]

returns an equivalent optuna space

HyperOpt

class ai4water.hyperopt.HyperOpt(algorithm: str, *, param_space, objective_fn, eval_on_best: bool = False, backend: Optional[str] = None, opt_path: Optional[str] = None, process_results: bool = True, verbosity: int = 1, **kwargs)[source]

Bases: object

The purpose of this class is to provide a uniform and simplifed interface to use hyperopt, optuna, scikit-optimize and scikit-learn based hyperparameter optimization methods. Ideally this class should provide all the functionalities of beforementioned libaries with a uniform interface. It however also complements these libraries by combining their functionalities and adding some additional functionalities to them. On the other hand this class should not limit or complicate the use of its underlying libraries. This means all the functionalities of underlying libraries are available in this class as well. Moreover, you can use this class just as you use one of its underlying library.

The purpose here is to make a class which allows application of any of the available optimization methods on any type of model/classifier/regressor. If the classifier/regressor is of sklearn-based, then for random search, we use RanddomSearchCV, for grid search, we use GridSearchCV and for Bayesian, we use BayesSearchCV . On the other hand, if the model is not sklearn-based, you will still be able to implement any of the three methods. In such case, the bayesian will be implemented using gp_minimize. Random search and grid search will be done by simple iterating over the sample space generated as in sklearn based samplers. However, the post-processing of the results is (supposed to be) done same as is done in RandomSearchCV and GridSearchCV.

The class is expected to pass all the tests written in sklearn or skopt for corresponding classes.

For detailed use of this class see this hpo_tutorial

- results dict
- gpmin_results dict
- skopt_results
- hp_space
- space
- skopt_space
- space dict
- title str
default this is same as name of algorithm. For AI4Water based

models, this is more detailed, containing problem type etc.

Type:

name of the folder in which all results will be saved. By

- eval_with_best: evaluates the objective_fn on best parameters
- best_paras(): returns the best parameters from optimization.
The following examples illustrate how we can uniformly apply different optimization algorithms.

Examples

>>> from ai4water import Model
>>> from ai4water.hyperopt import HyperOpt, Categorical, Integer, Real
>>> from ai4water.datasets import busan_beach
>>> from SeqMetrics import RegressionMetrics
>>> data = busan_beach()
>>> input_features = ['tide_cm', 'wat_temp_c', 'sal_psu', 'air_temp_c', 'pcp_mm', 'pcp3_mm']
>>> output_features = ['tetx_coppml']

We have to define an objective function which will take keyword arguments and return a scaler value as output. This scaler value will be minized during optimzation

>>> def objective_fn(**suggestion)->float:
...   # the objective function must receive new parameters as keyword arguments
...    model = Model(
...        input_features=input_features,
...        output_features=output_features,
...        model={"XGBRegressor": suggestion},
...        verbosity=0)
...
...    model.fit(data=data)
...
...    t, p = model.predict(return_true=True)
...    mse = RegressionMetrics(t, p).mse()
...    # the objective function must return a scaler value which needs to be minimized
...    return mse

Define search space The search splace determines pool from which parameters are chosen during optimization.

>>> num_samples=5   # only relavent for random and grid search
>>> search_space = [
...    Categorical(['gbtree', 'dart'], name='booster'),
...    Integer(low=1000, high=2000, name='n_estimators', num_samples=num_samples),
...    Real(low=1.0e-5, high=0.1, name='learning_rate', num_samples=num_samples)
... ]
... # Using Baysian with gaussian processes
>>> optimizer = HyperOpt('bayes', objective_fn=objective_fn, param_space=search_space,
...                     num_iterations=num_iterations )
>>> optimizer.fit()

Using TPE with optuna

>>> num_iterations = 10
>>> optimizer = HyperOpt('tpe', objective_fn=objective_fn, param_space=search_space,
...                     backend='optuna',
...                     num_iterations=num_iterations )
>>> optimizer.fit()

Using cmaes with optuna

>>> optimizer = HyperOpt('cmaes', objective_fn=objective_fn, param_space=search_space,
...                     backend='optuna',
...                     num_iterations=num_iterations )
>>> optimizer.fit()

Using random with optuna, we can also try hyperopt and sklearn as backend for random algorithm

>>> optimizer = HyperOpt('random', objective_fn=objective_fn, param_space=search_space,
...                     backend='optuna',
...                     num_iterations=num_iterations )
>>> optimizer.fit()

Using TPE of hyperopt

>>> optimizer = HyperOpt('tpe', objective_fn=objective_fn, param_space=search_space,
...                     backend='hyperopt',
...                     num_iterations=num_iterations )
>>> optimizer.fit()

Using grid with sklearn

>>> optimizer = HyperOpt('grid', objective_fn=objective_fn, param_space=search_space,
...                     backend='sklearn',
...                     num_iterations=num_iterations )
>>> optimizer.fit()
__init__(algorithm: str, *, param_space, objective_fn, eval_on_best: bool = False, backend: Optional[str] = None, opt_path: Optional[str] = None, process_results: bool = True, verbosity: int = 1, **kwargs)[source]

Initializes the class

Parameters:
  • algorithm (str) – must be one of random, grid, bayes, bayes_rf, and tpe, defining which optimization algorithm to use.

  • objective_fn (callable) – Any callable function whose returned value is to be minimized. It can also be either sklearn/xgboost based regressor/classifier.

  • param_space (list, dict) – the search space of parameters to be optimized. We recommend the use of Real, Integer and categorical classes from [ai4water.hyperopt][ai4water.hyperopt.Integer] (not from skopt.space). These classes allow a uniform way of defining the parameter space for all the underlying libraries. However, to make this class work exactly similar to its underlying libraries, the user can also define parameter space as is defined in its underlying libraries. For example, for hyperopt based method like ‘tpe’ the parameter space can be specified as in the examples of hyperopt library. In case the code breaks, please report.

  • eval_on_best (bool, optional) – if True, then after optimization, the objective_fn will be evaluated on best parameters and the results will be stored in the folder named “best” inside title folder.

  • opt_path – path to save the results

  • backend (str, optional) – Defines which backend library to use for the algorithm. For example the user can specify whether to use optuna or hyper_opt or sklearn for grid algorithm.

  • verbosity (bool, optional) – determines amount of information being printed

  • **kwargs – Any additional keyword arguments will for the underlying optimization algorithm. In case of using AI4Water model, these must be arguments which are passed to AI4Water’s Model class.

__getattr__(item)[source]
add_previous_results(iterations: Optional[Union[dict, str]] = None, x: Optional[list] = None, y: Optional[list] = None)[source]

adds results from previous iterations.

If you have run the optimization priviously, you can make use of those results by appending them.

Parameters:
  • iterations – It can be either a dictionary whose keys are y values and values are x or it can be a path to a file which contains these xy values as dictioary.

  • x – a list of lists where each sub-list is the value of hyperparameter at at one iteratio. The x and y arguments optional and will only be used if iterations are not provided.

  • y – a list of float values where each value in y is the output of objective_fn with corresponding x. The length of x and y must be equal.

property backend
best_iter() int[source]

returns the iteration on which best/optimized parameters are obtained. The indexing starts from 0.

best_paras(as_list=False) Union[list, dict][source]
best_xy() dict[source]

Returns best (optimized) parameters as dictionary. The dictionary has two keys x and y. x is the best hyperparameters while y is the corresponding objective function value.

check_args(**kwargs)[source]
static dict_to_xy(iterations: dict)[source]
dims()[source]
eval_sequence(params, **kwargs)[source]

” kwargs :

any additional keyword arguments for objective_fn

eval_with_best()[source]

Find the best parameters and evaluate the objective_fn with them. :param return_model bool: If True, then then the built objective_fn will be returned

fit(*args, **kwargs)[source]

Makes and calls the underlying fit method

Parameters:

**kwargs – any keyword arguments for the userdefined objective function

Example

>>> def objective_fn(a=2, b=5, **suggestions)->float:
...     # do something e.g calcualte validation score
>>>     val_score = 2.0
>>>     return val_score
fmin(**kwargs)[source]
classmethod from_gp_parameters(fpath: str, objective_fn)[source]

loads results saved from bayesian optimization

func_vals() ndarray[source]

returns the value of objective function at each iteration.

hp_space() dict[source]

returns a dictionary whose values are hyperopt equivalent space instances.

load_results(fname: str)[source]

loads the previously computed results. It should not be used after .fit()

Parameters:

fname (str) – complete path of hpo_results.bin file e.g. path/to/hpo_results.bin

model_for_gpmin(**kws)[source]
This function can be called in two cases
  • The user has made its own objective_fn.

  • We make objective_fn using AI4Water and return the error.

In first case, we just return what user has provided.

property num_iterations
property objective_fn_is_dl
property opt_path
optuna_objective(**kwargs)[source]

objective function that will used during random search method. :param kwargs: keyword arguments in the user defined objective function.

optuna_study()[source]

Attempts to create an optuna Study instance so that optuna based plots can be generated.

Returns None, if not possible else Study

original_para_order()[source]
own_fit(**kws)[source]

kws are the keyword arguments to user objective function by the user

property param_space
plot_importance(save=True, show: bool = False, plot_type='box', with_optuna: bool = False, **tree_kws) Axes[source]

plots hyperparameter importance using fANOVA

plot_parallel_coords(save=True, show=False, **kwargs)[source]

parallel coordinates of hyperparameters

Parameters:
  • save (bool, default=True) –

  • show (bool, default=False) –

  • **kwargs – any keyword arguments for easy_mpl.parallel_coordinates

pre_calculated_results(resutls, from_gp_parameters=True)[source]

Loads the pre-calculated results i.e. x and y values which have been already evaluated.

process_results(show=False)[source]

post processing of results

objective function that will used during random search method. :param kwargs: keyword arguments in the user defined objective function.

property random_state
save_iterations_as_xy()[source]
save_results(results, path: Optional[str] = None)[source]

saves the hpo results so that they can be loaded using load_results method.

Parameters:
  • results – hpo results i.e. output of optimizer.fit()

  • path – path where to save the results

serialize()[source]
skopt_results()[source]
skopt_space()[source]

Tries to make skopt compatible Space object. If unsuccessful, return None

space() dict[source]

Returns a skopt compatible space but as dictionary

property title
to_kw(x)[source]
property use_named_args
property use_own
property use_sklearn
property use_skopt_bayes
property use_skopt_gpmin
property use_tpe
xy_of_iterations() Dict[int, Dict[str, Union[str, dict]]][source]

returns a dictionary whose keys are iteration numbers are values are xy parirs at those iterations.

Returns Dict[int, Dict[str, [dict,float]]]

fANOVA

class ai4water.hyperopt.fANOVA(X: Union[ndarray, DataFrame], Y: ndarray, dtypes: List[str], bounds: List[Optional[tuple]], parameter_names=None, cutoffs=(-inf, inf), n_estimators=64, max_depth=64, random_state=313, **rf_kws)[source]

Calculation of parameter importance using FANOVA (Hutter et al., 2014).

Parameters:
  • X – input data of shape (n_iterations, n_parameters). For hyperparameter optimization, iterations represent number of optimization iterations and parameter represent number of hyperparameters

  • Y – objective value corresponding to X. Its length should be same as that of X

  • dtypes (list) – list of strings determining the type of hyperparameter. Allowed values are only categorical and numerical.

  • bounds (list) – list of tuples, where each tuple defines the upper and lower limit of corresponding parameter

  • parameter_names (list) – names of features/parameters/hyperparameters

  • cutoffs (tuple) –

  • n_estimators (int) – number of trees

  • max_depth (int (default=64)) – maximum depth of trees

  • **rf_kws – keyword arguments to sklearn.ensemble.RandomForestRegressor

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from ai4water.hyperopt import fANOVA
>>> x = np.arange(20).reshape(10, 2).astype(float)
>>> y = np.linspace(1, 30, 10).astype(float)
... # X are hyperparameters and Y are objective function values at corresponding iterations
>>> f = fANOVA(X=x, Y=y,
...            bounds=[(-2, 20), (-5, 50)],
...            dtypes=["numerical", "numerical"],
...            random_state=313, max_depth=3)
... # calculate importance
>>> imp = f.feature_importance()

for categorical parameters

>>> x = pd.DataFrame(['2', '2', '3', '1', '1', '2', '2', '1', '3', '3', '3'], columns=['a'])
>>> x['b'] = ['3', '3', '1', '3', '1', '2', '4', '4', '3', '3', '4']
>>> y = np.linspace(-1., 1.0, len(x))
>>> f = fANOVA(X=x, Y=y, bounds=[None, None], dtypes=['categorical', 'categorical'],
...            random_state=313, max_depth=3, n_estimators=1)
... # calculate importance
>>> imp = f.feature_importance()

for mix types

>>> x = pd.DataFrame(['2', '2', '3', '1', '1', '2', '2', '1', '3', '3', '3'], columns=['a'])
>>> x['b'] = np.arange(100, 100+len(x))
>>> y = np.linspace(-1., 2.0, len(x))
>>> f = fANOVA(X=x, Y=y, bounds=[None, (10, 150)], dtypes=['categorical', 'numerical'],
...           random_state=313, max_depth=5, n_estimators=5)
... # calculate importance
>>> imp = f.feature_importance()
__init__(X: Union[ndarray, DataFrame], Y: ndarray, dtypes: List[str], bounds: List[Optional[tuple]], parameter_names=None, cutoffs=(-inf, inf), n_estimators=64, max_depth=64, random_state=313, **rf_kws)[source]
get_trees_total_variances() tuple[source]

get variance of all trees

set_cutoffs(cutoffs=(-inf, inf), quantile=None)[source]

Setting the cutoffs to constrain the input space

To properly do things like ‘improvement over default’ the fANOVA now supports cutoffs on the y values. These will exclude parts of the parameters space where the prediction is not within the provided cutoffs. This is is specialization of “Generalized Functional ANOVA Diagnostics for High Dimensional Functions of Dependent Variables” by Hooker.