HyperParameter Optimization
This module is for optimization of hyper-parameters. The HyperOpt class performs optimization by minimizing the objective which is defined by a user defined objective function. The space of hyperparameters can be defined by using Categorical, Integer and Real classes.
For tutorial on using this class, see tutorials
Categorical
- class ai4water.hyperopt.Categorical(categories, prior=None, transform=None, name=None)[source]
This class is used when parameter has distinct group/class of values such as [1,2,3] or [‘a’, ‘b’, ‘c’]. This class overrides skopt’s Categorical class. It Can be converted to optuna’s distribution or hyper_opt’s choice. It uses same input arguments as received by skopt’s Categorical class
- - as_hp
- - to_optuna
- - suggest
- - to_optuna
- - serialize
Example
>>> from ai4water.hyperopt import Categorical >>> activations = Categorical(categories=['relu', 'tanh', 'sigmoid'], name='activations')
Real
- class ai4water.hyperopt.Real(low: Optional[float] = None, high: Optional[float] = None, num_samples: Optional[int] = None, step: Optional[int] = None, grid: Optional[Union[list, ndarray]] = None, *args, **kwargs)[source]
This class is used for the parameters which have fractional values such as real values from 1.0 to 3.5. This class extends the Real class of Skopt so that it has an attribute grid which then can be fed to optimization algorithm to create grid space. It also adds several further methods to it.
- grid
- - as_hp
- - to_optuna
- - suggest
- - to_optuna
- - serialize
Example
>>> from ai4water.hyperopt import Real >>> lr = Real(low=0.0005, high=0.01, prior='log-uniform', name='lr')
- __init__(low: Optional[float] = None, high: Optional[float] = None, num_samples: Optional[int] = None, step: Optional[int] = None, grid: Optional[Union[list, ndarray]] = None, *args, **kwargs)[source]
- Parameters:
low – lower limit of parameter
high – upper limit of parameter
step – used to define grid in conjuction with low and high This argument is only used when grid search algorithm is used.
grid – array like, if given, low, high, step and num_samples will be redundant.
num_samples – if given, it will be used to create grid space using the formula
``np.linspace –
Integer
- class ai4water.hyperopt.Integer(low: int = None, high: int = None, num_samples: int = None, step: int = None, grid: np.ndarray, list = None, *args, **kwargs)[source]
This class is used when the parameter is integer such as integer values from 1 to 10. Extends the Real class of Skopt so that it has an attribute grid which then can be fed to optimization algorithm to create grid space. Moreover it also generates optuna and hyperopt compatible/equivalent instances.
- grid
- - as_hp
- - to_optuna
- - suggest
- - to_optuna
- - serialize
- Example:
>>> from ai4water.hyperopt import Integer >>> units = Integer(low=16, high=128, name='units')
- __init__(low: int = None, high: int = None, num_samples: int = None, step: int = None, grid: np.ndarray, list = None, *args, **kwargs)[source]
- Parameters:
low – lower limit of parameter
high – upper limit of parameter
list/array (grid) – If given, low and high should not be given as they will be calculated from this grid.
int (num_samples) – if given , it will be used to calculated grid using the formula np.arange(low, high, step)
int – if given, it will be used to create grid space using the formula np.linspace(low, high, num_samples)
HyperOpt
- class ai4water.hyperopt.HyperOpt(algorithm: str, *, param_space, objective_fn, eval_on_best: bool = False, backend: Optional[str] = None, opt_path: Optional[str] = None, process_results: bool = True, verbosity: int = 1, **kwargs)[source]
Bases:
object
The purpose of this class is to provide a uniform and simplifed interface to use hyperopt, optuna, scikit-optimize and scikit-learn based hyperparameter optimization methods. Ideally this class should provide all the functionalities of beforementioned libaries with a uniform interface. It however also complements these libraries by combining their functionalities and adding some additional functionalities to them. On the other hand this class should not limit or complicate the use of its underlying libraries. This means all the functionalities of underlying libraries are available in this class as well. Moreover, you can use this class just as you use one of its underlying library.
The purpose here is to make a class which allows application of any of the available optimization methods on any type of model/classifier/regressor. If the classifier/regressor is of sklearn-based, then for random search, we use RanddomSearchCV, for grid search, we use GridSearchCV and for Bayesian, we use BayesSearchCV . On the other hand, if the model is not sklearn-based, you will still be able to implement any of the three methods. In such case, the bayesian will be implemented using gp_minimize. Random search and grid search will be done by simple iterating over the sample space generated as in sklearn based samplers. However, the post-processing of the results is (supposed to be) done same as is done in RandomSearchCV and GridSearchCV.
The class is expected to pass all the tests written in sklearn or skopt for corresponding classes.
For detailed use of this class see this hpo_tutorial
- - results dict
- - gpmin_results dict
- - skopt_results
- - hp_space
- - space
- - skopt_space
- - space dict
- - title str
- default this is same as name of algorithm. For AI4Water based
models, this is more detailed, containing problem type etc.
- Type:
name of the folder in which all results will be saved. By
- - eval_with_best: evaluates the objective_fn on best parameters
- - best_paras(): returns the best parameters from optimization.
- The following examples illustrate how we can uniformly apply different optimization algorithms.
Examples
>>> from ai4water import Model >>> from ai4water.hyperopt import HyperOpt, Categorical, Integer, Real >>> from ai4water.datasets import busan_beach >>> from SeqMetrics import RegressionMetrics >>> data = busan_beach() >>> input_features = ['tide_cm', 'wat_temp_c', 'sal_psu', 'air_temp_c', 'pcp_mm', 'pcp3_mm'] >>> output_features = ['tetx_coppml']
We have to define an objective function which will take keyword arguments and return a scaler value as output. This scaler value will be minized during optimzation
>>> def objective_fn(**suggestion)->float: ... # the objective function must receive new parameters as keyword arguments ... model = Model( ... input_features=input_features, ... output_features=output_features, ... model={"XGBRegressor": suggestion}, ... verbosity=0) ... ... model.fit(data=data) ... ... t, p = model.predict(return_true=True) ... mse = RegressionMetrics(t, p).mse() ... # the objective function must return a scaler value which needs to be minimized ... return mse
Define search space The search splace determines pool from which parameters are chosen during optimization.
>>> num_samples=5 # only relavent for random and grid search >>> search_space = [ ... Categorical(['gbtree', 'dart'], name='booster'), ... Integer(low=1000, high=2000, name='n_estimators', num_samples=num_samples), ... Real(low=1.0e-5, high=0.1, name='learning_rate', num_samples=num_samples) ... ] ... # Using Baysian with gaussian processes >>> optimizer = HyperOpt('bayes', objective_fn=objective_fn, param_space=search_space, ... num_iterations=num_iterations ) >>> optimizer.fit()
Using TPE with optuna
>>> num_iterations = 10 >>> optimizer = HyperOpt('tpe', objective_fn=objective_fn, param_space=search_space, ... backend='optuna', ... num_iterations=num_iterations ) >>> optimizer.fit()
Using cmaes with optuna
>>> optimizer = HyperOpt('cmaes', objective_fn=objective_fn, param_space=search_space, ... backend='optuna', ... num_iterations=num_iterations ) >>> optimizer.fit()
Using random with optuna, we can also try hyperopt and sklearn as backend for random algorithm
>>> optimizer = HyperOpt('random', objective_fn=objective_fn, param_space=search_space, ... backend='optuna', ... num_iterations=num_iterations ) >>> optimizer.fit()
Using TPE of hyperopt
>>> optimizer = HyperOpt('tpe', objective_fn=objective_fn, param_space=search_space, ... backend='hyperopt', ... num_iterations=num_iterations ) >>> optimizer.fit()
Using grid with sklearn
>>> optimizer = HyperOpt('grid', objective_fn=objective_fn, param_space=search_space, ... backend='sklearn', ... num_iterations=num_iterations ) >>> optimizer.fit()
- __init__(algorithm: str, *, param_space, objective_fn, eval_on_best: bool = False, backend: Optional[str] = None, opt_path: Optional[str] = None, process_results: bool = True, verbosity: int = 1, **kwargs)[source]
Initializes the class
- Parameters:
algorithm (str) – must be one of
random
,grid
,bayes
,bayes_rf
, andtpe
, defining which optimization algorithm to use.objective_fn (callable) – Any callable function whose returned value is to be minimized. It can also be either sklearn/xgboost based regressor/classifier.
param_space (list, dict) – the search space of parameters to be optimized. We recommend the use of Real, Integer and categorical classes from [ai4water.hyperopt][ai4water.hyperopt.Integer] (not from skopt.space). These classes allow a uniform way of defining the parameter space for all the underlying libraries. However, to make this class work exactly similar to its underlying libraries, the user can also define parameter space as is defined in its underlying libraries. For example, for hyperopt based method like ‘tpe’ the parameter space can be specified as in the examples of hyperopt library. In case the code breaks, please report.
eval_on_best (bool, optional) – if True, then after optimization, the objective_fn will be evaluated on best parameters and the results will be stored in the folder named “best” inside title folder.
opt_path – path to save the results
backend (str, optional) – Defines which backend library to use for the algorithm. For example the user can specify whether to use optuna or hyper_opt or sklearn for grid algorithm.
verbosity (bool, optional) – determines amount of information being printed
**kwargs – Any additional keyword arguments will for the underlying optimization algorithm. In case of using AI4Water model, these must be arguments which are passed to AI4Water’s Model class.
- add_previous_results(iterations: Optional[Union[dict, str]] = None, x: Optional[list] = None, y: Optional[list] = None)[source]
adds results from previous iterations.
If you have run the optimization priviously, you can make use of those results by appending them.
- Parameters:
iterations – It can be either a dictionary whose keys are y values and values are x or it can be a path to a file which contains these xy values as dictioary.
x – a list of lists where each sub-list is the value of hyperparameter at at one iteratio. The x and y arguments optional and will only be used if iterations are not provided.
y – a list of float values where each value in y is the output of objective_fn with corresponding x. The length of x and y must be equal.
- property backend
- best_iter() int [source]
returns the iteration on which best/optimized parameters are obtained. The indexing starts from 0.
- best_xy() dict [source]
Returns best (optimized) parameters as dictionary. The dictionary has two keys
x
andy
.x
is the best hyperparameters while y is the corresponding objective function value.
- eval_sequence(params, **kwargs)[source]
” kwargs :
any additional keyword arguments for objective_fn
- eval_with_best()[source]
Find the best parameters and evaluate the objective_fn with them. :param return_model bool: If True, then then the built objective_fn will be returned
- fit(*args, **kwargs)[source]
Makes and calls the underlying fit method
- Parameters:
**kwargs – any keyword arguments for the userdefined objective function
Example
>>> def objective_fn(a=2, b=5, **suggestions)->float: ... # do something e.g calcualte validation score >>> val_score = 2.0 >>> return val_score
- classmethod from_gp_parameters(fpath: str, objective_fn)[source]
loads results saved from bayesian optimization
- hp_space() dict [source]
returns a dictionary whose values are hyperopt equivalent space instances.
- load_results(fname: str)[source]
loads the previously computed results. It should not be used after .fit()
- Parameters:
fname (str) – complete path of hpo_results.bin file e.g. path/to/hpo_results.bin
- model_for_gpmin(**kws)[source]
- This function can be called in two cases
The user has made its own objective_fn.
We make objective_fn using AI4Water and return the error.
In first case, we just return what user has provided.
- property num_iterations
- property objective_fn_is_dl
- property opt_path
- optuna_objective(**kwargs)[source]
objective function that will used during random search method. :param kwargs: keyword arguments in the user defined objective function.
- optuna_study()[source]
Attempts to create an optuna Study instance so that optuna based plots can be generated.
Returns None, if not possible else Study
- property param_space
- plot_importance(save=True, show: bool = False, plot_type='box', with_optuna: bool = False, **tree_kws) Axes [source]
plots hyperparameter importance using fANOVA
- plot_parallel_coords(save=True, show=False, **kwargs)[source]
parallel coordinates of hyperparameters
- pre_calculated_results(resutls, from_gp_parameters=True)[source]
Loads the pre-calculated results i.e. x and y values which have been already evaluated.
- random_search(**kwargs)[source]
objective function that will used during random search method. :param kwargs: keyword arguments in the user defined objective function.
- property random_state
- save_results(results, path: Optional[str] = None)[source]
saves the hpo results so that they can be loaded using load_results method.
- Parameters:
results – hpo results i.e. output of optimizer.fit()
path – path where to save the results
- property title
- property use_named_args
- property use_own
- property use_sklearn
- property use_skopt_bayes
- property use_skopt_gpmin
- property use_tpe
fANOVA
- class ai4water.hyperopt.fANOVA(X: Union[ndarray, DataFrame], Y: ndarray, dtypes: List[str], bounds: List[Optional[tuple]], parameter_names=None, cutoffs=(-inf, inf), n_estimators=64, max_depth=64, random_state=313, **rf_kws)[source]
Calculation of parameter importance using FANOVA (Hutter et al., 2014).
- Parameters:
X – input data of shape (n_iterations, n_parameters). For hyperparameter optimization, iterations represent number of optimization iterations and parameter represent number of hyperparameters
Y – objective value corresponding to X. Its length should be same as that of
X
dtypes (list) – list of strings determining the type of hyperparameter. Allowed values are only
categorical
andnumerical
.bounds (list) – list of tuples, where each tuple defines the upper and lower limit of corresponding parameter
parameter_names (list) – names of features/parameters/hyperparameters
cutoffs (tuple) –
n_estimators (int) – number of trees
max_depth (int (default=64)) – maximum depth of trees
**rf_kws – keyword arguments to sklearn.ensemble.RandomForestRegressor
Examples
>>> import numpy as np >>> import pandas as pd >>> from ai4water.hyperopt import fANOVA >>> x = np.arange(20).reshape(10, 2).astype(float) >>> y = np.linspace(1, 30, 10).astype(float) ... # X are hyperparameters and Y are objective function values at corresponding iterations >>> f = fANOVA(X=x, Y=y, ... bounds=[(-2, 20), (-5, 50)], ... dtypes=["numerical", "numerical"], ... random_state=313, max_depth=3) ... # calculate importance >>> imp = f.feature_importance()
for categorical parameters
>>> x = pd.DataFrame(['2', '2', '3', '1', '1', '2', '2', '1', '3', '3', '3'], columns=['a']) >>> x['b'] = ['3', '3', '1', '3', '1', '2', '4', '4', '3', '3', '4'] >>> y = np.linspace(-1., 1.0, len(x)) >>> f = fANOVA(X=x, Y=y, bounds=[None, None], dtypes=['categorical', 'categorical'], ... random_state=313, max_depth=3, n_estimators=1) ... # calculate importance >>> imp = f.feature_importance()
for mix types
>>> x = pd.DataFrame(['2', '2', '3', '1', '1', '2', '2', '1', '3', '3', '3'], columns=['a']) >>> x['b'] = np.arange(100, 100+len(x)) >>> y = np.linspace(-1., 2.0, len(x)) >>> f = fANOVA(X=x, Y=y, bounds=[None, (10, 150)], dtypes=['categorical', 'numerical'], ... random_state=313, max_depth=5, n_estimators=5) ... # calculate importance >>> imp = f.feature_importance()
- __init__(X: Union[ndarray, DataFrame], Y: ndarray, dtypes: List[str], bounds: List[Optional[tuple]], parameter_names=None, cutoffs=(-inf, inf), n_estimators=64, max_depth=64, random_state=313, **rf_kws)[source]
- set_cutoffs(cutoffs=(-inf, inf), quantile=None)[source]
Setting the cutoffs to constrain the input space
To properly do things like ‘improvement over default’ the fANOVA now supports cutoffs on the y values. These will exclude parts of the parameters space where the prediction is not within the provided cutoffs. This is is specialization of “Generalized Functional ANOVA Diagnostics for High Dimensional Functions of Dependent Variables” by Hooker.