models
DualAttentionModel
- class ai4water.tf_models.DualAttentionModel(enc_config: Optional[dict] = None, dec_config: Optional[dict] = None, teacher_forcing: bool = True, **kwargs)[source]
Bases:
ai4water.functional.Model
This is Dual-Attention LSTM model of Qin et al., 2017. The code is adopted from this repository
Example
>>> from ai4water import DualAttentionModel >>> from ai4water.datasets import busan_beach >>> data = busan_beach() >>> model = DualAttentionModel(lookback=5, ... input_features=data.columns.tolist()[0:-1], ... output_features=data.columns.tolist()[-1:]) If you do not wish to feed previous output as input to the model, you can set teacher forcing to False. The drop_remainder argument must be set to True in such a case. >>> model = DualAttentionModel(teacher_forcing=False, batch_size=4, ... drop_remainder=True, ts_args={'lookback':5}) >>> model.fit(data=data)
- __init__(enc_config: Optional[dict] = None, dec_config: Optional[dict] = None, teacher_forcing: bool = True, **kwargs)[source]
- Parameters
enc_config –
dictionary defining configuration of encoder/input attention. It must have following three keys
n_h: 20
n_s: 20
m: 20
enc_lstm1_act: None
enc_lstm2_act: None
dec_config –
dictionary defining configuration of decoder/output attention. It must have following three keys
p: 30
n_hde0: None
n_sde0: None
teacher_forcing – Whether to use the prvious target/observation as input or not. If yes, then the model will require 2 inputs. The first input will be of shape (num_examples, lookback, num_inputs) while the second input will be of shape (num_examples, lookback-1, 1). This second input is supposed to be the target variable observed at previous time step.
kwargs – The keyword arguments for the [ai4water’s Model][ai4water.Model] class
- get_attention_weights(layer_name: Optional[str] = None, data='training') numpy.ndarray [source]
- Parameters
layer_name (str, optional) – the name of attention layer. If not given, the final attention layer will be used.
data (str, optional) –
the data to make forward pass to get attention weghts. Possible values are
training
validation
test
- Return type
a numpy array of shape (num_examples, lookback, num_ins)
- interpret(data='training', **kwargs)[source]
Interprets the underlying model. Call it after training.
- Returns
An instance of
ai4water.postprocessing.interpret.Interpret
class
Example
>>> from ai4water import Model >>> from ai4water.datasets import busan_beach >>> model = Model(model=...) >>> model.fit(data=busan_beach()) >>> model.interpret()
- property mode: str
- one_decoder_attention_step(_h_de_prev, _s_de_prev, _h_en_all, t)[source]
- Parameters
_h_de_prev – previous hidden state
_s_de_prev – previous cell state
_h_en_all – (None,T,m),n is length of input series at time t,T is length of time series
t – int, timestep
- Returns
x_t’s attention weights,total n numbers,sum these are 1
- one_encoder_attention_step(h_prev, s_prev, x, t, suf: str = '1')[source]
- Parameters
h_prev – previous hidden state
s_prev – previous cell state
x – (T,n),n is length of input series at time t,T is length of time series
t – time-step
suf – str, Suffix to be attached to names
- Returns
x_t’s attention weights,total n numbers,sum these are 1
- plot_act_along_inputs(layer_name: str, name: Optional[str] = None, vmin=None, vmax=None, data='training', show=False)[source]
TemporalFusionTransformer
- class ai4water.models.tensorflow.TemporalFusionTransformer(*args, **kwargs)[source]
Bases:
keras.engine.base_layer.Layer
Implements the model of https://arxiv.org/pdf/1912.09363.pdf This layer applies variable selection three times. First on static inputs, then on encoder inputs and then on decoder inputs. The corresponding weights are called static_weights, historical_weights and future_weights respectively.
1, 11, 21, 31, a 2, 12, 22, 32, b 3, 13, 23, 33, c 4, 14, 24, 34, d
- Parameters
hidden_units (int) – determines the depth/weight matrices size in TemporalFusionTransformer.
num_encoder_steps (int) – lookback steps used in the model.
num_heads (int) – must be>=1, number of attention heads to be used in MultiheadAttention layer.
num_inputs (int) – number of input features
total_time_steps (int) – greater than num_encoder_steps, This is sum of lookback steps + forecast length. Forecast length is the number of horizons to be predicted.
known_categorical_inputs (list) – a,b,c
input_obs_loc –
unknown_inputs –
static_inputs –
None/list (static_input_loc) – location of static inputs
category_counts (list) – Number of categories per categorical variable
use_cnn (bool) – whether to use cnn or not. If not, then lstm will be used otherwise 1D CNN will be used with “causal” padding.
kernel_size (int) – kernel size for 1D CNN. Only valid if use_cnn is True.
use_cudnn (bool) – default False, Whether to use Keras CuDNNLSTM or standard LSTM layers
dropout_rate (float) – default 0.1, >=0 and <=1 amount of dropout to be used at GRNs.
future_inputs (bool) – whether the given data contains futre known observations or not.
bool (return_sequences) – If True, then this layer (upon its call) will return outputs + attention componnets. Attention components are dictionary consisting of following keys and their values as numpy arrays.
bool – if True, then output and attention weights will consist of encoder_lengths/lookback and decoder_length/forecast_len. Otherwise predictions for only decoder_length will be returned.
Example
>>> params = {'num_inputs': 3, 'total_time_steps': 192, 'known_regular_inputs': [0, 1, 2]} >>> output_size = 1 >>> quantiles = [0.25, 0.5, 0.75] >>> layers = { >>> "Input": {"config": {"shape": (params['total_time_steps'], params['num_inputs']), 'name': "Model_Input"}}, >>> "TemporalFusionTransformer": {"config": params}, >>> "lambda": {"config": tf.keras.layers.Lambda(lambda _x: _x[Ellipsis, -1, :])}, >>> "Dense": {"config": {"units": output_size * len(quantiles)}}, >>> 'Reshape': {'target_shape': (3, 1)}}
- __init__(hidden_units: int, num_encoder_steps: int, num_heads: int, num_inputs: int, total_time_steps: int, known_categorical_inputs, static_input_loc, category_counts, known_regular_inputs, input_obs_loc, use_cnn: bool = False, kernel_size: Optional[int] = None, use_cudnn: bool = False, dropout_rate: float = 0.1, future_inputs: bool = False, return_attention_components: bool = False, return_sequences: bool = False, **kwargs)[source]
- get_tft_embeddings(all_inputs)[source]
Transforms raw inputs to embeddings.
Applies linear transformation onto continuous variables and uses embeddings for categorical variables.
- Parameters
all_inputs – Inputs to transform [batch_size, time_steps, input_features]
can (whre time_steps include both lookback and forecast. The input_features dimention of all_inputs) –
static_inputs (contain following inputs.) –
obs_inputs –
categorical_inputs –
regular_inputs. –
- Returns
Tensors for transformed inputs. unknown_inputs: known_combined_layer: contains regular inputs and categorical inputs (all known) obs_inputs: target values to be used as inputs. static_inputs
NBeats
- class ai4water.models.tensorflow.NBeats(*args, **kwargs)[source]
Bases:
keras.engine.base_layer.Layer
This implementation is same as that of Philip peremy with few modifications. Here NBeats can be used as a layer. The output shape will be (batch_size, forecast_length, input_dim) Some other changes have also been done to make this layer compatable with ai4water.
Example
>>> x = np.random.random((100, 10, 3)) >>> y = np.random.random((100, 1)) ... >>> model = Model(model={"layers": >>> {"Input": {"shape": (10, 3)}, >>> "NBeats": {"lookback": 10, "forecast_length": 1, "num_exo_inputs": 2}, >>> "Flatten": {}, >>> "Reshape": {"target_shape": (1,1)}}}, >>> ts_args={'lookback':10}) ... >>> model.fit(x=x, y=y.reshape(-1,1,1))
- __init__(units: int = 256, lookback: int = 10, forecast_len: int = 2, stack_types=('trend', 'seasonality'), nb_blocks_per_stack=3, thetas_dim=(4, 8), share_weights_in_stack=False, nb_harmonics=None, num_inputs=1, num_exo_inputs=0, **kwargs)[source]
Initiates the Nbeats layer
- Parameters
units – Number of units in NBeats layer. It determines the size of NBeats.
lookback – Number of historical time-steps used to predict next value
forecast_len –
stack_types –
nb_blocks_per_stack –
theta_dim –
share_weights_in_stack –
nb_harmonics –
num_inputs –
num_exo_inputs –
kwargs –
- GENERIC_BLOCK = 'generic'
- SEASONALITY_BLOCK = 'seasonality'
- TREND_BLOCK = 'trend'
HARHNModel
- class ai4water.pytorch_models.HARHNModel(*args, **kwargs)[source]
Bases:
ai4water.main.Model
- __init__(use_cuda=True, teacher_forcing=True, **kwargs)[source]
Initializes the layers of NN model using initialize_layers method. All other input arguments goes to BaseModel.
- forward(*inputs: Any, **kwargs: Any)[source]
implements forward pass implementation for pytorch based NN models.
- initialize_layers(layers_config: dict, inputs=None)[source]
Initializes the layers/weights/variables which are to be used in forward or call method.
- Parameters
layers_config (python dictionary to define neural network. For details) – [see](https://ai4water.readthedocs.io/en/latest/build_dl_models.html)
inputs (if None, it will be supposed the the Input layer either) – exists in layers_config or an Input layer will be created withing this method before adding any other layer. If not None, then it must be in Input layer and the remaining NN architecture will be built as defined in layers_config. This can be handy when we want to use this method several times to build a complex or parallel NN structure. Avoid Input in layer names.
IMVModel
- class ai4water.pytorch_models.IMVModel(*args, **kwargs)[source]
Bases:
ai4water.pytorch_models.HARHNModel
- __init__(*args, teacher_forcing=False, **kwargs)[source]
Initializes the layers of NN model using initialize_layers method. All other input arguments goes to BaseModel.
- forward(*inputs: Any, **kwargs: Any)[source]
implements forward pass implementation for pytorch based NN models.
- initialize_layers(layers_config: dict, inputs=None)[source]
Initializes the layers/weights/variables which are to be used in forward or call method.
- Parameters
layers_config (python dictionary to define neural network. For details) – [see](https://ai4water.readthedocs.io/en/latest/build_dl_models.html)
inputs (if None, it will be supposed the the Input layer either) – exists in layers_config or an Input layer will be created withing this method before adding any other layer. If not None, then it must be in Input layer and the remaining NN architecture will be built as defined in layers_config. This can be handy when we want to use this method several times to build a complex or parallel NN structure. Avoid Input in layer names.
- interpret(data='training', x=None, annotate=True, vmin=None, vmax=None, **bar_kws)[source]
Interprets the underlying model. Call it after training.
- Returns
An instance of
ai4water.postprocessing.interpret.Interpret
class
Example
>>> from ai4water import Model >>> from ai4water.datasets import busan_beach >>> model = Model(model=...) >>> model.fit(data=busan_beach()) >>> model.interpret()
MLP
- ai4water.models.MLP(units: Union[int, list] = 32, num_layers: int = 1, input_shape: Optional[tuple] = None, output_features: int = 1, activation: Optional[Union[str, list]] = None, dropout: Optional[Union[float, list]] = None, mode: str = 'regression', output_activation: Optional[str] = None, **kwargs) dict [source]
helper function to make multi layer perceptron model. This model consists of stacking layers of Dense layers. The number of dense layers are defined by
num_layers
. Each layer can be optionaly followed by a Dropout layer.- Parameters
units (Union[int, list], default=32) – number of units in Dense layer
num_layers (int, optional, (default, 32)) – number of Dense layers to use excluding output layer.
input_shape (tuple, optional (default=None)) – shape of input tensor to the model. If specified, it should exclude batch_size for example if model takes inputs (num_examples, num_features) then we should define the shape as (num_features,). The batch_size dimension is always None.
output_features (int, optional) – number of output features from the network
activation (Union[str, list], optional) – activation function to use.
dropout (Union[float, list], optional) – dropout to use in Dense layer
mode (str, optional) – either
regression
orclassification
output_activation (str, optional (default=None)) – activation of the output layer. If not given and the mode is clsasification then the activation of output layer is decided based upon
output_features
argument. In such a case, for binary classification, sigmoid with 1 output neuron is preferred. Therefore, even if the output_features are 2, the last layer will have 1 neuron and activation function issigmoid
. Although the user can setsoftmax
for 2 output_features as well (binary classification) but this seems superfluous and is slightly more expensive. For multiclass, the last layer will have neurons equal to output_features andsoftmax
as activation.**kwargs – any additional keyword arguments for Dense layer
- Returns
a dictionary with ‘layers’ as key which can be fed to ai4water’s Model
- Return type
dict
Examples
>>> from ai4water import Model >>> from ai4water.models import MLP >>> from ai4water.datasets import busan_beach >>> data = busan_beach() >>> input_features = data.columns.tolist()[0:-1] >>> output_features = data.columns.tolist()[-1:] ... # build a basic MLP >>> MLP(32) ... # MLP with 3 Dense layers >>> MLP(32, 3) ... # we can specify input shape as 3d (first dimension is always None) >>> MLP(32, 3, (5, 10)) ... # we can also specify number of units for each layer >>> MLP([32, 16, 8], 3, (10, 1)) ... # we can feed any argument which is accepted by Dense layer >>> mlp = MLP(32, 3, (10, ), use_bias=True, activation="relu") ... # we can feed the output of MLP to ai4water's Model >>> model = Model(model=mlp, input_features=input_features, >>> output_features=output_features) >>> model.fit(data=data)
LSTM
- ai4water.models.LSTM(units: Union[int, list] = 32, num_layers: int = 1, input_shape: Optional[tuple] = None, output_features: int = 1, activation: Optional[Union[str, list]] = None, dropout: Optional[Union[float, list]] = None, mode: str = 'regression', output_activation: Optional[str] = None, **kwargs)[source]
helper function to make LSTM Model
- Parameters
units (Union[int, list], optional (default 32)) – number of units in LSTM layer
num_layers – number of lstm layers to use
input_shape (tuple, optional (default=None)) – shape of input tensor to the model. If specified, it should exclude batch_size for example if model takes inputs (num_examples, lookback, num_features) then we should define the shape as (lookback, num_features). The batch_size dimension is always None.
output_features (int, optinoal (default=1)) – number of output features. If
mode
isclassification
, this refers to number of classes.activation (Union[str, list], optional) – activation function to use in LSTM
dropout – if > 0.0, a dropout layer is added after each LSTM layer
mode (str, optional) – either
regression
orclassification
output_activation (str, optional (default=None)) – activation of the output layer. If not given and the mode is clsasification then the activation of output layer is decided based upon
output_features
argument. In such a case, for binary classification, sigmoid with 1 output neuron is preferred. Therefore, even if the output_features are 2, the last layer will have 1 neuron and activation function issigmoid
. Although the user can setsoftmax
for 2 output_features as well (binary classification) but this seems superfluous and is slightly more expensive. For multiclass, the last layer will have neurons equal to output_features andsoftmax
as activation.**kwargs – any keyword argument for LSTM layer
- Returns
a dictionary with ‘layers’ as key
- Return type
dict
Examples
>>> from ai4water import Model >>> from ai4water.datasets import busan_beach >>> data = busan_beach() >>> input_features = data.columns.tolist()[0:-1] >>> output_features = data.columns.tolist()[-1:] # a simple LSTM model with 32 neurons/units >>> LSTM(32) # to build a model with stacking of LSTM layers >>> LSTM(32, num_layers=2) # we can build ai4water's model and train it >>> lstm = LSTM(32) >>> model = Model(model=lstm, input_features=input_features, >>> output_features=output_features, ts_args={"lookback": 5}) >>> model.fit(data=data)
CNN
- ai4water.models.CNN(filters: Union[int, list] = 32, kernel_size: Union[int, tuple, list] = 3, convolution_type: str = '1D', num_layers: int = 1, padding: Union[str, list] = 'same', strides: Union[int, list] = 1, pooling_type: Optional[Union[str, list]] = None, pool_size: Union[int, list] = 2, batch_normalization: Optional[Union[bool, list]] = None, activation: Optional[Union[str, list]] = None, dropout: Optional[Union[float, list]] = None, input_shape: Optional[tuple] = None, output_features: int = 1, mode: str = 'regression', output_activation: Optional[str] = None, **kwargs) dict [source]
helper function to make convolution neural network based model.
- Parameters
filters (Union[int, list], optional) – number of filters in convolution layer. If given as list, it should be equal to
num_layers
.kernel_size (Union[int, list], optional) – kernel size in (each) convolution layer
convolution_type (str, optional, (default="1D")) – either
1D
or2D
or3D
num_layers (int, optional) – number of convolution layers to use. Should be > 0.
padding (Union[str, list], optional) – padding to use in (each) convolution layer
strides (Union[int, list], optional) – strides to use in (each) convolution layer
pooling_type (str, optional) – either “MaxPool” or “AveragePooling”
pool_size (Union[int, list], optional) – only valid if pooling_type is not None
batch_normalization – whether to use batch_normalization after each convolution or convolution+pooling layer. If true, a batch_norm layer is added.
activation (Union[str, list], optional) – activation function to use in convolution layer
dropout (Union[float, list], optional) – if > 0.0, a dropout layer is added after each LSTM layer
input_shape (tuple, optional (default=None)) – shape of input tensor to the model. If specified, it should exclude batch_size for example if model takes inputs (num_examples, lookback, num_features) then we should define the shape as (lookback, num_features). The batch_size dimension is always None.
output_features (int, optional, (default=1)) – number of output features. If
mode
isclassification
, this refers to number of classes.mode (str, optional) – either
regression
orclassification
output_activation (str, optional (default=None)) – activation of the output layer. If not given and the mode is clsasification then the activation of output layer is decided based upon
output_features
argument. In such a case, for binary classification, sigmoid with 1 output neuron is preferred. Therefore, even if the output_features are 2, the last layer will have 1 neuron and activation function issigmoid
. Although the user can setsoftmax
for 2 output_features as well (binary classification) but this seems superfluous and is slightly more expensive. For multiclass, the last layer will have neurons equal to output_features andsoftmax
as activation.**kwargs – any keyword argument for Convolution layer
- Returns
a dictionary with ‘layers’ as key
- Return type
dict
Examples
>>> CNN(32, 2, "1D", input_shape=(5, 10))
>>> CNN(32, 2, "1D", pooling_type="MaxPool", input_shape=(5, 10))
CNNLSTM
- ai4water.models.CNNLSTM(input_shape: tuple, sub_sequences=3, cnn_layers: int = 2, lstm_layers: int = 1, filters: Union[int, list] = 32, kernel_size: Union[int, tuple, list] = 3, max_pool: bool = False, units: Union[int, tuple, list] = 32, output_features: int = 1, mode: str = 'regression', output_activation: Optional[str] = None) dict [source]
helper function to make CNNLSTM model. It adds one or more 1D convolutional layers before one or more LSTM layers.
- Parameters
input_shape (tuple) – shape of input tensor to the model. If specified, it should exclude batch_size for example if model takes inputs (num_examples, lookback, num_features) then we should define the shape as (lookback, num_features). The batch_size dimension is always None.
sub_sequences (int) – number of sub_sequences in which to divide the input before applying Conv1D on it.
cnn_layers (int , optional (default=2)) – number of cnn layers
lstm_layers – number of lstm layers
filters (Union[int, list], optional) – number of filters in (each) cnn layer
kernel_size (Union[int, tuple, list], optional) – kernel size in (each) cnn layer
max_pool (bool, optional (default=True)) – whether to use max_pool after every cnn layer or not
units (Union[int, list], optional (default=32)) – number of units in (each) lstm layer
output_features (int, optional (default=1)) – number of output features. If
mode
isclassification
, this refers to number of classes.mode (str, optional (default="regression")) – either
regression
orclassification
output_activation (str, optional (default=None)) – activation of the output layer. If not given and the mode is clsasification then the activation of output layer is decided based upon
output_features
argument. In such a case, for binary classification, sigmoid with 1 output neuron is preferred. Therefore, even if the output_features are 2, the last layer will have 1 neuron and activation function issigmoid
. Although the user can setsoftmax
for 2 output_features as well (binary classification) but this seems superfluous and is slightly more expensive. For multiclass, the last layer will have neurons equal to output_features andsoftmax
as activation.
- Returns
a dictionary with
layers
as key- Return type
dict
Examples
>>> from ai4water.models import CNNLSTM >>>model = CNNLSTM(input_shape=(9, 13), sub_sequences=3)
TCN
- ai4water.models.TCN(input_shape, filters: int = 32, kernel_size: int = 2, nb_stacks: int = 1, dilations=[1, 2, 4, 8, 16, 32], output_features: int = 1, mode='regression', output_activation: Optional[str] = None, **kwargs) dict [source]
helper function for building temporal convolution network
- Parameters
input_shape (tuple) – shape of input tensor to the model. This shape should exclude batch_size for example if model takes inputs (num_examples, num_features) then we should define the shape as (num_features,). The batch_size dimension is always None.
filters (int, optional (default=32)) – number of filters
kernel_size (int, optional (default=2)) – kernel size
nb_stacks (int, optional (default=) – number of stacks of tcn layer
dilations – dilation rate
output_features (int, optional) – number of output features. If
mode
isclassification
, this refers to number of classes.mode (str, optional (default="regression")) – either
regression
orclassification
output_activation (str, optional (default=None)) – activation of the output layer. If not given and the mode is clsasification then the activation of output layer is decided based upon
output_features
argument. In such a case, for binary classification, sigmoid with 1 output neuron is preferred. Therefore, even if the output_features are 2, the last layer will have 1 neuron and activation function issigmoid
. Although the user can setsoftmax
for 2 output_features as well (binary classification) but this seems superfluous and is slightly more expensive. For multiclass, the last layer will have neurons equal to output_features andsoftmax
as activation.**kwargs – any additional keyword argument
- Returns
a dictionary with
layers
as key- Return type
dict
Examples
>>> TCN((5, 10), 32)
LSTMAutoEncoder
- ai4water.models.LSTMAutoEncoder(input_shape: tuple, encoder_layers: int = 1, decoder_layers: int = 1, encoder_units: Union[int, list] = 32, decoder_units: Union[int, list] = 32, output_features: int = 1, prediction_mode: bool = True, mode: str = 'regression', output_activation: Optional[str] = None, **kwargs) dict [source]
helper function to make LSTM based AutoEncoder model.
- Parameters
input_shape (tuple) – shape of input tensor to the model. This shape should exclude batch_size for example if model takes inputs (num_examples, num_features) then we should define the shape as (num_features,). The batch_size dimension is always None.
encoder_layers (int, optional (default=1)) – number of encoder LSTM layers
decoder_layers (int, optional (default=1)) – number of decoder LSTM layers
encoder_units (Union[int, list], optional, (default=32)) – number of units in (each) encoder LSTM
decoder_units (Union[int, list], optional, (default=32)) – number of units in (each) decoder LSTM
prediction_mode (bool, optional (default="prediction")) – either “prediction” or “reconstruction”
output_features (int, optional) – number of output features. If
mode
isclassification
, this refers to number of classes.mode (str, optional (default="regression")) – either
regression
orclassification
output_activation (str, optional (default=None)) – activation of the output layer. If not given and the mode is clsasification then the activation of output layer is decided based upon
output_features
argument. In such a case, for binary classification, sigmoid with 1 output neuron is preferred. Therefore, even if the output_features are 2, the last layer will have 1 neuron and activation function issigmoid
. Although the user can setsoftmax
for 2 output_features as well (binary classification) but this seems superfluous and is slightly more expensive. For multiclass, the last layer will have neurons equal to output_features andsoftmax
as activation.**kwargs –
- Returns
a dictionary with
layers
as key- Return type
dict
Examples
>>> LSTMAutoEncoder((5, 10), 2, 2, 32, 32)
>>> LSTMAutoEncoder((5, 10), 2, 2, [64, 32], [32, 64])
TFT
- ai4water.models.TFT(input_shape, hidden_units: int = 32, num_heads: int = 3, dropout: float = 0.1, output_features: int = 1, use_cudnn: bool = False, mode: str = 'regression', output_activation: Optional[str] = None) dict [source]
helper function for temporal fusion transformer based model
- Parameters
input_shape (tuple) – shape of input tensor to the model. This shape should exclude batch_size for example if model takes inputs (num_examples, num_features) then we should define the shape as (num_features,). The batch_size dimension is always None.
hidden_units (int, optional (default=32)) – number of hidden units
num_heads (int, optional (default=1)) – number of attention heads
dropout (int, optional (default=0.1)) – droput rate
output_features (int, optional (default=1)) – number of output features. If
mode
isclassification
, this refers to number of classes.use_cudnn (bool, optional (default=False)) – whether to use cuda or not
mode (str, optional (default="regression")) – either
regression
orclassification
output_activation (str, optional (default=None)) – activation of the output layer. If not given and the mode is clsasification then the activation of output layer is decided based upon
output_features
argument. In such a case, for binary classification, sigmoid with 1 output neuron is preferred. Therefore, even if the output_features are 2, the last layer will have 1 neuron and activation function issigmoid
. Although the user can setsoftmax
for 2 output_features as well (binary classification) but this seems superfluous and is slightly more expensive. For multiclass, the last layer will have neurons equal to output_features andsoftmax
as activation.
- Returns
a dictionary with
layers
as key- Return type
dict
Examples