Tensorflow Layers

MCLSTM

class ai4water.models._tensorflow.MCLSTM(*args, **kwargs)[source]

Bases: Layer

Mass-Conserving LSTM model from Hoedt et al. [1]_.

This implementation follows of NeuralHydrology’s implementation of MCLSTM with some changes: 1) reduced sum is not performed for over the units 2) time_major argument is added 3) no implementation of Embedding

Examples

>>> from ai4water.models._tensorflow import MCLSTM
>>> import tensorflow as tf
>>> inputs = tf.range(150, dtype=tf.float32)
>>> inputs = tf.reshape(inputs, (10, 5, 3))
>>> mc = MCLSTM(1, 2, 8, 1)
>>> h = mc(inputs)  # (batch, units)
...
>>> mc = MCLSTM(1, 2, 8, 1, return_sequences=True)
>>> h = mc(inputs)  # (batch, lookback, units)
...
>>> mc = MCLSTM(1, 2, 8, 1, return_state=True)
>>> _h, _o, _c = mc(inputs)  # (batch, lookback, units)
...
>>> mc = MCLSTM(1, 2, 8, 1, return_state=True, return_sequences=True)
>>> _h, _o, _c = mc(inputs)  # (batch, lookback, units)
...
... # with time_major as True
>>> inputs = tf.range(150, dtype=tf.float32)
>>> inputs = tf.reshape(inputs, (5, 10, 3))
>>> mc = MCLSTM(1, 2, 8, 1, time_major=True)
>>> _h = mc(inputs)  # (batch, units)
...
>>> mc = MCLSTM(1, 2, 8, 1, time_major=True, return_sequences=True)
>>> _h = mc(inputs)  # (lookback, batch, units)
...
>>> mc = MCLSTM(1, 2, 8, 1, time_major=True, return_state=True)
>>> _h, _o, _c = mc(inputs)  # (batch, units), ..., (lookback, batch, units)
...
... # end to end keras Model
>>> from tensorflow.keras.layers import Dense, Input
>>> from tensorflow.keras.models import Model
>>> import numpy as np
...
>>> inp = Input(batch_shape=(32, 10, 3))
>>> lstm = MCLSTM(1, 2, 8)(inp)
>>> out = Dense(1)(lstm)
...
>>> model = Model(inputs=inp, outputs=out)
>>> model.compile(loss='mse')
...
>>> x = np.random.random((320, 10, 3))
>>> y = np.random.random((320, 1))
>>> y = model.fit(x=x, y=y)

References

__init__(num_mass_inputs, dynamic_inputs, units, num_targets=1, time_major: bool = False, return_sequences: bool = False, return_state: bool = False, name='MCLSTM', **kwargs)[source]

Parameters:

num_targets (int) – number of inputs for which mass balance is to be reserved.
dynamic_inputs – number of inpts other than mass_targets
units – hidden size, determines the size of weight matrix
time_major (bool, optional (default=True)) – if True, the data is expected to be of shape (lookback, batch_size, input_features) otherwise, data is expected of shape (batch_size, lookback, input_features)

call(inputs)[source]

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters:

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns:

A tensor or list/tuple of tensors.

Conditionalize

class ai4water.models._tensorflow.Conditionalize(*args, **kwargs)[source]

Bases: Layer

Mimics the behaviour of cond_rnn of Philipperemy but puts the logic of condition in a separate layer so that it becomes easier to use it.

Example

>>> from ai4water.models._tensorflow import Conditionalize
>>> from tensorflow.keras.layers import Input, LSTM
>>> i = Input(shape=(10, 3))
>>> raw_conditions = Input(shape=(14,))
>>> processed_conds = Conditionalize(32)([raw_conditions, raw_conditions, raw_conditions])
>>> rnn = LSTM(32)(i, initial_state=[processed_conds, processed_conds])

This layer can also be used in ai4water model when defining the model using declarative model definition style

>>> from ai4water import Model
>>> import numpy as np
>>> model = Model(model={"layers": {
...    "Input": {"shape": (10, 3)},
...    "Input_cat": {"shape": (10,)},
...    "Conditionalize": {"config": {"units": 32, "name": "h_state"},
...                       "inputs": "Input_cat"},
...    "LSTM": {"config": {"units": 32},
...             "inputs": "Input",
...                   'call_args': {'initial_state': ['h_state', 'h_state']}},
...    "Dense": {"units": 1}}},
...    ts_args={"lookback": 10}, verbosity=0, epochs=1)
... # define the input and call the .fit method
>>> x1 = np.random.random((100, 10, 3))
>>> x2 = np.random.random((100, 10))
>>> y = np.random.random(100)
>>> h = model.fit(x=[x1, x2], y=y)

__init__(units, max_num_cond=10, use_bias: bool = True, **kwargs)[source]

EALSTM

class ai4water.models._tensorflow.EALSTM(*args, **kwargs)[source]

Bases: Layer

Entity Aware LSTM as proposed by Kratzert et al., 2019 [1]_

The difference here is that a Dense layer is not applied on cell state as done in original implementation in NeuralHydrology [2]. This is left to user’s discretion.

Examples

>>> from ai4water.models._tensorflow import EALSTM
>>> import tensorflow as tf
>>> batch_size, lookback, num_dyn_inputs, num_static_inputs, units = 10, 5, 3, 2, 8
>>> inputs = tf.range(batch_size*lookback*num_dyn_inputs, dtype=tf.float32)
>>> inputs = tf.reshape(inputs, (batch_size, lookback, num_dyn_inputs))
>>> stat_inputs = tf.range(batch_size*num_static_inputs, dtype=tf.float32)
>>> stat_inputs = tf.reshape(stat_inputs, (batch_size, num_static_inputs))
>>> lstm = EALSTM(units, num_static_inputs)
>>> h_n = lstm(inputs, stat_inputs)  # -> (batch_size, units)
...
... # with return sequences
>>> lstm = EALSTM(units, num_static_inputs, return_sequences=True)
>>> h_n = lstm(inputs, stat_inputs)  # -> (batch, lookback, units)
...
... # with return sequences and return_state
>>> lstm = EALSTM(units, num_static_inputs, return_sequences=True, return_state=True)
>>> h_n, [c_n, y_hat] = lstm(inputs, stat_inputs)  # -> (batch, lookback, units), [(), ()]
...
... # end to end Keras model
>>> from tensorflow.keras.models import Model
>>> from tensorflow.keras.layers import Input, Dense
>>> import numpy as np
>>> inp_dyn = Input(batch_shape=(batch_size, lookback, num_dyn_inputs))
>>> inp_static = Input(batch_shape=(batch_size, num_static_inputs))
>>> lstm = EALSTM(units, num_static_inputs)(inp_dyn, inp_static)
>>> out = Dense(1)(lstm)
>>> model = Model(inputs=[inp_dyn, inp_static], outputs=out)
>>> model.compile(loss='mse')
>>> print(model.summary())
... # generate hypothetical data and train it
>>> dyn_x = np.random.random((100, lookback, num_dyn_inputs))
>>> static_x = np.random.random((100, num_static_inputs))
>>> y = np.random.random((100, 1))
>>> h = model.fit(x=[dyn_x, static_x], y=y, batch_size=batch_size)

References

__init__(units: int, num_static_inputs: int, use_bias: bool = True, activation='tanh', recurrent_activation='sigmoid', static_activation='sigmoid', kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', static_initializer='glorot_uniform', kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, static_constraint=None, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, static_regularizer=None, return_state=False, return_sequences=False, time_major=False, **kwargs)[source]

Parameters:

units (int) – number of units
num_static_inputs (int) – number of static features
static_activation – activation function for static input gate
static_regularizer –
static_constraint –
static_initializer –

build(input_shape)[source]: kernel, recurrent_kernel and bias are initiated for 3 gates instead of 4 gates as in original LSTM

call(inputs, static_inputs, initial_state=None, **kwargs)[source]

static_inputs :: of shape (batch, num_static_inputs)

cell(inputs, i, states)[source]

TransformerBlocks

class ai4water.models._tensorflow.TransformerBlocks(*args, **kwargs)[source]

Bases: Layer

This layer stacks Transformers on top of each other.

Example

>>> import numpy as np
>>> from tensorflow.keras.models import Model
>>> from tensorflow.keras.layers import Input, Dense
>>> from ai4water.models._tensorflow import TransformerBlocks
>>> inp = Input(shape=(10, 32))
>>> out, _ = TransformerBlocks(4, 4, 32)(inp)
>>> out = Dense(1)(out)
>>> model = Model(inputs=inp, outputs=out)
>>> model.compile(optimizer="Adam", loss="mse")
>>> x = np.random.random((100, 10, 32))
>>> y = np.random.random(100)
>>> h = model.fit(x,y)

__init__(num_blocks: int, num_heads: int, embed_dim: int, name: str = 'TransformerBlocks', **kwargs)[source]

Parameters:

num_blocks (int) –
num_heads (int) –
embed_dim (int) –
**kwargs – additional keyword arguments for ai4water.models.tensorflow.Transformer

get_config() → dict[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

Transformer

class ai4water.models._tensorflow.Transformer(*args, **kwargs)[source]

Bases: Layer

A basic transformer block consisting of LayerNormalization -> Add -> MultiheadAttention -> MLP ->

Example

>>> import numpy as np
>>> from tensorflow.keras.models import Model
>>> from tensorflow.keras.layers import Input, Dense
>>> from ai4water.models._tensorflow import Transformer
>>> inp = Input(shape=(10, 32))
>>> out, _ = Transformer(4, 32)(inp)
>>> out = Dense(1)(out)
>>> model = Model(inputs=inp, outputs=out)
>>> model.compile(optimizer="Adam", loss="mse")
>>> x = np.random.random((100, 10, 32))
>>> y = np.random.random(100)
>>> h = model.fit(x,y)

__init__(num_heads: int = 4, embed_dim: int = 32, dropout=0.1, post_norm: bool = True, prenorm_mlp: bool = False, num_dense_lyrs: int = 1, seed: int = 313, *args, **kwargs)[source]

Parameters:

num_heads (int) – number of attention heads
embed_dim (int) – embedding dimension. This value is also used for units/neurons in MLP blocl
dropout (float) – dropout rate in MLP blocl
post_norm (bool (default=True)) – whether to apply LayerNormalization on the outputs or not.
prenorm_mlp (bool) – whether to apply LayerNormalization on inputs of MLP or not
num_dense_lyrs (int) – number of Dense layers in MLP block.

get_config() → dict[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

TabTransformer

class ai4water.models._tensorflow.private_layers.TabTransformer(*args, **kwargs)[source]

Bases: Layer

tensorflow/keras layer which implements logic of TabTransformer model.

The TabTransformer layer converts categorical features into contextual embeddings by passing them into Transformer block. The output of Transformer block is concatenated with numerical features and passed through an MLP to get the final model output.

It is available only in tensorflow >= 2.6

__init__(num_numeric_features: int, cat_vocabulary: dict, hidden_units=32, lookup_kws: Optional[dict] = None, num_heads: int = 4, depth: int = 4, dropout: float = 0.1, num_dense_lyrs: int = 2, prenorm_mlp: bool = True, post_norm: bool = True, final_mlp_units=16, final_mpl_activation: str = 'selu', seed: int = 313, *args, **kwargs)[source]

Parameters:

num_numeric_features (int) – number of numeric features to be used as input.
cat_vocabulary (dict) – a dictionary whose keys are names of categorical features and values are lists which consist of unique values of categorical features. You can use the function ai4water.models.utils.gen_cat_vocab() to create this for your own data. The length of dictionary should be equal to number of categorical features. If it is None, then this layer expects only numeri features
hidden_units (int, optional (default=32)) – number of hidden units
num_heads (int, optional (default=4)) – number of attention heads
depth (int (default=4)) – number of transformer blocks to be stacked on top of each other
dropout (int, optional (default=0.1)) – droput rate in transformer
post_norm (bool (default=True)) –
prenorm_mlp (bool (default=True)) –
num_dense_lyrs (int (default=2)) – number of dense layers in MLP block inside the Transformer
final_mlp_units (int (default=16)) – number of units/neurons in final MLP layer i.e. the MLP layer after Transformer block

create_mlp(activation, normalization_layer, name=None)[source]

FTTransformer

class ai4water.models._tensorflow.private_layers.FTTransformer(*args, **kwargs)[source]

Bases: Layer

tensorflow/keras layer which implements logic of FTTransformer model.

In FTTransformer, both categorical and numerical features are passed through transformer block and then passed through MLP layer to get the final model prediction.

__init__(num_numeric_features: int, cat_vocabulary: Optional[dict] = None, hidden_units=32, num_heads: int = 4, depth: int = 4, dropout: float = 0.1, lookup_kws: Optional[dict] = None, num_dense_lyrs: int = 2, post_norm: bool = True, final_mlp_units: int = 16, with_cls_token: bool = False, seed: int = 313, *args, **kwargs)[source]

Parameters:

num_numeric_features (int) – number of numeric features to be used as input.
cat_vocabulary (dict/None) – a dictionary whose keys are names of categorical features and values are lists which consist of unique values of categorical features. You can use the function ai4water.models.utils.gen_cat_vocab() to create this for your own data. The length of dictionary should be equal to number of categorical features. If it is None, then this layer expects only numeri features
hidden_units (int, optional (default=32)) – number of hidden units
num_heads (int, optional (default=4)) – number of attention heads
depth (int (default=4)) – number of transformer blocks to be stacked on top of each other
dropout (float, optional (default=0.1)) – droput rate in transformer
lookup_kws (dict) – keyword arguments for lookup layer
post_norm (bool (default=True)) –
num_dense_lyrs (int (default=2)) – number of dense layers in MLP block inside the Transformer
final_mlp_units (int (default=16)) – number of units/neurons in final MLP layer i.e. the MLP layer after Transformer block
with_cls_token (bool (default=False)) – whether to use cls token or not
seed (int) – seed for reproducibility

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).