fusionlab.nn.anomaly_detection.LSTMAutoencoderAnomaly

class fusionlab.nn.anomaly_detection.LSTMAutoencoderAnomaly[source]

Bases: Model, NNLearner

LSTM Autoencoder for time series reconstruction-based anomaly

detection.

This layer implements a configurable LSTM autoencoder architecture. It encodes an input sequence into a lower-dimensional latent representation and then decodes this representation back into a sequence, attempting to reconstruct the original input. Training typically involves minimizing the reconstruction error on normal data.

The core idea is that anomalous sequences, deviating from patterns learned on normal data, will result in higher reconstruction errors, which can serve as anomaly scores. This layer offers flexibility in the number of encoder/decoder layers, bidirectionality, bottleneck configuration, output feature dimension specification, and the length of the reconstructed sequence.

Parameters:
  • latent_dim (int) – Dimensionality of the latent space (bottleneck). This controls the degree of information compression. If use_bottleneck_dense is True, this defines the output size of the bottleneck Dense layer applied to the final encoder hidden state. If False, this parameter might not be directly used (effective latent dim depends on lstm_units and use_bidirectional_encoder).

  • lstm_units (int) – Number of hidden units in each LSTM layer for both the encoder and decoder. Determines the capacity of the LSTMs.

  • n_features (int, optional, default None) –

    Allows pre-specifying the number of output features (last dimension) for the reconstructed sequence. * If an integer is provided, the final TimeDistributed(Dense)

    layer is created during initialization with this many units. An error will be raised during the build step if the actual input feature dimension doesn’t match this value.

    • If None (default), the number of output features is inferred from the input data’s feature dimension during the build step.

  • n_repeats (int, optional, default None) –

    Specifies a fixed number of time steps for the output sequence generated by the decoder. * If an integer is provided, the latent vector from the encoder

    is repeated n_repeats times before being fed into the decoder LSTM stack. The output reconstruction will have this many time steps, regardless of the input sequence length.

    • If None (default), the latent vector is repeated a number of times equal to the number of time steps in the input sequence, aiming to reconstruct the input fully.

  • num_encoder_layers (int, default 1) – Number of LSTM layers stacked in the encoder. Must be >= 1.

  • num_decoder_layers (int, default 1) – Number of LSTM layers stacked in the decoder. Must be >= 1.

  • activation (str, default 'tanh') – Activation function applied to the final TimeDistributed Dense output layer of the decoder, reconstructing the features. Examples: ‘tanh’, ‘sigmoid’, ‘linear’. Choose based on the expected range or normalization of the input data.

  • intermediate_activation (str, default 'relu') – Activation function used in the optional bottleneck Dense layers (if use_bottleneck_dense=True).

  • dropout_rate (float, default 0.0) – Dropout rate applied to the non-recurrent connections (inputs and outputs) of the LSTM layers. Value between 0 and 1.

  • recurrent_dropout_rate (float, default 0.0) – Dropout rate applied to the recurrent connections within the LSTM layers. Value between 0 and 1. Note: Using recurrent dropout may require disabling GPU acceleration (CuDNN) for LSTMs.

  • use_bidirectional_encoder (bool, default False) – If True, wraps the encoder LSTM layers with a Bidirectional wrapper, processing the input sequence in both forward and backward directions. The final hidden states are typically concatenated.

  • use_bottleneck_dense (bool, default False) – If True, adds Dense layers after the final encoder LSTM layer to explicitly project the final hidden state (state_h) and cell state (state_c) to the specified latent_dim. If False, the final encoder states are used directly.

  • **kwargs – Additional keyword arguments passed to the parent Keras Layer.

Notes

This layer expects input data with the shape (Batch, TimeSteps, Features). The output shape will be (Batch, OutputTimeSteps, OutputFeatures), where OutputTimeSteps is determined by n_repeats (or input TimeSteps if n_repeats is None) and OutputFeatures is determined by n_features (or input Features if n_features is None).

Use Case and Importance

This component is primarily used for unsupervised anomaly detection in sequential data. By training the autoencoder primarily on normal data, it learns the underlying patterns and structure inherent in that normal behavior. When presented with new data, sequences conforming to these learned patterns will be reconstructed accurately (low error), while sequences containing anomalies or novel patterns will result in poor reconstructions (high error). This reconstruction error serves as a valuable, data-driven anomaly score, particularly useful when labeled anomaly data is scarce or unavailable. The added flexibility via n_features and n_repeats allows for potential sequence-to-sequence tasks beyond pure reconstruction or handling cases where output dimensions differ from input.

Mathematical Formulation

The enhanced LSTM autoencoder involves:

  1. Encoder: A stack of num_encoder_layers LSTMs (optionally bidirectional) processes the input sequence \(\mathbf{X} \in \mathbb{R}^{T \times F}\). The final layer outputs the last hidden state \(h_T\) and cell state \(c_T\).

    \[[h_T, c_T] = \text{Encoder}_{LSTM\_Stack}(\mathbf{X})\]
  2. Bottleneck (Optional): If use_bottleneck_dense=True, the final states are projected to latent_dim: \(h'_T = \text{Dense}_{h}(h_T)\), \(c'_T = \text{Dense}_{c}(c_T)\). The latent vector used for decoding is \(\mathbf{z} = h'_T\). The decoder initial state is \([h'_T, c'_T]\). If False, \(\mathbf{z} = h_T\) and the initial state is \([h_T, c_T]\).

  3. Decoder Input Repetition: The latent vector \(\mathbf{z}\) is repeated $T’$ times using RepeatVector, where $T’ = text{n_repeats}$ if specified, otherwise $T’ = T$ (input time steps).

    \[\begin{split}\mathbf{Z}_{repeated} = \text{Repeat}(\mathbf{z})\\ \in \mathbb{R}^{T' \times \text{dim}(\mathbf{z})}\end{split}\]
  4. Decoder: A stack of num_decoder_layers LSTMs processes \(\mathbf{Z}_{repeated}\), initialized with the final (potentially bottlenecked) state from the encoder.

    \[\begin{split}\mathbf{H}_{dec} = \text{Decoder}_{LSTM\_Stack}\\ (\mathbf{Z}_{repeated}, \text{initial_state}) \in\\ \mathbb{R}^{T' \times \text{lstm\_units}}\end{split}\]
  5. Reconstruction: A TimeDistributed Dense layer maps the decoder’s output sequence \(\mathbf{H}_{dec}\) to the target feature dimension $F’$ (where $F’ = text{n_features}$ if specified, otherwise $F’=F$).

    \[\begin{split}\mathbf{\hat{X}} = \text{TimeDistributed}(\text{Dense}(\mathbf{H}_{dec}))\\ \in \mathbb{R}^{T' \times F'}\end{split}\]

The anomaly score is typically the reconstruction error, e.g., \(Error = ||\mathbf{X}_{[:T'',:F'']} - \mathbf{\hat{X}}_{[:T'',:F']}||^2\), where comparison might be limited to overlapping dimensions if $T’ neq T$ or $F’ neq F$. The compute_reconstruction_error method handles comparison over potentially differing time steps.

call(inputs, training=False)[source]

Performs the forward pass (encoding and decoding). Output shape depends on n_repeats and n_features.

compute_reconstruction_error(inputs, reconstructions=None)[source]

Calculates the mean squared error per sample, potentially only over overlapping time steps if input/output lengths differ due to n_repeats.

Parameters:
  • inputs (ndarray | Tensor)

  • reconstructions (ndarray | Tensor | None)

Return type:

Tensor

Examples

>>> from fusionlab.nn.anomaly_detection import LSTMAutoencoderAnomaly
>>> import tensorflow as tf
>>> B, T, F = 32, 20, 5 # Batch, TimeSteps, Features
>>> inputs = tf.random.normal((B, T, F))
>>> # Instantiate with specific output features and repeats
>>> lstm_ae = LSTMAutoencoderAnomaly(
...     latent_dim=8,
...     lstm_units=16,
...     n_features=F,  # Explicitly state output features
...     n_repeats=T,   # Explicitly state output time steps
...     num_encoder_layers=2,
...     num_decoder_layers=2,
... )
>>> # Get reconstructions
>>> reconstructions = lstm_ae(inputs)
>>> print(f"Reconstruction shape: {reconstructions.shape}") # Should be (32, 20, 5)
TensorShape([32, 20, 5])
>>> # Compute error
>>> error = lstm_ae.compute_reconstruction_error(inputs)
>>> print(f"Error shape: {error.shape}") # Should be (32,)
TensorShape([32])

See also

tensorflow.keras.layers.Layer

Base class for Keras layers.

tensorflow.keras.layers.LSTM

The recurrent layer used internally.

tensorflow.keras.layers.RepeatVector

Used to feed decoder.

tensorflow.keras.layers.TimeDistributed

Wraps the final Dense layer.

tensorflow.keras.layers.Bidirectional

Wrapper for bidirectional RNNs.

fusionlab.nn.transformers.XTFT

Can potentially incorporate anomaly scores derived from reconstruction errors.

fusionlab.nn.losses.anomaly_loss

Can be used with anomaly scores derived from this layer’s error.

SequenceAnomalyScoreLayer

Alternative anomaly detection component.

References

__init__(latent_dim, lstm_units, n_features=None, n_repeats=None, num_encoder_layers=1, num_decoder_layers=1, activation='tanh', intermediate_activation='relu', dropout_rate=0.0, recurrent_dropout_rate=0.0, use_bidirectional_encoder=False, use_bottleneck_dense=False, **kwargs)[source]
Parameters:
  • latent_dim (int)

  • lstm_units (int)

  • n_features (int | None)

  • n_repeats (int | None)

  • num_encoder_layers (int)

  • num_decoder_layers (int)

  • activation (str)

  • intermediate_activation (str)

  • dropout_rate (float)

  • recurrent_dropout_rate (float)

  • use_bidirectional_encoder (bool)

  • use_bottleneck_dense (bool)

Methods

__init__(latent_dim, lstm_units[, ...])

add_loss(losses, **kwargs)

Add loss tensor(s), potentially dependent on layer inputs.

add_metric(value[, name])

Adds metric tensor to the layer.

add_update(updates)

Add update op(s), potentially dependent on layer inputs.

add_variable(*args, **kwargs)

Deprecated, do NOT use! Alias for add_weight.

add_weight([name, shape, dtype, ...])

Adds a new variable to the layer.

build(input_shape)

Configure layers whose dimensions depend on input shape.

build_from_config(config)

Builds the layer's states with the supplied config dict.

call(inputs[, training])

Forward pass: Encode -> [Bottleneck] -> Repeat -> Decode.

compile([optimizer, loss, metrics, ...])

Configures the model for training.

compile_from_config(config)

Compiles the model with the information given in config.

compute_loss([x, y, y_pred, sample_weight])

Compute the total loss, validate it, and return it.

compute_mask(inputs[, mask])

Computes an output mask tensor.

compute_metrics(x, y, y_pred, sample_weight)

Update metric states and collect all metrics to be returned.

compute_output_shape(input_shape)

Computes the output shape of the layer.

compute_output_signature(input_signature)

Compute the output tensor signature of the layer based on the inputs.

compute_reconstruction_error(inputs[, ...])

Computes Mean Squared Error per sample.

count_params()

Count the total number of scalars composing the weights.

evaluate([x, y, batch_size, verbose, ...])

Returns the loss value & metrics values for the model in test mode.

evaluate_generator(generator[, steps, ...])

Evaluates the model on a data generator.

export(filepath)

Create a SavedModel artifact for inference (e.g. via TF-Serving).

finalize_state()

Finalizes the layers state after updating layer weights.

fit([x, y, batch_size, epochs, verbose, ...])

Trains the model for a fixed number of epochs (dataset iterations).

fit_generator(generator[, steps_per_epoch, ...])

Fits the model on data yielded batch-by-batch by a Python generator.

from_config(config)

Creates layer from its config.

get_build_config()

Returns a dictionary with the layer's input shape.

get_compile_config()

Returns a serialized config with information for compiling the model.

get_config()

Returns the layer configuration.

get_input_at(node_index)

Retrieves the input tensor(s) of a layer at a given node.

get_input_mask_at(node_index)

Retrieves the input mask tensor(s) of a layer at a given node.

get_input_shape_at(node_index)

Retrieves the input shape(s) of a layer at a given node.

get_layer([name, index])

Retrieves a layer based on either its name (unique) or index.

get_metrics_result()

Returns the model's metrics values as a dict.

get_output_at(node_index)

Retrieves the output tensor(s) of a layer at a given node.

get_output_mask_at(node_index)

Retrieves the output mask tensor(s) of a layer at a given node.

get_output_shape_at(node_index)

Retrieves the output shape(s) of a layer at a given node.

get_params([deep])

Get the parameters for this learner.

get_weight_paths()

Retrieve all the variables and their paths for the model.

get_weights()

Retrieves the weights of the model.

help(**kwargs)

load(file_path[, format])

Load the learner's state from a specified file in the desired format.

load_own_variables(store)

Loads the state of the layer.

load_weights(filepath[, skip_mismatch, ...])

Loads all layer weights from a saved files.

make_predict_function([force])

Creates a function that executes one step of inference.

make_test_function([force])

Creates a function that executes one step of evaluation.

make_train_function([force])

Creates a function that executes one step of training.

predict(x[, batch_size, verbose, steps, ...])

Generates output predictions for the input samples.

predict_generator(generator[, steps, ...])

Generates predictions for the input samples from a data generator.

predict_on_batch(x)

Returns predictions for a single batch of samples.

predict_step(data)

The logic for one inference step.

reset_metrics()

Resets the state of all the metrics in the model.

reset_states()

save(filepath[, overwrite, save_format])

Saves a model as a TensorFlow SavedModel or HDF5 file.

save_own_variables(store)

Saves the state of the layer.

save_spec([dynamic_batch])

Returns the tf.TensorSpec of call args as a tuple (args, kwargs).

save_weights(filepath[, overwrite, ...])

Saves all layer weights.

set_params(**params)

Set the parameters of this learner.

set_weights(weights)

Sets the weights of the layer, from NumPy arrays.

summary([line_length, positions, print_fn, ...])

Prints a string summary of the network.

test_on_batch(x[, y, sample_weight, ...])

Test the model on a single batch of samples.

test_step(data)

The logic for one evaluation step.

to_json(**kwargs)

Returns a JSON string containing the network configuration.

to_yaml(**kwargs)

Returns a yaml string containing the network configuration.

train_on_batch(x[, y, sample_weight, ...])

Runs a single gradient update on a single batch of data.

train_step(data)

The logic for one training step.

with_name_scope(method)

Decorator to automatically enter the module name scope.

Attributes

activity_regularizer

Optional regularizer function for the output of this layer.

autotune_steps_per_execution

Settable property to enable tuning for steps_per_execution

compute_dtype

The dtype of the layer's computations.

distribute_reduction_method

The method employed to reduce per-replica values during training.

distribute_strategy

The tf.distribute.Strategy this model was created under.

dtype

The dtype of the layer weights.

dtype_policy

The dtype policy associated with this layer.

dynamic

Whether the layer is dynamic (eager-only); set in the constructor.

inbound_nodes

Return Functional API nodes upstream of this layer.

input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape(s) of a layer.

input_spec

InputSpec instance(s) describing the input format for this layer.

jit_compile

Specify whether to compile the model with XLA.

layers

losses

List of losses added using the add_loss() API.

metrics

Return metrics added using compile() or add_metric().

metrics_names

Returns the model's display labels for all outputs.

my_params

name

Name of the layer (string), set in the constructor.

name_scope

Returns a tf.name_scope instance for this class.

non_trainable_variables

Sequence of non-trainable variables owned by this module and its submodules.

non_trainable_weights

List of all non-trainable weights tracked by this layer.

outbound_nodes

Return Functional API nodes downstream of this layer.

output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape(s) of a layer.

run_eagerly

Settable attribute indicating whether the model should run eagerly.

state_updates

Deprecated, do NOT use!

stateful

steps_per_execution

Settable `steps_per_execution variable. Requires a compiled model.

submodules

Sequence of all sub-modules.

supports_masking

Whether this layer supports computing a mask using compute_mask.

trainable

trainable_variables

Sequence of trainable variables owned by this module and its submodules.

trainable_weights

List of all trainable weights tracked by this layer.

updates

variable_dtype

Alias of Layer.dtype, the dtype of the weights.

variables

Returns the list of all layer variables/weights.

weights

Returns the list of all layer variables/weights.

__init__(latent_dim, lstm_units, n_features=None, n_repeats=None, num_encoder_layers=1, num_decoder_layers=1, activation='tanh', intermediate_activation='relu', dropout_rate=0.0, recurrent_dropout_rate=0.0, use_bidirectional_encoder=False, use_bottleneck_dense=False, **kwargs)[source]
Parameters:
  • latent_dim (int)

  • lstm_units (int)

  • n_features (int | None)

  • n_repeats (int | None)

  • num_encoder_layers (int)

  • num_decoder_layers (int)

  • activation (str)

  • intermediate_activation (str)

  • dropout_rate (float)

  • recurrent_dropout_rate (float)

  • use_bidirectional_encoder (bool)

  • use_bottleneck_dense (bool)

build(input_shape)[source]

Configure layers whose dimensions depend on input shape.

call(inputs, training=False)[source]

Forward pass: Encode -> [Bottleneck] -> Repeat -> Decode.

compute_reconstruction_error(inputs, reconstructions=None)[source]

Computes Mean Squared Error per sample.

Parameters:
  • inputs (ndarray | Tensor)

  • reconstructions (ndarray | Tensor | None)

Return type:

Tensor

get_config()[source]

Returns the layer configuration.

classmethod from_config(config)[source]

Creates layer from its config.

help(**kwargs)
my_params = LSTMAutoencoderAnomaly(     latent_dim,     lstm_units,     n_features=None,     n_repeats=None,     num_encoder_layers=1,     num_decoder_layers=1,     activation='tanh',     intermediate_activation='relu',     dropout_rate=0.0,     recurrent_dropout_rate=0.0,     use_bidirectional_encoder=False,     use_bottleneck_dense=False )