Anomaly Detection¶
Anomaly detection involves identifying data points, events, or observations that deviate significantly from the expected or normal behavior within a dataset. In the context of time series, this could mean detecting sudden spikes or drops, unusual patterns, or periods where the data behaves differently from the norm.
Incorporating anomaly detection into forecasting workflows can:
Improve model robustness by identifying or down-weighting unusual data points during training.
Provide insights into data quality issues or real-world events impacting the time series.
Help understand when and why a forecasting model might be struggling (e.g., high prediction errors coinciding with detected anomalies).
fusionlab provides components and integrates strategies (especially
within XTFT) to leverage anomaly information.
Anomaly Detection Components¶
These are neural network layers designed specifically for anomaly
detection ((fusionlab.nn.anomaly_detection()) tasks, often intended
to be used within or alongside forecasting models.
LSTMAutoencoderAnomaly¶
- API Reference:
Concept: Reconstruction-Based Anomaly Detection
This layer implements an LSTM-based autoencoder. The core idea is to train the model to reconstruct “normal” time series sequences accurately. It learns a compressed representation (encoding) of typical patterns and then attempts to rebuild the original sequence (decoding) from that representation.
Anomalous sequences, which do not conform to the patterns learned from normal data, are expected to have a higher reconstruction error (the difference between the original input \(\mathbf{X}\) and the reconstructed output \(\mathbf{\hat{X}}\)).
How it Works:
Takes an input sequence (Batch, TimeSteps, Features).
The encoder LSTM processes the sequence and produces a latent vector (typically the final hidden state).
The decoder LSTM takes this latent vector (repeated across time) and generates the reconstructed sequence.
Returns the reconstructed sequence \(\mathbf{\hat{X}}\). The output shape depends on the n_repeats and n_features parameters (see API reference for details).
Usage:
Training: Train the autoencoder typically on data assumed to be normal, minimizing a reconstruction loss like Mean Squared Error (MSE) between the input and the output. This is an unsupervised approach as it doesn’t require anomaly labels.
Scoring: After training, feed new (or training/validation) sequences into the autoencoder. Calculate the reconstruction error for each sequence (e.g., using the layer’s .compute_reconstruction_error() method which calculates MSE per sample).
Detection: Use the reconstruction error as an anomaly score. Sequences with errors exceeding a predefined threshold (determined based on validation data or domain knowledge) can be flagged as anomalous.
Integration: The anomaly scores derived from the reconstruction error
could potentially be used as input for the ‘from_config’ strategy in
XTFT by pre-calculating them.
Code Example:
1import tensorflow as tf
2# Assuming LSTMAutoencoderAnomaly is importable
3from fusionlab.nn.anomaly_detection import LSTMAutoencoderAnomaly
4
5# Config
6batch_size = 4
7time_steps = 20
8features = 5
9latent_dim = 8 # Size of internal compressed representation
10lstm_units = 16 # Units in LSTM layers
11
12# Dummy input sequence
13dummy_input = tf.random.normal((batch_size, time_steps, features))
14
15# Instantiate the layer (using enhanced version parameters)
16lstm_ae_layer = LSTMAutoencoderAnomaly(
17 latent_dim=latent_dim,
18 lstm_units=lstm_units,
19 num_encoder_layers=1, # Example: 1 encoder layer
20 num_decoder_layers=1, # Example: 1 decoder layer
21 n_features=features, # Reconstruct original feature count
22 n_repeats=time_steps, # Reconstruct original time step count
23 activation='tanh'
24)
25
26# Apply the layer to get reconstructions
27reconstructions = lstm_ae_layer(dummy_input)
28
29# Compute reconstruction error (MSE per sample)
30recon_error = lstm_ae_layer.compute_reconstruction_error(
31 dummy_input, reconstructions
32)
33
34print(f"Input shape: {dummy_input.shape}")
35print(f"Reconstruction shape: {reconstructions.shape}")
36print(f"Reconstruction Error shape (per sample): {recon_error.shape}")
37# Expected shapes: (4, 20, 5), (4, 20, 5), (4,)
SequenceAnomalyScoreLayer¶
- API Reference:
Concept: Feature-Based Anomaly Scoring
This layer learns to directly predict an anomaly score from a set of input features. These input features are typically learned representations extracted from a time series by preceding layers in a larger model (e.g., the final hidden state of an LSTM, the output of attention layers, or an aggregated feature vector).
How it Works:
Takes input features (typically Batch, Features).
Passes these features through one or more internal Dense layers with non-linear activations and optional dropout/normalization.
A final Dense layer with a single output neuron produces the scalar anomaly score for each input sample. The activation of this final layer (e.g., ‘linear’ for unbounded score, ‘sigmoid’ for 0-1 score) determines the score’s range.
Usage:
Integration: Add this layer near the end of a larger neural network architecture (like a modified XTFT or a custom model). It takes informative features from the network as input.
Training: Training requires a loss function that incorporates this anomaly score output. This could involve supervised training with anomaly labels or unsupervised/semi-supervised integration with a primary task loss (e.g., forecasting).
Detection: Use the output score directly. Higher scores indicate a higher likelihood of the input features representing an anomaly, as interpreted by the trained layer. Apply thresholding as needed.
Integration: This type of layer aligns conceptually with the
‘feature_based’ anomaly detection strategy mentioned in relation to
XTFT, where anomaly scores are computed internally
from learned features.
Code Example:
1import tensorflow as tf
2from fusionlab.nn.anomaly_detection import SequenceAnomalyScoreLayer
3
4# Config
5batch_size = 4
6feature_dim = 32 # Dimension of features input to this layer
7
8# Dummy input features (e.g., output from previous layers)
9learned_features = tf.random.normal((batch_size, feature_dim))
10
11# Instantiate the layer
12anomaly_scorer = SequenceAnomalyScoreLayer(
13 hidden_units=[16, 8], # Example: 2 hidden layers
14 activation='relu',
15 dropout_rate=0.1,
16 final_activation='linear' # Output unbounded score
17)
18
19# Apply the layer
20anomaly_scores = anomaly_scorer(learned_features, training=False)
21
22print(f"Input features shape: {learned_features.shape}")
23print(f"Output anomaly scores shape: {anomaly_scores.shape}")
24# Expected: (4, 32), (4, 1)
PredictionErrorAnomalyScore¶
- API Reference:
Concept: Prediction-Error-Based Anomaly Scoring
This layer quantifies the discrepancy between ground truth (y_true) and model predictions (y_pred) for time series, aggregating the error across time and features to produce a single anomaly score per sequence.
Functionality:
Takes input as a list [y_true, y_pred], where both tensors typically have shape \((B, T, F)\).
Calculates the element-wise error based on the specified error_metric (‘mae’ or ‘mse’).
\[\text{MAE}_t = \frac{1}{F} \sum_{f=1}^F |y_{true; t,f} - y_{pred; t,f}| \; \text{ or } \; \text{MSE}_t = \frac{1}{F} \sum_{f=1}^F (y_{true; t,f} - y_{pred; t,f})^2\]Aggregates these per-step errors across the time dimension \(T\) using the specified aggregation method (‘mean’ or ‘max’).
Returns a scalar anomaly score for each sequence in the batch (shape \((B, 1)\)).
Usage Context: Designed to be used when paired ground truth and
predictions are available. It directly links the anomaly score to the
model’s predictive performance on a sequence. The output score can be
used in a custom loss function or training step (similar to the logic
in prediction_based_loss()) to penalize
large prediction deviations, thereby implicitly identifying anomalies.
Code Example:
1import tensorflow as tf
2from fusionlab.nn.anomaly_detection import PredictionErrorAnomalyScore
3
4# Config
5batch_size = 4
6time_steps = 10
7features = 1
8
9# Dummy true and predicted sequences
10y_true = tf.random.normal((batch_size, time_steps, features))
11# Simulate predictions with some noise
12y_pred = y_true + tf.random.normal(tf.shape(y_true), stddev=0.5)
13
14# Instantiate the layer (MAE, max aggregation)
15error_scorer = PredictionErrorAnomalyScore(
16 error_metric='mae',
17 aggregation='max'
18)
19
20# Calculate scores
21anomaly_scores = error_scorer([y_true, y_pred])
22
23print(f"Input y_true shape: {y_true.shape}")
24print(f"Input y_pred shape: {y_pred.shape}")
25print(f"Output anomaly scores shape: {anomaly_scores.shape}")
26# Expected: (4, 10, 1), (4, 10, 1), (4, 1)
Using Anomaly Detection with XTFT¶
The XTFT model provides specific parameters to
integrate anomaly detection during training:
anomaly_detection_strategy: Can be set to'prediction_based'(derives scores from prediction errors usingprediction_based_loss()), potentially'feature_based'(using internal layers likeSequenceAnomalyScoreLayer), or implies'from_config'logic when used with specific combined losses likecombined_total_loss().anomaly_loss_weight: Controls the relative importance of the anomaly objective compared to the main forecasting objective in the loss function.anomaly_config: A dictionary potentially used to pass pre-computed scores (for'from_config'logic) or configure internal anomaly components.
Refer to the /user_guide/examples/xtft_with_anomaly_detection example for practical implementations of the ‘from_config’ (via combined loss) and ‘prediction_based’ strategies.