Welcome to fABBA’s documentation!
fABBA — Fast and Accurate Symbolic Representation for Time Series
fABBA (fast ABBA) is a state-of-the-art, highly optimized symbolic aggregate approximation method for univariate and multivariate time series. It achieves extremely high compression ratios (often > 100–1000×) while remaining fully reversible and providing tight error bounds.
The method consists of two core steps:
Lossy piecewise linear compression (tolerance-driven polygonal chain approximation)
Mean-based clustering of segments -> symbolic representation (fully automated, no need to pre-specify alphabet size)
Because the resulting representation is symbolic, it naturally leads to:
Strong noise smoothing
Drastic dimensionality reduction
Ultra-fast distance computations (via lookup tables)
Seamless integration with classic data mining algorithms (motif discovery, anomaly detection, classification, clustering, indexing, etc.)
fABBA significantly outperforms the original ABBA [1] in speed (often 10–100× faster) while producing nearly identical or even better symbolic sequences.
Visualization of the fABBA transformation process (source: Stefan Güttel, Turing–Manchester presentation, 2021).
Key Advantages of fABBA
Core Methods & Variants
fABBA.fABBA-> Original fast single-series implementation (pure Python + Cython)fABBA.JABBA-> Next-generation engine supporting:Univariate & multivariate series
Custom clustering backends (k-means, hierarchical, GPU, etc.)
Memory-optimized streaming aggregation
fABBA.image_compress/image_decompress-> Turn any 2D array/image into a short string and back
Applications
fABBA has demonstrated superior performance in numerous domains:
Time-series classification & clustering (UCR/UEA archives)
Extreme compression of sensor data (IoT, wearables, finance)
Motif & discord discovery at massive scale
Anomaly detection with symbolic distance measures
Lossy but reconstructible storage of medical signals (ECG, EEG)
Image and video frame compression via block-wise symbolization
Quick Example
from fABBA import fABBA
import numpy as np
import matplotlib.pyplot as plt
ts = np.load("example_series.npy")
fabba = fABBA(tol=0.1, alpha=0.01, method='agg')
string, centers = fabba.fit_transform(ts)
print(f"Original length : {len(ts)}")
print(f"Compressed to : {len(string)} symbols -> compression ratio {(len(ts)/len(string)):.1f}×")
print(f"Symbolic string : {string}")
reconstructed = fabba.inverse_transform(string, centers)
plt.plot(ts, label="Original")
plt.plot(reconstructed, "--", label="Reconstructed")
plt.legend(); plt.show()
References
Getting Started
pip install fABBA # includes pre-compiled wheels for Linux/macOS/Windows
Full documentation: https://fabba.readthedocs.io
We welcome contributions! Whether it’s new clustering backends, performance improvements, or better documentation — feel free to open issues or pull requests.
Enjoy ultra-fast symbolic time-series analysis with fABBA!
Guide
- Get started
- Main components
- Mult-channel symbolization
- Full Usage Examples
- 1. Multiple Univariate or Multivariate Time Series (Most Common)
- 2. True Multivariate Time Series (Shared Symbols Across Channels)
- 3. High-Dimensional Arrays (Video, Spectrograms, Images over Time)
- 4. Out-of-Sample (Test Set) Symbolization
- 5. Fixed vs Adaptive Vocabulary
- 6. GPU-Accelerated Digitization (Large Datasets)
- Parameter Guide
- Saving and loading
- Inspecting the Learned Dictionary
- Saving the Model (Recommended Methods)
- Loading a Trained Dictionary for Inference / Deployment
- Saving the Entire Model (including normalization & shape info)
- Production Deployment Example (FastAPI)
- Visualizing the Learned Prototypes
- Complete Persistent Package (Most Robust)
- Summary – What You Really Need to Save
- Extensible ABBA
API Reference
Others
Indices and Tables