tsseg.algorithms.vsax package
VSAX — Variable-length SAX state detection.
Description
VSAX converts each channel to Symbolic Aggregate approXimation (SAX) symbols, then finds the variable-length segmentation that minimises PAA reconstruction error plus an additive penalty per segment. The pipeline:
Z-normalisation — optionally standardise each channel.
PAA — reduce each candidate segment to
paa_segmentsframes.SAX — discretise PAA values into
alphabet_sizesymbols using Gaussian breakpoints (or adaptive empirical quantiles).DP segmentation — dynamic programming over
num_lengthscandidate segment lengths minimises reconstruction error +penaltyper segment.Symbol merging — per-channel SAX symbol tuples are clustered via agglomerative clustering on Hamming distance (threshold
symbol_merge_threshold).
Parameters
Name |
Type |
Default |
Description |
|---|---|---|---|
|
int |
|
Number of SAX symbols per channel. |
|
int |
|
Number of PAA frames per segment. |
|
int |
|
Minimum admissible segment length. |
|
int |
|
Maximum admissible segment length. |
|
int |
|
Number of candidate lengths (linearly spaced min..max). |
|
float |
|
Cost per new segment. Larger values produce longer segments. |
|
float |
|
Normalised Hamming distance threshold for merging symbols.
|
|
bool |
|
Apply per-channel z-normalisation. |
|
bool |
|
Learn SAX breakpoints from empirical quantiles. |
|
int |
|
Time axis. |
Usage
from tsseg.algorithms import VSAXDetector
detector = VSAXDetector(
alphabet_size=8, penalty=1.0, min_segment_length=30) states =
detector.fit_predict(X)
Implementation: Origin: new code.
Reference: —
Submodules
tsseg.algorithms.vsax.detector module
Variable-length SAX baseline detector.
Segmentation via dynamic programming over per-channel SAX symbols with agglomerative symbol clustering. Reconstruction error is computed in O(1) per candidate via prefix sums.
- class tsseg.algorithms.vsax.detector.VSAXDetector(*, axis=0, alphabet_size=6, paa_segments=8, min_segment_length=20, max_segment_length=180, num_lengths=6, penalty=0.8, symbol_merge_threshold=0.2, zscore=True, adaptive_breakpoints=True, random_state=0)[source]
Bases:
BaseSegmenterBaseline for state detection using variable-length SAX symbols.
The detector uses dynamic programming over variable-length Symbolic Aggregate approXimation (SAX) representations to find the segmentation that minimises PAA reconstruction error with an additive penalty controlling fragmentation.
SAX symbols are computed per channel, preserving multivariate structure. Similar symbols are merged into the same state via agglomerative clustering on Hamming distance, avoiding the brittleness of exact symbol matching.
- Parameters:
axis (
int) – Time axis.axis=0assumes(n_timepoints, n_channels)input.alphabet_size (
int) – Number of SAX symbols per channel. Values >= 1 are supported.paa_segments (
int) – Number of PAA frames per segment. Short segments automatically reduce the number of frames so that every frame contains at least one sample; the resulting symbol is zero-padded (by repeating the last frame) to a fixed length ofpaa_segments * n_channels.min_segment_length (
int) – Minimum admissible segment length (in samples).max_segment_length (
int) – Maximum admissible segment length.num_lengths (
int) – Number of candidate lengths linearly spaced betweenminandmax. Increasing this value improves flexibility at the cost of runtime.penalty (
float) – Cost added for every new segment. Use larger values to favour longer segments; reduce to obtain more change points.symbol_merge_threshold (
float) – Normalised distance threshold below which two SAX symbols are merged into the same state. Distance is measured as mean absolute difference of symbol indices divided byalphabet_size(so it lies in [0, 1]).0gives exact matching (original behaviour),1collapses everything into a single state.zscore (
bool) – Apply per-channel z-normalisation before computing scores.adaptive_breakpoints (
bool) – WhenTrue, learn SAX breakpoints from empirical quantiles of the training data instead of using Gaussian breakpoints.random_state (
int|None) – Accepted for API compatibility but unused (deterministic).
- set_fit_request(*, axis: bool | None | str = '$UNCHANGED$') VSAXDetector
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- set_predict_request(*, axis: bool | None | str = '$UNCHANGED$') VSAXDetector
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Module contents
Variable-length SAX baseline detector.