tsseg.algorithms.time2state package

Time2State — unsupervised latent state inference.

Description

Time2State infers latent states in time series data using a Causal CNN-based encoder and a novel unsupervised loss function (LSE-Loss). A sliding window extracts subsequences; the encoder maps them to a low-dimensional latent space where a Dirichlet Process Gaussian Mixture Model (DPGMM) clusters the embeddings into states without requiring the number of states a priori.

The framework drastically reduces computational cost compared to operating on raw time series by compressing the representation before clustering.

Type: state detection
Supervision: unsupervised or semi-supervised
Scope: univariate and multivariate
Requires: PyTorch

Parameters

Name

Type

Default

Description

window_size

int

256

Sliding window size.

step

int

50

Step size of the sliding window.

n_states

int

20

Maximum number of states for DPGMM.

alpha

float

1e3

DPGMM concentration parameter.

batch_size

int

1

Training batch size.

nb_steps

int

20

Training optimisation steps.

lr

float

0.003

Learning rate.

depth

int

10

Depth of the Causal CNN.

out_channels

int

4

Encoder output channels.

reduced_size

int

80

CNN output dimension before the linear layer.

kernel_size

int

3

Convolution kernel size.

use_gpu

bool / None

None

Force GPU (None = auto-detect).

random_state

int / None

None

Random seed.

axis

int

0

Time axis.

Usage

from tsseg.algorithms import Time2StateDetector

detector = Time2StateDetector(window_size=128, n_states=10)
states = detector.fit_predict(X)

Implementation: Adapted from original Time2State code.

Reference: Wang, Wu, Zhou & Cai (2023), Time2State: An Unsupervised Framework for Inferring the Latent States in Time Series Data, SIGMOD.

Submodules

tsseg.algorithms.time2state.detector module

This module provides an aeon-compatible wrapper for the Time2State algorithm.

class tsseg.algorithms.time2state.detector.Time2StateDetector(axis=0, window_size=256, step=50, n_states=20, alpha=1000.0, batch_size=1, nb_steps=20, lr=0.003, depth=10, out_channels=4, reduced_size=80, kernel_size=3, use_gpu=None, random_state=None)[source]

Bases: BaseSegmenter

A wrapper for the Time2State algorithm for time series segmentation, compatible with the aeon library.

Time2State uses a Causal CNN based encoder to learn representations of time series windows, which are then clustered to identify states.

Parameters:
  • axis (int) – Axis along which to segment if passed a multivariate series (2D input). If axis is 0, it is assumed each column is a time series and each row is a timepoint. i.e. the shape of the data is (n_timepoints,n_channels). axis == 1 indicates the time series are in rows, i.e. the shape of the data is (n_channels, n_timepoints).

  • window_size (int, default=256) – The size of the sliding window.

  • step (int, default=50) – The step size of the sliding window.

  • n_states (int, default=20) – The maximum number of states for the DPGMM clustering.

  • alpha (float, default=1e3) – The concentration parameter for the DPGMM clustering.

  • batch_size (int, default=1) – Batch size for training the neural network.

  • nb_steps (int, default=20) – Number of optimization steps for training.

  • lr (float, default=0.003) – Learning rate for the optimizer.

  • depth (int, default=10) – Depth of the Causal CNN.

  • out_channels (int, default=4) – Number of output channels of the encoder.

  • reduced_size (int, default=80) – Size of the output of the Causal CNN before the final linear layer.

  • kernel_size (int, default=3) – Kernel size for the convolutions in the Causal CNN.

  • use_gpu (bool, optional) – Whether to use GPU if available. If None, it will be auto-detected.

  • random_state (int, optional) – Random state for reproducibility.

set_fit_request(*, axis: bool | None | str = '$UNCHANGED$') Time2StateDetector

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

axis (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for axis parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, axis: bool | None | str = '$UNCHANGED$') Time2StateDetector

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

axis (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for axis parameter in predict.

Returns:

self – The updated object.

Return type:

object

tsseg.algorithms.time2state.time2state module

class tsseg.algorithms.time2state.time2state.BasicClusteringClass(params)[source]

Bases: object

abstractmethod fit(X)[source]
class tsseg.algorithms.time2state.time2state.BasicEncoder[source]

Bases: object

encode(X)[source]
load(X)[source]
save(X)[source]
class tsseg.algorithms.time2state.time2state.BasicEncoderClass(params)[source]

Bases: object

abstractmethod encode(X, win_size, step)[source]
abstractmethod fit(X)[source]
class tsseg.algorithms.time2state.time2state.CausalCNN(*args: Any, **kwargs: Any)[source]

Bases: Module

Causal CNN, composed of a sequence of causal convolution blocks.

Takes as input a three-dimensional tensor (B, C, L) where B is the batch size, C is the number of input channels, and L is the length of the input. Outputs a three-dimensional tensor (B, C_out, L).

Parameters:
  • in_channels (int) – Number of input channels.

  • channels (int) – Number of channels processed in the network and of output channels.

  • depth (int) – Depth of the network.

  • out_channels (int) – Number of output channels.

  • kernel_size (int) – Kernel size of the applied non-residual convolutions.

forward(x)[source]
class tsseg.algorithms.time2state.time2state.CausalCNNEncoder(*args: Any, **kwargs: Any)[source]

Bases: Module

Encoder of a time series using a causal CNN: the computed representation is the output of a fully connected layer applied to the output of an adaptive max pooling layer applied on top of the causal CNN, which reduces the length of the time series to a fixed size.

Takes as input a three-dimensional tensor (B, C, L) where B is the batch size, C is the number of input channels, and L is the length of the input. Outputs a three-dimensional tensor (B, C).

Parameters:
  • in_channels (int) – Number of input channels.

  • channels (int) – Number of channels manipulated in the causal CNN.

  • depth (int) – Depth of the causal CNN.

  • reduced_size (int) – Fixed length to which the output time series is reduced by the adaptive pooling layer.

  • out_channels (int) – Number of output channels.

  • kernel_size (int) – Kernel size of the applied non-residual convolutions.

forward(x)[source]
class tsseg.algorithms.time2state.time2state.CausalConv_LSE(win_size, batch_size, nb_steps, lr, channels, depth, reduced_size, out_channels, kernel_size, in_channels, cuda, gpu, M, N, win_type)[source]

Bases: BasicEncoder

encode(X, batch_size=500)[source]

Outputs the representations associated to the input by the encoder.

Parameters:
  • X (numpy.ndarray) – Testing set.

  • batch_size (int, default=500) – Size of batches used for splitting the test data to avoid out of memory errors when using CUDA. Ignored if the testing set contains time series of unequal lengths.

encode_window(X, win_size=128, batch_size=500, window_batch_size=10000, step=10)[source]

Outputs the representations associated to the input by the encoder, for each subseries of the input of the given size (sliding window representations).

Parameters:
  • X (numpy.ndarray) – Testing set.

  • win_size (int, default=128) – Size of the sliding window.

  • batch_size (int, default=500) – Size of batches used for splitting the test data to avoid out-of-memory errors when using CUDA.

  • window_batch_size (int, default=10000) – Number of windows processed per batch when calling encode to save RAM.

  • step (int, default=10) – Step length of the sliding window.

fit(X, y=None, save_memory=False, verbose=False)[source]
set_params(compared_length, batch_size, nb_steps, lr, channels, depth, reduced_size, out_channels, kernel_size, in_channels, cuda, gpu)[source]
class tsseg.algorithms.time2state.time2state.CausalConv_LSE_Adaper(params)[source]

Bases: BasicEncoderClass

encode(X, win_size, step)[source]
fit(X)[source]
class tsseg.algorithms.time2state.time2state.CausalConvolutionBlock(*args: Any, **kwargs: Any)[source]

Bases: Module

Causal convolution block, composed sequentially of two causal convolutions (with leaky ReLU activation functions), and a parallel residual connection.

Takes as input a three-dimensional tensor (B, C, L) where B is the batch size, C is the number of input channels, and L is the length of the input. Outputs a three-dimensional tensor (B, C, L).

Parameters:
  • in_channels (int) – Number of input channels.

  • out_channels (int) – Number of output channels.

  • kernel_size (int) – Kernel size of the applied non-residual convolutions.

  • dilation (int) – Dilation parameter of non-residual convolutions.

  • final (bool, default=False) – If True, disables the last activation function.

forward(x)[source]
class tsseg.algorithms.time2state.time2state.Chomp1d(*args: Any, **kwargs: Any)[source]

Bases: Module

Removes the last elements of a time series.

Takes as input a three-dimensional tensor (B, C, L) where B is the batch size, C is the number of input channels, and L is the length of the input. Outputs a three-dimensional tensor (B, C, L - s) where s is the number of elements to remove.

Parameters:

chomp_size (int) – Number of elements to remove.

forward(x)[source]
class tsseg.algorithms.time2state.time2state.DPGMM(n_states, alpha=1000.0)[source]

Bases: BasicClusteringClass

fit(X)[source]
class tsseg.algorithms.time2state.time2state.Dataset(*args: Any, **kwargs: Any)[source]

Bases: Dataset

PyTorch wrapper for a numpy dataset.

Parameters:

dataset (numpy.ndarray) – Array representing the dataset.

class tsseg.algorithms.time2state.time2state.LSELoss(*args: Any, **kwargs: Any)[source]

Bases: _Loss

LSE loss for representations of time series.

Parameters:
  • win_size (even integer.) – Size of the sliding window.

  • M (integer.) – Number of inter-state samples.

  • N (integer.) – Number of intra-state samples.

  • win_type ({'rect', 'hanning'}.) – window function.

forward(batch, encoder, save_memory=False)[source]
class tsseg.algorithms.time2state.time2state.SqueezeChannels(*args: Any, **kwargs: Any)[source]

Bases: Module

Squeezes, in a three-dimensional tensor, the third dimension.

forward(x)[source]
class tsseg.algorithms.time2state.time2state.Time2State(win_size, step, encoder, clustering_component, verbose=False)[source]

Bases: object

property change_points
property embedding_label
property embeddings
fit(X, win_size, step)[source]

Fit Time2State.

Parameters:
  • X ({ndarray} of shape (n_samples, n_features))

  • win_size (even integer.) – The size of sliding window.

  • step (integer.) – The step size of sliding window.

Returns:

self – Fitted Time2State.

Return type:

object

fit_encoder(X)[source]
load_result(path)[source]
plot(path)[source]
predict(X, win_size, step)[source]

Find state sequence for X.

Parameters:
  • X ({ndarray} of shape (n_samples, n_features))

  • win_size (even integer.) – The size of sliding window.

  • step (integer.) – The step size of sliding window.

Returns:

self – Fitted Time2State.

Return type:

object

predict_without_encode(X, win_size, step)[source]
set_clustering_component(clustering_obj)[source]
set_step(step)[source]
property state_seq
property velocity
tsseg.algorithms.time2state.time2state.hanning_numpy(X)[source]
tsseg.algorithms.time2state.time2state.hanning_tensor(X)[source]

Module contents