tsseg.algorithms.patss.segmentation package

Submodules

tsseg.algorithms.patss.segmentation.LogisticRegressionSegmentor module

class tsseg.algorithms.patss.segmentation.LogisticRegressionSegmentor.LogisticRegressionSegmentor(n_segments=None, regularization=0.1, k_means_kwargs=None, n_jobs=1)[source]

Bases: Segmentor

Segments a time series based on an embedding matrix in two steps.

First, a KMeans clustering model is fitted on the embedding, which will provide a discrete clustering (i.e., every observation in the time series will be assigned a discrete cluster label). The number of clusters K is decided based on the silhouette method. The discrete clustering serves as an initial model of where the different semantic segments occur.

Second, the discrete clustering is fed to a logistic regression model. This model will consequently learn to which segment a given embedding of the pattern-based embedding belongs. Because this is a probabilistic model, we can also retrieve the probabilities of a given observation belong to some specific segment, thereby obtaining a probabilistic segmentation.

Parameters:

n_segments (Union[List[int], int]) – The number of segments that should be computed. If only an integer is given, then this is the exact number of segments. Otherwise, if a list of integers is given, the number of segments with the largest silhouette score will be used.
regularization (float) – The regularization factor used for fitting Logistic Regression.
k_means_kwargs (dict, default={'n_init': 'auto', 'init': 'k-means++'}) – Additional arguments to pass to the sklearn k-means clustering.
n_jobs (int) – The number of jobs that may run in parallel. This is used to compute the K-means clustering for multiple number of segments at the same time.

k_means\_

The fitted K-Means clustering model.

Type:: sklearn.KMeans

logistic_regression\_

The fitted logistic regression model.

Type:: sklearn.LogisticRegression

fit(pattern_based_embedding, y=None)[source]

Fits this segmentor with the given pattern-based embedding through logistic regression supervised by K-Means clustering. (1) K-Means clustering will be performed for all K in self.n_segments. If there are at least two values and self.n_jobs > 1, then these clusterings will be performed in parallel. (2) The best clustering is selected using the silhouette method, which gives an unsupervised estimate of how well the clustering is. The obtained clustering gives a general idea of the different segments in the data. (3) A logistic regression model is trained using the clustering as labels, thereby enabling to learn a more fine-grained version of the semantic segmentation, as well as learning the probability distribution over the semantic segments.

Parameters:

pattern_based_embedding (PatternBasedEmbedding) – The pattern-based embedding used for training this segmentor.
y (Ignored) – Not used, present here for API consistency by convention.

Returns:

self – Returns the instance itself

Return type:

LogisticRegressionSegmentor

predict(pattern_based_embedding)[source]

Predicts the segment probabilities for the given pattern-based embedding. This is done by obtaining the predicted probabilities of the fitted logistic regression instance.

Parameters:: pattern_based_embedding (PatternBasedEmbedding) – The pattern-based embedding to predict the segmentation for.
Returns:: segmentation – The segmentation based on the given pattern-based embedding, which consists of n_segments different semantic segments for a time series with ´`n_samples`` observations. The value segmentation[s, t] equals the probability of being in semantic segment s at time step t.
Return type:: ndarray

tsseg.algorithms.patss.segmentation.Segmentor module

class tsseg.algorithms.patss.segmentation.Segmentor.Segmentor[source]

Bases: ABC

abstractmethod fit(pattern_based_embedding, y=None)[source]

Fit this segmentor to the given embedding.

Parameters:

pattern_based_embedding (PatternBasedEmbedding) – The pattern-based embedding used for training this segmentor.
y (Ignored) – Not used, present here for API consistency by convention.

Returns:

self – Returns the instance itself

Return type:

Segmentor

fit_predict(pattern_based_embedding, y=None)[source]

Fit this segmentor on the given pattern-based embedding and predict the semantic segmentation.

Parameters:

pattern_based_embedding (PatternBasedEmbedding) – The pattern-based embedding used for training this segmentor and predicting the semantic segmentation
y (Ignored) – Not used, present here for API consistency by convention.

Returns:

segmentation – The segmentation based on the given pattern-based embedding, which consists of n_segments different semantic segments for a time series with ´n_samples observations. The value segmentation[s, t] equals the probability of being in semantic segment s at time step t.

Return type:

ndarray

abstractmethod predict(pattern_based_embedding)[source]

Predicts the segment probabilities for the given pattern-based embedding.

Parameters:: pattern_based_embedding (PatternBasedEmbedding) – The pattern-based embedding to predict the segmentation for.
Returns:: segmentation – The segmentation based on the given pattern-based embedding, which consists of n_segments different semantic segments for a time series with ´`n_samples`` observations. The value segmentation[s, t] equals the probability of being in semantic segment s at time step t.
Return type:: ndarray

Module contents

This module enables to perform a semantic segmentation, given the pattern-based embedding. This means you can provide the PatternBasedEmbedding of a time series to a py:class:Segmentor in order to predict the segment probabilities.