tsseg.algorithms.igts package

IGTS — Information Gain based Temporal Segmentation.

Description

IGTS is a top-down greedy algorithm that locates the change point maximising the information gain at each step, then repeats on the resulting sub-signals. It works best on multivariate series where distribution shifts across channels provide discriminative evidence.

Warning

IGTS does not perform well on univariate series without augmentation.

Type: change point detection
Supervision: unsupervised or semi-supervised
Scope: multivariate (primarily)

Parameters

Name	Type	Default	Description
`k_max`	int	`10`	Maximum number of change points.
`step`	int	`5`	Stride for candidate locations.

Usage

from tsseg.algorithms import InformationGainDetector

detector = InformationGainDetector(k_max=8, step=5)
labels = detector.fit_predict(X)

Implementation: Adapted from aeon. BSD 3-Clause.

Reference: Sadri, Ren & Salim (2017), Information Gain-based Metric for Recognizing Transitions in Human Activities, Pervasive and Mobile Computing.

Submodules

tsseg.algorithms.igts.detector module

Information Gain-based Temporal Segmenter.

Information Gain Temporal Segmentation (_IGTS) is a method for segmenting multivariate time series based off reducing the entropy in each segment.

The amount of entropy lost by the segmentations made is called the Information Gain (IG). The aim is to find the segmentations that have the maximum information gain for any number of segmentations.

class tsseg.algorithms.igts.detector.InformationGainDetector(k_max=10, step=5)[source]

Bases: BaseSegmenter

Information Gain based Temporal Segmentation (GTS) Estimator.

GTS is an unsupervised method for segmenting time series into non-overlapping segments by locating change points that maximise the information gain.

Information gain (IG) is defined as the amount of entropy lost by the segmentation. The aim is to find the segmentation that has the maximum information gain for a specified number of segments.

GTS uses a top-down search to greedily find the next change point that creates the maximum information gain. Once found, the process repeats until k_max splits have been made.

For univariate input the series is automatically augmented with its normalised complement channel (Eq. 12-13 of [1]) so that entropy can vary across segments.

Parameters:

k_max (int) – Maximum number of change points to find. The number of segments is thus k+1.
step (int) – Step size, or stride for selecting candidate locations of change points. Fox example a step=5 would produce candidates [0, 5, 10, …]. Has the same meaning as step in range function.

change_points_

By convention change points include the identity segmentation, i.e. first and last index + 1 values.

Type:: list of change points as integer indexes.

intermediate_results_

Intermediate segmentation results for each k value, where k=1, 2, …, k_max

Type:: list of ChangePointResult

Notes

Based on the work from [1]. - alt. py implementation: https://github.com/cruiseresearchgroup/IGTS-python - MATLAB version: https://github.com/cruiseresearchgroup/IGTS-matlab - paper available at:

References

Examples

>>> from aeon.testing.data_generation import make_example_dataframe_series
>>> from sklearn.preprocessing import MinMaxScaler
>>> from aeon.segmentation import InformationGainSegmenter
>>> X = make_example_dataframe_series(n_channels=2, random_state=10)
>>> X_scaled = MinMaxScaler(feature_range=(0, 1)).fit_transform(X)
>>> igts = InformationGainSegmenter(k_max=3, step=2)
>>> y = igts.fit_predict(X_scaled, axis=0)

set_fit_request(*, axis: bool | None | str = '$UNCHANGED$') → InformationGainDetector

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:: axis (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for axis parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, axis: bool | None | str = '$UNCHANGED$') → InformationGainDetector

Configure whether metadata should be requested to be passed to the predict method.