tsseg.algorithms.bocd package

BOCD (Bayesian Online Change-point Detection) — posterior thresholding.

Description

This offline Bayesian change-point detector integrates out the mean and variance of each segment under a conjugate Normal-Gamma prior, constructs a run-length posterior via dynamic programming, and selects change points by thresholding the posterior probability of a boundary. It extends the classical framework of Fearnhead (2006) and Adams & MacKay (2007) to multivariate time series.

A constant hazard function \(H(\tau) = 1/\lambda\) controls the prior expectation of segment length. Two strategies handle multivariate inputs:

  • "l2" — reduces channels via L2 norm before inference.

  • "ensembling" — runs per-channel inference and aggregates change points.

Type: change point detection
Supervision: unsupervised or semi-supervised
Scope: univariate and multivariate

Parameters

Name

Type

Default

Description

hazard_lambda

float

300

Expected run length (constant hazard = 1/lambda).

mu

float

0.0

Prior mean of segment observations.

kappa

float

1.0

Strength of the prior mean (normal precision).

alpha

float

1.0

Shape of the inverse-gamma prior over variance.

beta

float

1.0

Scale of the inverse-gamma prior over variance.

truncate

int

-40

Log-probability truncation threshold for the DP matrix.

cp_prob_threshold

float

0.05

Minimum posterior probability to accept a change point.

min_distance

int

25

Minimum distance (samples) between successive change points.

max_cps

int / None

None

Optional upper bound on the number of change points.

multivariate_strategy

str

"l2"

"l2" (reduce via L2 norm) or "ensembling" (per-channel).

tolerance

float

0

Tolerance for aggregating change points in ensembling mode.

axis

int

0

Time axis.

Usage

from tsseg.algorithms import BOCDDetector

detector = BOCDDetector(hazard_lambda=200, cp_prob_threshold=0.1)
labels = detector.fit_predict(X)

Implementation: Adapted from hildensia/bayesian_changepoint_detection. Apache License 2.0.

Reference: Fearnhead (2006), Statistics and Computing; Adams & MacKay (2007), arXiv.

Submodules

tsseg.algorithms.bocd.bayesian_models module

tsseg.algorithms.bocd.bayesian_models.offline_changepoint_detection(data, prior_function, log_likelihood_class, truncate=-40)[source]

Compute the likelihood of changepoints on data.

Parameters: data – the time series data truncate – the cutoff probability 10^truncate to stop computation for that changepoint log likelihood

Outputs:

P – the log-likelihood of a datasequence [t, s], given there is no changepoint between t and s Q – the log-likelihood of data Pcp – the log-likelihood that the i-th changepoint is at time step t. To actually get the probility of a changepoint at time step t sum the probabilities.

tsseg.algorithms.bocd.bayesian_models.online_changepoint_detection(data, hazard_function, log_likelihood_class)[source]

Use online bayesian changepoint detection https://scientya.com/bayesian-online-change-point-detection-an-intuitive-understanding-b2d2b9dc165b

Parameters: data – the time series data

Outputs:

R – is the probability at time step t that the last sequence is already s time steps long maxes – the argmax on column axis of matrix R (growth probability value) for each time step

tsseg.algorithms.bocd.detector module

class tsseg.algorithms.bocd.detector.BOCDDetector(hazard_lambda=300, mu=0.0, kappa=1.0, alpha=1.0, beta=1.0, truncate=-40, cp_prob_threshold=0.05, min_distance=25, max_cps=None, multivariate_strategy='l2', tolerance=0, axis=0)[source]

Bases: BaseSegmenter

Bayesian change-point detector based on offline inference.

This implementation wraps the offline Bayesian change-point detection heuristic introduced by Fearnhead (2006). It integrates out the mean and variance of each segment under a conjugate Normal-Gamma prior and selects change points by thresholding the posterior probability of a boundary.

Parameters:
  • hazard_lambda (float) – Expected run length. Internally converted to a constant hazard probability of 1 / hazard_lambda.

  • mu (float) – Prior mean of the segment observations.

  • kappa (float) – Strength of the prior mean (normal precision).

  • alpha (float) – Prior shape of the inverse-gamma distribution over the variance.

  • beta (float) – Prior scale of the inverse-gamma distribution over the variance.

  • truncate (int) – Log probability truncation used by the dynamic programme. Values further than 10**truncate below the running total are skipped.

  • cp_prob_threshold (float) – Minimum posterior probability required to accept a change point.

  • min_distance (int) – Minimum distance (in samples) enforced between successive change points.

  • max_cps (int | None) – Optional cap on the number of change points to return. None keeps all candidates above the probability threshold.

  • multivariate_strategy (str) –

    Strategy for handling multivariate data:

    • "l2" – reduce to univariate via L2 norm across channels.

    • "ensembling" – run BOCD independently on each channel and aggregate the detected change points.

  • tolerance (int | float) – Tolerance for aggregating change points in the ensembling strategy. If a float in (0, 1) it is interpreted as a fraction of the signal length.

  • axis (int) – Time axis in the input array.

set_fit_request(*, axis: bool | None | str = '$UNCHANGED$') BOCDDetector

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

axis (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for axis parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, axis: bool | None | str = '$UNCHANGED$') BOCDDetector

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

axis (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for axis parameter in predict.

Returns:

self – The updated object.

Return type:

object

Module contents

Bayesian online change detection algorithm utilities.