tsseg.algorithms.bocd package
BOCD (Bayesian Online Change-point Detection) — posterior thresholding.
Description
This offline Bayesian change-point detector integrates out the mean and variance of each segment under a conjugate Normal-Gamma prior, constructs a run-length posterior via dynamic programming, and selects change points by thresholding the posterior probability of a boundary. It extends the classical framework of Fearnhead (2006) and Adams & MacKay (2007) to multivariate time series.
A constant hazard function \(H(\tau) = 1/\lambda\) controls the prior expectation of segment length. Two strategies handle multivariate inputs:
"l2"— reduces channels via L2 norm before inference."ensembling"— runs per-channel inference and aggregates change points.
Parameters
Name |
Type |
Default |
Description |
|---|---|---|---|
|
float |
|
Expected run length (constant hazard = |
|
float |
|
Prior mean of segment observations. |
|
float |
|
Strength of the prior mean (normal precision). |
|
float |
|
Shape of the inverse-gamma prior over variance. |
|
float |
|
Scale of the inverse-gamma prior over variance. |
|
int |
|
Log-probability truncation threshold for the DP matrix. |
|
float |
|
Minimum posterior probability to accept a change point. |
|
int |
|
Minimum distance (samples) between successive change points. |
|
int / None |
|
Optional upper bound on the number of change points. |
|
str |
|
|
|
float |
|
Tolerance for aggregating change points in ensembling mode. |
|
int |
|
Time axis. |
Usage
from tsseg.algorithms import BOCDDetector
detector = BOCDDetector(hazard_lambda=200, cp_prob_threshold=0.1)
labels = detector.fit_predict(X)
Implementation: Adapted from hildensia/bayesian_changepoint_detection. Apache License 2.0.
Reference: Fearnhead (2006), Statistics and Computing; Adams & MacKay (2007), arXiv.
Submodules
tsseg.algorithms.bocd.bayesian_models module
- tsseg.algorithms.bocd.bayesian_models.offline_changepoint_detection(data, prior_function, log_likelihood_class, truncate=-40)[source]
Compute the likelihood of changepoints on data.
Parameters: data – the time series data truncate – the cutoff probability 10^truncate to stop computation for that changepoint log likelihood
- Outputs:
P – the log-likelihood of a datasequence [t, s], given there is no changepoint between t and s Q – the log-likelihood of data Pcp – the log-likelihood that the i-th changepoint is at time step t. To actually get the probility of a changepoint at time step t sum the probabilities.
- tsseg.algorithms.bocd.bayesian_models.online_changepoint_detection(data, hazard_function, log_likelihood_class)[source]
Use online bayesian changepoint detection https://scientya.com/bayesian-online-change-point-detection-an-intuitive-understanding-b2d2b9dc165b
Parameters: data – the time series data
- Outputs:
R – is the probability at time step t that the last sequence is already s time steps long maxes – the argmax on column axis of matrix R (growth probability value) for each time step
tsseg.algorithms.bocd.detector module
- class tsseg.algorithms.bocd.detector.BOCDDetector(hazard_lambda=300, mu=0.0, kappa=1.0, alpha=1.0, beta=1.0, truncate=-40, cp_prob_threshold=0.05, min_distance=25, max_cps=None, multivariate_strategy='l2', tolerance=0, axis=0)[source]
Bases:
BaseSegmenterBayesian change-point detector based on offline inference.
This implementation wraps the offline Bayesian change-point detection heuristic introduced by Fearnhead (2006). It integrates out the mean and variance of each segment under a conjugate Normal-Gamma prior and selects change points by thresholding the posterior probability of a boundary.
- Parameters:
hazard_lambda (
float) – Expected run length. Internally converted to a constant hazard probability of1 / hazard_lambda.mu (
float) – Prior mean of the segment observations.kappa (
float) – Strength of the prior mean (normal precision).alpha (
float) – Prior shape of the inverse-gamma distribution over the variance.beta (
float) – Prior scale of the inverse-gamma distribution over the variance.truncate (
int) – Log probability truncation used by the dynamic programme. Values further than10**truncatebelow the running total are skipped.cp_prob_threshold (
float) – Minimum posterior probability required to accept a change point.min_distance (
int) – Minimum distance (in samples) enforced between successive change points.max_cps (
int|None) – Optional cap on the number of change points to return.Nonekeeps all candidates above the probability threshold.multivariate_strategy (
str) –Strategy for handling multivariate data:
"l2"– reduce to univariate via L2 norm across channels."ensembling"– run BOCD independently on each channel and aggregate the detected change points.
tolerance (
int|float) – Tolerance for aggregating change points in the ensembling strategy. If a float in(0, 1)it is interpreted as a fraction of the signal length.axis (
int) – Time axis in the input array.
- set_fit_request(*, axis: bool | None | str = '$UNCHANGED$') BOCDDetector
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- set_predict_request(*, axis: bool | None | str = '$UNCHANGED$') BOCDDetector
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Module contents
Bayesian online change detection algorithm utilities.