tsseg.algorithms.hdp_hsmm package

HDP-HSMM — Bayesian non-parametric state detection with Gibbs sampling.

Description

HDP-HSMM (Hierarchical Dirichlet Process Hidden Semi-Markov Model) is a non-parametric Bayesian model for time series segmentation that does not require the number of states to be specified in advance. It extends the HMM by modelling arbitrary (non-geometric) state durations through explicit duration distributions.

The generative model:

  • A Hierarchical Dirichlet Process draws an infinite discrete distribution over states (truncated at n_max_states). Concentration parameters alpha and gamma control state reuse.

  • Each state has Normal-Inverse-Wishart emission parameters, allowing multivariate Gaussian observations with learned mean and covariance.

  • Each state has a Negative-Binomial duration distribution (shape dur_alpha, rate dur_beta).

  • Inference uses blocked Gibbs sampling over n_iter iterations.

Type: state detection
Supervision: fully unsupervised
Scope: univariate and multivariate

Parameters

Name

Type

Default

Description

alpha

float

6.0

Concentration for the DP prior on transitions.

gamma

float

6.0

Concentration for the top-level DP.

init_state_concentration

float

6.0

Concentration for the initial state distribution.

n_iter

int

200

Number of Gibbs sampling iterations.

n_max_states

int

20

Truncation level (max states).

trunc

int

100

Truncation level for duration distributions.

kappa0

float

0.25

Prior strength for NIW.

nu0

float / None

None

Degrees of freedom for NIW (default: obs_dim + 2).

prior_mean

float / array

0.0

Prior mean for emissions.

prior_scale

float / array

1.0

Scale matrix for NIW.

dur_alpha

float

2.0

Shape for duration Gamma prior.

dur_beta

float

0.1

Rate for duration Gamma prior.

axis

int

0

Time axis.

Usage

from tsseg.algorithms import HdpHsmmDetector

detector = HdpHsmmDetector(n_iter=200, n_max_states=10)
states = detector.fit_predict(X)

Implementation: Pure NumPy/SciPy Gibbs sampler. Origin: new code. Replaces the earlier pyhsmm-backed implementation.

Reference: Johnson & Willsky (2013), Bayesian Nonparametric Hidden Semi-Markov Models, JMLR; Nagano, Nakamura, Nagai, Mochihashi, Kobayashi & Kaneko (2019), Sequence Pattern Extraction by Segmenting Time Series Data Using GP-HSMM with HDP, IEEE RA-L.

Submodules

tsseg.algorithms.hdp_hsmm.detector module

Adaptive HDP-HSMM detector implementing Gibbs sampling.

class tsseg.algorithms.hdp_hsmm.detector.HdpHsmmDetector(axis=0, alpha=6.0, gamma=6.0, init_state_concentration=6.0, n_iter=200, n_max_states=20, trunc=100, *, kappa0=0.25, nu0=None, prior_mean=0.0, prior_scale=1.0, dur_alpha=2.0, dur_beta=0.1)[source]

Bases: BaseSegmenter

Bayesian non-parametric HDP-HSMM detector using Gibbs sampling.

This implementation faithfully reproduces the generative model of the original pyhsmm-based detector (Weak Limit HDP-HSMM with Gaussian emissions and Poisson durations).

Parameters:
  • axis (int) – Axis along which the time index lies.

  • alpha (float) – Concentration parameter for the Dirichlet Process prior on transitions.

  • gamma (float) – Concentration parameter for the top-level Dirichlet Process (global weights).

  • init_state_concentration (float) – Concentration parameter for the initial state distribution.

  • n_iter (int) – Number of Gibbs sampling iterations.

  • n_max_states (int) – Truncation level for the number of states.

  • trunc (int) – Truncation level for duration distributions.

  • kappa0 (float) – Prior strength for the Normal-Inverse-Wishart distribution.

  • nu0 (Optional[float]) – Degrees of freedom for the NIW prior. Defaults to obs_dim + 2.

  • prior_mean (Any) – Prior mean for the emissions.

  • prior_scale (Any) – Scale matrix for the NIW prior.

  • dur_alpha (float) – Shape parameter for the duration Gamma prior.

  • dur_beta (float) – Rate parameter for the duration Gamma prior.

get_fitted_params()[source]
set_fit_request(*, axis: bool | None | str = '$UNCHANGED$') HdpHsmmDetector

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

axis (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for axis parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, axis: bool | None | str = '$UNCHANGED$') HdpHsmmDetector

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

axis (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for axis parameter in predict.

Returns:

self – The updated object.

Return type:

object

Module contents