tsseg.algorithms.hdp_hsmm package
HDP-HSMM — Bayesian non-parametric state detection with Gibbs sampling.
Description
HDP-HSMM (Hierarchical Dirichlet Process Hidden Semi-Markov Model) is a non-parametric Bayesian model for time series segmentation that does not require the number of states to be specified in advance. It extends the HMM by modelling arbitrary (non-geometric) state durations through explicit duration distributions.
The generative model:
A Hierarchical Dirichlet Process draws an infinite discrete distribution over states (truncated at
n_max_states). Concentration parametersalphaandgammacontrol state reuse.Each state has Normal-Inverse-Wishart emission parameters, allowing multivariate Gaussian observations with learned mean and covariance.
Each state has a Negative-Binomial duration distribution (shape
dur_alpha, ratedur_beta).Inference uses blocked Gibbs sampling over
n_iteriterations.
Parameters
Name |
Type |
Default |
Description |
|---|---|---|---|
|
float |
|
Concentration for the DP prior on transitions. |
|
float |
|
Concentration for the top-level DP. |
|
float |
|
Concentration for the initial state distribution. |
|
int |
|
Number of Gibbs sampling iterations. |
|
int |
|
Truncation level (max states). |
|
int |
|
Truncation level for duration distributions. |
|
float |
|
Prior strength for NIW. |
|
float / None |
|
Degrees of freedom for NIW (default: |
|
float / array |
|
Prior mean for emissions. |
|
float / array |
|
Scale matrix for NIW. |
|
float |
|
Shape for duration Gamma prior. |
|
float |
|
Rate for duration Gamma prior. |
|
int |
|
Time axis. |
Usage
from tsseg.algorithms import HdpHsmmDetector
detector = HdpHsmmDetector(n_iter=200, n_max_states=10)
states = detector.fit_predict(X)
Implementation: Pure NumPy/SciPy Gibbs sampler. Origin: new code. Replaces
the earlier pyhsmm-backed implementation.
Reference: Johnson & Willsky (2013), Bayesian Nonparametric Hidden Semi-Markov Models, JMLR; Nagano, Nakamura, Nagai, Mochihashi, Kobayashi & Kaneko (2019), Sequence Pattern Extraction by Segmenting Time Series Data Using GP-HSMM with HDP, IEEE RA-L.
Submodules
tsseg.algorithms.hdp_hsmm.detector module
Adaptive HDP-HSMM detector implementing Gibbs sampling.
- class tsseg.algorithms.hdp_hsmm.detector.HdpHsmmDetector(axis=0, alpha=6.0, gamma=6.0, init_state_concentration=6.0, n_iter=200, n_max_states=20, trunc=100, *, kappa0=0.25, nu0=None, prior_mean=0.0, prior_scale=1.0, dur_alpha=2.0, dur_beta=0.1)[source]
Bases:
BaseSegmenterBayesian non-parametric HDP-HSMM detector using Gibbs sampling.
This implementation faithfully reproduces the generative model of the original
pyhsmm-based detector (Weak Limit HDP-HSMM with Gaussian emissions and Poisson durations).- Parameters:
axis (
int) – Axis along which the time index lies.alpha (
float) – Concentration parameter for the Dirichlet Process prior on transitions.gamma (
float) – Concentration parameter for the top-level Dirichlet Process (global weights).init_state_concentration (
float) – Concentration parameter for the initial state distribution.n_iter (
int) – Number of Gibbs sampling iterations.n_max_states (
int) – Truncation level for the number of states.trunc (
int) – Truncation level for duration distributions.kappa0 (
float) – Prior strength for the Normal-Inverse-Wishart distribution.nu0 (
Optional[float]) – Degrees of freedom for the NIW prior. Defaults toobs_dim + 2.prior_mean (
Any) – Prior mean for the emissions.prior_scale (
Any) – Scale matrix for the NIW prior.dur_alpha (
float) – Shape parameter for the duration Gamma prior.dur_beta (
float) – Rate parameter for the duration Gamma prior.
- set_fit_request(*, axis: bool | None | str = '$UNCHANGED$') HdpHsmmDetector
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- set_predict_request(*, axis: bool | None | str = '$UNCHANGED$') HdpHsmmDetector
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.