tsseg.algorithms.hdp_hsmm package
HDP-HSMM — Bayesian non-parametric state detection with Gibbs sampling.
Description
HDP-HSMM (Hierarchical Dirichlet Process Hidden Semi-Markov Model) is a non-parametric Bayesian model for time series segmentation that does not require the number of states to be specified in advance. It extends the HMM by modelling arbitrary (non-geometric) state durations through explicit duration distributions.
The generative model:
A Hierarchical Dirichlet Process draws an infinite discrete distribution over states (truncated at
n_max_states). Concentration parametersalphaandgammacontrol state reuse.Each state has Normal-Inverse-Wishart emission parameters, allowing multivariate Gaussian observations with learned mean and covariance.
Each state has a Negative-Binomial duration distribution (shape
dur_alpha, ratedur_beta).Inference uses blocked Gibbs sampling over
n_iteriterations.
Parameters
Name |
Type |
Default |
Description |
|---|---|---|---|
|
float |
|
Concentration for the DP prior on transitions. |
|
float |
|
Concentration for the top-level DP. |
|
float |
|
Concentration for the initial state distribution. |
|
int |
|
Number of Gibbs sampling iterations. |
|
int |
|
Truncation level (max states). |
|
int |
|
Truncation level for duration distributions. |
|
float |
|
Prior strength for NIW. |
|
float / None |
|
Degrees of freedom for NIW (default: |
|
float / array |
|
Prior mean for emissions. |
|
float / array |
|
Scale matrix for NIW. |
|
float |
|
Shape for duration Gamma prior. |
|
float |
|
Rate for duration Gamma prior. |
|
int |
|
Time axis. |
Usage
from tsseg.algorithms import HdpHsmmDetector
detector = HdpHsmmDetector(n_iter=200, n_max_states=10)
states = detector.fit_predict(X)
Implementation: Pure NumPy/SciPy Gibbs sampler. Origin: new code. Replaces
the earlier pyhsmm-backed implementation.
Reference: Johnson & Willsky (2013), Bayesian Nonparametric Hidden Semi-Markov Models, JMLR; Nagano, Nakamura, Nagai, Mochihashi, Kobayashi & Kaneko (2019), Sequence Pattern Extraction by Segmenting Time Series Data Using GP-HSMM with HDP, IEEE RA-L.