tsseg.algorithms.hidalgo package
Hidalgo — Heterogeneous Intrinsic Dimensionality Algorithm.
Description
Hidalgo performs Bayesian clustering by estimating the local intrinsic
dimensionality of data manifolds. It assigns each observation to one of
K_states manifolds using Gibbs sampling, a Potts-model spatial prior and
nearest-neighbour distance statistics.
The algorithm is designed for high-dimensional data and is particularly suited when different states occupy manifolds of different dimensionality.
K_states required)Parameters
Name |
Type |
Default |
Description |
|---|---|---|---|
|
str / callable |
|
Distance metric for sklearn |
|
int |
|
Number of manifolds / states. |
|
float |
|
Local homogeneity level, in \((0, 1)\). |
|
int |
|
Number of neighbours for local Z interaction. |
|
int |
|
Number of Gibbs sampling iterations. |
|
int |
|
Number of random restarts. |
|
float |
|
Fraction of iterations discarded as burn-in. |
|
bool |
|
Estimate parameters with fixed allocation Z. |
|
bool |
|
Enable local Potts interaction between assignments. |
|
bool |
|
Update zeta during sampling. |
|
int |
|
Save samples every k iterations. |
|
int |
|
Random seed. |
Usage
from tsseg.algorithms import HidalgoDetector
detector = HidalgoDetector(K_states=3, n_iter=500)
states = detector.fit_predict(X)
Implementation: Adapted from aeon with numerical stability fix (log-domain
sample_p). BSD 3-Clause.
Reference: Allegra, Facco, Denti, Laio & Mira (2020), Data segmentation based on the local intrinsic dimension, Scientific Reports.