What this page covers
HHM results are only as strong as their testing rules. This page documents how we set thresholds for operators (e.g., OP002 Rec
, OP003 Echo
, OP006 UnifiedEntropy
, OP014 CollapseOrbit
), how we construct null models to estimate chance levels, and how we compute confidence intervals and error control for reproducible claims.
HHM_bundle.json
when you (or AI) run tests. Return a Result Card with thresholds, null choice, CI method, effect sizes, and decisions.
Run this with an AI
"Load HHM_bundle.json. For the attached dataset, compute OP002 (Rec), OP003 (Echo), OP006 (Entropy),
and OP014 (CollapseOrbit). Use the bundle's default thresholds and null models.
Return a Result Card per schema with CIs, effect sizes, and pass/fail decisions."
Default Thresholds (bundle-wide, adjustable)
These are the baseline values used in the HHM minimum validation set. Adjust per dataset via the calibration flow below.
Metric / Operator | Symbol | Default Threshold | Decision Rule | Notes |
---|---|---|---|---|
Recurrence similarity (OP002) | Rec(Ψ₁,Ψ₂) |
≥ 0.85 | Pass if Rec ≥ 0.85 | Cosine similarity on matched feature space; domain-normalized. |
Self-echo (OP003) | Echo(Ψ) |
≥ 0.90 | Pass if max lag peak ≥ 0.90 (τ ≠ 0) | Autocorr/RQA; report lag τ* at peak. |
Entropy delta (OP006) | |ΔH| |
≤ 0.15 | Pass if |H₁ − H₂| ≤ 0.15 | UnifiedEntropy on normalized modal distribution. |
Orbit agreement (OP014) | ΔOrbit |
≤ 1 step | Pass if |n₁ − n₂| ≤ 1 | Small tolerance for noisy sequences. |
Identity match (composite) | IMS |
≥ 0.85 | Pass if IMS ≥ 0.85 | Bundle-defined weighted mix of Rec/Echo/Entropy/Orbit. |
Result Card fields (excerpt)
{
"operator": "OP002",
"metric": "Rec",
"value": 0.891,
"threshold": 0.85,
"null_model": "phase_randomization",
"p_value": 0.004,
"ci": {"method":"BCa","level":0.95,"lower":0.862,"upper":0.915},
"effect_size": {"type":"Cohen_d","value":1.12},
"decision": "pass"
}
Calibrating Thresholds by Dataset
Thresholds are principled defaults, not commandments. For each dataset, we calibrate using distributional checks, null simulations, and ROC analysis.
- Normalize features (per-basis scaling, z-score by channel/segment).
- Estimate chance levels with at least one null model.
- Target α (e.g., 0.05) and derive threshold that keeps FPR ≤ α on null samples.
- Check ROC/AUC if labeled positives exist; pick threshold near Youden’s J.
- Lock threshold before looking at holdout data.
Run this with an AI
"Using HHM_bundle.json, calibrate Rec/Echo thresholds for this dataset.
Generate null distributions via circular shift (time-series) and symbol shuffle (glyph).
Report ROC, AUC, chosen threshold for α=0.05, and freeze settings into a Thresholds block."
Null Models (choose by data type)
Nulls approximate “no structured match/recurrence.” Choose the weakest assumption that still preserves nuisance structure (e.g., spectrum, autocorrelation).
Null | Use For | Preserves | Breaks |
---|---|---|---|
Random shuffle | Symbols, unordered events | Marginal histogram | Order, recurrence |
Circular time shift | Time-series (EEG, audio) | Spectrum, amplitude | Phase alignment across windows |
Phase randomization | Stationary signals | Power spectrum | Temporal structure |
IAAFT surrogate | Nonlinear time-series | Amplitude distribution + spectrum | Higher-order dependencies |
Block bootstrap | Autocorrelated data | Local correlation (block length) | Long-range order |
Label permutation | Between-group tests | All within-sample structure | Condition linkage |
Run this with an AI
"Compute null distributions for OP003 (Echo) using phase randomization (N=2000) and circular shifts (N=2000).
Return p-values, QQ plots, and a merged conservative p via max method. Update the Result Card."
Permutation, Resampling & Power
Permutation tests
- Within-sample: Time-shift or phase-randomize; recompute metric to get null.
- Between-group: Shuffle labels; recompute group difference of metric.
Bootstrap
- Percentile / BCa CIs for metrics (Rec, Echo, Entropy).
- Block bootstrap for autocorrelated series (pick block length via ACF cutoff).
Power analysis (pragmatic)
- Estimate effect size from pilot (e.g., ΔRec vs null mean / pooled SD).
- Simulate resamples to find N (segments/windows) for 80–90% power at α = 0.05.
Run this with an AI
"Estimate power to detect Rec ≥ 0.85 given empirical null from circular shifts.
Use block bootstrap (block=2s) to model dependency; report N windows for 0.8 and 0.9 power."
Confidence Intervals & Effect Sizes
CI methods
- Bootstrap Percentile (default for bounded metrics): report 95% CI from resamples.
- BCa (bias-corrected & accelerated) when distributions are skewed.
- Parametric CI when justified; e.g., Fisher z-transform for correlations.
Examples
- Rec: Fisher z CI or bootstrap-BCa if non-normal.
- Echo: Bootstrap peak distribution across windows (exclude τ=0).
- Entropy: Bootstrap on normalized modal probabilities; delta method optional.
Effect sizes
- Cliff’s δ for robust difference vs null.
- Cohen’s d when assumptions are met.
- AUC for discriminability across classes.
Run this with an AI
"Compute 95% BCa CI for Echo and Rec on the attached sequence (window=4s, step=0.5s).
Also report Cliff's δ vs null and interpret in plain language. Update the Result Card."
Multiple Comparisons & Reporting
When running many windows, channels, or operator variants, control false discoveries.
- FDR (Benjamini–Hochberg) across families of related tests.
- FWER (Holm–Bonferroni) for small families or critical claims.
- Cluster-wise correction for contiguous time/frequency effects.
Run this with an AI
"Apply BH-FDR (q=0.05) across 64 channels × 10 lags for Echo tests.
Return per-channel significant lags, adjusted p-values, and cluster summaries."
Cross-Domain Normalization (when comparing unlike things)
- Feature scaling: z-score within domain; unit-norm vectors before OP002.
- Basis alignment: map to shared modal bins or learned embedding.
- Entropy comparability: use the same binning or adaptive partitioning.
Note: Always state normalization in the Result Card. Different choices can change IMS slightly; thresholds should be calibrated under the same pipeline.
Result Card — Minimal Fields
Field | Type | Description |
---|---|---|
operator | string | OP id (e.g., OP002) |
metric | string | Name (Rec, Echo, Entropy, Orbit) |
value | number | Observed statistic |
threshold | number | Decision cutoff after calibration |
null_model | string | e.g., circular_shift, phase_rand |
p_value | number | One- or two-sided p |
ci | object | {method, level, lower, upper} |
effect_size | object | {type, value} |
multiplicity | object | {method, m, p_adj} |
decision | string | pass / fail |
notes | string | Plain-language interpretation |
Reproducibility Checklist
- State operators, windows, preprocessing, and basis.
- Lock thresholds after calibration on train/validation; test on holdout.
- Declare null model(s) and resample counts (N ≥ 2000 recommended).
- Provide CI method, level (95% default), and effect sizes.
- Control multiplicity; report adjusted p-values and family definition.
- Attach Result Cards and data hashes; version everything.