What this page covers
        HHM results are only as strong as their testing rules. This page documents how we set thresholds for operators (e.g., OP002 Rec, OP003 Echo, OP006 UnifiedEntropy, OP014 CollapseOrbit), how we construct null models to estimate chance levels, and how we compute confidence intervals and error control for reproducible claims.
      
HHM_bundle.json when you (or AI) run tests. Return a Result Card with thresholds, null choice, CI method, effect sizes, and decisions.
      Run this with an AI
"Load HHM_bundle.json. For the attached dataset, compute OP002 (Rec), OP003 (Echo), OP006 (Entropy),
and OP014 (CollapseOrbit). Use the bundle's default thresholds and null models.
Return a Result Card per schema with CIs, effect sizes, and pass/fail decisions."
      Default Thresholds (bundle-wide, adjustable)
These are the baseline values used in the HHM minimum validation set. Adjust per dataset via the calibration flow below.
| Metric / Operator | Symbol | Default Threshold | Decision Rule | Notes | 
|---|---|---|---|---|
| Recurrence similarity (OP002) | Rec(Ψ₁,Ψ₂) | 
            ≥ 0.85 | Pass if Rec ≥ 0.85 | Cosine similarity on matched feature space; domain-normalized. | 
| Self-echo (OP003) | Echo(Ψ) | 
            ≥ 0.90 | Pass if max lag peak ≥ 0.90 (τ ≠ 0) | Autocorr/RQA; report lag τ* at peak. | 
| Entropy delta (OP006) | |ΔH| | 
            ≤ 0.15 | Pass if |H₁ − H₂| ≤ 0.15 | UnifiedEntropy on normalized modal distribution. | 
| Orbit agreement (OP014) | ΔOrbit | 
            ≤ 1 step | Pass if |n₁ − n₂| ≤ 1 | Small tolerance for noisy sequences. | 
| Identity match (composite) | IMS | 
            ≥ 0.85 | Pass if IMS ≥ 0.85 | Bundle-defined weighted mix of Rec/Echo/Entropy/Orbit. | 
Result Card fields (excerpt)
{
  "operator": "OP002",
  "metric": "Rec",
  "value": 0.891,
  "threshold": 0.85,
  "null_model": "phase_randomization",
  "p_value": 0.004,
  "ci": {"method":"BCa","level":0.95,"lower":0.862,"upper":0.915},
  "effect_size": {"type":"Cohen_d","value":1.12},
  "decision": "pass"
}
      Calibrating Thresholds by Dataset
Thresholds are principled defaults, not commandments. For each dataset, we calibrate using distributional checks, null simulations, and ROC analysis.
- Normalize features (per-basis scaling, z-score by channel/segment).
 - Estimate chance levels with at least one null model.
 - Target α (e.g., 0.05) and derive threshold that keeps FPR ≤ α on null samples.
 - Check ROC/AUC if labeled positives exist; pick threshold near Youden’s J.
 - Lock threshold before looking at holdout data.
 
Run this with an AI
"Using HHM_bundle.json, calibrate Rec/Echo thresholds for this dataset.
Generate null distributions via circular shift (time-series) and symbol shuffle (glyph).
Report ROC, AUC, chosen threshold for α=0.05, and freeze settings into a Thresholds block."
      Null Models (choose by data type)
Nulls approximate “no structured match/recurrence.” Choose the weakest assumption that still preserves nuisance structure (e.g., spectrum, autocorrelation).
| Null | Use For | Preserves | Breaks | 
|---|---|---|---|
| Random shuffle | Symbols, unordered events | Marginal histogram | Order, recurrence | 
| Circular time shift | Time-series (EEG, audio) | Spectrum, amplitude | Phase alignment across windows | 
| Phase randomization | Stationary signals | Power spectrum | Temporal structure | 
| IAAFT surrogate | Nonlinear time-series | Amplitude distribution + spectrum | Higher-order dependencies | 
| Block bootstrap | Autocorrelated data | Local correlation (block length) | Long-range order | 
| Label permutation | Between-group tests | All within-sample structure | Condition linkage | 
Run this with an AI
"Compute null distributions for OP003 (Echo) using phase randomization (N=2000) and circular shifts (N=2000).
Return p-values, QQ plots, and a merged conservative p via max method. Update the Result Card."
      Permutation, Resampling & Power
Permutation tests
- Within-sample: Time-shift or phase-randomize; recompute metric to get null.
 - Between-group: Shuffle labels; recompute group difference of metric.
 
Bootstrap
- Percentile / BCa CIs for metrics (Rec, Echo, Entropy).
 - Block bootstrap for autocorrelated series (pick block length via ACF cutoff).
 
Power analysis (pragmatic)
- Estimate effect size from pilot (e.g., ΔRec vs null mean / pooled SD).
 - Simulate resamples to find N (segments/windows) for 80–90% power at α = 0.05.
 
Run this with an AI
"Estimate power to detect Rec ≥ 0.85 given empirical null from circular shifts.
Use block bootstrap (block=2s) to model dependency; report N windows for 0.8 and 0.9 power."
      Confidence Intervals & Effect Sizes
CI methods
- Bootstrap Percentile (default for bounded metrics): report 95% CI from resamples.
 - BCa (bias-corrected & accelerated) when distributions are skewed.
 - Parametric CI when justified; e.g., Fisher z-transform for correlations.
 
Examples
- Rec: Fisher z CI or bootstrap-BCa if non-normal.
 - Echo: Bootstrap peak distribution across windows (exclude τ=0).
 - Entropy: Bootstrap on normalized modal probabilities; delta method optional.
 
Effect sizes
- Cliff’s δ for robust difference vs null.
 - Cohen’s d when assumptions are met.
 - AUC for discriminability across classes.
 
Run this with an AI
"Compute 95% BCa CI for Echo and Rec on the attached sequence (window=4s, step=0.5s).
Also report Cliff's δ vs null and interpret in plain language. Update the Result Card."
      Multiple Comparisons & Reporting
When running many windows, channels, or operator variants, control false discoveries.
- FDR (Benjamini–Hochberg) across families of related tests.
 - FWER (Holm–Bonferroni) for small families or critical claims.
 - Cluster-wise correction for contiguous time/frequency effects.
 
Run this with an AI
"Apply BH-FDR (q=0.05) across 64 channels × 10 lags for Echo tests.
Return per-channel significant lags, adjusted p-values, and cluster summaries."
      Cross-Domain Normalization (when comparing unlike things)
- Feature scaling: z-score within domain; unit-norm vectors before OP002.
 - Basis alignment: map to shared modal bins or learned embedding.
 - Entropy comparability: use the same binning or adaptive partitioning.
 
Note: Always state normalization in the Result Card. Different choices can change IMS slightly; thresholds should be calibrated under the same pipeline.
Result Card — Minimal Fields
| Field | Type | Description | 
|---|---|---|
operator | string | OP id (e.g., OP002) | 
metric | string | Name (Rec, Echo, Entropy, Orbit) | 
value | number | Observed statistic | 
threshold | number | Decision cutoff after calibration | 
null_model | string | e.g., circular_shift, phase_rand | 
p_value | number | One- or two-sided p | 
ci | object | {method, level, lower, upper} | 
effect_size | object | {type, value} | 
multiplicity | object | {method, m, p_adj} | 
decision | string | pass / fail | 
notes | string | Plain-language interpretation | 
Reproducibility Checklist
- State operators, windows, preprocessing, and basis.
 - Lock thresholds after calibration on train/validation; test on holdout.
 - Declare null model(s) and resample counts (N ≥ 2000 recommended).
 - Provide CI method, level (95% default), and effect sizes.
 - Control multiplicity; report adjusted p-values and family definition.
 - Attach Result Cards and data hashes; version everything.