The methodology separates runtime evidence, cohort assumptions, reference-human priors, confidence evolution, reproducibility, limitations, and clinical interpretation.
Validate freshness, source provenance, repeatability, missingness, and whether signals are suitable for longitudinal interpretation.
Test whether cohort assignment and reference-human priors are appropriate for initialization and bounded comparison.
Review whether Twin confidence, state transitions, and trajectory updates behave consistently as evidence accumulates.
Evaluate drift detection, transition logic, confidence gates, and whether uncertainty remains visible to operators.
Compare recommended execution posture with observed practice outcomes and longitudinal recalibration behavior.
Tests whether confidence changes in proportion to evidence maturity, signal freshness, and recalibration continuity.
Reviews whether cohort priors and reference envelopes are appropriate, bounded, and transparent.
Checks whether gameplay-derived behavioral signals remain repeatable, interpretable, and useful for runtime modeling.
Keeps infrastructure interpretation separate from diagnosis, treatment, prescribing, or autonomous clinical claims.
Evaluate change across time instead of isolated snapshots.
Document cohort, reference-human, and calibration assumptions.
Expose uncertainty, maturity, and evidence strength.
Preserve runtime events and model transitions for review.
Explicitly identify what the model cannot infer.
Do not convert computational signals into clinical claims without validation.