← kineticaai.com
Kinetica AI · Sleep Quality Predictor · Research

Sleep Quality Predictor
Nocturnal recovery as clinical signal.

An independent predictor that targets sleep quality as its own clinical output. Built on the same physiological pipeline as the ANS Predictor, but asking a different clinical question: can nocturnal autonomic recovery predict next-day fatigue burden in a longitudinal N=1 study?

47 paired nights
27 high-fatigue · 20 low-fatigue days
Algorithm: Logistic Regression (L2)
AUC LOO-CV: 0.74 · CI₉₅ [0.57, 0.88]
93% sensitivity · Youden-J threshold
243Nights monitored
47Training pairs
2Features selected
0.74AUC LOO-CV
0.57–0.88CI₉₅
93%Sensitivity
Clinical context

Sleep quality as an independent clinical signal

In patients with complex chronic conditions — post-Lyme, ME/CFS, post-viral syndromes — sleep is not merely a background variable. Nocturnal autonomic recovery directly shapes next-day symptom burden. Poor sleep continuity and dysregulated overnight HRV are not just correlates of fatigue; they are part of the physiological pathway.

This relationship is independently documented in published literature on post-infectious syndromes and wearable physiological monitoring. [→ Scientific basis]

This predictor was built to isolate that pathway. It uses nocturnal RMSSD — measured from raw RR intervals during sleep — as the primary physiological window, and predicts whether the following diary day will be a high-fatigue day (fatigue ≥ 6/10, self-rated 0–10 scale).

It is not a subcomponent of the ANS Predictor. It is a separate model, with its own feature selection, its own validation, and its own clinical interpretation. Both predictors draw from the same cleaned physiological foundation, but ask different questions.

Methodology

N-of-1 longitudinal design — single autonomic target

A single post-Lyme patient wearing a Polar Grit X2 every night. Raw RR intervals are processed into nocturnal HRV features across multiple lag windows. Each morning, fatigue severity is recorded in a standardized diary.

Feature selection: Greedy forward selection from 20 candidate features (5 HRV variables × 4 lag windows: t0 to t3). Selection stops when AUC gain drops below 0.01. The model independently selected its own optimal feature set — it was not given the ANS Predictor's features.

Validation: Leave-one-out cross-validation (LOO-CV) for primary AUC. 1,000× bootstrap for confidence intervals. No separate test set — every paired diary/physiology entry serves as both training and validation exactly once.

Algorithm: Logistic regression with L2 regularisation (C=0.5), class-balanced weighting to compensate for the 27/20 positive/negative split, StandardScaler re-fit inside each LOO fold to prevent data leakage.

Results

Model performance

Trained on 47 diary entries paired with nocturnal HRV from 243 monitored nights.

KEY FINDING

Nocturnal RMSSD from the same night (t0) and the previous night (t1) are the only features selected. Forward selection stopped at two features — no additional lag provided ≥ 0.01 AUC gain. Both coefficients are negative, confirming that higher overnight RMSSD is associated with lower next-day fatigue probability. The result is physiologically coherent: stronger parasympathetic recovery during sleep predicts better next-day autonomic regulation.

N-of-1 study: results are subject-specific and cannot be generalised to other individuals. LOO-CV provides an unbiased AUC estimate within the training set but does not constitute prospective validation. Confidence intervals are wide (≈0.30 range) due to the small sample (47 paired days). The Youden-J threshold produces high sensitivity (93%) at the cost of lower specificity (55%) — optimised to minimise missed high-fatigue days.
Selected features

Two features, two lags — minimal and interpretable

The model selected exactly two HRV variables. Both measure the same physiological phenomenon — parasympathetic nocturnal recovery — at consecutive lag windows. This is the simplest possible model that still achieves AUC 0.74.

hrv_rmssd_night_t0
Coefficient: −0.8228
Nocturnal RMSSD from the same night as the diary entry. Strongest predictor — higher overnight parasympathetic activity strongly reduces predicted fatigue probability.
hrv_rmssd_night_t1
Coefficient: −0.2860
Nocturnal RMSSD from the previous night (lag 1). Weaker but additive effect — persistent autonomic recovery across two consecutive nights lowers fatigue probability further.

Physiological coherence check: the direction of both coefficients is negative — meaning higher RMSSD predicts lower fatigue. This is expected: RMSSD reflects parasympathetic dominance, and higher overnight parasympathetic activity indicates better recovery. The model is not a statistical artefact.

ROC curve

How well does the model separate high-fatigue from low-fatigue days?

A ROC curve shows how well the model separates symptomatic days from asymptomatic ones across every possible decision threshold. The top-left corner is perfect; the diagonal is a coin flip. The dot marks the operating point chosen for clinical use (Youden's J: the best trade-off between catching true cases and avoiding false alarms).

Technical: Curve drawn from leave-one-out cross-validation predicted probabilities. Each vertex represents one threshold — the step-function shape is the statistically correct representation for n=47; smooth interpolation would misrepresent the data. AUC and 95% CI from 1,000 bootstrap resamples (seed = 42).
Forest plot

AUC with 95% confidence interval

The bar shows the model's accuracy (AUC) at predicting next-day fatigue, with its uncertainty range (95% confidence interval) drawn as a horizontal line. An AUC of 1.0 means perfect prediction; 0.5 is random chance.

Technical: AUC = area under the ROC curve from LOO-CV. Confidence interval from 1,000 bootstrap resamples (seed = 42). Reference lines at 0.5 (random) and 0.7 (conventional acceptable threshold for clinical classifiers). The wide CI reflects the small sample size (n=47) — a larger dataset would substantially narrow the interval.
Confusion matrix

Where exactly does the model make mistakes?

A confusion matrix shows exactly where the model makes mistakes. The table reports: how many truly high-fatigue days were correctly flagged (True Positives), how many were missed (False Negatives), how many low-fatigue days were wrongly flagged (False Positives), and how many were correctly left alone (True Negatives).

Technical: Predictions made at the Youden-optimal threshold (threshold = 0.419; maximises sensitivity + specificity − 1). Counts from leave-one-out cross-validation — each sample predicted exactly once by a model trained on all the other samples.
Model details

What does the model actually use?

This section shows which physiological signals the model relies on and the direction of each signal's effect. Negative bars push toward "low fatigue" (protective), positive bars push toward "high fatigue" (risk factor). Coefficients are computed on standardised features and are directly comparable in magnitude.

Technical: LR coefficients computed on standardised features fitted on the full dataset (single fit — not LOO-CV). C=0.5 · solver=lbfgs · class_weight=balanced · StandardScaler · converged in 4 iterations.
Metric glossary
AUC-ROCProbability that the model ranks a random high-fatigue day above a random low-fatigue one. 1.0 = perfect, 0.5 = coin flip.
Sensitivity (Recall)Of all truly high-fatigue days, how many did the model catch? High sensitivity → few missed flares.
SpecificityOf all truly low-fatigue days, how many did the model correctly leave unflagged? High specificity → few false alarms.
PPV (Precision)Of the days the model flags as high-fatigue, how many actually are? Answers: "when the alarm fires, can I trust it?"
NPVOf the days the model clears as low-fatigue, how many actually are? Answers: "when no alarm, am I safe?"
F1-scoreHarmonic mean of precision and sensitivity (0–1). Standard ML summary statistic when classes are unequal.
MCCMatthews Correlation Coefficient (−1 to +1). The most robust single number for imbalanced binary classification. +1 = perfect, 0 = random.
Balanced Accuracy(Sensitivity + Specificity) / 2. More honest than raw accuracy when one class is more frequent.
Youden's J thresholdDecision threshold that maximises Sensitivity + Specificity − 1. Used for the confusion matrix here.
LOO-CVLeave-one-out cross-validation. Every sample is predicted once by a model that never saw it during training.
Bootstrap CI 95%Confidence interval estimated by resampling the dataset 1,000 times. Captures uncertainty from the small N.
Forward selectionGreedy feature search: add the single feature that most improves AUC at each step. Stops when gain < 0.01.
Data pipeline

Built on the same physiological foundation as the ANS Predictor

This predictor does not have its own data collection layer. It uses the same automated pipeline that feeds the ANS Predictor: nightly Polar AccessLink pulls, RR interval processing via neurokit2, feature engineering, and diary alignment.

The Sleep Quality Predictor adds a separate modelling layer on top of that foundation — with its own target variable (fatiga), its own feature selection run, its own logistic regression coefficients, and its own LOO-CV evaluation. The pipeline is shared. The clinical question is different.

Stack: Python · scikit-learn · neurokit2 · Polar Grit X2 · GitHub Actions · CSV

References

Scientific basis

The Sleep Quality Predictor rests on three lines of published evidence: the predictive value of HRV features for sleep quality assessed by machine learning, nocturnal RMSSD as a marker of parasympathetic recovery, and the case for individual-level wearable modelling when aggregate patterns fail to generalise.

[1] Predicting Sleep Quality through Biofeedback: A Machine Learning Approach Using Heart Rate Variability and Skin Temperature

Di Credico A et al. · Clocks & Sleep · 2024 · https://doi.org/10.3390/clockssleep6030023

SVM on wearable HRV achieves 83.4% classification accuracy for sleep quality (PSQI) — the same predictive direction as Kinetica's RMSSD-based model, with Kinetica's AUC 0.74 being conservative given its unimodal, single-subject design.

[2] Detailed evaluation of sleep apnea using heart rate variability: a machine learning and statistical method using ECG data

Attar ET · Frontiers in Neurology · 2025 · https://doi.org/10.3389/fneur.2025.1636983

Nonlinear HRV features (SampEn) and parasympathetic indices (HF, RMSSD) show the strongest discriminatory power for sleep state classification — consistent with Kinetica's feature selection converging on RMSSD as the sole predictor.

[3] Benefit of the N-of-1 Approach Versus Aggregate Analysis in Tracking Individual Trajectories

Behrouzi T et al. · JMIR Formative Research · 2026 · https://doi.org/10.2196/86203

In a 256-participant wearable study tracking HRV and fatigue, aggregate models failed to match individual HRV trajectories in 95.24% of cases — methodological support for Kinetica's N-of-1 design targeting fatigue as primary outcome.

[4] Similar Patterns of Dysautonomia in Myalgic Encephalomyelitis/Chronic Fatigue and Post-COVID-19 Syndromes

Ryabkova VA et al. · Pathophysiology · 2024 · https://doi.org/10.3390/pathophysiology31010001

HRV parameters correlate closely with fatigue in ME/CFS and post-COVID patients but not with depression/anxiety scores — supporting the clinical hypothesis that nocturnal autonomic recovery is a fatigue-specific signal, not a general distress marker.

All references retrieved from PubMed. DOI links open the original publications.