eneuro — Latest Matching Preprints

1

Distinguishing Age-specific Patterns in Comorbidities of Obstructive Sleep Apnea Using Real-World Data

Goodman, M. O.; Alex, R. M.; Sands, S. A.; Azarbarzin, A.; Batool-anwar, S.; Pavlova, M. K.; Epstein, L. J.; Redline, S.; Cade, B. E.

2026-05-28 epidemiology 10.64898/2026.05.20.26352336 medRxiv

Top 10%

0.7%

Show abstract

Obstructive sleep apnea (OSA) is associated with a wide range of comorbidities, but the extent to which these follow predictable, age-dependent patterns is not well understood. Identifying such patterns could provide insight into OSA heterogeneity and its links to physiological measures of OSA. We trained age-dependent topic models (ATM) on longitudinal electronic health records from 36,426 patients with OSA in the Mass General Brigham Biobank. ATM organizes incident diagnoses into distinct comorbidity "topics," whose age-specific disease loadings represent predictive patterns linking related diagnoses across the life course. We applied the trained model to compute individual-level topic scores in independent data: a cohort of 11,689 OSA cases and 22,695 matched controls, and a cohort of 6,220 patients with polysomnography (PSG)-derived physiological measures. We identified 19 distinct age-dependent comorbidity profiles, all significantly associated with OSA case status (FDR-adjusted p<0.05). Topics reflected recognizable clusters including metabolic, neuropsychiatric, and immune-mediated conditions, and several were distinguished by age-of-onset of key comorbidities, such as early- vs late-onset asthma. Seventeen of the 19 topics were significantly associated with at least one of 13 PSG-derived physiological measures, including associations between cardiometabolic topics and the apnea-hypopnea index, sleep apnea specific hypoxic burden, and respiratory event-specific heart rate burden. These findings indicate that age-dependent comorbidity patterns distinguish meaningful OSA subtypes with differing prognoses and endophenotype associations. ATM offers insight into complex OSA comorbidity and suggests that age-informed, topic-based stratification may improve individualized risk assessment, interpretation of PSG findings, and targeting of clinical interventions.

2

VOGeo-Gaze: Calibration-Free, Geometry-Aware Deep Learning for Real-Time Gaze Tracking in Clinical Video-Oculography

Zhao, J.; Ahmadi, S.-A.; Decker, J.; Zwergal, A.; Eulenburg, P. z.; Flanagin, V. L.; Wuehr, M.

2026-05-29 health informatics 10.64898/2026.05.27.26354254 medRxiv

Top 11%

0.5%

Show abstract

Quantitative eye movement analysis is important for neuro- logical diagnostics, yet existing video-oculography (VOG) systems typ- ically require calibration, device-specific settings, or accurate gaze la- bels. We present VOGeo-Gaze, a real-time, calibration-free, geometry- aware neural network that estimates gaze by reconstructing anatomi- cally meaningful eyeball parameters from image features. The method combines segmentation-driven projection geometry, a refraction-aware pupil correction module, and temporal anatomical stabilization, so gaze is derived from interpretable eye geometry rather than direct angular regression. Trained only on the public TEyeD dataset with weak gaze supervision, VOGeo-Gaze was evaluated on 116 clinical recordings from 17 patients and 19 healthy subjects using EyeSeeCam, a clinical gold- standard VOG system. It achieved median absolute angular errors of 0.33{whitebullet} horizontally and 0.35{whitebullet} vertically, with nearly 92% of recordings below 1{whitebullet} error while operating at >300 FPS. These results demonstrate sub-degree clinical gaze estimation without subject-specific calibration, camera intrinsics, or accurate gaze labels, providing a scalable and inter- pretable alternative to conventional VOG pipelines. Code is available at https://github.com/DSGZ-MotionLab/VOGeo-Gaze.

3

Generalized Sensory Sensitivity for Prediction of Post-Surgical Analgesic Outcomes: An Observational Cohort Study of Total Hip Arthroplasty and Hysterectomy

Schrepf, A.; Smith, T.; Waller, N.; Harris, R. E.; Ichesco, E.; Kaplan, C. M.; Till, S. R.; Williams, D. A.; As-Sanie, S.; Evanski, J. M.; Urquhart, A.; Brummett, C. M.; Clauw, D. J.; Harte, S. E.

2026-05-27 rheumatology 10.64898/2026.05.26.26354108 medRxiv

Top 13%

0.3%

Show abstract

Background. A substantial minority (~20%) of patients fail to achieve meaningful pain reduction following surgery intended to relieve pain. Risk is elevated in patients with nociplastic pain features, but available self-report measures were not designed for pre-surgical screening. We aimed to develop a brief, data- driven screener for poor analgesic response to surgery. Methods. Participants were recruited from tertiary orthopedic and chronic pelvic pain clinics. Total hip arthroplasty participants had Kellgren-Lawrence grades III-IV with hip pain greater than or equal to 1 year; hysterectomy participants had chronic pelvic pain greater than or equal to 6 months. The primary outcome was a 50% reduction in worst pain at six months. Items were selected via elastic net regression with k-fold cross-validation from 68 candidates. Results. Of 428 participants (81% female; mean age 51), 35% failed to achieve a 50% pain reduction. The resulting 11-item screener - the GenerAlized sensory sensitivity for sUrGical rEsponsiveness (GAUGE) - comprises pain across seven body regions and four symptom items measuring interoception (nausea, numbness/tingling) and exteroception (sensitivity to sound, sensitivity to odors). GAUGE outperformed the Central Sensitization Inventory, Fibromyalgia Survey Criteria, and PainDETECT for predicting surgical non-response (RR 1.535, 95% CI 1.342-1.55; AUC 0.738; sensitivity 0.741, specificity 0.635) and for predicting Patient Global Impression of Change. In an independent validation cohort of 54 total knee arthroplasty patients, GAUGE outperformed the Fibromyalgia Survey Criteria in predicting pain severity at six-months. Conclusions. GAUGE is a data-driven, theoretically grounded screener for poor analgesic response to surgery, with potential utility for pre-surgical counseling and clinical trial enrichment.

4

Personalized Brain-Based Analgesia Detection with Portable fNIRS and AI

Minoccheri, C.; Joo, P.; Hu, X.-S.; Affendi, H.; Elayyan, F.; Harville, A.; McDonald, N. J.; Botero, T.; DaSilva, A. F.

2026-05-28 dentistry and oral medicine 10.64898/2026.05.20.26353377 medRxiv

Top 15%

0.2%

Show abstract

Neuroimaging based pain decoding faces two underappreciated challenges: between subject variability that prevents classifiers from generalizing across patients, and within session cross validation designs that inflate reported accuracy by conflating within person and between person variance. Here we address both using portable functional near infrared spectroscopy (fNIRS) during pharmacologically verified local nerve anesthesia. Twentyfive patients with clinically painful teeth underwent 36 channel bilateral fNIRS during percussion before ("Pre") and after ("Post") local nerve anesthesia. In 13 block-success patients, a paired Pre versus Post comparison with healthy tooth control identified three temporal hemodynamic response function (HRF) features (late slope, mean first derivative, and baseline normalized amplitude) whose analgesia interaction effects (d = 0.63 to 0.79) exceeded that of raw general linear model (GLM) amplitude (d = 0.56), with a significant difference-in-differences interaction (p = 0.011). Per-patient calibration with these features yielded leave one subject out (LOSO) AUC = 0.68 to 0.76 for nonlinear classifiers (permutation p = 0.002), with HbO-specific feature selection achieving the best performance (RF AUC = 0.760); a healthy tooth negative control was non-significant. End to end deep learning on raw time series (CNN LSTM AUC = 0.719) was competitive with feature based classifiers, while linear models did not reach significance. Critically, head to head comparison of within-session CV and LOSO on the same data revealed mean inflation of +0.13 AUC across all model types, including deep learning, demonstrating that high within session accuracy alone does not establish subject-independent validity. Exploratory analyses suggested complementary roles for oxyhemoglobin (HbO; within patient analgesia detection) and deoxyhemoglobin (HbR; cross patient information), and that trial to trial response variability may complement amplitude for cross patient pain detection. These results show that per patient calibration with temporal HRF features supports subject independent analgesic-state detection under strict LOSO evaluation, and that within-session validation (standard in the fNIRS pain- decoding literature) can substantially overestimate performance.

5

The Sleep-Wake Classification Performance of Pediatric-Trained Machine Learning Algorithms for Raw Accelerometer Data

Chen, P.-W.; Cielo, C.; Walsh, O.; Mcdonald, M.; Song, P. X.; Goldstein, C.; Moreno, J. P.; Jansen, E.; Mitchell, J. A.

2026-06-01 pediatrics 10.64898/2026.05.28.26354364 medRxiv

Top 17%

0.1%

Show abstract

Introduction: Actigraphy sleep-wake classification methods increasingly seek to leverage raw acceleration data and machine-learning-based classification, but performance evaluation in pediatrics is limited. We trained machine-learning models using pediatric data and compared their sleep-wake classification performance with existing algorithms for children. Methods: Sixty-five children (46% female, ages 5.3 to 17.7 years) completed in-lab overnight polysomnography and wore a GENEActiv device on their non-dominant wrist. The acceleration data were converted into 30-second epochs and aligned with physician-scored sleep-wake data from electroencephalography. Seven machine-learning models were trained using leave-one-subject-out cross-validation. Epoch-by-epoch analyses generated performance metrics (e.g., balanced accuracy [BA]) and discrepancy analyses provided overall sleep duration bias estimates. The combination of highest performance and least bias was used to rank using Euclidean distance scores - where a lower score represents closer to perfect performance and zero bias. For benchmarking, we included GGIR sleep scoring algorithms and an adult trained random forest classifier. Results: Overall, 560.1 hours of polysomnography and actigraphy data were collected (74.4% of epochs were scored as sleep). The pediatric-trained local-global long-short term memory (LSTM) classifier had the most optimal epoch-by-epoch performance (e.g., BA=0.85, sensitivity=0.88, specificity=0.83, ROC-AUC=0.95, and Cohen kappa=0.67). These metrics exceeded that of an adult-trained random forest classifier and GGIR-based algorithms. Discrepancy analyses revealed that overall sleep duration was underestimated by an average of 25 minutes using the LSTM classifier with no proportional bias. Conclusion: We trained seven pediatric sleep-wake classifiers that had strong ability to detect sleep and wake, with the LSTM classifier being most optimal.

6

The emotional impact of gambling-related advertising: an experimental functional Near-Infrared Spectroscopy study protocol

Daniel, L.-I.; Ros-Leon, A.; Molina-Rodriguez, S.; Pellicer-Porcar, O.; Cabrera-Perona, V.; Ibanez-Ballesteros, J.

2026-05-27 addiction medicine 10.64898/2026.05.20.26353682 medRxiv

Top 17%

0.1%

Show abstract

The proliferation of gambling advertising has intensified concerns regarding its influence on vulnerable populations, yet the neural mechanisms underlying cue-reactivity to these stimuli remain underexplored in ecologically valid settings. This study protocol proposes a novel methodological framework to investigate prefrontal cortical responses to gambling advertisements in individuals with varying degrees of gambling experience. Materials and methods: This cross-sectional study will recruit 44 participants, divided into a clinical group (individuals with high-frequency gambling or gambling disorder) and a matched control group. Neural activity will be recorded using fNIRS while participants view gambling-related, neutral, violent, and sexual stimuli. Secondary measures include validated scales for gambling severity (SOGS), impulsivity, sensation seeking, and alexithymia. Data analysis will primarily utilize inter-subject correlation (ISC) to quantify neural synchronization and multiband frequency decomposition to capture dynamic affective processing. Advanced preprocessing, including short-channel regression, will be applied to ensure signal robustness. Discussion: By combining portable neuroimaging with a data-driven ISC approach, this study aims to identify objective neural markers of gambling vulnerability. The findings will provide novel insights into the idiosyncratic processing of commercial stimuli, potentially informing public health policies and the development of more effective evidence-based regulations for gambling marketing.

7

Validation of Gait Tasks in SynapTrack Mobile App for Cervical Spondylotic Myelopathy

Lewis, A.; Arkam, F.; Steel, B.; Chen, E.; Singh, P.; Yakdan, S.; Becker, I.; Guo, W.; Shahrabani, A.; Payne, P. R.; Ghogawala, Z.; Steinmetz, M. P.; Neuman, B.; Ray, W. Z.; Duncan, R.; Greenberg, J.

2026-05-29 surgery 10.64898/2026.05.27.26354225 medRxiv

Top 18%

0.1%

Show abstract

Background Gait impairment is a central sign of cervical spondylotic myelopathy (CSM) that is typically evaluated through subjective patient-reported questionnaires or objective in-clinic measures. These systems require substantial resources to administer and are poorly suited for longitudinal monitoring, however, emerging smartphone applications present an efficient alternative. We developed and assessed the validity of a data processing framework based on the SynapTrack smartphone application to assess gait function in individuals with CSM. Methods Participants completed walking tasks which were recorded on both the SynapTrack app and a gold standard gait mat. Acceleration data extracted from the smartphone by the app were filtered and processed to produce gait cycle features including velocity, step time, waveform features and frequency domain features. Standard gait features were compared across the two methods by correlation and Bland-Altman plots to assess validity. App-based gait features were then compared to the standard modified Japanese Orthopedic Assessment (mJOA) assessment to determine construct validity through correlation and ability to discriminate between individuals with CSM and healthy controls. Finally, intraclass correlation coefficients and coefficients of variation were used to measure test-retest reliability and standard variation across app features. Results A total of 110 participants were included in this study, of which 55 (50%) had CSM, 24 (22%) had peripheral neuropathy, and 31 (28%) were healthy controls. SynapTrack gait measures including velocity, step time, and double support showed strong validity as indicated through Bland-Altman plots and high correlation (>0.8) with mat features. In addition to the gait features, acceleration root mean square, acceleration crest, spectral entropy, and dominant frequency showed strong construct validity compared to the mJOA across correlation (0.2-0.54), trend test (p < 0.001), and AUROC (0.62-0.79) analyses. ICCs showed moderate test-retest reliability (0.52-0.67). Discussion The proposed framework for processing gait data showed strong validity compared to the gold standard mat and high construct validity compared to the mJOA suggesting the utility of the SynapTrack app as an efficient alternative to existing methods. The confirmation of gait metrics related to CSM severity and identification of relevant waveform and frequency domain features present opportunities to use smartphone apps to develop ecologically valid data driven markers of CSM severity.

8

Can Large Language Models Diagnose Primary Immunodeficiency from Patient-Described Symptoms?

Reteig, L. C.; Woloshin, S.; Maglione, P. J.; Farmer, J. R.; Ong, M.-S.

2026-05-27 allergy and immunology 10.64898/2026.05.26.26353818 medRxiv

Top 18%

0.1%

Show abstract

Patients with primary immunodeficiency (PID) often face prolonged diagnostic delays and may increasingly turn to large language models (LLMs) to interpret their symptoms during this period. We evaluated whether an LLM could recognize PID from symptom descriptions derived from interviews with 21 PID patients. In a prior study, we showed that GPT-4o identified PID in 96% of cases when prompted with physician-written patient histories (Rider et al., JACI, 2024). Here, when prompted with symptom descriptions in patients' own words, GPT-5 identified PID in only 7 cases (33%), although it more broadly suggested immune system issues in 18 cases (81%). The gap between these findings indicates that LLMs are sensitive to the language and framing of symptom descriptions, performing substantially worse when patients describe their own symptoms in everyday language than when clinicians summarize patient histories in structured medical terms. This study underscores the need to carefully evaluate how LLMs are used in patient-facing applications.

9

Comparison of Mechanical Tissue Properties Using MyotonPRO and Time-Harmonic Elastography: Understanding Fundamental Differences and Statistical Relationships

Kurz, E.; Valli, G.; Meyer, T.; Proger, S.; Schwesig, R.; Bartels, T.; Delank, K.-S.; Sack, I.; Aghamiry, H. S.

2026-05-28 sports medicine 10.64898/2026.05.20.26353658 medRxiv

Top 18%

0.1%

Show abstract

Abstract Purpose: MyotonPRO (MTP) and time-harmonic elastography (THE) are increasingly used to assess muscle mechanical properties, yet they operate on fundamentally different physical principles. MTP measures composite MTP stiffness (N/m) through surface oscillations, while THE quantifies intrinsic shear modulus (THE stiffness, kPa) via propagating shear waves. This study aimed at systematically compare MTP and THE measurements in the vastus lateralis muscle across different contraction intensities and examine how the skin layer and subcutaneous fat (SLSF) thickness influence their relationship. Methods: Twenty-six healthy adults (15 males, 11 females; age 25 [SD 4] years) underwent MTP and THE measurements of the vastus lateralis at rest and during isometric contractions at 15% and 30% maximal voluntary contraction (MVC). Effects of contraction intensities on tissue properties were assessed using univariate analyses of variance with repeated measures. Associations between the different outcomes of THE and MTP technologies were explored using Pearson's correlations and partial correlation coefficients separately for each contraction intensity with adjustment of the SLSF thickness of participants. Results: Both technologies detected contraction intensity-dependent stiffening across all outcomes (p < 0.001). THE stiffness increased from 5.3 [1.2] kPa at rest to 15.6 [6.1] kPa at 30% MVC; THE wave attenuation increased from 0.83 [0.19] to 1.42 [0.36] s/m while MTP stiffness increased from 337.3 [49.3] N/m at rest to 529.4 [160.7] N/m at 30% MVC. Correlations between modalities were weak and condition-dependent. THE wave attenuation did not significantly correlate with any MTP outcome across conditions. Conclusion: MTP and THE detect contraction-induced stiffening through fundamentally different physical mechanisms and should not be regarded as interchangeable. Their correlation is modest at rest and breaks down (or reverses) during active contraction, with subcutaneous fat as a key modifying factor. Clinical trial number: Not applicable.

10

Neonatal EEG network activity associates with 2-year neurodevelopment after perinatal asphyxia

Syvalahti, T.; Tokariev, M.; Nevalainen, P.; Tuiskula, A.; Metsaranta, M.; Haataja, L.; Vanhatalo, S.; Tokariev, A.

2026-05-27 pediatrics 10.64898/2026.05.26.26354098 medRxiv

Top 19%

0.1%

Show abstract

Abstract Background Prediction of long-term neurodevelopmental outcomes remains challenging after perinatal asphyxia. Here, we studied whether computational metrics of brain function derived from neonatal EEG are associated with long-term neurodevelopment in infants with perinatal asphyxia. Methods Total of 36 term-born infants with perinatal asphyxia with or without hypoxic-ischemic encephalopathy were studied with neonatal multichannel electroencephalography (EEG). We computed local EEG amplitudes and phase-amplitude coupling (PAC), as well as large-scale functional cortical networks estimated using amplitude-amplitude correlations (AAC) and phase-phase correlations (PPC). These EEG-derived markers were tested for associations with neurodevelopmental outcomes at two years, assessed using the Griffiths Scales of Child Development, 3rd edition (GMDS-III). Results EEG amplitudes showed positive associations with GMDS-III Foundations of Learning and General Development scores across most electrodes during quiet sleep, with the strongest effects observed at frontal and central regions (r = 0.44-0.66). PAC showed negative associations with the same scores mainly over parietal and temporal regions (r = -0.45 to -0.55). Cortical AAC networks demonstrated the most robust and widespread negative associations in all frequency bands during quiet sleep (r = -0.47 to -0.54), with 70-72% of connections significant in high delta frequency. In turn, PPC networks showed frequency-selective and more spatially constrained negative associations during quiet sleep (r = -0.48 to -0.53), involving 5-12% of the network. Conclusions Both local and network-based metrics in the newborn brain show significant association with neurodevelopmental outcome at 2 years after perinatal asphyxia.

11

High-resolution Orbitofrontal Cortex Morphometry and Cannabis Use Disorder Severity in High-risk Emerging Adults: A Preliminary Study

Hargreaves, T. L.; McIntyre-Wood, C.; Elsayed, M.; Vandehei, E.; Belisario, K. L.; Lee, L.; Blakely, A.; Halladay, J. L.; Amlung, M.; Sweet, L. H.; MacKillop, J.

2026-05-27 addiction medicine 10.64898/2026.05.26.26354113 medRxiv

Top 21%

0.1%

Show abstract

Background: Cannabis use is highly prevalent among emerging adults (18-25 years), a developmental period marked by ongoing neurodevelopment and heightened risk for cannabis use disorder (CUD). Structural alterations in the orbitofrontal cortex (OFC) and medial prefrontal/anterior cingulate cortex (mPFC/ACC) have been linked to cannabis use, though findings remain inconsistent in directionality. To address this, we examined cortical thickness and surface area of the OFC and mPFC/ACC subregions using the high-resolution Glasser atlas, allowing for more granular characterization of associations with CUD severity. Method: One hundred eleven emerging adults (41% male, aged=20.6{+/-}1.1 years) reporting significant alcohol and/or cannabis use completed clinical assessments and structural MRI. The OFC and mPFC/ACC were segmented into seven and six subregions per hemisphere, respectively. Multiple linear regressions tested associations between cortical thickness or surface area and DSM-5 CUD symptom count, controlling for alcohol use and intracranial volume. Subregions surviving false discovery rate correction were examined in relation to depression, trauma-related symptoms, impulsivity, and cannabis use motives. Results: Greater CUD severity was associated with lower cortical surface area and greater cortical thickness in OFC and mPFC/ACC subregions. Lower OFC surface area was correlated with coping- and enhancement-related cannabis use motives. Lower mPFC/ACC surface area and greater thickness were associated with more severe depression, trauma-related symptoms, and impulsivity. Conclusion: In high-risk emerging adults, greater CUD symptom burden is associated with lower surface area and greater thickness in OFC and mPFC/ACC subregions. Using the high-resolution Glasser atlas, these findings provide a more precise characterization of structural correlates of CUD and highlight potential neurobiological markers linked to affective and motivational processes underlying cannabis use.

12

Ejaculatory Function and Clinical Outcomes Following Robotic Aquablation for Prostatic Bladder Outflow Obstruction: A Retrospective Real-World Cohort Study Protocol

Shroff, D. E.; Newman, T.; Malde, S.; Martyn-Hemphill, C.

2026-05-30 urology 10.64898/2026.05.28.26354125 medRxiv

Top 21%

0.0%

Show abstract

Introduction Aquablation for surgical treatment of benign prostatic enlargement (BPE) causing bladder outflow obstruction (BOO) has demonstrated good functional outcomes, even for large glands, with high rates of ejaculatory preservation reported. This is a protocol for a study that aims to review real-world outcomes of ejaculatory preservation or restoration post-Aquablation in an unselected cohort and compare to published clinical trial outcomes. Methods Retrospective data will be collected from a prospectively maintained consecutive case series of patients who underwent Aquablation, in a single UK centre. The primary outcome is ejaculatory function subjectively reported by men post-operatively, and classified as: antegrade ejaculation, retrograde/low volume ejaculation, anejaculation or not sexually active. Secondary outcomes are International Prostate Symptom Severity (IPSS), Quality of Life (QoL) Score, post-void residual (PVR), and incontinence. Descriptive and comparative statistical tests will be performed. Conclusions This study will review real-world ejaculatory function and clinical outcomes following robotic Aquablation for prostatic bladder outflow obstruction and compare this to published clinical trial outcomes.

13

Consumer Opinions, Lot-to-Lot Variability, and Pharmacokinetics of Transdermal Melatonin Products: A Randomized, Crossover Clinical Trial

Bonilla, K.; Sherman, V. M.; Arbaiza, A. S.; Dougherty, M.; Olson, L. E.

2026-05-29 pharmacology and therapeutics 10.64898/2026.05.27.26354234 medRxiv

Top 21%

0.0%

Show abstract

In some countries, melatonin is sold without a physician prescription and dosage is unregulated. Transdermal products have become popular including those marketed for children. We measured consumer assumptions about these products among adult residents of the United States, analyzed lot-to-lot variability, and compared the pharmacokinetics of melatonin administered in oral, lotion, and bath product forms. Survey respondents (n=199) believed oral melatonin was more effective than transdermal products and that all melatonin products were relatively safe. Melatonin lotion products analyzed by HPLC displayed lot-to-lot variability as well as changes in formulation and product claims. To determine pharmacokinetics, three different treatments (oral tablets, lotion, and bath immersion) were administered to twelve undergraduate participants in a randomized, crossover design. Five additional participants completed bath product treatment only. Participants collected saliva samples up to 48 hours after administration, which were analyzed for melatonin by enzyme-linked immunosorbent assay. Oral (n=11) and lotion formulations (n=12) caused maximum salivary melatonin levels within 30 minutes after administration, but bath immersion did not cause increases in saliva melatonin (n=17). The half-life of oral melatonin was 1.17 [0.69 -- 1.65] hours versus 5.72 [3.75 -- 7.68] hours for lotion treatment (p = 0.011, effect size r = 0.770). Melatonin lotion may pose a risk to consumers who assume it is safe and less effective than oral tablets, when in fact it may be very potent and remain at high physiological levels into the following day. This study is registered on clinicaltrials.gov (NCT06382610) and was funded by the Sleep Research Society.

14

Preliminary Reliability and Validity of SynapTrack, a Smartphone-Based Digital Biomarker Platform for Remote Assessment of Cervical Spondylotic Myelopathy

Yakdan, S.; Singh, P.; Arkam, F.; Chen, E.; Lewis, A.; Steel, B.; Becker, I.; Guo, W.; Naveed, H.; Wang, C.; Yang, D.; Wang, Z.; Ray, W. Z.; Hassenstab, J.; Steinmetz, M. P.; Ghogawala, Z.; Kelleher, C.; Greenberg, J.

2026-06-01 surgery 10.64898/2026.05.29.26354454 medRxiv

Top 21%

0.0%

Show abstract

Background and Objectives: Cervical spondylotic myelopathy (CSM) is a leading cause of neurological disability in older adults. However, validated, scalable tools to quantify disease severity and changes over time are lacking. Recent advances in smartphone technology have opened new avenues for longitudinal, objective, and remote monitoring of neurological conditions. We performed a preliminary evaluation of the reliability and validity of SynapTrack, a smartphone-based digital platform for objective remote CSM assessments. Methods: In this single-center prospective cohort study, 265 participants (151 with CSM, 114 healthy controls) completed in-person SynapTrack assessments related to tapping, pinching, and vibratory detection, along with reference laboratory measures of dexterity (Box and Block Test, 9-Hole Peg Test) and vibratory sensation (tuning fork). A subset completed repeated home-based testing to assess test-retest reliability. We evaluated convergent validity, construct validity against the modified Japanese Orthopedic Association (mJOA) score, known-groups validity, and test-retest reliability (intraclass correlation coefficient, ICC). Results: Smartphone-derived metrics demonstrated good-to-excellent test-retest reliability, with the strongest stability for vibratory detection threshold (ICC = 0.92), overall and non-dominant tapping speed (ICC = 0.90 each), and pinching successful targets (ICC = 0.90). Convergent validity was supported by moderate-to-strong correlations between digital metrics and reference laboratory dexterity tests ({rho} up to 0.60 for tapping speed; up to -0.65 for the vibratory threshold). Construct validity against the mJOA was strongest for the vibratory threshold ({rho} = -0.53 to -0.54) and Level 2 non-dominant pinching errors ({rho} = -0.45). Selected metrics distinguished CSM patients from controls with good discrimination, including non-dominant tapping speed (AUROC = 0.76, 95% CI 0.68-0.85), Level 2 dominant pinching successful targets (AUROC = 0.78, 95% CI 0.62-0.94), and the non-dominant vibratory threshold (AUROC = 0.77, 95% CI 0.64-0.90). Conclusions and Relevance: A smartphone-based battery of upper-extremity sensorimotor tasks demonstrated preliminary reliability and validity in CSM. Furthermore, to our knowledge, the novel vibratory detection task represents the first smartphone-based sensory assessment used for CSM. Collectively, these findings position SynapTrack as a scalable platform for objective, remote neurological monitoring of CSM.

15

Cross-Sectional Measures of Periodontal Severity: Distortion from Severity-Dependent Tooth Loss

McCormick, K. M.; Amarasena, N.; Guzzo, G.; Nath, S.; Jamieson, L.

2026-05-30 dentistry and oral medicine 10.64898/2026.05.27.26354277 medRxiv

Top 21%

0.0%

Show abstract

Aim: Cross-sectional summaries of periodontitis based on clinical attachment loss (CAL) are, by definition, conditioned on surviving teeth. Because the most severely affected teeth are more likely to have been lost, these measures may underestimate cumulative disease burden and show an artificial flattening (attenuation) of severity with age. We hypothesised that measures more sensitive to severe attachment loss would show greater attenuation at older ages than measures defined across a broader range of sites. Materials and Methods: Using nationally representative data from adults aged 30+ years in NHANES 2009-2014, we examined age-specific trajectories across multiple continuous measures of periodontal severity and assessed whether divergence between measures followed the pattern predicted under severity-dependent tooth loss. Results: The proportion of observable sites declined from 93% at ages 30-34 to 68% at 80+ years, establishing the structural basis for the divergence observed across severity measures. All severity measures showed nonlinear attenuation with age, with distortion increasing with severity threshold. Higher-threshold measures exhibited the greatest attenuation, while lower-threshold measures showed more stable trajectories. Conclusions: Cross-sectional summaries of periodontitis reflect disease among surviving teeth rather than cumulative damage across teeth originally at risk. Attenuation at older ages is consistent with depletion of the most severely affected teeth rather than biological slowing. Distortion varies by measure, with higher-threshold and mean-based indices most affected, whereas the CAL 3+ mm threshold provides a more stable basis for age comparisons.

16

Morphological feature remodeling of intracranial arteries in the context of inflammation and HIV-associated cognitive impairment

Hoang, N.; Yang, H.; Uddin, M. N.; Zhong, J.; Faiyaz, A.; Singh, M. V.; Boodoo, Z. D.; Sutton, K. R.; Wang, H. Z.; Sahin, B.; Khan, M. W.; Weber, M. T.; Yuan, C.; Chen, L.; Schifitto, G.

2026-05-27 hiv aids 10.64898/2026.05.19.26353071 medRxiv

Top 21%

0.0%

Show abstract

Background: Despite the success of combination antiretroviral therapy (cART), vascular comorbidities, including cerebrovascular disease, are more prominent in people living with HIV (PLWH) compared to people without HIV (PWOH). However, quantitative assessments of cerebrovascular morphometry and their associations with cognitive outcomes in the context of HIV are still limited. In this study, we explore this missing link. Methods: Magnetic Resonance Angiography (MRA) data, blood markers, and neurocognitive assessments were collected from 73 PWOH subjects (male: 57, female: 16; age: 53 {+/-} 16) and 99 PLWH subjects (male: 66, female: 30, age: 53 {+/-} 11). Vessel morphometric features were quantified using intraCranial Artery Feature Extraction (iCafe) to investigate associations between vessel morphometry, markers of monocytes, endothelial cell activation, and cognitive performance. Results: HIV status predicted a lower total number of branches ({beta} = -0.224, p = 0.001, d = -0.517) and shorter total distal length ({beta} = -0.173, p = 0.021, d = -0.370) with a moderate effect size. Total branch number was found to be negatively associated with plasma levels of monocyte markers (sCD14: r = -0.167, p = 0.033; sCD163: r = -0.157, p = 0.045) and positively correlated with white matter cerebral blood flow (r = 0.550; p [≤] 0.05). HIV status was the strongest predictor of overall cognitive performance in ANCOVA model ({beta} = -0.219, p = 0.006, d = -0.453). Conclusions: Our results suggest that cognitive impairment in PLWH is associated with vessel morphology metrics. Monocyte immune activation may contribute to changes in vessel morphology.

17

The dangers of data double dipping in assessing the classification accuracies of blood biomarkers in Alzheimer's disease and related disorder research

Liu, T.; Zeng, X.; Snitz, B. E.; Karikari, T. K.; Deek, R. A.

2026-06-01 neurology 10.64898/2026.05.22.26353848 medRxiv

Top 22%

0.0%

Show abstract

Blood biomarker models are increasingly used in Alzheimer's disease and related dementia translational research, but predictive performance can be inflated when the same dataset is used for both model development and evaluation. We assess the effect of data double dipping using simulations and NULISA proteomic data from the MYHAT-NI community-based cohort to predict brain amyloid-beta neuroimaging status. In both settings, training AUC increased as more biomarkers were added, while testing AUC peaked earlier and then declined. These findings show that data double dipping can inflate model performance and highlight the need for external validation or internal validation with data partitioning.

18

A Lasting Legacy: Long-Term Effects of Exercise Training on Cardiometabolic Health in the STRRIDE-Prediabetes Reunion Study

Ross, L. M.; Sudnick, A. M.; Collins-Bennett, K. A.; Bo, N.; Counts, J. D.; Johnson, J. L.; Bennett, W. C.; Saldana, A. A.; Kennedy, K. G.; Aliferis, C. F.; Ma, S.; Huffman, K. M.; Peskoe, S. B.; Kraus, W. E.

2026-05-28 cardiovascular medicine 10.64898/2026.05.26.26352907 medRxiv

Top 22%

0.0%

Show abstract

Background: Regular exercise is a highly effective yet underutilized strategy to reduce cardiometabolic disease burden. Whether brief structured exercise programs confer lasting cardiometabolic benefits remains unclear. The STRRIDE-Prediabetes Reunion study examined legacy effects of exercise training on cardiorespiratory fitness, body composition, and cardiometabolic health. Methods: Seventy-three participants (71.3 {+/-} 7.2 years; 64% women; 77% White) completed Reunion assessments ~11 years after completing one of four 6-month interventions differing in exercise amount, intensity, and inclusion of diet-induced weight loss. Linear mixed effects models evaluated longitudinal trajectories; secondary analyses examined baseline-adjusted associations among short-term intervention response and Reunion outcomes. Results: Abdominal adiposity improved across all groups from baseline to Reunion, with waist circumference decreasing ~3 cm over the follow-up period. In contrast, cardiorespiratory fitness and fat-free mass declined significantly. A significant group by time interaction was observed for total fat mass (p=0.01), with continued fat mass reductions observed in women randomized to high amount exercise. After baseline adjustment, greater short-term intervention response was associated with more favorable Reunion outcomes across fitness, body composition, and cardiometabolic domains; fat-free mass showed the strongest association ({beta}=0.84, p<0.0001). Conclusions: In older adults with prediabetes, the STRRIDE-Prediabetes interventions produced several legacy health effects persisting more than a decade later. Legacy effects differed by sex and exercise dose, and short-term intervention response relative to baseline was associated with long-term outcomes, supporting targeted exercise strategies to preserve cardiometabolic health and functional independence with aging.

19

The Verification Gap: Artificial Intelligence Adoption, Hallucination Awareness, and Verification Practices Among Early Career Medical Researchers in Pakistan

Sajjad, M.

2026-05-30 health informatics 10.64898/2026.05.28.26354373 medRxiv

Top 22%

0.0%

Show abstract

Artificial intelligence (AI) tools have been rapidly adopted by medical researchers, yet whether early career researchers in low and middle income countries possess the awareness and habits needed to use these tools safely remains poorly documented. This study characterized AI adoption patterns, hallucination awareness, and verification and disclosure practices among early career medical researchers in Pakistan. A cross sectional anonymous online survey was conducted among medical students, house officers, residents, physicians, and faculty involved in research or academic work across Pakistan (May 2026). Descriptive statistics and chi square tests were applied to 373 eligible responses. AI use was near universal (99.7%), with 60.3% using AI tools daily. The most commonly reported tool in this sample was Claude (40.5%), followed by ChatGPT (29.2%) and Perplexity (26.0%), though this ranking likely reflects sampling characteristics. Despite high adoption, 59.2% typically did not verify AI outputs before use, and 40.2% had never heard that AI can generate fabricated scientific references. In behavioral vignettes, 36.5% assumed convincing AI generated references were authentic, and 54.2% would continue using remaining AI content after discovering one fabricated reference. Formal research training was strongly associated with consistent disclosure (51.7% vs. 17.1%; chi square=48.43, p less than 0.001). Role, daily use frequency, and research training were not significantly associated with verification behavior. Early career medical researchers in Pakistan demonstrate high AI adoption alongside incomplete hallucination awareness and infrequent verification, a pattern that may carry implications for research integrity. Formal training was the only factor significantly associated with consistent disclosure. Integration of AI literacy into medical curricula and institutional governance frameworks merits consideration.

20

Keeping human in the loop: A three-phase generative AI workflow for research integrity in data-intensive science.A methodological case study using elite Ethiopian distance-running data

Galko, P.; Yisamaw, A.; Haugen, T.; Seiler, S.

2026-05-29 sports medicine 10.64898/2026.05.29.26354013 medRxiv

Top 22%

0.0%

Show abstract

Background: Generative AI tools can support data-intensive research by writing code, drafting prose, searching analytical possibilities, and stress-testing claims. They can also produce false citations, drift between statistical specifications, and lose continuity across long investigations. This paper describes a practical workflow for using AI systems in empirical research while keeping discovery, verification, and accountability inspectable. Methods: We developed and applied a three-phase human-AI workflow to a case study of 14 elite Ethiopian distance runners. The dataset contained 22,605 GPS-segments collected across 97 consecutive days in late 2025, supplemented by venue and athlete metadata collected in the field. Phase 1 used an autonomous data-exploration tool to pre-filter the hypothesis space across five seeded research questions. Phase 2 used an AI system under direct human guidance to construct candidate findings into numerical claims, verification scripts, and draft text. Phase 3 used an independent AI system in an adversarial role to stress-test methods, statistics, prose, figures, and citations. The workflow was informed by Pearl's distinction between association, intervention, and counterfactual reasoning, with human judgement retained for research direction, interpretation, and final claims. Results: The workflow produced three empirical analyses and a documented correction process. The analyses estimated an altitude-to-sea-level pace correction of +0.10 min/km per 1,000 m at matched heart rate, showed why pooled altitude-surface regression was not identifiable within this venue system, documented method-dependence in heart-rate-based intensity classification, characterised within-venue route variation as a 64/36 path-fixed-to-trail-variable split with the Sululta label resolving into two functionally distinct sub-venues, and reframed the cohort's training through a 3x3x3 prescription lattice grounded in Ethiopian coaching practice. The adversarial phase identified several hallucinated citations, a terminology error between HC1 and cluster-robust standard errors, and several inconsistencies between prose, figures, and computed results. Verification scripts re-derived nearly all numerical claims from the cleaned lap-level data. Conclusions: The case study shows how researchers can organise AI-assisted empirical work so that candidate discovery, claim construction, independent stress-testing, and final accountability remain separated. The workflow did not remove the need for domain expertise or human judgement. Its value was in making the route from candidate finding to manuscript claim explicit, reproducible, and open to challenge. Trial registration: Not applicable.