JAMA
● American Medical Association (AMA)
Preprints posted in the last 30 days, ranked by how well they match JAMA's content profile, based on 17 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.
Aschmann, H. E.; Tang, A. S.; Lee, M.; Salcedo, K. L.; Murrill, M. T.; Chen, G.; Ouyang, Y.; Lui, K.; Rahman, M.; Flood, J.; Kerkhoff, A. D.; Lin, T. K.; Shete, P. B.; for the Tuberculosis Epidemiologic Studies Consortium,
Show abstract
Objectives Tuberculosis (TB) in the United States disproportionately affects non-U.S.-born individuals. While testing this population for TB infection is recommended, little is known about individuals' willingness to take treatment for latent TB infection (LTBI). To address this gap, we conducted a pilot preference survey among individuals from countries with high TB incidence. Design Cross-sectional survey supported by language concordant community health workers. Setting Federally qualified health center, serving a primarily Asian immigrant community, in San Francisco. Participants Adults eligible for risk-based LTBI testing based on place of birth seeking primary care. Outcome measures Perspectives on TB disease, risk of reinfection, and willingness to accept treatment if diagnosed with LTBI conditional on different factors, such as side effects, costs, and other treatment burden. Results Among 60 participants, the median age was 48 years (interquartile range 35-63 years), 52% were women, and 100% spoke Chinese. Infecting others (n=35, 58%), risk of death (n=30, 50%), and potential isolation (n=25, 42%) were the most worrisome consequences of TB disease. Reinfection risk, risk of liver damage, cost, TB progression risk, clinic visits, and blood draws were most often considered moderately or very important when deciding whether to take LTBI treatment (n=53 to 57, 88-95%). While most participants (n=56, 93%) were willing to take treatment if diagnosed with LTBI even at a 10-year TB progression risk below 1% and willing to accept a risk of liver damage (n=41, 68%), less than half would accept LTBI treatment if there were any associated cost (n=28, 47%). Finally, many participants had concerns about their reinfection risk after completing LTBI treatment (n=34, 57%). Conclusions Amongst surveyed participants, TB disease and its consequences such as hospitalization, death and infecting others were worrisome, and participants had a high level of willingness to take treatment if diagnosed with LTBI. Future assessments of how people weigh tradeoffs regarding LTBI diagnosis and treatment could inform interventions to increase LTBI treatment acceptance and completion.
Moe, A. B.; Haverty, C.; Lee, M.; Hahn, S. E.; McElrath, T. F.; Jain, M.; Rasmussen, M.; Corso, A.; Larson, M. L.; Morrison, H.; Melroy, L. M.; Roofeh, J.; Phelps-Sandall, B.; Kiefer, D.; Biggio, J. R.
Show abstract
Introduction: Preeclampsia (PE) is a leading cause of maternal and neonatal morbidity and mortality, and low-dose aspirin (LDA) prophylaxis is the cornerstone of evidence-based prevention. Despite guideline recommendations, LDA adherence remains poor, with 10-25% of moderate-risk patients taking aspirin. Objective personalized risk stratification using biomarkers has been shown to motivate behavior change in other disease contexts. Survey data suggest that patients are more motivated to take aspirin if informed by an objective predictive test. Here, we report real-world LDA adherence among patients who received a high-risk result from a cell-free RNA (cfRNA) PE risk prediction test. Methods: This retrospective, observational survey study included asymptomatic patients of advanced maternal age (AMA; [≥] 35 years at delivery) with singleton pregnancies without USPSTF-defined preexisting high-risk conditions for PE who received the cfRNA PE risk prediction test. Patients who opted in to receive text message surveys were asked about LDA use following receipt of test results. High adherence was defined as reporting LDA use on at least 6 of 7 days per week at least 85% of the time surveyed. The primary analysis included patients with a high-risk test result and at least one LDA frequency survey response following receipt of test result. The observed proportion of adherent patients was compared to a baseline estimate of 25% using an exact binomial test. Results: Of 166 patients who received a cfRNA PE risk prediction test result, 48 (28.9%) received a high-risk result. Of these, 29 (60%) opted in and responded to at least one survey, constituting the primary analysis population. Twenty-seven of the 29 (93.1%; 95% CI: 78.0-98.1%) were classified as highly adherent, significantly higher than the 25% baseline adherence estimate for moderate-risk patients (p < 0.0001). Conclusion: Among surveyed patients who received a high-risk cfRNA PE test result, the proportion classified as highly adherent to LDA (93%) substantially exceeded published estimates of adherence in a similar patient population and met the clinically meaningful threshold of [≥] 80% associated with reduced risk of preterm preeclampsia. These findings indicate that objective and personalized biomarker risk testing may be a powerful driver of behavior change that current guidelines have failed to produce.
Kierulf, G.; Emmerson, M.; Krautscheid, P.; Bleyl, S.; Tristani-Firouzi, M.; Sawyer, B.
Show abstract
Congenital heart defects (CHD) are a common congenital anomaly and a leading cause of neonatal mortality. Even in ostensibly isolated cases, genetic testing can reveal monogenic causes of isolated CHD or identify syndromic conditions before additional features become clinically apparent. A timely and accurate genetic diagnosis can inform medical management and surveillance, reduce the need for unnecessary investigations, and offer families valuable information about prognosis, recurrence risk, and anticipatory guidance. In September of 2023, Primary Childrens Hospital introduced a universal genetic testing protocol that implemented whole genome sequencing for all neonates admitted to the cardiac intensive care unit (CICU) undergoing cardiac surgery before 30 days of life, with the goal of increasing the number of patients who receive a timely genetic diagnosis and improving clinical care. This is a retrospective chart review of patients who underwent whole genome sequencing (WGS) under the new universal genetic testing protocol at Primary Childrens Hospital from its initiation in September 2023 to February 2026. Over the study period, 217 neonates with CHD participated in the universal WGS protocol. Of these patients, 23 (10.6%) received a genetic diagnosis that was causative of their CHD, of which 11 patients (48%) had no major extracardiac features at the time testing was ordered. Twenty patients were diagnosed with a syndromic condition, and three patients were diagnosed with a non-syndromic condition. All of these patients received additional referrals to specialists following their new diagnosis, and six families used results to inform decisions regarding continuation of care. An additional 19 patients (8.8%) received WGS results that were clinically relevant but non-diagnostic for their CHD, including partial diagnoses, secondary findings, and carrier status. In total, 19.4% of patients (n=42) had clinically relevant variants identified on their WGS.
Hartmann, K.; Gannon, M.; Natarajan, P.; Greenland, P.; Biobank, P. M.; Levin, M.
Show abstract
Background: Polygenic risk scores (PRS) for coronary artery disease (CAD) are associated with cardiovascular events, but the relationship between inherited risk and routinely reported coronary computed tomography angiography (CTA) findings has not been studied. Objectives: To evaluate associations between a genome-wide PRS for angiographic coronary disease burden and coronary CTA-derived measures of atherosclerotic severity in a real-world clinical cohort. Methods: We studied Penn Medicine BioBank participants with available genotypes and clinically obtained coronary CTA reports. A previously published PRS for angiographic CAD burden was calculated using pgsc_calc. CAD-RADS scores and coronary artery calcium (CAC) values were extracted from radiology reports using the large language model Llama 3.1 8B. Associations between PRS and CAD-RADS severity were evaluated using Bayesian cumulative ordinal logit regression, while associations with log-transformed CAC burden were assessed using Bayesian linear regression. Results: Among 630 participants, median age was 59 years (IQR 49 - 68), 53% were female, 62% were genetically similar to a European reference population, and 34% to an African reference population. LLM-extracted CAD-RADS and CAC values demonstrated near-perfect agreement with manual abstraction. Higher PRS was associated with greater coronary atherosclerotic burden on CTA. Each 1-standard deviation (SD) increase in PRS was associated with a 20% higher odds of belonging to a more severe CAD-RADS category (cumulative OR 1.20, 95% credible interval 1.06-1.44). Higher PRS was also associated with greater CAC burden ({beta} 0.38, 95% credible interval 0.15 - 0.61). Conclusions: Polygenic risk for angiographic coronary disease burden is reflected in clinically reported coronary CTA severity measures, including CAD-RADS and CAC. These findings demonstrate that inherited susceptibility to CAD manifests as greater anatomic atherosclerotic burden at the time of clinical presentation and support further investigation of genetic risk integration into imaging-based cardiovascular risk assessment.
Chen, T.; Watanabe, M.; Callaghan, T.; Shioda, K.
Show abstract
Background: Statewide immunization data are essential for monitoring vaccination trends and evaluating immunization program impact. In the United States, Immunization Information Systems (IIS) were established in the early 1990s to collect these data; however, operational, legal, and procedural details vary across states and over time. This study summarized differences in IIS characteristics, such as legal requirements and reporting procedures, across U.S. states and jurisdictions over time. Methods: We analyzed survey data from previous work in 2000 and the Centers for Disease Control and Prevention (CDC) in 2012, 2018, and 2024. Our review focused on legislation and reporting requirements for immunization registries across 50 states and 14 jurisdictions, including U.S. territories and Freely Associated States. Results: Between 2000 and 2024, legal frameworks and reporting practices for immunization registries expanded across U.S. states and jurisdictions. The number of states with laws or administrative rules authorizing immunization registries increased from 24 states in 2000 to all 50 states, the District of Columbia, five metropolitan areas, five U.S. territories, and three Freely Associated States in 2024. Over the same period, reporting requirements also became more widespread. The number of states and jurisdictions mandating providers to report immunization records increased from 12 in 2000 to 54 in 2024. Consent policies also changed over time. By 2024, most states and jurisdictions had adopted implicit consent for reporting children's immunization records (41; 64%), while a smaller proportion required explicit parental consent (7; 11%) or implemented mandatory reporting without consent (14; 22%). Discussion: IIS infrastructure and reporting requirements have expanded across U.S. states and jurisdictions over the past two decades, while heterogeneity in consent policies and reporting practices persists. These temporal changes may need to be considered when interpreting IIS data, particularly in longitudinal and cross-jurisdictional analyses.
Fagerberg, P.; Sallander, O.; Vikhe Patil, K.; Thunborg, C.; Lundstrom, L.; Berg, A.; Nyman, A.; Borg, N.; Linden, T.
Show abstract
Title and abstract screening limit the timeliness of systematic reviews used for clinical guidelines. We evaluated audited large language model (LLM) triage at Sweden's National Board of Health and Welfare. Ten LLMs from five model families were tested on 419 Cochrane reviews comprising 26,892 records, and the selected ensemble was externally validated on 133 reviews including 8,501 records matched to planned guideline topics. The same locked model pair was then used prospectively across 24 systematic reviews in two national guideline programmes. On the 419-review selection benchmark, the selected Gemini-3-flash plus GPT-5.1 ensemble achieved 98.0% (95% CI, 97.3-98.7) mean review-level sensitivity, while topic-matched validation yielded 96.7% sensitivity (95% CI, 93.7-98.9). Prospective deployment screened 74,679 records, placed 63,858 (85.5%) in the AI-excluded pool and reduced estimated first-pass screening effort from 415 to 34 person-days. Across 600 randomly sampled AI-excluded records from the migraine and dementia programmes, none was confirmed as a final false negative after post-unblinding adjudication; across the completed 680-record audit, all 38 final retained records had been AI flagged, whereas locked blinded human consensus missed seven. These findings support locked, audited LLM triage, with human oversight and programme-specific monitoring, for systematic reviews used in national guidelines.
Silverman, R. A.; Ahrens, M. L.; Helmick, M.; Finkielstein, C. V.; Cohen, A.; Short, E.; Bordwine, P.
Show abstract
Background and Objectives: SARS-CoV-2 (COVID-19) continues to mutate, circulate, and adversely impact health and quality of life. While COVID-19 vaccines remain safe and effective, uptake remains low, especially among children, the youngest of whom were not vaccine-eligible until after Omicron and are underrepresented in published research. This study estimated vaccine effectiveness (VE) among under-5-year-olds. Methods: We used Virginia Department of Health surveillance data from June 2022 through October 2022 to conduct a test negative case-control study. We estimated VE derived from odds ratios (ORs) of reported infections using logistic regression among children aged 6-months to 5-years. Results: Using the earliest positive (cases) or negative (controls) post-vaccine-eligible test results, the VE associated with two doses of a COVID-19 vaccine was 78% (95% CI=45%, 93%; p=0.004) in unadjusted analyses and 70% (95% CI=25%, 91%, p=0.023) when adjusting for age, sex, prior testing behavior, and prior reported infections. The adjusted VE was 74% (95% CI=28%, 94%; p=0.025) among those with no prior positives reported and 45% (95% CI=-302%, 97%; p=0.569) among those with a prior positive reported. Conclusions: These results show that even though the vaccine was not closely matched to the dominant variants circulating during the time period analyzed, it was effective at reducing the risk of reported infections. This study adds to the body of knowledge on pediatric COVID-19 VE in an underrepresented age-group and in a rural region, illustrates the utility of surveillance data for evaluation, and can inform vaccine decisions to improve vaccine uptake for young children.
Jia, E.; Omar, M.; Barash, Y.; Brook, O. R.; Ahmed, M.; Kruskal, J. B.; Gorenshtein, A.; Klang, E.
Show abstract
AI-assisted clinical care may compound, rather than correct, existing health inequities. We applied Omar and colleagues' validated four-domain emergency-medicine benchmark to OpenEvidence (OE), a literature-grounded clinical LLM used by tens of thousands of US physicians daily, across 100 emergency-department cases and 20 sociodemographic labels. OE was consistent on the codified clinical decisions, triage, workup, and treatment, but diverged sharply on mental-health screening, where it flagged many historically marginalized groups between three and ten times more often than demographically unmarked cases. Cases labeled as unhoused received recommendations in 78 to 87 percent of responses (versus a 9 percent no-identifier-control rate); cases labeled as transgender in 22 to 24 percent; and Black transgender women specifically in 47 percent. A pre- registered audit of 193 free-text rationales localized the differential to the inner layer of the response, in the structure and tone of the rationale rather than the recommendation itself. Literature grounding may redistribute sociodemographic disparity in clinical AI rather than remove it. As clinical LLMs move toward agentic deployment, equity audits should examine how evidence is applied to each patient, not only whether citations are present.
Golshani, P.; Joseph, M. S.
Show abstract
The US Food and Drug Administration (FDA) maintains a public list of artificial intelligence and machine learning (AI/ML)-enabled medical devices that have received marketing authorization. Prior published analyses examined this list at earlier time points and reported a marked dominance of radiology applications. We performed a cross-sectional analysis of all 1,430 AI/ML-enabled medical device authorizations recorded by the FDA between September 1995 and December 2025 to characterize the cumulative growth, specialty distribution, and manufacturer concentration of authorized devices. The annual authorization volume increased from a mean of 1.8 per year between 1995 and 2014 to 264 per year between 2023 and 2025, with 331 authorizations recorded in 2025 alone. Devices reviewed by the FDAs Radiology panel accounted for 1,094 of 1,430 authorizations (76.5%), and the three most represented panels (Radiology, Cardiovascular, and Neurology) accounted for 90.6% of all authorizations. Several large clinical specialties were represented by very small numbers of authorized devices, including Pathology (n = 9), Microbiology (n = 6), and Obstetrics and Gynecology (n = 4). No authorizations were recorded under a psychiatry or behavioral health review panel. Of 740 unique companies, 502 (67.8%) had a single authorized device, while 13 companies (1.8%) accounted for 217 devices (15.2%). The cumulative regulatory record demonstrates rapid growth that has been concentrated in image-rich diagnostic specialties, with limited representation across many specialties that account for substantial clinical activity in the United States. These findings may inform policy discussions about where regulatory, infrastructure, and dataset investments are most needed to broaden the clinical scope of medical AI.
Gharibyan, I.; Ahner, E.; Shao, R.; Sharma, D.; Navarsartian Tazehkand, T.; Diep, J.; Assoumou, B.
Show abstract
Background: Statins are key to preventing atherosclerotic cardiovascular disease and lowering low-density lipoprotein cholesterol and cardiovascular events. However, skepticism regarding their safety and value persists and is increasingly influenced by social media. TikTok has emerged as a major source of health information, but its content varies in quality and accuracy. This study evaluated the quality, attitudes, misinformation, and engagement of statin-related content on TikTok. Methods: Public TikTok videos were collected using predefined search terms and coded by creator type, thematic content, and overall attitude. Video quality was assessed using the DISCERN instrument, the Patient Education Materials Assessment Tool for Audiovisual Materials, and the Global Quality Score. False or misleading claims were independently reviewed by two cardiology fellows. Associations between engagement and quality were also examined. Results: Of 1,349 screened videos, 258 met inclusion criteria. Most were educational (91.0%), with non-physician healthcare providers (34.5%) as the largest creator group. Risks or negative effects were discussed more often than benefits (63.2% vs 42.2%), and 39.5% contained at least one false or misleading claim, most often from complementary and alternative medicine providers and wellness promoters. Quality differed by creator type across all instruments, with physician-created content scoring highest. Video popularity showed minimal association with informational quality. Conclusion: Statin-related TikTok content frequently emphasizes harms, often contains misinformation, and varies substantially in quality by creator type. Greater involvement of healthcare professionals on social media may help improve digital health literacy and counter misleading information about statin therapy.
Ryder, R.; Elder, J.; Panditrao, M.; Grosgebauer, K.; Katz, R.; Tello, L.; Carroll, E.; Borthwick, D.; Kaur, C.; Smith, R.; Shiau, V.; Wheeler, W.; Reilly, E.; Myers, J.; Nelson, L.; Lim, E.; Arunleung, P.; Baylis, E.; Gilliam, S.; Hennesy-Burt, T.; Bregman, B.; Silver, E.; Kapsak, C.; Wright, S.; Leon, T.; Bell, J.; Morales, C.; Wadford, D. A.
Show abstract
In July 2021, the California Code of Regulations Title 17 required all laboratories performing SARS-CoV-2 whole genome sequencing (WGS) to report their sequencing results to the California Department of Public Health (CDPH). These viral genomic data and patient metadata were compiled into the Integrated Genomic Epidemiology Database (IGED). Linking anonymized viral sequences with patient-level information enabled monitoring of infectiousness, pathogenicity, transmission dynamics, evolution, and vaccine evasion among emerging SARS-CoV-2 lineages. Laboratories performing SARS-CoV-2 WGS transmitted sequencing results to CDPH through Electronic Laboratory Reporting (ELR) and non-ELR pathways. CDPH applied uniform reporting requirements but allowed flexibility in specific data formats to accommodate diverse data systems. To preserve data quality and interoperability across heterogeneous sources, CDPH implemented standardization, validation, and deduplication protocols. Snowflake, a cloud-based data storage and analytics platform, and Posit Connect, a cloud deployment and automation platform, supported the management, processing, and integration of data within the IGED. The IGED established links between SARS-CoV-2 WGS data and epidemiologic metadata for 801,418 sequences, representing 81.7% of all sequences reported in California. Lineages reported to the IGED showed strong concordance with lineage proportions in GISAID. Sequences reported to the IGED had average turnaround times longer than one month, and the majority of sequencing was performed in Southern California and Los Angeles. The IGED enhanced genomic surveillance through predictive modeling and monitoring concerning evolutionary trends such as recombination and saltations in persistent infections. Development of the IGED highlighted the need for standardized data requirements, sustained funding for sequencing, incentives for data submission, and interdisciplinary collaboration to build an effective genomic surveillance system. This framework for linking genomic and epidemiologic data has not only generated critical insights for SARS-CoV-2 but also provided the foundation for CDPH and other public health organizations to develop similar IGED-like systems for other priority pathogens as genomic surveillance expands.
Rivers, B.; Murray, B.; Applegate, C. D.; Tichnell, C.; Gordon, C.; McClellan, R.; Brown, E.; Nunez, K.; Barth, A. S.; Taylor, C. O.; Yanek, L. R.; Day, J.; James, C. A.
Show abstract
Background: Pretest genetic counseling (GC) is recommended in conjunction with genetic testing (GT) for cardiovascular (CV) indications, yet access to CVGC is limited leading to delayed GT. Posttest GC could increase GC and GT access but requires efficient pretest education that supports both informed GT decision-making and robust GT uptake. Methods: We developed four indication-tailored online CV genetics education videos and deployed them in a 3-arm randomized trial comparing pretest vs. posttest outpatient CVGC (RESEQUENCE-GC, NCT05422573). Participants were 1:1:1 randomized to pretest video education plus an optional (efficiency arm) or required (flipped arm) phone call with a genetic counselor and planned posttest CVGC or to standard pretest CVGC (SOC arm). Questionnaires administered at baseline and post-education included the CV Multidimensional Model of Informed Choice [MMIC] to quantify GT knowledge and informed GT choice. Results: 389/767 (50.7%) adults aged 18-80 (mean 51.2{+/-}14.9 years) scheduling a first CVGC appointment consented to RESEQUENCE-GC and completed the baseline questionnaire. Efficiency arm participants (video education + optional phone call) were most likely to complete pretest education (134, 97.4% efficiency; 107, 85.6% flipped; 111, 87.4% SOC, p=0.0012) and elect GT (131, 95.6% efficiency; 105, 84.0% flipped; 107, 84.2% SOC, p=0.0036). Few (4, 2.9%) efficiency arm participants requested an optional pretest phone call. Most flipped arm participants (90, 84.1%) had no post-video questions, consistent with the 97 second [IQR: 65s-145s] median call duration. CV genetics knowledge was high post-education (median 8 [IQR 7,8]/8 MMIC items correct). Only video-based pretest education was associated with a significant increase in knowledge (p<0.0001). Nearly all participants made an informed GT choice with no difference between intervention (95.6%) and SOC (90.4%) arms (p=0.074). Conclusions: Tailored, online video pretest education can enhance CV GT uptake, support informed GT decision-making, and be integrated into efficient pretest workflows, suggesting utility in scalable posttest CVGC.
Xu, S.; Sy, L. S.; Hong, V.; Farrington, P.; Glenn, S. C.; Kim, S.; Ryan, D. S.; Tubert, J. E.; Tong, P.; Lewin, B. J.; Tseng, H. F.; Carbayo, A.; Davis, C.; Sangha, N. S.; Belongia, E. A.; Sundaram, M. E.; Nelson, J. C.; Daley, M. F.; Klein, N. P.; Fireman, B.; Haapala, J.; Hurley, L. P.; Irving, S. A.; Cocoros, N. M.; Weintraub, E. S.; Duffy, J.; Qian, L.
Show abstract
Background: The Vaccine Safety Datalink (VSD) detected a statistical signal for ischemic events (ischemic stroke or transient ischemic attack) following bivalent mRNA COVID-19 vaccination through prospective surveillance during 2022-2023. Although multiple studies from other surveillance systems and countries reported no increased risk, important methodological limitations remained. This U.S. study addressed those limitations by evaluating the ischemic stroke risk following bivalent mRNA COVID-19 vaccination, influenza vaccination, and their same-day coadministration using event-dependent self-controlled case series (SCCS) design. Methods: Study outcomes included first-ever ischemic stroke (primary outcome), first-in-1-year ischemic stroke (secondary outcome), and ischemic events (exploratory outcomes), identified using ICD-10-CM codes in inpatient and emergency department settings during September 1, 2022-March 31, 2023, among individuals aged>=12 years across eight VSD sites. Analyses were conducted separately for Pfizer-BioNTech and Moderna bivalent vaccines, with relative incidences (RI) and 95% confidence intervals (CI) estimated for 1-21-day and 1-42-day risk intervals, using person-time outside these intervals as the control period. Subgroup analyses were performed by age group (12-64, >65 years) and history of documented SARS-CoV-2 infection. Results: A total of 6,510 first-ever ischemic strokes were identified among more than 6.8 million participants. Among recipients of Pfizer-BioNTech bivalent COVID-19 and influenza vaccines, no statistically significant increased risk of first-ever ischemic stroke was observed following bivalent COVID-19 vaccination (RI=0.94; 95% CI: 0.63-1.41), influenza vaccination (RI=0.95; 95% CI: 0.82-1.10), or same-day coadministration (RI=1.15; 95% CI: 0.88-1.49) within 1-21-day risk intervals; findings were similar for 1-42-day intervals. Comparable null results were observed for Moderna vaccines and across all subgroups, secondary, and exploratory outcomes. Conclusion: No increased risk of ischemic stroke was found following bivalent mRNA COVID-19 vaccination, influenza vaccination, or their coadministration in this multi-site SCCS study. These findings are consistent with previous studies and underscore the importance of continued vaccine safety monitoring.
Panagiotopoulos, A.-P.; Laskaris, A.; Tsakri, D.; Manoussopoulos, Y.; Anastassopoulou, C.; Tsakris, A.; Ioannidis, J.
Show abstract
Objectives To quantify the frequency of baseline control-group use in published long COVID prevalence studies and assess their key methodological features. Design Cross-sectional meta-epidemiological evaluation of published post-acute COVID-19 prevalence studies, supplemented by a corresponding-author survey. Setting Published studies identified through a systematic review by Hou et al. (2025) and supplementary data obtained through direct email contact with corresponding authors. Participants A total of 440 published long COVID prevalence studies. Main Outcome measures Presence and type of comparator group, reliance on solely self-reported outcomes, acknowledgment of lack of a control group among uncontrolled studies, and availability of additional comparator data through author survey. Results Among 440 studies, 372 (84.5%) reported no control group on their publication. Healthy or uninfected comparators were reported in 55 studies (12.5%) and other comparator types in 14 (3.2%); 1 study included both categories. Solely self-reported outcomes were used in 279 studies (63.4%). Among 372 uncontrolled studies, 244 (65.6%) did not explicitly acknowledge the absence of a baseline comparator as a limitation anywhere in text. Corresponding authors of 140 studies (31.8%) responded to the survey; among them, 126 (90.0%) reported no additional comparative data, while 14 (10.0%) mentioned some available comparative datasets (19 additional datasets). Almost all of that information (10/14, 17/19) had been already published in other articles not captured by the Hou et al. systematic review. Conclusions Most published long COVID prevalence studies lacked comparator groups and relied exclusively on self-reported outcomes without acknowledging this limitation. Direct author contact identified little additional comparator information. Much of the long COVID prevalence literature may therefore be poorly suited to estimating burden attributable specifically to SARS-CoV-2, underscoring the need for appropriately matched comparators and more objective outcome assessment. Registration The protocol was prospectively registered on the Open Science Framework (https://osf.io/f4hra).
Sajjad, M.
Show abstract
Artificial intelligence (AI) tools have been rapidly adopted by medical researchers, yet whether early career researchers in low and middle income countries possess the awareness and habits needed to use these tools safely remains poorly documented. This study characterized AI adoption patterns, hallucination awareness, and verification and disclosure practices among early career medical researchers in Pakistan. A cross sectional anonymous online survey was conducted among medical students, house officers, residents, physicians, and faculty involved in research or academic work across Pakistan (May 2026). Descriptive statistics and chi square tests were applied to 373 eligible responses. AI use was near universal (99.7%), with 60.3% using AI tools daily. The most commonly reported tool in this sample was Claude (40.5%), followed by ChatGPT (29.2%) and Perplexity (26.0%), though this ranking likely reflects sampling characteristics. Despite high adoption, 59.2% typically did not verify AI outputs before use, and 40.2% had never heard that AI can generate fabricated scientific references. In behavioral vignettes, 36.5% assumed convincing AI generated references were authentic, and 54.2% would continue using remaining AI content after discovering one fabricated reference. Formal research training was strongly associated with consistent disclosure (51.7% vs. 17.1%; chi square=48.43, p less than 0.001). Role, daily use frequency, and research training were not significantly associated with verification behavior. Early career medical researchers in Pakistan demonstrate high AI adoption alongside incomplete hallucination awareness and infrequent verification, a pattern that may carry implications for research integrity. Formal training was the only factor significantly associated with consistent disclosure. Integration of AI literacy into medical curricula and institutional governance frameworks merits consideration.
Serrano, A. E.
Show abstract
Machine learning (ML) has emerged as a transformative technology across biomedical and life science sectors, with applications spanning drug discovery, medical imaging, genomics, and clinical decision support (Goecks et al., 2020; Patel et al., 2020). Despite exponential growth in ML-related publications, from fewer than 100 articles in 2003 to nearly 25,000 by 2021 (NCBI, 2022), adoption among industry professionals remains uneven and sector-dependent. Understanding what drives or inhibits this adoption is critical for organisations seeking to leverage ML capabilities in research and clinical practice. Technology adoption in organisational contexts has been extensively studied through the Technology Acceptance Model (TAM), originally proposed by Davis (1989) and subsequently extended to incorporate external variables influencing perceived usefulness (PU) and perceived ease of use (PEU) (Venkatesh & Davis, 1996). While TAM has been applied across multiple industries, its application within biomedical and life science contexts remains limited, and the industry-specific factors that shape ML acceptance in this sector have not been systematically examined. Two external variables are particularly relevant to life science professionals. First, the bibliometric journal impact factor (JIF) functions as a cognitive signal of scientific credibility, a sector where evidence-based decision-making is culturally embedded, and publication quality serves as a proxy for technological legitimacy (Garfield, 1996). Second, technology hype, operationalised through the Gartner Hype Cycle framework, represents a social influence variable that shapes organisational expectations and investment decisions around emerging technologies (Gartner Inc., 2018). Whether these variables influence ML acceptance among life science professionals, alongside individual knowledge and experience, has not been empirically tested. This study addresses that gap by investigating ML technology acceptance among 213 biomedical and life science professionals across EMEA, LATAM, and North America, using a cross-sectional quantitative survey and PLS-SEM analysis. The TAM model is extended with three external variables, JIF, technology hype, and prior knowledge and experience, to test their influence on PU and PEU in this specific professional context. Additionally, the study examines demographic and regional differences in ML acceptance, with particular attention to variation between academic researchers and healthcare professionals. The findings contribute a validated, sector-specific extension of TAM for life sciences, provide actionable insights for organisations seeking to accelerate ML implementation, and establish a framework for future subsector-specific research.
Edwards, P. J.; Caddick, B.; Skeen, A.; Lin, J.; Ridd, M. J.; Barnes, R. K.; Salisbury, C.
Show abstract
Background In 2024, one-third of GP appointments in England were conducted by telephone. What happens during these consultations is largely unknown. Aim To test the feasibility of collecting recorded GP telephone consultations with linked data and consent for future research use. Design and setting Retrospective observational study in seven practices in South West England. Method Adults who had a telephone consultation at practices that routinely record calls were invited to consent to retrieval of call audio, a 4-month electronic health record (EHR) extract and a post-consultation patient questionnaire. Practice-level consent rates were analysed using regression models. Results Of 28 clinicians recruited, 19 GPs had consultations with patients whose recordings were retrievable, usable, and consented for future research. Of 2,053 invitations, 123 patients consented (6.0%). Consent was lower in more deprived practices (IMD 1-2 vs 9-10: OR=0.22, 95CI=0.09-0.54). Of 101 recordings retrieved, 96 were usable and 91 had consent for future research. 86/91 were linked to EHRs and 89/91 to post-consultation patient questionnaires. Mean consultation duration was 7 minutes 13 seconds; audible typing was heard in 69% (63/91). 161 problems were discussed (mean 1.77 per consultation). Most patients were happy their consultation was by telephone (96/117, 82%), although the majority reported usually preferring face-to-face appointments (68/115, 59%). Conclusion It is feasible to assemble a reusable archive of GP telephone consultations with linked data. However, recruitment was low using retrospective remote consent. Future work should test alternative recruitment approaches, particularly to improve patient engagement at practices serving deprived populations.
Yang, Y.; Peracchio, L.; Mayourian, J.; Miller, T.; La Cava, W.
Show abstract
Background Artificial intelligence-enhanced electrocardiography (AI-ECG) enables scalable, low-cost cardiac dysfunction screening, but existing models are annotation-intensive and predominantly adult-derived, leaving paediatric generalizability uncertain. Paediatric cohorts exhibit highly variable cardiac morphology and function compared to adults, which may be useful for learning generalizable AI-ECG models. Methods We pretrained ECG-Fyler on a predominantly paediatric, all-age cohort at Boston Children's Hospital (1992-2023), annotated with a cardiology-specific coding system (Fyler codes), and evaluated it on assessments from echocardiography (echo) and cardiac magnetic resonance (CMR) studies. We validated on an external adult cohort from Columbia University Irving Medical Center. Performance was benchmarked against several AI-ECG foundation models by AUROC across age groups, lesion types, and limited-data scenarios. Findings The pretraining cohort comprised 782,138 ECGs from 255,271 patients (median age: 10.9 years, IQR: [2.8-16.8]). Internal evaluation included 178,495 ECG-echo pairs (median age: 10.9 [3.7-17.0]) and 8,584 ECG-CMR pairs (median age: 20.7 [15.6-29.6]). External validation included 82,543 ECG-echo pairs from adults (median age: 64.0 [52.0-74.0]). ECG-Fyler improved AUROC across biventricular dysfunction and dilation tasks, with the largest gains in low-data settings. In internal validation, ECG-Fyler detected low left ventricular ejection fraction (LVEF [≤] 40%) from only 100 fine-tuning samples (AUROC: 0.80, 95% CI: [0.78-0.80]), outperforming other models (AUROC < 0.65) and improving with additional fine-tuning (AUROC: 0.94 [0.93-0.94]). Similar improvements were observed for CMR-derived LVEF, RVEF, and ventricular dilation. In external validation on adults, ECG-Fyler exhibited an AUROC of 0.83 (CI: [0.82-0.85]) for LVEF [≤] 40%. After fine-tuning on less than 10% of external data, LVEF [≤] 45% performance (AUROC: 0.87 [0.86-0.88]) outperformed a fully trained, site-specific prior model (AUROC: 0.85 [0.84-0.87]). Interpretation Pretraining on richly annotated, paediatric-dominant ECGs yields models that transfer efficiently across institutions and ages, supporting AI-ECG screening and triage when labels or imaging access are limited. Funding National Institutes of Health (R01LM012973); Kostin Innovation Fund, Boston Children's Hospital
Alahdab, F.; Mittendorfer, B.
Show abstract
Objective: To estimate the adjusted relative risk (RR) of administrative grant disruption faced by first-time and mechanism-first principal investigators (PIs) during the 2025 National Institutes of Health (NIH) grant disruptions. Design: Retrospective cohort study linking NIH RePORTER data to a publicly curated registry of grants disrupted in 2025. Setting: All NIH active research grants in fiscal years 2024 to 2025. Participants: 80,976 active projects: 4,961 disrupted during the wave that peaked in May 2025, 76,015 non-disrupted controls. Main outcome measures: Adjusted RR of disruption by two pre-specified first-time PI constructs: absolute first-time PI (no prior NIH grant) and mechanism-first PI (no prior NIH grant with the same activity code). Modified Poisson regression with institution-clustered standard errors adjusted for project, institutional, and geographic covariates. A pre-specified fiscal year 2024 common-anchor analysis addressed year-of-disruption confounding. Results: Of 4,961 disrupted grants, 237 (4.8%) had an absolute first-time PI and 396 (8.0%) had a mechanism-first PI. After adjustment, absolute first-time PIs faced 77% elevated risk of disruption (RR 1.77, 95% CI 1.34 to 2.32) and mechanism-first PIs faced 57% elevated risk (RR 1.57, 1.16 to 2.11). Under the common-anchor analysis, the absolute first-time effect attenuated to RR 1.22 (0.95 to 1.58); the mechanism-first effect persisted (RR 1.48, 1.07 to 2.06). The elevated risk was concentrated in research-mechanism grants (RR 1.78, 1.26 to 2.52) and was robust across 8 of 9 pre-specified sensitivity analyses. The Track A start-time construct, which asks whether the disrupted project was the PI's debut grant, yielded null estimates (RR 0.98, 0.93 to 1.04), with any effect concentrated entirely in newly started projects. Conclusions: First-time and mechanism-first PIs faced disproportionately elevated risk of disruption during the 2025 NIH wave, concentrated in research-mechanism grants and robust to year-confounding-free identification. The relevant exposure was being early-career at the moment of administrative action, not at project initiation. The findings have direct implications for workforce equity in US biomedical research.
Wang, Y.; He, H.; Zhu, R.; Lu, Y.; Phadungsaksawasdi, P.; Peng, M.; Liu, Z.; Zou, K.; Zhang, Y.; Chew, S. P.; Tham, Y. C.; Khorasani, A.; Deng, H.; Cheng, C.-Y.; Yang, J.; Liu, D.
Show abstract
Background Patients worldwide receive healthcare in many languages, yet medical AI systems are validated almost exclusively in high-resource languages such as English and Chinese, exposing patients in other linguistic settings to unquantified diagnostic risk. Existing multilingual evaluations rely on translated research-style benchmarks that fail to capture authentic clinical work. We aimed to characterise the patient safety consequences of multilingual medical AI deployment in real-world clinical settings and to develop an auditable detection method for unsafe outputs. Methods We evaluated different language models (LLMs) and visual language models (VLMs) across four real-world clinical tasks (conversational QA, radiology report generation, glaucoma diagnosis, ICU re-intubation prediction) in five languages (English, Chinese, Malay, Thai, Persian). We developed a token-level uncertainty toolkit to localise reasoning instability, compared three inference paradigms (native-language, English chain-of-thought, back-translation pivot), and conducted a prospective study (50 dialogues, 150 physician-reviewed records). Findings LLM/VLM performance degraded consistently from high- to low-resource languages across all tasks. Key gaps included: HealthBench score declining from 0.3743 to 0.3180; radiology macro-F1 from 0.2938 to 0.2149-0.2424, consistent with selective pathology suppression; glaucoma accuracy from 50.7% to 32.7%; ICU parameter recall from 100.0% to 48.5%. Multimodal inputs amplified degradation. Qwen3 VL 235B showed attenuated decline with no resource-ordered pattern in glaucoma classification. Token-level analysis localised instability to mid-chain stages (40-70% of the normalised trajectory); perplexity-based confidence failed to flag errors (AUROC 0.41-0.66). Back-translation pivot consistently restored performance. In the prospective study, 98.7% of records required physician edits (overall modification score 53.6%); Thai-pivot correction burden (59.0%) exceeded English-pivot (50.7%, p=0.003) and Chinese-direct (51.0%, p=0.004). Interpretation Multilingual deployment produced clinically consequential failures, including missed pathology, distorted physiological extraction, and amplified multimodal misclassification, that were invisible to monolingual validation and not reliably flagged by model confidence. Pretraining data composition may contribute to multilingual safety risk. Language-specific safety auditing should precede deployment in non-dominant-language healthcare settings; the open-source detection toolkit enables this without model retraining.