Diagnostics
○ MDPI AG
Preprints posted in the last 90 days, ranked by how well they match Diagnostics's content profile, based on 48 papers previously published here. The average preprint has a 0.08% match score for this journal, so anything above that is already an above-average fit.
Pham, H. T.; Bussey, K. J.; Oshiro, M. M.; Rounseville, M.; Moses, M.; Zulbaran-Rojas, A.; Nguyen, V.; Bernert, R. A.; Routh, J.; Watts, G.; Block, G. D.; Fisher, W. E.; Nelson, M. A.
Show abstract
ContextPancreatic ductal adenocarcinoma (PDAC) is an aggressive malignancy often diagnosed at advanced stages due to the lack of early clinical symptoms. DNA methylation alterations arise early in PDAC tumorigenesis and may serve as promising biomarkers for blood-based cancer detection. ObjectiveTo evaluate the performance of EPISEEK, a laboratory-developed blood-based multi-cancer early detection (MCED) assay, for detecting PDAC across disease stages. DesignA retrospective cohort study included 97 patients with stage I-IV PDAC and 201 asymptomatic healthy controls. Sensitivity, specificity, area under the curve (AUC), and stage-specific performance were assessed. EPISEEK-MCED performance was also compared with CA 19-9 alone and in combination with CA 19-9. ResultsEPISEEK-MCED classified 65 of 97 PDAC cases as positive, corresponding to an observed sensitivity of 70.1% (95% CI, 60.3% - 78.3%) at 99.5% specificity. The assay demonstrated strong discrimination between PDAC cases and healthy controls, with an AUC of 0.916 (95% CI, 0.88 - 0.952). Sensitivity increased with advancing stage while remaining substantial in early-stage disease, measuring 53.6% for stage I and 65.1% for stage II PDAC, 100% for stage III and 94.7% for stage IV. Across stages, EPISEEK-MCED outperformed CA 19-9 alone, particularly in early-stage disease. Combined analysis of EPISEEK-MCED and CA 19-9 further improved detection performance, achieving sensitivity of 57.1% and 81.4% for stage I and II, respectively. ConclusionsEPISEEK-MCED demonstrated high specificity and sensitivity for PDAC detection across disease stages, including early-stage disease. Combining EPISEEK-MCED with CA19-9 further improved performance, supporting its clinical utility for PDAC detection.
Sidiropoulou, Z.; Santos, C.
Show abstract
Rationale and ObjectivesPublished estimates of benign breast disease (BBD) are derived mainly from clinical, surgical, screening-recall, or reduction-mammoplasty series. Forensic autopsy cohorts can reduce referral and symptom-selection bias, although they are not necessarily representative of the whole living population. We describe imaging-detected benign breast findings in the Sisyphus forensic autopsy cohort. Materials and MethodsConsecutive medico-legal autopsies of individuals aged 40 years or older were prospectively evaluated over a multi-year period at a medico-legal autopsy service in Portugal. Bilateral breast specimens obtained by subcutaneous modified radical mastectomy were examined with specimen digital mammography and ultrasonography. Findings were classified according to BI-RADS terminology. Lesions requiring tissue diagnosis in the post-mortem protocol underwent wire-guided or direct excisional biopsy. Female cadavers were analysed as the primary cohort; male cadavers were analysed separately as an exploratory subgroup. Proportions are reported with exact 95% confidence intervals (CIs). ResultsThe cohort included 291 cadavers: 217 women and 74 men. Among female breast specimens, 236/434 were BI-RADS 1 (54.4%; 95% CI, 49.6-59.1), 189/434 were BI-RADS 2 (43.5%; 95% CI, 38.8-48.4), and 8/434 were protocol-sampled suspicious findings (1.8%; 95% CI, 0.8-3.6). At the cadaver level, 99/217 women had at least one benign imaging finding (45.6%; 95% CI, 38.9-52.5). Mammographic benign findings were present in 91/217 women (41.9%; 95% CI, 35.3-48.8), dominated by calcifications; ultrasonographic benign findings were present in 51/217 (23.5%; 95% CI, 18.0-29.7), most often simple cysts and duct ectasia. Plasma cell mastitis-pattern calcifications were observed in 8/217 women (3.7%; 95% CI, 1.6-7.1). Male benign findings were less frequent (9/74, 12.2%; 95% CI, 5.7-21.8) and were dominated by benign lymph-node variants. All nine protocol-sampled lesions were benign at histology. Clinical breast examination identified 5/8 protocol-sampled female lesions (62.5%; 95% CI, 24.5-91.5). ConclusionIn this forensic autopsy cohort unselected for breast symptoms, benign imaging findings were common in women aged 40 years or older and less frequent in men. The results provide descriptive post-mortem imaging reference data, but lesion-specific estimates, especially rare entities, should be interpreted with caution because of small numerators, the older age profile, limited clinical history, and the original cancer-focused design of the Sisyphus study.
HORAGUCHI, T.; Nomura, R.; Sakai, S. A.; Saito, N.; Kurihara, K.; Ohira, M.; Takaha, R.; Mitsui, N.; Yokoi, R.; Hatanaka, Y.; Hayashi, H.; Kuno, M.; Fukada, M.; Sato, Y.; Yasufuku, I.; Asai, R.; Bando, H.; Yamashita, R.; Matsuhashi, N.
Show abstract
PurposeIn this study, we aimed to develop and evaluate an artificial intelligence-based diagnostic model for the diagnosis of acute cholecystitis (AC) using non-contrast CT images and clinical data. Materials and MethodsThis retrospective study included 199 patients (100 AC, 99 non-AC) treated between January 2016 and December 2025 at a single center. Patients were randomly divided into training (n=139) and test (n=60) datasets. Three models were constructed: an imaging-based deep learning model, a clinical data-based machine learning model, and a hybrid machine learning model integrating deep learning-derived imaging features with clinical data. CT images were preprocessed, and gallbladder regions were segmented. Clinical variables included white blood cell counts and levels of C-reactive protein and liver function markers. Model performance was evaluated using accuracy, precision, recall, specificity, F1 score, and area under the receiver operating characteristic curve (AUC). Statistical comparisons were performed using Welchs t-test and Chi-square test. ResultsThe imaging-based model achieved accuracy 0.883, precision 0.848, recall 0.933, specificity 0.833, and AUC 0.916. The blood-based model achieved accuracy 0.917, precision 0.931, recall 0.900, specificity 0.933, and AUC 0.949. The hybrid model showed the highest performance, with accuracy 0.950, precision 0.909, recall 1.000, specificity 0.900, F1 score 0.952, and AUC 0.986. ConclusionA hybrid model integrating CT imaging and clinical data improved diagnostic performance for AC compared with single-modality models.
Mao, S.; Sahli, A. J.; Buoy, S. N.; Hutcheson, C.; Gelabert, G. A.; Barbon, C. E. A.; Naser, M. A.; Fuller, C. D.; Brock, K. K.; Hutcheson, K. A.
Show abstract
Purpose: Modified Barium Swallow (MBS) studies utilize videofluoroscopy, a dynamic X-ray technique for evaluating swallowing anatomy and physiology. Each MBS exam typically includes multiple bolus trials, often involving different bolus consistencies. Accurate classification of bolus types is essential, as swallowing dynamics, aspiration risks, and residue levels vary with bolus consistency. In this preliminary study, we propose a deep learning-based approach for automated bolus type classification in MBS, aiming to provide a standardized and efficient framework for automated processing of swallowing assessments. Methods: A total of 206 patients (Mean +/- SD age: 60.24 +/- 9.02 years; 89.32% men) underwent MBS examinations, comprising 277 individual MBS studies. The dataset included 2,752 bolus-level video segments, categorized by bolus type as follows: 1,711 liquid (IDDSI 0-3, 62.17%), 521 pudding (IDDSI 4, 18.93%), and 520 solid boluses (IDDSI 7, cookie or cracker, 18.89%). To standardize variable video lengths for the data pipeline, each MBS video was temporally segmented into a fixed-length frame sequence, with shorter videos padded using static frames and longer videos randomly cropped to the target length. We employed an Inflated 3D convolutional neural network to develop the deep learning model. Results: Each video segment contained an average of 273.03 +/- 195.81 frames. On the independent test set, the deep learning model achieved an overall accuracy of 96.13%, and the macro F1-score was 95.05% in classifying food bolus types within MBS videos. Conclusions: The developed AI-based system demonstrated effective automated classification of food bolus types in MBS videos, representing an important step toward fully automated MBS analysis for swallowing efficiency assessment. The AI model reduces the reliance on manual labels, thereby promising to streamline clinical and research workflows.
Luo, Y.; Zhang, X.; Li, R.; Zeng, Y.; Zhao, Y.; Li, L.; Qian, B.; Xiao, Y.; Li, M.; Zhao, Y.; Xu, S.; Yang, Q.; Zhang, H.; Chen, H.; Lu, C.; Lan, X.; Liu, C.
Show abstract
Assessment of pathologic complete response (pCR) following neoadjuvant chemotherapy (NAC) remains an unmet clinical need in breast cancer. Fibroblast activation protein inhibitor (FAPI) PET targets the tumor microenvironment and may therefore enhance response evaluation after NAC. This study aimed to compare the performance of [68Ga]Ga-FAPI-04 PET, [18F]FDG PET, and contrast-enhanced MRI for predicting pathologic response after NAC in breast cancer, with separate analyses for primary breast lesions and axillary lymph nodes. MethodsIn this prospective single-center diagnostic accuracy study, women with biopsy-confirmed stage II-III breast cancer underwent baseline and post-therapy [68Ga]Ga-FAPI-04 PET/MRI, [18F]FDG PET/CT, and contrast-enhanced MRI before surgery. Quantitative PET parameters were evaluated for primary tumors and axillary lymph nodes. pCR was defined as ypT0/isN0. Significant variables identified in univariable analyses were further explored using least absolute shrinkage and selection operator (LASSO) analysis, and receiver-operating-characteristic (ROC) analysis was performed to assess diagnostic performance. Fibroblast activation protein expression was also assessed by immunohistochemistry in paired pre- and post-therapy tumor specimens from a subset of patients. ResultsTwenty-four patients completed the study protocol, yielding 25 primary lesions and 44 metastatic lymph nodes across 27 axillary compartments. Overall patient-level pCR was achieved in 13 of 24 patients (54.17%). The lesion-level pCR rate was 60.00% (15/25) for primary breast lesions, and the node-level pCR rate was 72.73% (32/44) for axillary lymph nodes. For primary tumor response, post-therapy [68Ga]Ga-FAPI-04 SUVmax showed the highest diagnostic performance (AUC, 0.84; sensitivity, 80.00%; specificity, 80.00%; accuracy, 80.00%), whereas the optimal [18F]FDG parameter was {Delta} TBR% (AUC, 0.747). For nodal response, post-therapy [68Ga]Ga-FAPI-04 SULmean showed the highest diagnostic performance (AUC, 0.89; sensitivity, 91.67%; specificity, 81.25%; accuracy, 84.09%) and was significantly different from the best [18F]FDG parameter ({Delta} SULmax%, AUC, 0.669) on DeLong testing (P < 0.05). MRI achieved AUCs of 0.733 for primary lesions and 0.770 for lymph nodes. Stromal FAP expression positively correlated with [68Ga]Ga-FAPI-04 SUVmax and was markedly reduced in lesions achieving pCR. ConclusionPost-therapy [68Ga]Ga-FAPI-04 PET may serve as a promising adjunctive imaging biomarker for predicting pathologic response after NAC in breast cancer, particularly for axillary nodal assessment. These findings suggest that FAPI PET may provide clinically relevant information for preoperative evaluation of residual disease burden, potentially contributing to more individualized surgical planning and treatment decision-making.
Ueda, Y.; Okazaki, T.; Isome, H.; Patel, A.; Ichimasa, T.; Asaumi, R.; Kawai, T.; Suyama, K.; Hayashi, S.
Show abstract
BackgroundVertebral artery calcification (VAC), a critical indicator of cerebrovascular disease, is often overlooked in head-and-neck imaging. Manual detection is time-consuming and prone to inter-observer variability. This study aimed to develop and validate a deep learning model for automated detection and quantitative risk assessment of VAC in non-contrast head-and-neck computed tomography (CT) images, bridging the diagnostic gap between dentistry and vascular medicine. MethodsWe developed a deep learning model based on the ResNet-18 architecture, designated as Grayscale ResNet, optimized for single-channel CT images. The development followed a two-phase strategy: initial training on 539 axial images from head-and-neck CT image followed by iterative refinement (fine-tuning) using a targeted dataset of clinically significant cases to ensure generalizability. The models performance was evaluated using patient-level Receiver Operating Characteristic (ROC) analysis and saliency map visualization for clinical interpretability. ResultsThe optimized model demonstrated a robust performance in distinguishing between cases with and without VAC. In the independent cohort, the model achieved an area under the curve (AUC) of 0.846. At a specific threshold value (98.6%), the system yielded a sensitivity of 80.0% and a specificity of 90.6%. A saliency map analysis confirmed that the model consistently focused on anatomically relevant vascular regions. ConclusionsThe proposed automated system provides an accurate and reliable method for VAC screening using routine head-and-neck CT scans. By transforming incidental imaging findings into a quantifiable risk index, this tool can serve as a vital decision-support system for dentists and radiologists, facilitating early patient referrals and contributing to global stroke prevention.
Hu, Y.; Shui, Y.; Li, W.; Liang, J.; Song, Y.; Wang, M.; Zhang, F.; Zhang, M.; Wang, H.; Ji, L.; Li, M.; Wang, C.; Shao, N.; Kuang, X.; He, S.; Zhang, X.
Show abstract
Abstract Background Immune-related adverse events (irAEs) involving the breast remain rarely reported. Purpose To characterize clinical and imaging features of camrelizumab-associated breast lesions (CABLs). Materials and Methods This retrospective dual cohort study (October 2019 to February 2026) included 196 female patients. Cohort A comprised 180 non-breast cancer patients; Cohort B comprised 16 breast cancer patients receiving neoadjuvant camrelizumab. Baseline characteristics, treatment response, and CT/MRI features were compared between CABL-positive and CABL-negative groups using Mann-Whitney U and chi-square tests. Results CABLs developed in 34.4% (62/180) of Cohort A and 93.8% (15/16) of Cohort B. CABL-positive patients were younger (median 50.5 vs 54.5 years; P = 0.006) and more often premenopausal (46.8% vs 26.3%; P = 0.009). The objective response rate was relatively high among patients with positive lesions; in Group A, the disease progression rate was lower in the CABL-positive group than in the CABL-negative group (3.2% vs 17.8%), whilst in Group B, the pathological complete response rate was as high as 53.3% (8/15). On CT/MRI, CABLs were predominantly multiple (62.5%), with well-defined margins and unrestricted diffusion. The predominant time-intensity curve (TIC) pattern was washout (46.7%). Median time to onset was 2-3 cycles (the second MRI scan); most lesions disappeared (40.3%) and shrank (46.8%) during follow-up. ADC values of lesions were significantly higher than those of primary tumors (1.847+/-0.284 vs 0.976+/-0.055 x10[-3] mm[2]/s; P < 0.001). Histopathology of four lesions revealed lymphocytic infiltration and fibrosis without malignancy. Conclusion CABLs are benign reactive changes driven by multiple factors. Their recognition prevents misinterpretation as disease progression, thereby avoiding unnecessary treatment discontinuation or biopsy.
Veverkova, L.; Dolezalova, Z.; Marackova, V.; Mathew, E.; Urbankova, M.; Ambrozova, M.; Piskovsky, T.; Ngo, O.; Majek, O.
Show abstract
Objectives: The aim of mammographic screening is the early detection of invasive cancers. In the era of artificial intelligence (AI), this tool may improve diagnosis of earlier stages. The purpose of this study was to assess the impact on selected quality indicators retrospectively. Method: The data source was the Breast Cancer Screening Registry using data from one Screening Unit that currently uses AI routinely. The indicators of the cancer detection rate (CDR), further assessment rate (FAR), and recall rate (RR) in the year 2023, when AI was used, and the year 2022, without AI, in women aged 45-69 were compared. The statistical evaluation used the chi-square test and logistic regression adjusting for the effects of age, a woman's risk level, and the screening round at a 5% significance level. Results: In 2022, without AI, 4,034 women aged 45-69 were included, compared with 4,049 women in 2023 when AI was used. This study showed a non-significant increase in CDR from 5.0 breast cancers detected per 1,000 women (non-AI assessment) to 5.2 (AI-assisted assessment), p = 0.919; OR (95% CI): 1.034 (0.542-1.974), a significant decrease in the FAR from 5.2% to 3.9%, p < 0.001; OR (95% CI): 0.665 (0.529-0.836), and a decrease in RR from 2.4% to 1.9%, p = 0.083; OR (95% CI): 0.754 (0.548-1.037). Conclusion: AI has the potential to be a useful tool in the early detection of breast cancer by improving quality through a decrease in FAR and RR, while probably maintaining CDR.
Obeti, F.; Asiku, R. A.
Show abstract
BackgroundHepatocellular carcinoma (HCC) is a leading cause of cancer-related mortality worldwide, with particularly severe consequences in sub-Saharan Africa where access to advanced diagnostic imaging remains limited. Ultrasound is the most widely available imaging modality in low-resource settings, yet its sensitivity for detecting early-stage HCC remains insufficient when used in conventional B-mode alone. MethodsWe present a dual-path convolutional neural network (CNN) that jointly analyzes B-mode and contrast-enhanced ultrasound (CEUS) images for automated HCC detection. The model processes 1,057 labeled liver ultrasound images from 85 patients sourced from The Cancer Imaging Archive, a publicly available single-center dataset. A preprocessing pipeline extracts liver-centered regions of interest from heterogeneous DICOM files, including automatic separation of dual-panel B-mode and CEUS frames. Each imaging modality is processed through a dedicated ResNet-34 backbone initialized with ImageNet weights, and the resulting feature embeddings are fused through a late-fusion classification head. The model is evaluated using patient-wise five-fold cross-validation and a held-out 20% patient-level test set. ResultsOn the held-out test set, the model achieved 94.2% accuracy, 93.6% precision, 100% sensitivity, 83.3% specificity, and a 96.7% F1-score for binary HCC versus non-HCC classification. Cross-validation analysis showed consistently high discrimination across folds, with AUC values ranging from 0.93 to 0.98. Training dynamics indicated that early stopping typically activated between epochs seven and eleven, with validation loss closely tracking training loss and no evidence of severe overfitting under the chosen regularization scheme. ConclusionsThese findings demonstrate that a relatively lightweight multimodal CNN, trained on carefully preprocessed public data, can provide strong imaging-level discrimination between HCC and non-HCC findings within a single-center dataset. However, the small sample size, pronounced class imbalance, and single-center origin of the data preclude any claims of clinical utility at this stage. This work is a transparent, reproducible methodological baseline intended to support future multi-site validation, particularly in African and other low-resource clinical settings where ultrasound-based decision support could have the greatest impact.
Jean, A.; Benillouche, P.; Jacques, T.
Show abstract
This study analyzes the adoption, barriers, and expectations of French radiologists regarding the use of Artificial Intelligence (AI) solutions in their daily practice. Despite a recognition of AI's potential to make radiology more precise, predictive, and personalized, its adoption remains limited. The main obstacles identified are the high cost of those solutions and the insufficient equipment of French imaging centers with AI technologies. Nevertheless, the survey reveals a strong willingness to adopt, with over 70% of radiologists expressing their desire to use AI and 0% declaring a refusal to use it. Furthermore, the radiologists' fears of being replaced by AI are very low (0 to 8.8%).
Solanki, s.; Solanki, N.; Prasad, J.; Prasad, R.; Harsulkar, A.
Show abstract
Background: Early breast cancer detection remains central to improving clinical outcomes, yet conventional screening pathways, particularly mammography, have recognized limitations in sensitivity, specificity, and performance in dense breast tissue. Circulating microRNAs (miRNAs) have emerged as promising minimally invasive biomarkers, while artificial intelligence and machine learning (AI/ML) offer powerful tools for identifying diagnostically relevant multi-marker patterns within complex biomarker datasets. This systematic review and meta-analysis evaluated the diagnostic performance of AI/ML-based circulating miRNA signatures for early breast cancer detection. Methods: A systematic search of PubMed/MEDLINE, Scopus, and Web of Science Core Collection was conducted from database inception to 31 December 2025. Studies were eligible if they were original human investigations evaluating circulating miRNAs using an AI/ML-based diagnostic model for breast cancer detection and reporting extractable diagnostic performance metrics. Study selection followed PRISMA 2020 and PRISMA-DTA guidance. Methodological quality was assessed using QUADAS 2. Pooled sensitivity and specificity were synthesized using a bivariate random-effects model, and overall diagnostic performance was summarized using a hierarchical summary receiver operating characteristic framework. Results: Seven studies met the inclusion criteria for qualitative synthesis, with eligible studies contributing to the quantitative analysis depending on data availability. Across the pooled analysis, AI/ML-based circulating miRNA models demonstrated good overall diagnostic performance, with a pooled AUC of 0.905 (95% CI: 0.890 to 0.921), pooled sensitivity of 81.3% (95% CI: 76.8% to 85.2%), and pooled specificity of 87.0% (95% CI: 82.4% to 90.7%). Heterogeneity was moderate for AUC (I2 = 42.3%) and sensitivity (I2 = 38.7%) and low for specificity (I2 = 28.4%). Risk-of-bias assessment showed overall low-to-moderate methodological concern, with patient selection representing the most variable domain. Deeks funnel plot asymmetry test showed no significant evidence of publication bias (p = 0.34). Conclusions: AI/ML based circulating miRNA signatures show promising diagnostic accuracy for early breast cancer detection and may have value as non invasive adjunctive tools within imaging supported diagnostic pathways. However, the evidence base remains limited by methodological heterogeneity, variable validation rigor, and the predominance of retrospective case control designs. Prospective, standardized, and externally validated studies are needed before routine clinical implementation can be justified.
Xu, Y.; Heacock, L.; Park, J.; Pasadyn, F. L.; Lei, Q.; Lewin, A.; Geras, K. J.; Moy, L.; Schnabel, F.; Shen, Y.
Show abstract
Background: Imaging-based breast cancer risk prediction models primarily use full-field digital mammography (FFDM). As digital breast tomosynthesis (DBT) has become a predominant screening modality in the United States, its potential for long-term breast cancer risk prediction remains under-explored. Objective: To develop and evaluate a deep learning model that uses longitudinal DBT exams to predict long-term breast cancer risk. Methods: This retrospective study included 313,531 DBT exams from 161,165 women (mean age, 58.5, std 11.7 years) between January 2016 and August 2020 at Institute A. A risk prediction (DRP) model was developed to estimate 2-5 year breast cancer risk using longitudinal DBT exams, patient age and breast density. Model performance was compared with a single-time point DBT model, the Mirai model using same-day FFDM, and the Tyrer-Cuzick model using the area under the receiver operating characteristic curve (AUC), time-dependent concordance index, and integrated Brier score. Results: In an independent test set (n = 34,580), the longitudinal DRP model achieved a 5-year AUC of 0.720 (95% CI, 0.703-0.738), improving on the single time point DRP model (AUC, 0.706; 95% CI, 0.687-0.724; p < 0.001) and the Mirai model (AUC, 0.687; 95% CI, 0.668-0.705; p < 0.001). In a matched case-control cohort (n=432), the DRP model achieved a 5-year AUC of 0.676 (95% CI, 0.626-0.727), compared with 0.567 (95% CI, 0.514-0.621; p < 0.001) for the Tyrer-Cuzick model. The model reclassified 37.6% (705/1,877) of women with extremely dense breasts as average risk, with a 5-year cancer incidence of 0.7% (5/705), and identified 15.5% (404/2,605) of women with fatty breasts as high risk, with a 5-year cancer incidence of 2.5% (10/404). Conclusion: A deep learning model using longitudinal DBT examinations improved long-term breast cancer risk prediction compared with FFDM-based and clinical risk models. Clinical Impacts: Longitudinal DBT-based risk prediction may enable dynamic risk assessment using screening images, supporting personalized screening strategies and more targeted use of supplemental imaging.
Hovda, T.; Sober, S.; Padrik, P.; Kruuv-Kao, K.; Grindedal, E. M.; Vamre, T. B. A.; Eikeland, E.; Hofvind, S.; Sahlberg, K. K.
Show abstract
BackgroundPopulation-based mammographic screening is primarily age-based. However, breast cancer risk is multifactorial, and women may benefit from personalized risk-based screening. This pilot study aimed to explore the use of polygenic risk score (PRS) as a tool for risk stratification in personalized screening. MethodsWe included 80 women aged 40-49 years referred for clinical mammography. Exclusion criteria were prior breast cancer or premalignant breast disease, and previous genetic testing. After DNA collection, PRS was calculated from 2805 Single Nucleotide Polymorphisms (SNPs). Screening recommendations were based on each participants relative 10-year breast cancer risk estimated from PRS and compared with the 10-year risk of an average woman of the same age. Women with a self-reported family history of cancer meeting standard criteria were referred for gene panel testing for pathogenic variants in high-risk genes. A follow up questionnaire regarding participants experiences was distributed 6-9 months after PRS testing. ResultsMean age was 45.2 years (SD 2.8). Mean relative 10-year breast cancer risk was 1.18 (SD 0.57). Based on PRS, 40 participants were recommended standard biennial screening 50-69 years, while 40 were advised to begin biennial screening before age 50. Among these, 7 were recommended annual mammography from when their 10-year risk reached twice that of an average 50-year-old. Twenty-one women underwent gene panel testing; no pathogenic variants in breast cancer genes were identified. Five women were advised annual mammography from 40-60 years due to family history of breast cancer, regardless of PRS. Most respondents viewed breast cancer risk assessment positively and did not report increased anxiety after testing. ConclusionsPolygenic risk score testing may influence current screening recommendations and contribute to more personalized risk-based breast cancer screening strategies.
Ludwig, K. D.; Hatt, C. R.; Keith, L.; Matyga, A. W.; Te, H. S.; Landeras, L.; Chelala, L.; Patel, A. R.; Chung, J. H.
Show abstract
Objective: Coronary artery calcification (CAC) assessment for cardiovascular risk stratification is traditionally achieved using ECG-gated computed tomography (CT). Automated deep-learning (DL) algorithms may streamline opportunistic CAC detection and scoring, particularly on non-gated CT scans. This study evaluated the performance of a fully automated DL-based CAC scoring algorithm ("DL-CAC") against expert human scoring. Methods: The algorithm was trained on 1,260 chest CT scans from multiple databases to automatically identify coronary calcium, calculate Agatston scores, and assign a cardiovascular disease (CVD) risk classification. Performance was assessed on a holdout dataset (n=500) comprising ECG-gated calcium scoring CT scans and lung cancer screening non-gated chest CTs as well as in an external, independent CT dataset (n=129) from liver transplant candidates. Agreement with expert scoring was assessed using intraclass correlation coefficient (ICC) for Agatston scores and Cohen's {kappa} for CVD risk classification. Results: The algorithm demonstrated high agreement with expert scoring in the pooled calcium scoring and lung cancer screening cohorts, with an ICC of 0.947 for Agatston scores and {kappa} of 0.936 for CVD risk classification. For liver transplant candidates, the algorithm exhibited substantial agreement with expert scoring of non-gated CT scans ({kappa}=0.79) and a sensitivity of 90.4% and specificity of 96.4% in high-risk cases. Conclusion: These findings suggest that DL-based CAC scoring on non-gated CT scans may be a feasible alternative to traditional methods and could support opportunistic cardiovascular risk assessment in routine imaging. Further validation is warranted to assess clinical integration in broader practice settings.
Singh, V.; Jhamb, A.; Sil, S.; Kumar, S.; Agrawal, C.; Pareek, A.; Gautam, A.; Parale, G.; Singh, S.; Padmanabhan, D.
Show abstract
BackgroundA critical radiologist shortage exists in India, leading to delayed chest radiograph (CXR) interpretation. This leads to disease progression, higher morbidity, and mortality. Artificial intelligence-based CXR interpretation by Lenek Intelligent Radiology Assistant (LIRA) is a promising solution. This study aims to establish the screening and triaging capabilities of LIRA by assessing its accuracy in detecting abnormalities and pathologies in CXRs from geographically diverse institutions. MethodsWe conducted a retrospective multi-source validation of the diagnostic accuracy of LIRA for the detection of general abnormalities, tuberculosis, consolidation, pleural effusion, pneumothorax, and cardiomegaly. De-identified chest radiographs were input into LIRA models. The obtained interpretations were compared to the established ground truth reporting for the calculation of sensitivity, specificity, and AUROC with 95% CI for individual pathologies across varying probability thresholds. ResultsLIRA demonstrated high sensitivity for general abnormality detection (AUROC 0.93-0.986, 84.4-97.1% sensitivity, 88.9-92.4% specificity) and tuberculosis triaging (Shenzhen & Montgomery: 88.5-89.7% sensitivity, 89.9-90.5% specificity; Jaypee: 98.7% sensitivity, 63.6% specificity). For consolidation (AUROC 0.884-0.895, 96.4-96.9% sensitivity, 70.8-77.1% specificity), pleural effusion (AUROC 0.942-0.967, 79.7-99.1% sensitivity, 81.2-87.7% specificity), pneumothorax (AUROC 0.87, 90.6-94.8% sensitivity, 79.5-82.7% specificity) and cardiomegaly (AUROC 0.883, 95.1% sensitivity, 81.6% specificity), the model exhibited commendable accuracy as well. ConclusionsThe diagnostic performance of LIRA was consistent across various pathologies and chest radiographs from diverse geographic locations, with particular strengths in abnormality detection and tuberculosis screening. The risk-stratified triaging and high sensitivity of LIRA make it a reliable adjunct solution to address radiologist shortages, reduce turnaround times, and support Indias tuberculosis elimination goals.
de Boer, S.; Häntze, H.; Ziegelmayer, S.; van Ginneken, B.; Prokop, M.; Bressem, K. K.; Hering, A.
Show abstract
BackgroundMedical imaging, especially computed tomography and magnetic resonance imaging, is essential in clinical care of patients with renal cell carcinoma (RCC). Artificial intelligence (AI) research into computer-aided diagnosis, staging and treatment planning needs curated and annotated datasets. Across literature, The Cancer Genome Atlas (TCGA) datasets are widely used for model training and validation. However, re-annotation is often necessary due to limited access to public annotations, raising entry barriers and hindering comparison with prior work. MethodsWe screened 1915 CT scans from three TCGA-RCC databases and employed a segmentation model to annotate kidney lesion. After a meta-data-based exclusion step, we hosted a reader study with all papillary (n=56), chromophobe (n=27) and 200 randomly selected clear cell RCC cases. Two students quality checked and corrected the data as well as annotated tumors and cysts. Uncertain cases were checked by a board-certified radiologist. ResultsAfter data exclusion and quality control a total of 142 annotated CT scans from 101 patients (26 female, 75 male, mean age 56 years) remained. This includes 95 CTs with clear cell RCC, 29 with papillary RCC and 18 with chromophobe RCC. Images and voxel-level annotations of kidneys and lesions are open sourced at https://zenodo.org/records/19630298. ConclusionBy making the annotations open-source, we encourage accessible and reproducible AI research for renal cell carcinoma. We invite other researchers who have previously annotated any of these cohorts to share their annotations.
Yotsutsuji, S.; Kataoka, H.; Ando, T.; Inada, M.; Sugano, M.; Takada, M.; Esaki, M.; Kato, K.; Yamamoto, Y.; Sano, Y.
Show abstract
BackgroundFor pancreatic cancer, practical blood-based tests for early detection and postoperative surveillance remain elusive. We sought to develop a qPCR-measurable serum microRNA (miRNA) panel that robustly discriminates pancreatic cancer from non-cancer controls and other malignancies. MethodsWe profiled 255 serum miRNAs in batch 1 (n=72) and selected 27 candidates. Candidates were refined in batch 2 (n=552) and cross-batch evaluation was performed with batch 3 (n=391) to derive a miRNA model. Independent validation used batch 4 (n=616). Clinical relevance was assessed in an independent clinical cohort of resection patients with samples obtained preoperatively and at 1 and 12 months postoperatively. ResultsThe miRNA model trained on batches 2 and 3 achieved an area under the curve (AUC) of 0.91 and 0.83 for pancreatic cancer versus non-cancer controls and non-cancer plus other cancers, respectively, when independently validated in batch 4. Stage-wise AUCs in batch 4 were 0.91 (I), 0.94 (II), 0.86 (III) and 0.90 (IV). In the clinical batch, the score decreased postoperatively (preoperative vs month 1; p<0.01) and was higher in recurrence than non-recurrence (p<0.001). ConclusionsThe developed compact miRNA qPCR assay discriminated pancreatic cancer across independent assay batches and showed clinical relevance for postoperative surveillance. Clinical Trial RegistrationNot applicable.
Holen, A. S.; Larsen, M.; Hofvind, S.
Show abstract
Background and ObjectiveIncreasing screening volumes, combined with global shortage of radiologists and a high proportion of normal mammograms, challenge the efficiency and sustainability of breast cancer screening. Artificial intelligence (AI) has the potential to improve resource allocation, workflow efficiency and diagnostic performance by supporting and partially replacing radiologists in the interpretation process. This randomized, controlled, parallel-group, non-inferiority, single-blinded trial evaluates whether an AI-supported reading strategy, involving one or two radiologists depending on AI risk stratification, is non-inferior to standard independent double reading. The primary outcome is the number of screen-detected breast cancer cases in each group. MethodsWomen invited to BreastScreen Norway in the Western, Central, and Northern Norway Regional Health Authorities are eligible for inclusion. Following written informed consent, participants are randomized 1:1 to the control group (standard independent double reading by two radiologists) or the intervention group. In the intervention group, mammograms are analyzed using Transpara. Examinations with AI scores of 1-7 are interpreted by a single radiologist, whereas examinations with scores of 8-10 undergo independent double reading. Radiologists are blinded to AI scores and AI image markings during the initial interpretation; this information is disclosed during consensus meetings. Non-inferiority will be assessed by estimating confidence interval for the difference in screen-detected cancer rates between groups. Non-inferiority will be concluded if the upper bound of the confidence interval does not exceed the predefined non-inferiority margin. ConclusionsThe trial addresses a critical challenge in breast cancer screening: maintaining diagnostic performance while improving efficiency in the context of workforce constraints and a high prevalence of normal examinations. By evaluating a risk-stratified AI-supported reading strategy within a population-based screening program, the study will provide important evidence on whether AI can be safely integrated to optimize workload distribution while preserving cancer detection rates. Trial registrationThe ClinicalTrials.gov registry (NCT06032390)
Heine, J.; Fowler, E.; Egan, K.; Weinfurtner, R. J.; Balagurunathan, Y.; Schabath, M. B.
Show abstract
A substantial body of evidence demonstrates that measures from mammograms are predictive of breast cancer risk. In this matched case-control study, mammograms acquired near the time of diagnosis were analyzed to investigate bilateral breast asymmetry as measure of short-term risk prediction. Specifically, contralateral breast images were compared with measures derived in the Fourier domain (FD); this technique summarizes power in concentric radial bands that cover the Fourier plane. Equivalently, this approach can be described as a multiscale characterization of the image. The summarized power difference between respective contralateral bands produces an asymmetry measure. Full field digital mammography (FFDM) and synthetic two-dimensional images from digital breast tomosynthesis (DBT) were investigated for women that had both types of mammograms acquired at the same time. Odds ratios (ORs) and the area under the receiver operating curves (Azs) were generated from conditional logistic regression modeling with 95% confidence intervals. Raw unprocessed FFDM images produced significant findings: OR = 1.90 (1.58, 2.29) and Az = 1.72 (0.67, 0.76) per one standard deviation unit. Associations were significant but attenuated for both clinical FFDM and DBT images: OR = 1.31 (1.11, 1.54) and Az = 0.63 (0.58, 0.67); and OR = 1.48 (1.25, 1.76) and Az = 0.65 (0.60, 0.70), respectively. Results suggest that clinical FFDM and DBT images are inferior to raw FFDM images in capturing breast asymmetry with information loss for breast cancer risk prediction. Moreover, these DBT images have lower spatial resolution but produced stronger associations than the clinical FFDM images.
Yang, J.; Li, L.; Cao, J.; Zhang, J.
Show abstract
Objective:This study aims to compare the advantages and disadvantages of DLIR and adaptive statistical iterative reconstruction-V (ASIR-V) in thin-slice (2.5 mm) CT images of hepatic lesions characterized by high and low contrast. Additionally, the study seeks to determine the optimal DLIR strength for the evaluation of liver lesions. Methods:A retrospective analysis was performed on 90 patients who underwent abdominal contrast-enhanced CT scans. Group A comprised 48 patients with low-contrast lesions, while Group B included 42 patients with high-contrast lesions. The acquired images were reconstructed using post-processing DLIR at low (DLIR-L), medium (DLIR-M), and high (DLIR-H) strengths, all with a slice thickness of 2.5 mm (subgroups A1-A3, B1-B3). Furthermore, images were reconstructed with ASIR-V at 50% strength at slice thicknesses of 2.5 mm and 5 mm (subgroups A4/B4 and A5/B5, respectively). CT values and standard deviations (SD) of the liver and lesions were measured, and the corresponding signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) were calculated. The edge rise slope (ERS) was determined using ImageJ software by measuring CT values along a line from the liver parenchyma to the lesion. Objective metrics were compared using one-way ANOVA, with independent samples t-tests applied for inter-group differences. Subjective scoring, which encompassed noise level, diagnostic confidence, and lesion margin delineation, was conducted by two radiologists, with differences analyzed using the Kappa test. Results: Objective evaluation revealed a progressive decrease in lesion SD and a progressive increase in SNR and CNR from subgroups A1/B1 to A3/B3. The SD of Group A2 decreased by 57.4% compared to A4, while the SNR and CNR of A2 icreased by 19.3% and 24.6% compared to A4. Although subgroup B2 had a lower SNR than B5, the difference was not statistically significant. SNR and CNR in B2 increased by 24.1% and 11.9%, respectively, compared to B4. ERS gradually decreased from A1/B1 to A3/B3. ERS values in A2 and B2 increased by 27.0% and 39.4%, respectively, relative to A5 and B5. Although A3 had a lower ERS than A1 and A2, all DLIR subgroups exhibited higher ERS than A5; similar trends were observed in Group B. Subjective evaluation indicated good inter-reader agreement (Kappa > 0.61, p < 0.05). As DLIR strength increased, noise scores rose progressively in both groups. However, noise in A2 and B2 was lower than in A4/A5 and B4/B5. Diagnostic confidence and lesion margin delineation scores were highest in A2 and B2, while all subjective scores were lowest in A5 and B5. Discussion: Most prior studies evaluated the liver, vessels, or confirmed that image quality can be guaranteed at low doses. However, there are few studies on specific individual lesions. Therefore, this study aims to investigate specific individual lesions. The details and detection rate were analyzed separately to confirm the clinical acceptability of 2.5-mm DLIR image in different contrast lesions. Conclusion: For both high- and low-contrast hepatic lesions, DLIR provides superior image quality compared to ASIR-V, with the 2.5mm DLIR-M setting being optimal. DLIR-M reduces image noise, improves spatial resolution, and produces images more suitable for diagnostic purposes.