Back

A hybrid-computer vision model to predict lung cancer in diverse patient populations

Zakkar, A.; Perwaiz, N.; Zhong, W.; Krule, A.; Burrage-Burton, M.; Kim, D.; Miglani, M.; Narra, V.; Yousef, F.; Gadi, V.; Korpics, M. C.; Kim, S. J.; Khan, A. A.; Molina, Y.; Dai, Y.; Marai, E.; Meidani, H.; Nguyen, R.; Salahudeen, A. A.

2024-10-07 oncology

10.1101/2024.10.07.24315011 medRxiv

Show abstract

PURPOSEDisparities of lung cancer incidence exist in Black populations and screening criteria underserve Black populations due to disparately elevated risk in the screening eligible population. Prediction models that integrate clinical and imaging-based features to individualize lung cancer risk is a potential means to mitigate these disparities. PATIENTS AND METHODSThis Multicenter (NLST) and catchment population based (UIH, urban and suburban Cook County) cross-sectional study utilized participants at risk of lung cancer with available lung CT imaging and follow up between the years 2015 and 2024. 53,452 in NLST and 11,654 in UIH were included based on age and tobacco use based risk factors for lung cancer. Cohorts were used for training and testing of deep and machine learning models using clinical features alone or combined with CT image features (hybrid computer vision). RESULTSAn optimized 7 clinical feature model achieved ROC-AUC values ranging 0.64-0.67 in NLST and 0.60-0.65 in UIH cohorts across multiple years. Incorporation of imaging features to form a hybrid computer vision model significantly improved ROC-AUC values to 0.78-0.91 in NLST but deteriorated in UIH with ROC-AUC values of 0.68-0.80, attributable to Black participants where ROC-AUC values ranged from 0.63-0.72 across multiple years. Retraining the hybrid computer vision model by incorporating Black and other participants from the UIH cohort improved performance with ROC-AUC values of 0.70-0.87 in a held out UIH test set. CONCLUSIONHybrid computer vision predicted risk with improved accuracy compared to clinical risk models alone. However, potential biases in image training data reduced model generalizability in Black participants. Performance was improved upon retraining with a subset of the UIH cohort, suggesting that inclusive training and validation datasets can minimize racial disparities. Future studies incorporating vision models trained on representative data sets may demonstrate improved health equity upon clinical use.

A hybrid-computer vision model to predict lung cancer in diverse patient populations

Matching journals