Back

Urban infrastructure and spatiotemporal environmental features for EGFR-mutant lung cancer

Lu, D.; Cui, L.; Kunz, N.; Wong, M.; Tayarani, M.; Solomon, J. P.; Garcia, C. A.; Altorki, N. K.; Choi, E.; Gao, H. O.; Shieh, Y.

2026-05-21 oncology
10.64898/2026.05.18.26353481 medRxiv
Show abstract

Background: Lung cancer in never-smokers is rising, with a substantial proportion harboring the EGFR mutation. While fine particulate matter (PM2.5) is a recognized risk factor, other intervenable pollutants and built environmental factors remain unknown. Objectives: To identify urban characteristics associated with EGFR-mutant (vs. wild-type) lung cancer using high-resolution spatiotemporal data. Methods: We analyzed 2,699 lung cancer patients with documented EGFR status treated at a high-volume academic medical center in New York City. Patient residential addresses were linked to high-resolution (300m x 300m) 5-year cumulative exposures to 3 air pollutants and 26 urban features. We developed Light Gradient Boosting Machine (LightGBM) models to classify EGFR status, comparing a basic clinical model with established predictors (Asian, female, never-smoking status, and adenocarcinoma histology) to an extended model with additional urban factors. Predictive performance was assessed based on discrimination (AUC). Results: We included 2,699 patients, of whom 54.1% were female and 25.8% self-identified as Asian, 11.2% as Black, and 7.4% as Hispanic; and 29% had EGFR-mutated cancer. The extended model showed modest improvements in discrimination (AUC: 0.775 [95% CI, 0.739-0.809] vs. 0.768 [0.723-0.811]), compared to the clinical model. Newly identified factors for EGFR-mutant status included black carbon (BC), nitrogen dioxide (NO2), proximity to airports, reduced access to public transportation, elevated noise levels, and lead exposure. Conclusions: Traffic-related pollutants (BC, NO2) from diesel engines and motor vehicles, and proximity to airports, were among the novel spatiotemporal features associated with EGFR-mutant lung cancer. These results may inform policy interventions.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Scientific Reports
3102 papers in training set
Top 11%
7.4%
2
Annals of Epidemiology
19 papers in training set
Top 0.1%
7.0%
3
Environment International
42 papers in training set
Top 0.2%
7.0%
4
JNCI: Journal of the National Cancer Institute
16 papers in training set
Top 0.1%
7.0%
5
PLOS ONE
4510 papers in training set
Top 26%
6.5%
6
JAMA Network Open
127 papers in training set
Top 0.5%
5.0%
7
Interface Focus
14 papers in training set
Top 0.1%
5.0%
8
Environmental Health Perspectives
17 papers in training set
Top 0.1%
4.4%
9
Environmental Pollution
35 papers in training set
Top 0.7%
4.1%
50% of probability mass above
10
Nature Communications
4913 papers in training set
Top 44%
2.7%
11
Science of The Total Environment
179 papers in training set
Top 2%
2.5%
12
Environmental Science & Technology
64 papers in training set
Top 1%
2.5%
13
eLife
5422 papers in training set
Top 35%
2.1%
14
BMJ Open
554 papers in training set
Top 7%
2.1%
15
PeerJ
261 papers in training set
Top 6%
1.8%
16
Environmental Research
46 papers in training set
Top 0.8%
1.7%
17
The Innovation
12 papers in training set
Top 0.3%
1.7%
18
PLOS Global Public Health
293 papers in training set
Top 4%
1.4%
19
Indoor Air
10 papers in training set
Top 0.2%
1.3%
20
International Journal of Radiation Oncology*Biology*Physics
21 papers in training set
Top 0.3%
1.3%
21
ACS Nano
99 papers in training set
Top 3%
1.1%
22
eBioMedicine
130 papers in training set
Top 3%
0.9%
23
European Respiratory Journal
54 papers in training set
Top 1%
0.9%
24
Nature
575 papers in training set
Top 15%
0.8%
25
Science
429 papers in training set
Top 19%
0.8%
26
BMC Medicine
163 papers in training set
Top 7%
0.8%
27
Annals of Biomedical Engineering
34 papers in training set
Top 1%
0.8%
28
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 44%
0.8%
29
Cancer Research
116 papers in training set
Top 3%
0.7%
30
Frontiers in Neuroscience
223 papers in training set
Top 8%
0.7%