Back

Reproducibility test of radiomics using network analysis and Wasserstein K-means algorithm

Oh, J. H.; Apte, A.; Katsoulakis, E.; Riaz, N.; Hatzoglou, V.; Yu, Y.; Leeman, J.; Mahmood, U.; Pouryahya, M.; Iyer, A.; Shukla-Dave, A.; Tannenbaum, A.; Lee, N.; Deasy, J.

2019-09-19 biophysics
10.1101/773168 bioRxiv
Show abstract

PurposeTo construct robust and validated radiomic predictive models, the development of a reliable method that can identify reproducible radiomic features robust to varying image acquisition methods and other scanner parameters should be preceded with rigorous validation. Due to the property of high correlation present between radiomic features, we hypothesize that reproducible radiomic features across different datasets that are obtained from different image acquisition settings preserve some level of connectivity between features in the form of a network.\n\nMethodsWe propose a regularized partial correlation network to identify robust and reproducible radiomic features. This approach was tested on two radiomic feature sets generated with two different reconstruction methods from a cohort of 47 lung cancer patients. The commonality of the resulting two networks was assessed. A largest common network component from the two networks was tested on phantom data consisting of 5 cancer samples. We further propose a novel K-means algorithm coupled with the optimal mass transport (OMT) theory to cluster samples. This approach following the regularized partial correlation analysis was tested on computed tomography (CT) scans from 77 head and neck cancer patients that were downloaded from The Cancer Imaging Archive (TCIA) and validated on CT scans from 83 head and neck cancer patients treated at our institution.\n\nResultsCommon radiomic features were found in relatively large network components between the resulting two partial correlation networks from a cohort of 47 lung cancer patients. The similarity of network components in terms of the common number of radiomic features was statistically significant. For phantom data, the Wasserstein distance on a largest common network component from the lung cancer data was much smaller than the Wasserstein distance on the same network using random radiomic features, implying the reliability of those radiomic features present in the network. Further analysis using the proposed Wasserstein K-means algorithm on TCIA head and neck cancer data showed that the resulting clusters separate tumor subsites and this was validated on our institution data.\n\nConclusionsWe showed that a network-based analysis enables identifying reproducible radiomic features. This was validated using phantom data and external data via the Wasserstein distance metric and the proposed Wasserstein K-means method.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Medical Physics
14 papers in training set
Top 0.1%
22.7%
2
Physics in Medicine & Biology
17 papers in training set
Top 0.1%
18.8%
3
PLOS ONE
4510 papers in training set
Top 12%
14.8%
50% of probability mass above
4
Magnetic Resonance in Medicine
72 papers in training set
Top 0.2%
10.2%
5
Scientific Reports
3102 papers in training set
Top 27%
4.3%
6
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
7
Magnetic Resonance Imaging
21 papers in training set
Top 0.2%
2.6%
8
Annals of Biomedical Engineering
34 papers in training set
Top 0.6%
1.7%
9
Statistics in Medicine
34 papers in training set
Top 0.2%
1.3%
10
Journal of Magnetic Resonance Imaging
14 papers in training set
Top 0.5%
1.0%
11
Bioinformatics
1061 papers in training set
Top 9%
0.9%
12
NeuroImage
813 papers in training set
Top 5%
0.9%
13
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.8%
0.8%
14
Archives of Clinical and Biomedical Research
28 papers in training set
Top 2%
0.8%
15
Frontiers in Physiology
93 papers in training set
Top 5%
0.8%
16
NMR in Biomedicine
24 papers in training set
Top 0.4%
0.8%
17
Expert Systems with Applications
11 papers in training set
Top 0.4%
0.8%
18
Bioengineering
24 papers in training set
Top 1%
0.8%
19
Journal of Theoretical Biology
144 papers in training set
Top 2%
0.8%
20
IEEE Access
31 papers in training set
Top 1.0%
0.8%
21
Biology Methods and Protocols
53 papers in training set
Top 3%
0.6%