External validation of self-supervised transfer learning for noninvasive molecular subtyping of pediatric low-grade glioma using T2-weighted MRI
Yoo, J. J.; Tak, D.; Namdar, K.; Wagner, M. W.; Liu, A.; Tabori, U.; Hawkins, C.; Ertl-Wagner, B. B.; Kann, B. H.; Khalvati, F.
Show abstract
PurposeTo externally evaluate three binary classification models designed to differentiate the molecular subtype of pediatric low-grade glioma (pLGG) between BRAF Fusion, BRAF Mutation, and Wild Type on T2-weighted magnetic resonance imaging using self-supervised transfer learning, which enables effective performance in a low data setting. Materials and methodsThis retrospective study evaluates pLGG molecular subtyping models, pre-trained using data collected at Dana Farber Cancer Institute/Bostons Childrens Hospital, on two datasets from the Hospital for Sick Children, one consisting of patients identified from the electronic health record between January 2000 to December 2018 (n=336) and another consisting of patients identified from the electronic health record between January 2019 to April 2023 (n=87). These datasets consist of T2-weighted MRI with pLGG and corresponding genetic marker identifications, labelled as BRAF Fusion, BRAF Mutation, or Wild Type. The datasets included manually annotated ground-truth segmentations that were used in the classification pipeline during evaluation. The models were evaluated using the area under the receiver operating characteristic curve (AUC). To acquire a per-class probabilities across all three considered molecular subtypes, we used the output probabilities from each binary model as logits input to a Softmax function. These probabilities were used to determine the AUC of the models on each evaluated dataset. ResultsThe models performed achieved a macro-average AUC of 0.7671 on the newer dataset from the Hospital for Sick Children but achieved a lower macro-average AUC of 0.6463 on the older dataset from the Hospital for Sick Children. ConclusionsThe evaluated pLGG molecular subtyping models have the potential for effective generalization but may require further fine-tuning for consistent performance across varying datasets.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.