Back

Training machine learning models on patient level data segregation is crucial in practical clinical applications

2020-04-25 health informatics Title + abstract only
View on medRxiv
Show abstract

This article discusses the effect of segregation of histopathology images data into three sets; training set for training machine learning model, validation set for model selection and test set for testing model performance. We found that one must be cautious when segregating histological images data (slides) into training, validation and test sets because subtle mishandling of data can introduce data leakage and gives illusively good results on the test set. We performed this study on gene muta...

Predicted journal destinations