From claims to care: Machine learning algorithm to classify urinary tract infection cases using Swiss health insurance data

Aghlmandi, S.; Shafiezadeh, S.; Huber, C.; Godet, P.; Bucher, H. C.; Bielicki, J. A.

2025-10-01 primary care research

10.1101/2025.09.29.25336862 medRxiv

Show abstract

ObjectivesTo evaluate whether machine learning (ML) applied to comprehensive claims data without diagnostic codes can distinguish a high proportion of antibiotic treatment episodes as urinary tract infection (UTI) or non-UTI cases. Such approaches may be valuable for antimicrobial stewardship when diagnosis-linked datasets are unavailable. MethodsOutpatient antibiotic prescription claims from three major Swiss insurers (2017-2020; [~]40% of the Swiss population) were analyzed. Based on clinical input, specific constellations of claims codes (e.g. positive urine culture plus typical antibiotic) were a priori assigned as indicating UTI episodes, providing the reference classification. Predictors included sex, age group, comorbidity, and diagnostic tests ordered during the episode. Four ML classifiers were tested; performance and interpretability were evaluated, with XGBoost prioritized. ResultsAfter cleaning and balancing, 38,982 records (19,491 UTI; 19,491 non-UTI) were included. XGBoost achieved an AUC of 0.94, accuracy of 87.6%, sensitivity of 79.2%, and specificity of 96.1%. Misclassification was asymmetric: 11% of non-UTI cases were labeled UTI, while 2% of UTI cases were misclassified as non-UTI. Diagnostics ordered were the strongest predictors, followed by female sex and older age. ConclusionsEven in the absence of diagnosis codes, ML applied to claims data can reliably identify UTI-related prescriptions. This supports the feasibility of claims-based surveillance tools for stewardship, while in parallel highlighting the need for scalable, low-burden approaches to improve direct diagnostic coding in routine data.

From claims to care: Machine learning algorithm to classify urinary tract infection cases using Swiss health insurance data

Matching journals