Back

Improving Medicare Fraud Detection Accuracy in Deep Learning by Exploring Feature Selection and Data Sampling Techniques.

Ahammed, F.

2026-03-20 health informatics
10.64898/2026.03.18.26348763 medRxiv
Show abstract

Fraud in the health landscape is an aggravating issue, with far-reaching consequences burdening the financial stability of the health industry and threatening the quality of medical care. It results from vulnerabilities within the current healthcare framework that are exploited by the fraudsters in their favor. In spite of many developed models that aim to detect fraudulent patterns in insurance claims, the accuracy of such models frequently suffers as a result of the imbalance issue of the Medicare dataset and irrelevant features. This study ventures to improve detection performance and accuracy by employing a deep learning model along with data sampling and feature selection techniques. Comparative analysis among different combinations is conducted to determine their efficacy to enhance the accuracy of the fraud detection model. Hence, the suggested model clearly demonstrates that a combination of myriad data sampling and feature selection techniques is helping to improve accuracy and performance. The accuracy was thus 95.4%, with negligible evidence of overfitting detected using both Chi-square and Synthetic Minority Over-sampling (SMOTE) techniques. Ultimately, the study findings underscore the significance of employing combined techniques instead of using only the baseline deep learning model for better performance in detecting Medicare insurance fraud.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.1%
23.3%
2
PLOS ONE
4510 papers in training set
Top 24%
7.1%
3
JMIR Medical Informatics
17 papers in training set
Top 0.1%
6.5%
4
Computers in Biology and Medicine
120 papers in training set
Top 0.4%
5.0%
5
International Journal of Medical Informatics
25 papers in training set
Top 0.2%
5.0%
6
Scientific Reports
3102 papers in training set
Top 21%
5.0%
50% of probability mass above
7
Informatics in Medicine Unlocked
21 papers in training set
Top 0.1%
4.1%
8
JAMIA Open
37 papers in training set
Top 0.5%
3.2%
9
PLOS Digital Health
91 papers in training set
Top 1%
2.4%
10
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.2%
2.1%
11
Biology Methods and Protocols
53 papers in training set
Top 0.6%
2.0%
12
Journal of Medical Internet Research
85 papers in training set
Top 2%
1.8%
13
JMIR Public Health and Surveillance
45 papers in training set
Top 1%
1.8%
14
Journal of Biomedical Informatics
45 papers in training set
Top 0.7%
1.8%
15
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.3%
1.8%
16
Cureus
67 papers in training set
Top 3%
1.5%
17
BMJ Health & Care Informatics
13 papers in training set
Top 0.5%
1.4%
18
BioMed Research International
25 papers in training set
Top 2%
1.4%
19
International Journal of Environmental Research and Public Health
124 papers in training set
Top 5%
1.4%
20
Frontiers in Public Health
140 papers in training set
Top 6%
1.4%
21
JMIRx Med
31 papers in training set
Top 1%
1.3%
22
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 1%
1.1%
23
Physical Biology
43 papers in training set
Top 2%
0.9%
24
Chaos, Solitons & Fractals
32 papers in training set
Top 1%
0.9%
25
Frontiers in Applied Mathematics and Statistics
10 papers in training set
Top 0.4%
0.8%
26
Bioengineering
24 papers in training set
Top 1%
0.8%
27
Expert Systems with Applications
11 papers in training set
Top 0.4%
0.7%
28
Journal of Personalized Medicine
28 papers in training set
Top 2%
0.7%
29
JMIR Formative Research
32 papers in training set
Top 2%
0.7%
30
Sensors
39 papers in training set
Top 2%
0.5%