Early Detection of Absurdity Signals in Pharmacovigilance: A Machine Learning Ensemble Approach to Identify Rare Adverse Drug Reactions
Dasgupta, R.
Show abstract
BackgroundTraditional pharmacovigilance methods based on biostatistical approaches systematically exclude outliers and rare events, potentially missing critical safety signals. These methods fail to detect micro-clusters of adverse events and comorbidity patterns that may indicate serious but low-frequency adverse drug reactions (ADRs). We introduce the concept of absurdity signal detection - the identification of statistically anomalous but clinically significant adverse event patterns that conventional methods dismiss as outliers. MethodsWe developed an ensemble machine learning framework combining five distinct algorithms (Random Forest, Gradient Boosting, XGBoost, Neural Networks, and Support Vector Machines) to analyze FDA Adverse Event Reporting System (FAERS) data. The system employs outlier-inclusive modeling, multi-dimensional cluster detection, and severity-weighted propensity scoring. We validated our approach on Losartan, analyzing 500 adverse event reports to detect absurdity signals that may have been missed by conventional biostatistical surveillance. ResultsOur ensemble approach achieved 75% accuracy in identifying high-risk adverse events, with the best-performing model successfully detecting 15 distinct absurdity signals. The top five identified events were: cough (propensity score 1.525), angioedema (1.298), insomnia (1.290), nausea (1.180), and hyperkalemia (1.114). Notably, our method identified several rare but severe ADRs that would have been excluded as statistical outliers in traditional disproportionality analyses. The ensemble approach demonstrated superior performance compared to individual models, with inter-model agreement providing an additional confidence metric for signal validation. ConclusionsMachine learning-based absurdity signal detection offers a paradigm shift in pharmacovigilance by preserving and analyzing rare adverse events rather than excluding them. This approach has significant implications for patient safety, potentially preventing serious adverse events in vulnerable populations with atypical response profiles. Our methodology is scalable, validated against FDA data sources, and provides a framework for real-time safety monitoring in the $138 billion pharmaceutical industry. Future work will extend this approach to drug-drug interaction detection and personalized risk stratification.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.