Monte Carlo Committee Simulation with Large Language Models for Predicting Drug Reimbursement Recommendations and Conditions: A Novel Neurosymbolic AI Approach
Janoudi, G.; Rada (Uzun), m.; Yasinov, E.; Richter, T.
Show abstract
BackgroundHealth technology assessment (HTA) agencies issue reimbursement recommendations that determine patient access to new therapies. Predicting these outcomes would enable sponsors to optimize market access strategies and health systems to anticipate budget impacts. However, traditional machine learning approaches require extensive manual feature extraction and predict only categorical outcomes, not the specific conditions attached to recommendations. MethodsWe developed Monte Carlo Committee Simulation, a neurosymbolic system that simulates multi-panelist deliberation using 14 persona-conditioned large language model panelists with weighted voting and uncertainty quantification. We conducted a temporal external validation study on CDA-AMC (Canadas Drug Agency) sponsor-submitted recommendations published between October 2024 and December 2025 (n=67), after the knowledge cutoff of the underlying models, ensuring predictions reflected reasoning rather than memorization. The system predicted both recommendation category (Reimburse with Conditions, Do Not Reimburse) and five condition categories (Population Restrictions, Prescriber/Setting Requirements, Continuation Conditions, Economic Conditions, Evidence Conditions). ResultsOn submissions where the system expressed confidence (n=44), recommendation prediction achieved 93.2% accuracy (95% CI: 84.1-100.0%), exceeding the 91.8% (95% CI: 83.7-98.0%) majority class baseline. The system demonstrated superior discrimination versus chance level (AUROC 0.817, 95% CI: 0.45-0.99, vs 0.500) and calibrated confidence estimates (ECE = 0.091). Pre-specified Strength of Mandate stratified accuracy from 96.8% (High, 95% CI: 90.3-100.0%) to 40.0% (Weak, 95% CI: 0.0-80.0%), with 83.3% of errors occurring in cases flagged as uncertain (p=0.0025). Analysis of the 5 abstained cases confirmed 40.0% accuracy, validating the systems identification of uncertain predictions. For condition prediction, the system achieved 48.8% subset accuracy, requiring correct simultaneous prediction of all 5 condition categories (25 = 32 possible combinations), and 86.3% Hamming accuracy versus 25.8% for a no-conditions baseline. Per-category accuracy ranged from 68.3% (Continuation Conditions) to 97.6% (Economic Conditions), with Continuation Conditions demonstrating the strongest discriminative ability (AUROC 0.896, 95% CI: 0.79-0.98). ConclusionsMonte Carlo Committee Simulation enables a shift from reactive to proactive market access: anticipating specific reimbursement conditions before committee review, with calibrated confidence that identifies which predictions to trust. Validated on temporally separated data the models could not have memorized, the system can be positioned as a forecasting aid that complements rather than replaces human deliberation.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.