Back

Improving the detection of clinically significant steatotic liver disease using a machine learning algorithm in a real-world primary care population

Purssell, H.; Bennett, L.; Mostafa, M.; Landi, S.; Mysko, C.; Hammersley, R.; Patel, M.; Scott, J.; Street, O.; Piper Hanley, K.; The ID LIVER Consortium, ; Hanley, N. A.; Morling, J.; Guha, I. N.; Athwal, V. S.

2026-03-05 gastroenterology
10.64898/2026.03.04.26347631 medRxiv
Show abstract

Background and aimsPopulation screening for liver disease in high-risk groups is recommended. Community diagnosis of liver disease is a challenge due to the asymptomatic nature of disease until very advanced stages. Moreover, regional variation in testing availability can result in people with clinically significant liver disease being missed. Machine learning (ML) has been proposed as a method to reduce diagnostic error and automate screening. We present a novel machine learning derived algorithm (ID LIVER-ML) designed to predict the risk of clinically significant liver disease in a high-risk community population to identify those needing further investigations or specialist referral. MethodsUsing data from 2039 patients recruited to two UK cohorts, we created a parsimonious model using investigations that would be available in primary care using liver stiffness measurement as reference standard. The performance of ID LIVER-ML was compared against FIB-4 score in a second unseen hold out cohort (n=327). ResultsID LIVER-ML performed well at identifying patients at risk of clinically significant liver fibrosis (sensitivity 0.90, Specificity 0.43, PPV 0.54, NPV 0.86, AUC 0.83) and outperformed conventional risk scoring systems (FIB-4: AUC 0.65; NAFLD Fibrosis Score: AUC 0.66; APRI: AUC 0.53; BARD: AUC 0.58). ConclusionMachine learning derived algorithms can help screen high risk populations in a community setting for liver fibrosis. ClinicalTrials.gov ID: NCT04666402 Impact and ImplicationsThe prevalence of steatotic liver disease is rising globally and is an increasingly significant challenge for healthcare systems. Existing risk stratification scores are not validated in a real-world cohort where patients have risk factors for multiple aetiologies of liver disease. Our work shows that a machine learning model can predict the risk of clinically significant liver disease using routine primary care data, better than existing non-invasive risk stratification tools in a real-world cohort. This highlights a potential role for machine learning in the automation of fibrosis risk assessment in primary care. Highlights- Machine learning derived algorithms can predict the risk of clinically significant liver disease in an at risk community population with a mixed aetiology of liver diseases. - The performance of the ML algorithm (ID LIVER-ML) is not affected by metabolic, alcohol, or mixed aetiologies. - ID LIVER-ML outperforms traditional risk stratification scoring systems such as FIB-4 and NAFLD fibrosis scores. - Compared to the FIB-4 score, the use of Machine Learning can reduce the need for secondary care investigations by 59%.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Hepatology Communications
21 papers in training set
Top 0.1%
18.9%
2
Hepatology
18 papers in training set
Top 0.1%
14.5%
3
BMC Medicine
163 papers in training set
Top 0.4%
6.9%
4
PLOS ONE
4510 papers in training set
Top 30%
4.9%
5
Frontiers in Medicine
113 papers in training set
Top 1%
4.0%
6
Wellcome Open Research
57 papers in training set
Top 0.3%
3.6%
50% of probability mass above
7
American Journal of Gastroenterology
15 papers in training set
Top 0.1%
3.6%
8
Scientific Reports
3102 papers in training set
Top 35%
3.6%
9
BMJ Open
554 papers in training set
Top 6%
3.3%
10
Journal of Translational Medicine
46 papers in training set
Top 0.4%
2.4%
11
Gut
36 papers in training set
Top 0.4%
1.9%
12
Journal of Clinical Medicine
91 papers in training set
Top 3%
1.7%
13
Clinical and Translational Science
21 papers in training set
Top 0.4%
1.7%
14
Frontiers in Pharmacology
100 papers in training set
Top 2%
1.5%
15
Clinical Pharmacology & Therapeutics
25 papers in training set
Top 0.4%
1.5%
16
Metabolites
50 papers in training set
Top 0.7%
1.2%
17
British Journal of Cancer
42 papers in training set
Top 1%
1.2%
18
Modern Pathology
21 papers in training set
Top 0.3%
1.1%
19
Diabetes
53 papers in training set
Top 0.5%
1.1%
20
Archives of Clinical and Biomedical Research
28 papers in training set
Top 1%
1.1%
21
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.1%
22
Diabetes, Obesity and Metabolism
17 papers in training set
Top 0.4%
1.1%
23
International Journal of Infectious Diseases
126 papers in training set
Top 3%
1.0%
24
PLOS Digital Health
91 papers in training set
Top 2%
0.8%
25
PeerJ
261 papers in training set
Top 14%
0.8%
26
Biology Methods and Protocols
53 papers in training set
Top 3%
0.7%
27
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 10%
0.7%
28
Metabolomics
11 papers in training set
Top 0.6%
0.7%
29
eBioMedicine
130 papers in training set
Top 6%
0.5%
30
Nature Communications
4913 papers in training set
Top 68%
0.5%