Machine learning model for the detection of non-alcoholic fatty liver disease (NAFLD) in the general population
Refereed conference paper presented and published in conference proceedings

Full Text

Other information
AbstractBackground: Non-alcoholic fatty liver disease (NAFLD) is identified to be an increasingly important cause of hepatocellular carcinoma. The prevalence of NAFLD is increasing and affects 20–40% of the general population in developed countries. NAFLD is currently the 2nd leading indication for liver transplantation in the United States. Hence epidemiological studies on NAFLD are of great importance. To facilitate studies on NAFLD, we aimed to develop a novel yet simple machine learning model based on routine clinical and laboratory parameters to detect NAFLD for the general population. Data from a population screening study in Hong Kong from year 2008–2010 were used to build up the NAFLD prediction model.

Methods: 146 NAFLD patients and 354 healthy subjects without NAFLD diagnosed by proton-magnetic resonance spectroscopy were included in the training group to develop the NAFLD prediction model. Four modern machine learning algorithms namely logistic regression, ridge regression, Adaboost and bagging decision tree were utilized to develop the predictors for the NAFLD prediction model, respectively, among 23 routine clinical and laboratory parameters. The performance of these four algorithms was compared by the area under the receiver operating characteristic curves (AUC) in the training group, and in the validation group including another 118 NAFLD patients and 304 healthy subjects.

Result: The NAFLD prediction model fitted by logistic regression gave an AUC of 0.87 (95% CI 0.83–0.90) in the training group; the corresponding AUC was 0.88 (0.84–0.91) in the validation group, which is the highest among the four algorithms. This model consisted of 6 predictors including alanine transaminase, high-density lipoprotein, white blood cell count, hemoglobin A1c, triglyceride and the presence of hypertension. Dual cutoffs were provided to achieve 90% sensitivity and specificity, respectively, for this model. At the lower cut-off of 0.24, this model detected NAFLD patients with sensitivity of 91% (85–95%), specificity of 66% (61–71%), PPV of 52% (46–59%), and NPV of 95% (91–97%) in the training group; and sensitivity of 94% (88–97%), specificity of 64% (58–69%), PPV of 50% (44–57%) and NPV of 97% (93–98%) in the validation group. At the higher cut-off of 0.39, this model detected NAFLD patients with sensitivity of 55% (46–63%), specificity of 90% (87–93%), PPV of 70% (61–78%), and NPV of 83% (79–86%) in the training group; and sensitivity of 55% (46–64%), specificity of 91% (87–94%), PPV of 70% (59–79%) and NPV of 84% (79–88%) in the validation group.

Conclusion: The proposed NAFLD prediction model is a simple and robust reference for clinicians to screen patients with suspected NAFLD for assessments, and for researchers to classify NAFLD patients in epidemiologic studies.
All Author(s) ListTerry Cheuk-Fung Yip, Andy Jinhua Ma, Vincent Wai-Sun Wong, Pong-Chi Yuen, Yee-Kit Tse, Henry Lik-Yuen Chan, Grace Lai-Hung Wong
Name of ConferenceThe 26th Annual Conference of Asian Pacific Association for the Study of the Liver (APASL)
Start Date of Conference15/02/2017
End Date of Conference19/02/2017
Place of ConferenceShanghai
Country/Region of ConferenceChina
Proceedings TitleHepatology International
Volume Number11
Issue NumberSuppl 1
PagesS950 - S951
LanguagesEnglish-United Kingdom

Last updated on 2019-21-10 at 17:09