Page 93 - AIH-2-2
P. 93
Artificial Intelligence in Health
ORIGINAL RESEARCH ARTICLE
Application of supervised and semi-supervised
learning prediction models to predict
progression to cirrhosis in chronic hepatitis C
Yueying Hu , Weijing Tang , Lauren A. Beste 3,4† , Grace L. Su ,
1†
5,6
2†
George N. Ioannou , Tony Van , Ji Zhu , and Akbar K. Waljee 9,11,12† *
10†
7,8
9
1 Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan,
United States of America
2 Department of Statistics and Data Science, Dietrich College of Humanities and Social Sciences,
Carnegie Melon University, Pittsburgh, Pennsylvania, United States of America
4 General Medicine Service, Veterans Affairs Puget Sound Healthcare System, Seattle, Washington,
United States of America
4 Department of Medicine, Veterans Affairs Puget Sound Healthcare System, Seattle, Washington,
United States of America
5 Gastroenterology Service, VA Ann Arbor Healthcare System, Ann Arbor, Michigan, United States
of America
6 Department of Internal Medicine, Michigan Medicine, Ann Arbor, Michigan, United States of America
7 Gastroenterology Service, Veterans Affairs Puget Sound Healthcare System, Seattle, Washington,
United States of America
8 Division of Gastroenterology, School of Medicine, University of Washington, Seattle, Washington,
United States of America
† These authors contributed equally
to this work.
*Corresponding author:
Akbar Waljee
(awaljee@med.umich.edu) Abstract
Citation: Hu Y, Tang W, Beste LA, In this study, we aim to examine the efficacy of deep learning methods in predicting
et al. Application of supervised and
semi-supervised learning prediction the 1-year risk of developing cirrhosis in patients with chronic hepatitis C (CHC), as
models to predict progression to defined by transient elastography (TE), in comparison with conventional models,
cirrhosis in chronic hepatitis C. Artif as well as to assess whether semi-supervised learning can improve performance
Intell Health. 2025;2(2):87-99.
doi: 10.36922/aih.4671 relative to supervised learning when the labels are limited. We used the electronic
health records of the 169,317 valid patients in the Veterans Health Administration
Received: August 27, 2024 system from 2000 to 2016. Predictor variables contained baseline characteristics,
Revised: October 31, 2024 such as age, gender, race, hepatitis C virus genotype, and 26 liver-related longitudinal
Accepted: December 19, 2024 variables such as sustained virologic response and laboratory data. The response
variable, developing cirrhosis, is defined as liver stiffness >12.5 kPa on TE within a
Published online: January 2, 2025 1-year window. Using baseline and longitudinal variables, we fitted four prediction
Copyright: © 2025 Author(s). models, including logistic regression (LR), random forest (RF), supervised recurrent
This is an Open-Access article neural network (RNN), and semi-supervised RNN (semi-RNN) and evaluated their
distributed under the terms of the
Creative Commons Attribution performances. Both RNN (area under the receiver operating characteristic curve
License, permitting distribution, [AuROC] 0.744) and semi-RNN (AuROC 0.785) accurately predicted the risk of cirrhosis
and reproduction in any medium, within 1 year and significantly outperformed RF (AuROC 0.731) and LR (AuROC 0.724).
provided the original work is
properly cited. By enabling early identification of high-risk patients, these models hold promise for
targeted interventions in clinical CHC treatment.
Publisher’s Note: AccScience
Publishing remains neutral with
regard to jurisdictional claims in
published maps and institutional Keywords: Semi-supervised learning; Electronic health records; Longitudinal predictors
affiliations.
Volume 2 Issue 2 (2025) 87 doi: 10.36922/aih.4671

