Page 94 - AIH-2-2
P. 94

Artificial Intelligence in Health                                         Cirrhosis prediction in hepatitis C




            9 Center for Clinical Management Research, VA Ann Arbor Healthcare System, Ann Arbor, Michigan, United States of America
            10 Department of Statistics, College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, Michigan, United States of America
            11 Center for Global Health and Equity, University of Michigan, Ann Arbor, Michigan, United States of America
            12 Department of Learning Health Sciences, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, United States of America



            1. Introduction                                    incorporating  both  baseline  predictors  and  summary
                                                               statistics of longitudinal predictors, with RF being able to
            Chronic hepatitis C (CHC) is a leading cause of cirrhosis,   capture non-linear trends that are limitedly represented
            liver transplantation, and hepatocellular carcinoma. The   by LR.  Recent advances  in deep learning, a subtype of
                                                                    7
            development of cirrhosis in patients with CHC is highly   machine learning, see the emergence of the deep recurrent
            variable and non-linear.  Many cofactors influence the   neural network (RNN) as a powerful tool to process
                                1
            risk of cirrhosis in patients with CHC, including patient   sequential data collected at various times.  The structure
                                                                                                 8
            characteristics (e.g., alcohol use, obesity, and age), viral   of RNN has shown superior performance for applications
            factors (e.g., genotype), successful antiviral treatment,   such as machine translation, and it is flexible to be applied
            and others.  Reliable models to predict cirrhosis risk are   in both supervised learning and semi-supervised learning.
                     2
            needed to facilitate population-level screening and guide   In supervised learning, we use training data with known
            treatment decision-making. Machine learning methods   outcomes (“labeled data”) to learn an algorithm that
            to predict cirrhosis development offer greater flexibility   can make accurate predictions for new unseen data. In
            than traditional predictive methods because they can   contrast, semi-supervised learning uses both labeled data
            accommodate  large  numbers  of  predictor  variables   and data with missing outcomes (“unlabeled data”), where
            and are able to handle data with complex inter-variable   the unlabeled data can help identify relevant patterns.
            relationships and irregular collection intervals.  Semi-supervised learning can help improve prediction
              Traditional methods for predicting cirrhosis are subject   performance especially in the case where labeled data is
            to several limitations. Liver biopsy, which is considered the   scarce. Both supervised RNN and semi-supervised RNN
            gold standard for diagnosis, is invasive and poorly scalable   (semi-RNN) offer advantages over conventional methods,
            for large populations. Non-invasive markers of liver   such as LR because they can handle varying time durations
            disease, such as the AST-to-platelet ratio index (APRI),   and irregular time gaps between two consecutive visits,
            the fibrosis-4 index (FIB-4), transient elastography   and they can automatically learn predictive patterns
            (TE), and others, offer only single snapshots in time   from raw data, rather than requiring pre-specified feature
            and do not account for longitudinal changes. Fibrosis   extraction.
            assessment is not performed across large populations   Machine learning methods have previously demonstrated
            at consistent intervals or in all patients. Moreover, few   superior performance compared to linear Cox proportional
            previous models examining patients with CHC evaluate   hazards in predicting the risk of cirrhosis in patients with
            the risk of continued liver disease progression after   CHC, as defined based on APRI score thresholds.  However,
                                                                                                     4
            antiviral therapy (e.g., due to comorbid liver disease that   it remains unknown whether machine learning methods
            persists after CHC eradication). In addition, the variable   perform well in predicting progression to cirrhosis, as
            rate of fibrosis progression over time complicates the   defined by TE, a far more sensitive modality for assessing
            development of reliable risk prediction models using   liver fibrosis. Although TE has become more widely
            conventional methods.                              used in the last decade, it is still used in only a minority
              Machine learning is a form of artificial intelligence   of patients with CHC. Therefore, we hypothesized that
            that  uses  computer  algorithms  to  identify  patterns  in   outcomes from patients who underwent TE could be used
            large  datasets  and  allows  computers  to  assimilate  new   to train a model to accurately predict the 1-year risk of
            information without being explicitly programmed.  It   developing cirrhosis in CHC patients, as defined by TE.
                                                       3
            has demonstrated success in many real-world healthcare   The Veterans Health Administration (VHA) serves the
            applications, including  computer-aided interpretation of   largest single cohort of CHC patients in the United States.
            liver imaging, prediction of hepatocellular carcinoma risk,   Our analysis aimed to evaluate the predictive performance
                                             4-6
            and prediction of cirrhosis development.  For example,   of deep learning methods and compare it to conventional
            conventional machine learning models such as logistic   models. Moreover, we aimed to assess whether semi-RNN
            regression (LR) and random forest (RF) have been shown   can  obtain  better  performance  than  a  supervised  RNN
            to effectively predict the disease progression in CHC by   when the number of patients with TE outcomes is limited.


            Volume 2 Issue 2 (2025)                         88                               doi: 10.36922/aih.4671
   89   90   91   92   93   94   95   96   97   98   99