Page 95 - AIH-2-2
P. 95

Artificial Intelligence in Health                                         Cirrhosis prediction in hepatitis C



            2. Data and methods                                2.4. Predictor variables

            2.1. Data source                                   Predictor variables for predicting cirrhosis development
                                                               were selected based on our previous research and
            The national VHA system is the largest integrated   biological plausibility. We employed both baseline and
            healthcare system in  the  United  States.  It includes  172   longitudinal variables for our analysis. The baseline
            medical centers and 1,069 outpatient sites of care, serving   predictors consisted of age at the enrollment, gender, race,
            9 million enrollees.  All data were obtained from the VA   and HCV genotype. The longitudinal predictors, which
                           9
            Corporate Data Warehouse, which is a comprehensive   may be assessed multiple times, included achievement of
            repository  of  data  from  the  VA’s  universal  electronic   SVR, body mass index, and 24 laboratory blood tests. The
            medical record system including laboratory data, biometric   achievement of SVR was defined as a serum HCV RNA
            data, diagnoses, and pharmacy data. 10
                                                               viral load below the lower limit of detection performed
              All study procedures were approved by the VA Ann   at least 12 weeks after the end of HCV treatment, where
            Arbor Institutional Review Board. All procedures conform   we identified all antiviral treatment regimens received,
            to the ethical guidelines of the 1975 Declaration of Helsinki.   including  both  interferon  and  direct-acting  antiviral-
            A waiver of informed patient consent was obtained before   based therapies. The blood tests used in this study included
            project initiation.                                total bilirubin, aspartate aminotransferase (AST), alanine
                                                               aminotransferase (ALT), alpha-fetoprotein (AFP), alkaline
            2.2. Study population                              phosphatase (ALP), albumin, AST:ALT ratio, FIB-4, APRI,
            We identified 182,747 VHA users with a history of   blood urea nitrogen, creatinine, glucose, international
            positive HCV RNA tests seen in the VHA at least once   normalized ratio (INR), hemoglobin, leukocyte count,
            between January 2000 and January 2016. Patients were   platelet count (PLT), sodium, potassium, chloride, and
            followed from the date of the first APRI (enrollment) to   total protein. FIB-4 and APRI scores were defined using
                                                                                                            12
            their last visit recorded in the VA system through January   published formulae to assess the degree of liver fibrosis.
            2019.  To  ensure  that  patients  did  not  have  cirrhosis  at   In addition, the laboratory values of AST, ALT, AFP, and
            enrollment, we included only patients with APRI results   ALP, which were measured through standardized blood
            <2.0 (72% negative predictive value for cirrhosis in CHC)   tests, were divided by the corresponding upper limits of
            at enrollment.  Because antiviral treatment outcome is a   normal to account for differences in reference ranges
                       11
            key predictor of cirrhosis development, we excluded an   across laboratories.
            additional 13,430 patients who received antiviral treatment
            regimens but lacked RNA tests in VHA electronic records   2.5. Cohort building
            to document whether sustained virologic response (SVR)   Labeled patients were followed from enrollment (time 0)
            was achieved. After these exclusions, the cohort contained   to the date of the last available TE or the date of diagnosis
            169,317  patients, among which 10,575  patients had   of cirrhosis through TE, if applicable. Unlabeled patients
            undergone TE after enrollment. Finally, since we aimed to   (i.e., those without TE outcomes) were followed from
            develop longitudinal models predicting the development   enrollment to the last visit documented in the VHA records
            of cirrhosis over a 1-year period, we excluded TE results for   (Figure 1). The training cohort was created by randomly
            297 patients who had less than 1 year of available follow-up   selecting visit dates from the patient’s follow-up records.
            time between enrollment and their last available TE. This   This approach simulates the scenario in which we aim to
            resulted in a final analytic cohort of 10,278 patients with   predict the risk of cirrhosis within a year of a clinical visit
            valid TE results (the “labeled cohort”) for 1-year prediction   based on a patient’s medical history.
            and a cohort of 159,039 patients without TE results (the
            “unlabeled cohort”).                               2.5.1. Labeled cohort for supervised learning
                                                               All patients with known cirrhosis outcomes by TE were
            2.3. Progression to cirrhosis defined by TE
                                                               included in this cohort (Figure  1). The models used
            TE was introduced into the VHA system in 2013 for   baseline predictors as well as the entire trajectory of the
            the non-invasive assessment of fibrosis and can be   longitudinal predictors from enrollment to their sampled
            considered  a  reliable  measure for  cirrhosis outcome.   visit time t. The outcome measured whether the patient
            Our primary outcome, the development of cirrhosis,   developed cirrhosis within 1 year, starting from time t.
            was defined based on liver stiffness >12.5 kPa on TE
            measured at least once in the VHA data. The earliest date   2.5.1.1. Cases
            of liver stiffness >12.5 kPa on available TEs is defined as   There were 2,247  patients in the labeled cohort who
            the date of cirrhosis.                             developed cirrhosis during follow-up according to their


            Volume 2 Issue 2 (2025)                         89                               doi: 10.36922/aih.4671
   90   91   92   93   94   95   96   97   98   99   100