Page 99 - AIH-2-1
P. 99

Artificial Intelligence in Health





                                        ORIGINAL RESEARCH ARTICLE
                                        Machine learning-driven prediction of EBNA1

                                        inhibitors against Epstein–Barr virus in
                                        nasopharyngeal carcinoma



                                        Lavinia Clarisa Wicklem , Siaw San Hwang 1  , Bee Theng Lau 1  ,
                                                            1
                                        Mrinal Bhave 2  , and Xavier Wezen Chee *
                                                                           1
                                        1 Science Programme, School of Engineering and Science, Swinburne University of Technology
                                        (Sarawak Campus), Kuching, Sarawak, Malaysia
                                        2 Department of Chemistry and Biotechnology, School of Science, Computing and Engineering
                                        Technologies, Swinburne University of Technology, Melbourne, Victoria, Australia




                                        Abstract
                                        Nasopharyngeal carcinoma (NPC), particularly prevalent in regions such as Malaysia,
                                        is a significant health concern often linked to Epstein-Barr virus (EBV) infection. The
                                        EBV nuclear antigen 1 (EBNA1), crucial for EBV survival and NPC tumorigenicity,
                                        has emerged as a potential therapeutic target for EBV-positive NPC. In this study,
                                        we utilized quantitative structure-activity relationship (QSAR) models to predict
                                        potential inhibitors of EBNA1. These models were developed based on the molecular
                                        fingerprints of known EBNA1 inhibitors, using both classification and regression
            *Corresponding author:      approaches. Our QSAR classification models demonstrated consistently high
            Xavier Wezen Chee
            (xchee@swinburne.edu.my)    precision, recall, F1 score, and accuracy scores across the training set.  The top-
                                        performing models, constructed using logistic regression algorithms, achieved
            Citation: Wicklem LC, Hwang SS,
            Lau BT, Bhave M, Chee XW.   perfect precision scores of 1.000 in the test set evaluation. These models’ recall, F1
            Machine learning-driven     score, and accuracy scores were 0.571, 0.727, and 0.667, respectively. On the other
            prediction of EBNA1 inhibitors   hand, the best-performing model among the regression models was built using
            against Epstein–Barr virus in   the sequential minimal optimization regression algorithm, achieving a correlation
            nasopharyngeal carcinoma. Artif
            Intell Health. 2025;2(1):93-104.   coefficient of 0.703. The mean absolute error and root mean square error of this QSAR
            doi: 10.36922/aih.4375      regression model were 0.173 and 0.217, respectively, whereas the relative absolute
            Received: July 30, 2024     error was 0.689. We screened the enamine advanced compound library using this
                                        regression model to predict compounds with potential EBNA1 inhibitory effects. This
            Revised: September 10, 2024
                                        led to the identification of the top 10 compounds with the most promising predicted
            Accepted: September 23, 2024  EBNA1 inhibitory properties.
            Published Online: November 8,
            2024
                                        Keywords: Epstein-Barr virus nuclear antigen 1; Nasopharyngeal carcinoma; Quantitative
            Copyright: © 2024 Author(s).   structure-activity relationship; Inhibitor; Machine learning; Compound screening
            This is an Open-Access article
            distributed under the terms of the
            Creative Commons Attribution
            License, permitting distribution,
            and reproduction in any medium,   1. Introduction
            provided the original work is
            properly cited.             The drug discovery process involves several stages, starting with the identification of
            Publisher’s Note: AccScience   disease targets and the search for small molecules that can modulate these targets. This
            Publishing remains neutral with   often involves testing thousands to millions of compounds in various assays, with only a
            regard to jurisdictional claims in                                       1
            published maps and institutional   few progressing to animal testing and pre-clinical studies.  Conclusively, developing new
                                                                                                             2
            affiliations.               and effective drugs is tedious, requiring millions of dollars and spanning over a decade.

            Volume 2 Issue 1 (2025)                         93                               doi: 10.36922/aih.4375
   94   95   96   97   98   99   100   101   102   103   104