Page 99 - AIH-2-1
P. 99
Artificial Intelligence in Health
ORIGINAL RESEARCH ARTICLE
Machine learning-driven prediction of EBNA1
inhibitors against Epstein–Barr virus in
nasopharyngeal carcinoma
Lavinia Clarisa Wicklem , Siaw San Hwang 1 , Bee Theng Lau 1 ,
1
Mrinal Bhave 2 , and Xavier Wezen Chee *
1
1 Science Programme, School of Engineering and Science, Swinburne University of Technology
(Sarawak Campus), Kuching, Sarawak, Malaysia
2 Department of Chemistry and Biotechnology, School of Science, Computing and Engineering
Technologies, Swinburne University of Technology, Melbourne, Victoria, Australia
Abstract
Nasopharyngeal carcinoma (NPC), particularly prevalent in regions such as Malaysia,
is a significant health concern often linked to Epstein-Barr virus (EBV) infection. The
EBV nuclear antigen 1 (EBNA1), crucial for EBV survival and NPC tumorigenicity,
has emerged as a potential therapeutic target for EBV-positive NPC. In this study,
we utilized quantitative structure-activity relationship (QSAR) models to predict
potential inhibitors of EBNA1. These models were developed based on the molecular
fingerprints of known EBNA1 inhibitors, using both classification and regression
*Corresponding author: approaches. Our QSAR classification models demonstrated consistently high
Xavier Wezen Chee
(xchee@swinburne.edu.my) precision, recall, F1 score, and accuracy scores across the training set. The top-
performing models, constructed using logistic regression algorithms, achieved
Citation: Wicklem LC, Hwang SS,
Lau BT, Bhave M, Chee XW. perfect precision scores of 1.000 in the test set evaluation. These models’ recall, F1
Machine learning-driven score, and accuracy scores were 0.571, 0.727, and 0.667, respectively. On the other
prediction of EBNA1 inhibitors hand, the best-performing model among the regression models was built using
against Epstein–Barr virus in the sequential minimal optimization regression algorithm, achieving a correlation
nasopharyngeal carcinoma. Artif
Intell Health. 2025;2(1):93-104. coefficient of 0.703. The mean absolute error and root mean square error of this QSAR
doi: 10.36922/aih.4375 regression model were 0.173 and 0.217, respectively, whereas the relative absolute
Received: July 30, 2024 error was 0.689. We screened the enamine advanced compound library using this
regression model to predict compounds with potential EBNA1 inhibitory effects. This
Revised: September 10, 2024
led to the identification of the top 10 compounds with the most promising predicted
Accepted: September 23, 2024 EBNA1 inhibitory properties.
Published Online: November 8,
2024
Keywords: Epstein-Barr virus nuclear antigen 1; Nasopharyngeal carcinoma; Quantitative
Copyright: © 2024 Author(s). structure-activity relationship; Inhibitor; Machine learning; Compound screening
This is an Open-Access article
distributed under the terms of the
Creative Commons Attribution
License, permitting distribution,
and reproduction in any medium, 1. Introduction
provided the original work is
properly cited. The drug discovery process involves several stages, starting with the identification of
Publisher’s Note: AccScience disease targets and the search for small molecules that can modulate these targets. This
Publishing remains neutral with often involves testing thousands to millions of compounds in various assays, with only a
regard to jurisdictional claims in 1
published maps and institutional few progressing to animal testing and pre-clinical studies. Conclusively, developing new
2
affiliations. and effective drugs is tedious, requiring millions of dollars and spanning over a decade.
Volume 2 Issue 1 (2025) 93 doi: 10.36922/aih.4375

