Page 57 - TD-4-1
P. 57

Tumor Discovery                                                  Drug repurposing for pancreatic cancer via AI



            2. Materials and methods                           2.2. Constructing candidate genome-wide genetic
                                                               and epigenetic networks for PDAC and healthy
            2.1. Overview of PDAC and healthy control          controls based on big data mining
            genome-wide genetic and epigenetic networks
            using systems biology approach                     In this research, we obtained microarray data from the
                                                               NCBI  under accession  number  GSE183795.  The  dataset
            In this study, we aim to establish the GWGENs of PDAC   was divided into two groups: the disease group, comprising
            and non-PDAC core genomes. Microarray data for PDAC   139 PDAC samples, and the healthy control group,
            and non-PDAC were obtained from the National Center   consisting of 105 non-PDAC samples.
            for Biotechnology Information (NCBI) under accession
            number GSE183795. Four processes were then conducted   The candidate GWGENs include candidate PPINs and
            to identify the core signaling pathways of candidate   candidate GRNs. We represented the candidate GWGEN
            GWGENs, as illustrated in Figure 1 and detailed below.  using a binary Boolean matrix, where a value of 1 is
            i.   Construction of candidate GWGENs: We utilized a   assigned if an interaction or regulation exists for a node,
               data mining approach to construct Boolean matrices   and 0 if it does not. To construct the candidate PPINs,
               representing candidate protein-protein interaction   we consulted various databases, including the Database
                                                                                        18
                                                                                                19
               networks (PPINs) and candidate gene regulatory   of Interacting Proteins (DIP),  IntAct,  the Biological
                                                                                                            20
               networks (GRNs), which include interactions among   General Repository for Interaction Datasets (BioGRID),
                                                                                                      21
               proteins, and regulation among genes, microRNAs   and the Molecular INTeraction Database (MINT).  For the
               (miRNAs),  and  long  non-coding  RNAs  (lncRNAs).   candidate GRNs, we utilized multiple resources such as the
               Specifically, if an interaction or regulation exists between   Human Transcriptional Regulation  Interaction  Database
                                                                       22
               two nodes, it is denoted as 1; if not, it is denoted as 0.  (HTRIdb),   the integrated  transcription factor platform
                                                                     23
                                                                                           25
                                                                                                            26
                                                                                 24
            ii.  Identification of real GWGENs: We employed    (ITFP),  TRANSFAC,  CircuitDB,  TargetScanHuman,
               PDAC  and non-PDAC (control)  microarray  data  to   and StarBase. 27
               construct real GWGENs, identifying parameters for   2.3. Establishing a system model for identifying
               protein-protein interaction (PPI) models and GRN   real genome-wide genetic and epigenetic networks
               regulatory models by solving constrained linear least   for PDAC and healthy controls based on candidate
               squares parameter estimation problems. To address   genome-wide genetic and epigenetic networks
               potential false positive interactions in candidate
               GWGEN, we pruned these false positives using the   To investigate the oncogenic molecular mechanisms of
               Akaike  Information  Criterion  (AIC)  system  order   PDAC, we  referenced  relevant databases  and utilized
                                                                                                            28
               identification method, obtaining real GWGENs for   PDAC microarray data to construct candidate GWGENs.
               PDAC and non-PDAC.                              Following the establishment of these candidate GWGENs,
            iii.  Extraction of core GWGENs: We applied the PNP   we employed PDAC microarray data to discern the real
               method to extract core GWGENs from the real     GWGENs for PDAC and non-PDAC samples. This process
               GWGENs. The PNP method calculates the projection   required the development of a stochastic system model to
               value  of  each  node  in  the  real  GWGEN  to  capture   enable candidate GWGENs to capture stochastic interactions
               85%  of  the  network’s  energy,  sorting  the  projection   and regulations, such as protein-protein interactions, as well
               values of all nodes from highest to lowest. Given the   as the regulation of transcription factors (TFs), miRNAs, and
               maximum allowable annotated node count of 6,000   lncRNA. Additionally, the stochastic model should account
               (as per KEGG pathways), we selected the top 6,000   for residuals from the initial model establishment and
               significant nodes to form the core GWGEN.       stochastic noise resulting from experimental measurements.
            iv.  Construction and comparison of core signaling   Furthermore, the  main  protein  interaction model  in
               pathways: We annotated the KEGG pathways of     Equation I and the miRNA regulation models in Equations
               PDAC and non-PDAC of core GWGENs based on       II–IV were designed as bilinear interaction models based
               relevant literature, completing the construction of core   on the product of the concentrations of the interacting
               signaling pathways for each. We then compared the   proteins in Equation I or the regulations of miRNAs on their
               upstream microenvironmental factors, core signaling   target mRNAs, miRNAs or lncRNAs in Equations II–IV.
               pathways, and corresponding downstream aberrant   However, for  simplification, we  presented the interaction
                                                                                                           28
               cellular  functions  between  PDAC  and  non-PDAC   and regulation coefficients as linear in PPINs and GRNs.
               to explore the oncogenic molecular mechanisms of   First, we established a system model of the interactions
               PDAC and identify potential biomarkers as drug   involving the  w-th protein and other proteins in the
               targets for PDAC therapeutics.                  candidate PPINs, presented as follows:


            Volume 4 Issue 1 (2025)                         49                                doi: 10.36922/td.4709
   52   53   54   55   56   57   58   59   60   61   62