Page 62 - TD-4-1
P. 62

Tumor Discovery                                                  Drug repurposing for pancreatic cancer via AI



                                                                                         ˆ
            employed the systematic order detection method using the   where the parameter vector  Θ  can be obtained from
                                                                                          z
            AIC to prune these inaccuracies and derive real GWGENs   Equation XX using the least-squares method. Ω z represents
            for both PDAC and non-PDAC. 29                     the residual estimation of the  z-th miRNA  model, and
              The system order detection scheme, based on the AIC   S z, U , and V  respectively represent the number of gene,
                                                                   z
                                                                         z
            method, estimates the number of interactions among   lncRNA, and miRNA regulations acting on the  z-th
            proteins and the number of regulations involving genes,   miRNA, respectively.
            lncRNAs, and miRNAs within the candidate GWGENs as   The AIC method was used to assess the complexity of
            follows:                                           stochastic models and their fit to data. A smaller value of
                                21  R                       AIC indicates a better model fit while accounting for the
            AICR      log  2 w  N  w           (XXI)    model’s complexity. To obtain the real GWGEN, we had to
                  w
                                                               minimize four AIC Equations XXI–XXIV:
                          ( Φ ⋅Θ − P  ) ( Φ ⋅Θ − P  )          R  argmin AIC R                       (XXV)
                                      T
                               ˆ
                                            ˆ
                                                                *
                                                                w
                                                                               w
              where  Ω=  2   w  w   w  N  w  w   w                    R w
                      2
                      w
                                                                                    x
                                                                *
                                      ˆ
                                                                x
              where the parameter vector  Θ  can be obtained from   SU V,  * x ,  x *   argmin AIC SU V,  x  ,  x     (XXVI)
                                                                            x ,
                                                                          x ,
                                       w
                                                                         SU V x
            Equation (XVII) using the least-squares method. Ω w   *  *  *      AIC SU V,
                                                                                        ,
            represents the residual estimation of the  w-th protein   SU V,  y ,  y   argmin  y  y     (XXVII)
                                                                                          y
                                                                y
                                                                          y ,
                                                                            y ,
            model, and  R w is the number of interactions (i.e., the     SU V y
                                                                                    z
            model’s complexity) with the w-th protein.         SU V,  * z ,  z *   argmin AIC SU V,  z ,  z     (XXVIII)
                                                                *
                                                                z
                                                                          z ,
                                                                           z ,
                                                                         SU V z
                                      x  2
                 x
                                          x
                                              x
                           log
            AICS UV,  x ,  x      2 x  S  U  V   1    (XXII)  R  represents the actual quantity of protein interactions
                                                                   *
                                          N
                                                                   w
                                                                                     *
                                                                                  *
                                                               with the w-th protein. S , U , and V  represent the actual
                                                                                            *
                                                                                            x
                                                                                  x
                                                                                     x
                          ( Φ ⋅Θ −G  ) ( Φ ⋅Θ −G  )            number of regulatory TFs, lncRNAs, and miRNAs on the
                                     T
                              ˆ
                                           ˆ
                                                                                      *
                                                                                             *
                                                                                   *
                     2
              where  Ω=  2  x   x   x  N  x  x  x              x-th gene, respectively.  S ,U , and V  indicate the actual
                                                                                             y
                                                                                   y
                                                                                      y
                     x
                                                               quantities of regulations between TFs, lncRNAs, and
                                                                                                  *
                                                                                                     *
                                                                                                            *
                                      ˆ
              where  the  parameter  vector  Θ   can be  obtained  from   miRNAs on the y-th lncRNA, respectively. S , U , and V
                                                                                                     z
                                                                                                  z
                                                                                                            z
                                       x
            Equation (XVIII) using the least-squares method. Ω x   denote the actual number of regulatory TFs, lncRNAs, and
            represents the residual estimation of the  x-th gene   miRNAs on the z-th miRNA, respectively. By minimizing
            model, while S x, U x, and V x represent the number of gene,   the AIC, we successfully identified the true number of
            lncRNA, and miRNA regulations acting on the x-th gene,   interactions for each protein and the actual number of
            respectively.                                      regulations for each gene, miRNA and lncRNA, effectively
                                                               pruning false positives from the candidate GWGENs, and
                                                +V
                                            +U
                                        +
                                           y
                                2
            AIC S  ,U y  y ) = ,V  ( ) +  ( 2 S y  N  y  ) 1   (XXIII)  thus obtaining the real GWGENs for both PDAC and non-
                              Ω log
                                                               PDAC, as depicted in Figure S1.
                ( y
                                y
                         ( Φ ⋅Θ − L  ) ( Φ ⋅Θ − L  )           2.5. Extracting core genome-wide genetic and
                                    T
                             ˆ
                                          ˆ
                    2
              with  Ω=  2  y   y  y  N  y  y   y               epigenetic networks from real genome-wide
                                                               genetic and epigenetic networks using the principal
                    y
                                      ˆ
              where the parameter vector  Θ  can be obtained from   network projection method
                                       y
            Equation (XIX) using the least-squares method. Ω y   By employing the AIC method to prune false positive
            represents the residual estimation of the  y-th lncRNA   interactions and regulations from candidate GWGENs,
            model, whereas  S y,  U y, and  V  represent the number of   real GWGENs for both PDAC and non-PDAC were
                                     y
            genes, lncRNA, and miRNA regulations acting on the y-th   obtained. However, due to the complexity  of real
            lncRNA, respectively.                              GWGENs from both PDAC and non-PDAC, the
                                                               carcinogenic molecular mechanisms of PDAC have not
                                      z  2
                 z
                                          z
                                             z
                          log
            AICS UV,  z ,  z      2 z  S  U  V   1    (XXIV)  yet been fully confirmed. Therefore, it was necessary
                                         N
                                                               to leverage KEGG pathways to elucidate the signaling
                          ( Φ ⋅Θ − M  ) ( Φ ⋅Θ − M  )          pathways underlying real GWGENs of both PDAC
                                     T
                                           ˆ
                              ˆ
              where  Ω=  2  z  z    z  N  z  z   z             and non-PDAC. Since the current KEGG database can
                     2
                     z
                                                               only annotate networks containing up to 6,000 nodes,
            Volume 4 Issue 1 (2025)                         54                                doi: 10.36922/td.4709
   57   58   59   60   61   62   63   64   65   66   67