Page 91 - MI-2-2
P. 91

Microbes & Immunity                                               Big data and DNN-based DTI model in CHP



              Hence, we can use the constrained least square   the system identification method in the MATLAB
            estimation to estimate the parameter vectors  β ,  β ,   optimization toolbox based on a reflective Newton method
                                                         1
                                                      k
            β , and  β , containing protein interaction abilities,   for minimizing a quadratic function.  Through this
                                                                                                18
                     n
             m
            transcriptional  regulatory  abilities,  post-transcriptional   process, we are able to get the optimal estimated parameter
            regulatory abilities, and basal levels by the corresponding   vectors for PPIs and genes, lncRNAs, and miRNA
            microarray data and DNA methylation, phosphorylation,   regulations in GWGENs of fibrosis slice cells of CHP
            and ubiquitination profiles. The constrained least square   patients and non-fibrosis slice cells of healthy control.
            estimation problem of parameter vectors  β ,  β ,  β , and   Since  large-scale  measurement  of  expressed  cellular
                                               k
                                                  1
                                                     m
            β  in Equations XIV–XVII can be solved by the following   proteins is not yet feasible, and mRNA abundance explains
             n
            constrained  least  square  parameter  estimation  problems   79% of the variance in protein abundance, we can use
            (Equations XVIII-XXI), respectively. 18            gene expression microarray data as a proxy for protein
                                                               expressions. By incorporating gene, miRNA, and lncRNA
             k  β k 2 1  σ * β − S || 2 2          (XVIII)    expression data, along with DNA cross-talk between
            β = min||
                           k
                               k
                        k
                                                               methylation, phosphorylation, and ubiquitination profiles
                                                                                                          k
                  1  σ * β − T || 2                  (XIX)    in CHP patients and healthy controls, we can solve S , T,
                                                                                                             l
            β = min||
                                                               D , Q , σ , σ, σ , and σ  from the least square parameter
                              l
                       l
             l
                           l
                                2
                                                                                  n
                                                                m
                                                                         l
                                                                    n
                                                                      k
                 β l 2
                                                                           m
                                                               estimation problems in Equations XVIII-XXI to identify
                                                             the protein interactive abilities, transcriptional regulatory
                     0   00 …   01 …    00                  abilities, post-transcriptional regulatory abilities, and basal
                                                             levels in β , β, β  and β .
            subjectto                                       k  l  m,  n
                    
                     0   00 …   00 …    10                    Since the candidate GWGEN interactions and
                                                 regulations were constructed using big text data mining



                       X l   Y l     Z l                   from a substantial variety of databases and experimental
                                                               datasets, which may include some plausible information, it
                    1
                β m 2  σ * β − D || 2 2                       is possible that many false positives of protein interactive
            β = min||
                                m
                            m
                        m
             m
                                                               abilities, transcriptional regulatory abilities, and post-
                                                               transcriptional regulatory abilities are included in candidate
                                           
                     0   00 …   01 …    00                  GWGEN. To remove false positives in the estimated protein
                                                             interactive abilities, transcriptional regulatory abilities in
                    
            subjectto                      (XX)     Equations XVIII-XXI, and post-transcriptional regulatory
                     0   00 …   00 …    10                  abilities, we applied a system order detection scheme 17-19
                           



                       X m   Y m     Z m                   to exclude these insignificant abilities as false positives to
                                                               obtain real GWGENs of CHP and healthy control.
                    1 σ * β − Q || 2                            Akaike Information Criterion (AIC) is a system order
            β = min||
             n
                                                                              20
                 β n  2  n  n  n  2                            detection method  based on the system identification
                                                               method in solving Equations XVIII-XXI to detect the
                                                             real system order. The AICs of the k-th protein of PPIN,
                     0   00 …   01 …    00 
                                                             the l-th gene of GRN, the m-th miRNA of GRN, and the
            subjectto                      (XXI)    n-th lncRNA of GRN are shown in Equations XXII-XXV,
                    
                     0   00 …   00 …    10                  respectively.
                           



                       X n   Y n     Z n                                      ( 21 + W )
                                                               AICW ( ) = ( ) +log Θ 2 k  k             (XXII)
                                                                     k
              The inequality constraints in the above constrained                   I
            least square parameter estimation problem can ensure that             2   21  X + Y +  Z )
                                                                                       ( +
                                                                     l (
                                                                             log
            the post-transcriptional regulatory abilities of miRNA on   AICX YZ,  l, ) = ( ) +Θ l  l  I  l  l    (XXIII)
                                                                          l
            genes/miRNAs/lncRNAs  are   always  non-positive.
            Therefore, we could solve the constrained least square   (             2   21  X + Y + Z )
                                                                                        ( +
                                                                                                 m
                                                                                                     m
                                                                                             m
                                                                              log
            parameter estimation problem to obtain protein interactive   AICX YZ,  m, ) = ( ) +Θ m  I   (XXIV)
                                                                     m
                                                                          m
                                
            parameters, that is,  β  and gene/miRNA/lncRNA                            21  X + Y + Z )
                                 k
                                                                                       ( +
                                                                     n (
                                                 
                                         
                                     
                                                                             log
            regulatory parameters, that is,  β ,  β , and  β  through   AICX YZ,  n, ) = ( ) +Θ 2 n  n  I  n  n  (XXV)
                                                                          n
                                                  n
                                          m
                                      l
            Volume 2 Issue 2 (2025)                         83                               doi: 10.36922/mi.4620
   86   87   88   89   90   91   92   93   94   95   96