Page 91 - MI-2-2
P. 91
Microbes & Immunity Big data and DNN-based DTI model in CHP
Hence, we can use the constrained least square the system identification method in the MATLAB
estimation to estimate the parameter vectors β , β , optimization toolbox based on a reflective Newton method
1
k
β , and β , containing protein interaction abilities, for minimizing a quadratic function. Through this
18
n
m
transcriptional regulatory abilities, post-transcriptional process, we are able to get the optimal estimated parameter
regulatory abilities, and basal levels by the corresponding vectors for PPIs and genes, lncRNAs, and miRNA
microarray data and DNA methylation, phosphorylation, regulations in GWGENs of fibrosis slice cells of CHP
and ubiquitination profiles. The constrained least square patients and non-fibrosis slice cells of healthy control.
estimation problem of parameter vectors β , β , β , and Since large-scale measurement of expressed cellular
k
1
m
β in Equations XIV–XVII can be solved by the following proteins is not yet feasible, and mRNA abundance explains
n
constrained least square parameter estimation problems 79% of the variance in protein abundance, we can use
(Equations XVIII-XXI), respectively. 18 gene expression microarray data as a proxy for protein
expressions. By incorporating gene, miRNA, and lncRNA
k β k 2 1 σ * β − S || 2 2 (XVIII) expression data, along with DNA cross-talk between
β = min||
k
k
k
methylation, phosphorylation, and ubiquitination profiles
k
1 σ * β − T || 2 (XIX) in CHP patients and healthy controls, we can solve S , T,
l
β = min||
D , Q , σ , σ, σ , and σ from the least square parameter
l
l
l
l
2
n
m
l
n
k
β l 2
m
estimation problems in Equations XVIII-XXI to identify
the protein interactive abilities, transcriptional regulatory
0 00 … 01 … 00 abilities, post-transcriptional regulatory abilities, and basal
levels in β , β, β and β .
subjectto k l m, n
0 00 … 00 … 10 Since the candidate GWGEN interactions and
regulations were constructed using big text data mining
X l Y l Z l from a substantial variety of databases and experimental
datasets, which may include some plausible information, it
1
β m 2 σ * β − D || 2 2 is possible that many false positives of protein interactive
β = min||
m
m
m
m
abilities, transcriptional regulatory abilities, and post-
transcriptional regulatory abilities are included in candidate
0 00 … 01 … 00 GWGEN. To remove false positives in the estimated protein
interactive abilities, transcriptional regulatory abilities in
subjectto (XX) Equations XVIII-XXI, and post-transcriptional regulatory
0 00 … 00 … 10 abilities, we applied a system order detection scheme 17-19
X m Y m Z m to exclude these insignificant abilities as false positives to
obtain real GWGENs of CHP and healthy control.
1 σ * β − Q || 2 Akaike Information Criterion (AIC) is a system order
β = min||
n
20
β n 2 n n n 2 detection method based on the system identification
method in solving Equations XVIII-XXI to detect the
real system order. The AICs of the k-th protein of PPIN,
0 00 … 01 … 00
the l-th gene of GRN, the m-th miRNA of GRN, and the
subjectto (XXI) n-th lncRNA of GRN are shown in Equations XXII-XXV,
0 00 … 00 … 10 respectively.
X n Y n Z n ( 21 + W )
AICW ( ) = ( ) +log Θ 2 k k (XXII)
k
The inequality constraints in the above constrained I
least square parameter estimation problem can ensure that 2 21 X + Y + Z )
( +
l (
log
the post-transcriptional regulatory abilities of miRNA on AICX YZ, l, ) = ( ) +Θ l l I l l (XXIII)
l
genes/miRNAs/lncRNAs are always non-positive.
Therefore, we could solve the constrained least square ( 2 21 X + Y + Z )
( +
m
m
m
log
parameter estimation problem to obtain protein interactive AICX YZ, m, ) = ( ) +Θ m I (XXIV)
m
m
parameters, that is, β and gene/miRNA/lncRNA 21 X + Y + Z )
k
( +
n (
log
regulatory parameters, that is, β , β , and β through AICX YZ, n, ) = ( ) +Θ 2 n n I n n (XXV)
n
n
m
l
Volume 2 Issue 2 (2025) 83 doi: 10.36922/mi.4620

