Page 62 - TD-4-1
P. 62
Tumor Discovery Drug repurposing for pancreatic cancer via AI
ˆ
employed the systematic order detection method using the where the parameter vector Θ can be obtained from
z
AIC to prune these inaccuracies and derive real GWGENs Equation XX using the least-squares method. Ω z represents
for both PDAC and non-PDAC. 29 the residual estimation of the z-th miRNA model, and
The system order detection scheme, based on the AIC S z, U , and V respectively represent the number of gene,
z
z
method, estimates the number of interactions among lncRNA, and miRNA regulations acting on the z-th
proteins and the number of regulations involving genes, miRNA, respectively.
lncRNAs, and miRNAs within the candidate GWGENs as The AIC method was used to assess the complexity of
follows: stochastic models and their fit to data. A smaller value of
21 R AIC indicates a better model fit while accounting for the
AICR log 2 w N w (XXI) model’s complexity. To obtain the real GWGEN, we had to
w
minimize four AIC Equations XXI–XXIV:
( Φ ⋅Θ − P ) ( Φ ⋅Θ − P ) R argmin AIC R (XXV)
T
ˆ
ˆ
*
w
w
where Ω= 2 w w w N w w w R w
2
w
x
*
ˆ
x
where the parameter vector Θ can be obtained from SU V, * x , x * argmin AIC SU V, x , x (XXVI)
x ,
x ,
w
SU V x
Equation (XVII) using the least-squares method. Ω w * * * AIC SU V,
,
represents the residual estimation of the w-th protein SU V, y , y argmin y y (XXVII)
y
y
y ,
y ,
model, and R w is the number of interactions (i.e., the SU V y
z
model’s complexity) with the w-th protein. SU V, * z , z * argmin AIC SU V, z , z (XXVIII)
*
z
z ,
z ,
SU V z
x 2
x
x
x
log
AICS UV, x , x 2 x S U V 1 (XXII) R represents the actual quantity of protein interactions
*
N
w
*
*
with the w-th protein. S , U , and V represent the actual
*
x
x
x
( Φ ⋅Θ −G ) ( Φ ⋅Θ −G ) number of regulatory TFs, lncRNAs, and miRNAs on the
T
ˆ
ˆ
*
*
*
2
where Ω= 2 x x x N x x x x-th gene, respectively. S ,U , and V indicate the actual
y
y
y
x
quantities of regulations between TFs, lncRNAs, and
*
*
*
ˆ
where the parameter vector Θ can be obtained from miRNAs on the y-th lncRNA, respectively. S , U , and V
z
z
z
x
Equation (XVIII) using the least-squares method. Ω x denote the actual number of regulatory TFs, lncRNAs, and
represents the residual estimation of the x-th gene miRNAs on the z-th miRNA, respectively. By minimizing
model, while S x, U x, and V x represent the number of gene, the AIC, we successfully identified the true number of
lncRNA, and miRNA regulations acting on the x-th gene, interactions for each protein and the actual number of
respectively. regulations for each gene, miRNA and lncRNA, effectively
pruning false positives from the candidate GWGENs, and
+V
+U
+
y
2
AIC S ,U y y ) = ,V ( ) + ( 2 S y N y ) 1 (XXIII) thus obtaining the real GWGENs for both PDAC and non-
Ω log
PDAC, as depicted in Figure S1.
( y
y
( Φ ⋅Θ − L ) ( Φ ⋅Θ − L ) 2.5. Extracting core genome-wide genetic and
T
ˆ
ˆ
2
with Ω= 2 y y y N y y y epigenetic networks from real genome-wide
genetic and epigenetic networks using the principal
y
ˆ
where the parameter vector Θ can be obtained from network projection method
y
Equation (XIX) using the least-squares method. Ω y By employing the AIC method to prune false positive
represents the residual estimation of the y-th lncRNA interactions and regulations from candidate GWGENs,
model, whereas S y, U y, and V represent the number of real GWGENs for both PDAC and non-PDAC were
y
genes, lncRNA, and miRNA regulations acting on the y-th obtained. However, due to the complexity of real
lncRNA, respectively. GWGENs from both PDAC and non-PDAC, the
carcinogenic molecular mechanisms of PDAC have not
z 2
z
z
z
log
AICS UV, z , z 2 z S U V 1 (XXIV) yet been fully confirmed. Therefore, it was necessary
N
to leverage KEGG pathways to elucidate the signaling
( Φ ⋅Θ − M ) ( Φ ⋅Θ − M ) pathways underlying real GWGENs of both PDAC
T
ˆ
ˆ
where Ω= 2 z z z N z z z and non-PDAC. Since the current KEGG database can
2
z
only annotate networks containing up to 6,000 nodes,
Volume 4 Issue 1 (2025) 54 doi: 10.36922/td.4709

