Page 62 - GPD-3-4
P. 62
Gene & Protein in Disease Drugs and immune infiltration in IPF
cheminformatics, displaying drug targets, mechanisms 3. Results
of action, and other data, which facilitates the screening
of candidate drugs targeting hub genes. DGIdb (https:// 3.1. DEG analysis in patients with IPF and normal
www.dgidb.org/) focuses on drug–gene interactions, controls
integrating rich data to explore the relationship between This study integrated gene expression microarray data
drug mechanisms and genes. The 10 key genes identified from three sources: GSE2052, GSE110147, and GSE53845.
were collagen type XV alpha 1 chain (COL15A1), collagen The GSE2052 dataset included 13 samples from patients
type VI alpha 3 chain (COL6A3), asporin (ASPN), collagen with IPF and 11 from normal controls. The GSE110147
type XIV alpha 1 chain (COL14A1), fibrillin 1 (FBN1), dataset contained 22 samples from patients with IPF and
sulfatase 1 (SULF1), versican (VCAN), thrombospondin 11 from normal controls. Similarly, the GSE53845 dataset
2 (THBS2), fibroblast activation protein (FAP), and comprised 40 samples from patients with IPF and eight
latent transforming growth factor beta binding protein 1 from normal controls (Table 1).
(LTBP1). These genes were imported into the DrugBank Differential analysis between patients with IPF and normal
and DGIdb databases to search for potential drugs controls revealed 215 DEGs, including 106 significantly
targeting these genes. As all 10 key genes are expressed in upregulated and 109 significantly downregulated genes
IPF, these hub genes were imported into the CTD to screen in the IPF group (Supplementary file: Table S1). The top
for compounds that can reduce the expression levels of key 5 upregulated genes were transmembrane protein 100
genes. They were considered as potential targeted drugs for (TMEM100; |logFC| = 2.810), carboxypeptidase B2
IPF, with the condition that the number of genes reduced (CPB2; |logFC| = 2.430), vasoactive intestinal peptide
should be >5. Finally, Cytoscape was used to visualize receptor 1 (VIPR1; |logFC| = 2.424), carbonic anhydrase
the drugs and their interacting hub genes from the three IV (CA4; |logFC| = 2.368), and advanced glycosylation end
databases.
product-specific receptor (AGER; |logFC| = 2.033). The
2.9. Molecular docking top 5 downregulated genes were secreted phosphoprotein
1 (SPP1; |logFC| = 3.894), matrix metalloproteinase 7
To evaluate the binding energy and interaction mode (MMP7; |logFC| = 3.218), interleukin 13 receptor alpha 2
between candidate drugs/small molecules and the top (IL13RA2; |logFC| = 2.800), BPI fold containing family B
two hub genes (COL15A1 and COL6A3), we employed member 1 (BPIFB1; |logFC| = 2.755), and ceruloplasmin
the online molecular docking platform CB-Dock2 (CP; |logFC| = 2.639). Figure 1A shows the heatmap of
(https://cadd.labshare.cn/cb-dock2/). 16 CB-Dock2 DEGs with |logFC| >1.5 and an adjusted P < 0.05, and
enables automated protein–ligand blind docking through Figure 1B presents the volcano plot.
four steps: data input, processing, cavity detection and
docking, and visualization and analysis. The program 3.2. Identification of gene interaction networks and
automatically refines the protein structure and removes modules in IPF
impurities. The 3D structures of COL15A1 and COL6A3
were retrieved from the PDB database (https://www. The upper quartile genes (n = 2060) were selected
rcsb.org/) as receptors, after limiting “organisms” to for clustering analysis by calculating the variation in
“Homo sapiens” and “method” to “X-ray diffraction.” expression values for each gene. An outlier detection
The 3D structures of compounds were obtained from the threshold of 60 was established. As no samples exceeded
PubChem Compound Database (https://pubchem.ncbi. this threshold, none of them were excluded. The optimal
nlm.nih.gov/) as ligands. Compounds whose structures soft-thresholding power β value was determined to be 5,
were unavailable in PubChem were excluded. Docking with an R-squared value of 0.8 (Appendix: Figure A1).
was performed using CB-Dock2 to obtain optimal The dynamic tree-cutting method was used for module
results, which were then visualized using Discovery detection, and modules with highly correlated feature
Studio software. genes (dissimilarity coefficient of <0.2) were merged. Genes
within the same module showed high connectivity and
2.10. Statistical analysis shared functional characteristics (Figure 2A). Each module
In this study, all statistical analyses were conducted using R was assigned a distinct color label, and correlation analysis
software (version 4.2.2). The R packages and versions used was performed between module eigengenes (MEs) and
in each analysis are listed in each section and are available clinical phenotypes (IPF and normal control samples). Five
from the Bioconductor website (https://bioconductor.org/) modules were identified: MEblue, MEbrown, MEturquoise,
and the R website. All statistical tests were two-sided, with MEyellow, and MEgrey (Supplementary File: Table S2). The
a P < 0.05 considered to indicate statistical significance. MEyellow module showed the strongest positive correlation
Volume 3 Issue 4 (2024) 4 doi: 10.36922/gpd.4101

