Page 53 - TD-2-1
P. 53
Tumor Discovery LCP2 regulates melanoma progression
melanoma samples and four normal melanocyte samples). proportional hazards models in TCGA data set. Genes with
GSE15605 (including 58 melanoma samples and 16 P < 0.05 were then fitted in the multivariate LASSO Cox
normal skin samples) was used to validate the identified regression model with the R package glmnet . The optimal
[23]
DEGs. The RNA-seq data of cutaneous melanoma from the λ value was determined by ten-fold cross validation that
Cancer Genome Atlas (TCGA) were downloaded from the gives a minimum cross-validated error. Coefficients of
UCSC Cancer Brower (https://xenabrowser.net/), which genes without significant effect on OS were shrunk to zero.
contained 471 melanoma samples. The TCGA RNA-seq The correlations among significant genes were assessed
data were utilized to further identify prognostic genes that using Spearman’s correlation method. Then, the prognostic
could predict the survival of melanoma, and to construct immune score (PIS) for each patient was constructed, which
prognostic models. In addition, GSE65904, which contains is the linear combination of the expression of genes and the
214 melanoma samples, was used to validate the identified corresponding coefficients estimated in LASSO model. The
prognostic gene signatures. Variables with missing values patients in TCGA data set were divided into high-risk group
>20% of all samples were excluded from this study, and and low-risk group based on the median of PIS. The time of
the remaining missing values were imputed by nearest OS and disease-free survival (DFS) between the two groups
neighbor averaging method using the R package impute . were compared using Kaplan–Meier analyses and log-rank
[19]
The characteristics of the five data sets used in our study tests, separately. The identified prognostic gene signature
are provided in Table S1. was validated in an independent cohort GSE65904. PIS was
calculated according to the expression of each identified
2.2. Identification and validation of DEGs gene and its associated coefficient obtained from the TCGA
GSE3189 and GSE31879 data sets were used to select data set. Subsequently, the high-risk group and low-risk
DEGs. DEGs between melanoma and benign skin nevus, group were formed according to the median of PIS, and
melanoma and normal skin tissue, melanoma, and normal the time of OS and distant metastasis-free survival (DMFS)
melanocyte were screened by Wilcoxon rank-sum test, between the two groups were compared by Kaplan–Meier
separately. The false discovery rate (FDR) method was analyses and log-rank tests, separately. Since TCGA data
used to correct type I error occurring when conducting set has relatively more complete clinical information than
multiple comparison . Variables with q value generated other data sets, we investigated whether prognostic value
[20]
by FDR <0.01 were selected, and the DEGs that were shared of PIS was independent of clinical baseline information
between melanoma and three controls (nevus, normal skin, using multivariate Cox proportional hazards models in
and melanocyte) were screened as the robust DEGs. The TCGA data set. P < 0.05 was set as a cutoff for significant
robust DEGs were then validated in GSE15605, and DEGs difference. All data were statistically analyzed with
that were also statistically significant (q value < 0.01) in R software (version 3.4.4).
GSE15605 were used in following analyses. The identified
DEGs were further validated in the Gene Expression 2.5. Exploring the associations between identified
Profiling Interactive Analysis (GEPIA) online database prognostic genes and tumor-associated leukocyte
(http://gepia.cancer-pku.cn/) , which contains a large (TAL) subsets
[21]
number of normal samples that were derived from the Leukocyte compositions were inferred from the bulk
Genotype-Tissue Expression (GTEx) projects in addition tumor transcriptomes using CIBERSORT, which could
to TCGA melanoma samples. estimate relative proportions of TAL from expression
profiles of bulk tumors and outperform other TAL
2.3. Pathway and functional enrichment analyses of decomposition methods . After applying CIBERSORT
[24]
DEGs
to TCGA data set, 22 distinct TAL subsets were obtained,
To gain more insights into the functions and biological and the correlation between each identified prognostic
processes of the DEGs, we utilized the R package gene and the specific TAL was calculated using Spearman’s
clusterProfiler for Kyoto Encyclopedia of Genes and rank correlation. Genes that had the largest absolute
Genomes (KEGG) and gene ontology (GO) enrichment correlation coefficients with certain TAL subset among all
analyses . Significant pathways and GO items were TAL subsets were considered to have strong association
[22]
selected with FDR q < 0.05. with that TAL subset. In addition, melanoma subset was
defined as the average expression of known melanoma
2.4. Identification and validation of prognostic gene markers (MLANA, S100A1, PAN2, SOX10, TYR, S100B,
signatures and MITF), and the association between each identified
DEGs that could be used to predict the OS of patients prognostic gene and melanoma subset was explored using
with melanoma were selected using the univariate Cox Spearman’s rank correlation.
Volume 2 Issue 1 (2023) 3 https://doi.org/10.36922/td.308

