Page 53 - TD-2-1
P. 53

Tumor Discovery                                                       LCP2 regulates melanoma progression



            melanoma samples and four normal melanocyte samples).   proportional hazards models in TCGA data set. Genes with
            GSE15605 (including 58 melanoma samples and 16     P < 0.05 were then fitted in the multivariate LASSO Cox
            normal skin samples) was used to validate the identified   regression model with the R package glmnet . The optimal
                                                                                                 [23]
            DEGs. The RNA-seq data of cutaneous melanoma from the   λ value was determined by ten-fold cross validation that
            Cancer Genome Atlas (TCGA) were downloaded from the   gives a minimum cross-validated error. Coefficients of
            UCSC Cancer Brower (https://xenabrowser.net/), which   genes without significant effect on OS were shrunk to zero.
            contained 471 melanoma samples. The TCGA RNA-seq   The correlations among significant genes were assessed
            data were utilized to further identify prognostic genes that   using Spearman’s correlation method. Then, the prognostic
            could predict the survival of melanoma, and to construct   immune score (PIS) for each patient was constructed, which
            prognostic models. In addition, GSE65904, which contains   is the linear combination of the expression of genes and the
            214 melanoma samples, was used to validate the identified   corresponding coefficients estimated in LASSO model. The
            prognostic gene signatures. Variables with missing values   patients in TCGA data set were divided into high-risk group
            >20% of all samples were excluded from this study, and   and low-risk group based on the median of PIS. The time of
            the remaining missing values were imputed by nearest   OS and disease-free survival (DFS) between the two groups
            neighbor averaging method using the R package impute .   were compared using Kaplan–Meier analyses and log-rank
                                                        [19]
            The characteristics of the five data sets used in our study   tests, separately. The identified prognostic gene signature
            are provided in Table S1.                          was validated in an independent cohort GSE65904. PIS was
                                                               calculated according to the expression of each identified
            2.2. Identification and validation of DEGs         gene and its associated coefficient obtained from the TCGA
            GSE3189 and GSE31879 data sets were used to select   data  set.  Subsequently,  the  high-risk  group  and  low-risk
            DEGs. DEGs between melanoma and benign skin nevus,   group were formed according to the median of PIS, and
            melanoma and normal skin tissue, melanoma, and normal   the time of OS and distant metastasis-free survival (DMFS)
            melanocyte were screened by Wilcoxon rank-sum test,   between the two groups were compared by Kaplan–Meier
            separately. The false discovery rate (FDR) method was   analyses and log-rank tests, separately. Since TCGA data
            used to correct type I error occurring when conducting   set has relatively more complete clinical information than
            multiple comparison . Variables with q value generated   other data sets, we investigated whether prognostic value
                             [20]
            by FDR <0.01 were selected, and the DEGs that were shared   of PIS was independent of clinical baseline information
            between melanoma and three controls (nevus, normal skin,   using multivariate Cox proportional hazards models in
            and melanocyte) were screened as the robust DEGs. The   TCGA data set. P < 0.05 was set as a cutoff for significant
            robust DEGs were then validated in GSE15605, and DEGs   difference. All data were statistically analyzed with
            that were also statistically significant (q value < 0.01) in   R software (version 3.4.4).
            GSE15605 were used in following analyses. The identified
            DEGs were further validated in the Gene Expression   2.5. Exploring the associations between identified
            Profiling Interactive Analysis (GEPIA) online database   prognostic genes and tumor-associated leukocyte
            (http://gepia.cancer-pku.cn/) , which contains a large   (TAL) subsets
                                   [21]
            number of normal samples that were derived from the   Leukocyte compositions were inferred from the bulk
            Genotype-Tissue Expression (GTEx) projects in addition   tumor  transcriptomes  using  CIBERSORT,  which  could
            to TCGA melanoma samples.                          estimate relative proportions of TAL from expression
                                                               profiles of bulk tumors and outperform other TAL
            2.3. Pathway and functional enrichment analyses of   decomposition methods . After applying CIBERSORT
                                                                                   [24]
            DEGs
                                                               to TCGA data set, 22 distinct TAL subsets were obtained,
            To gain more insights into the functions and biological   and the correlation between each identified prognostic
            processes of the DEGs, we utilized the R package   gene and the specific TAL was calculated using Spearman’s
            clusterProfiler  for  Kyoto Encyclopedia of Genes and   rank correlation. Genes that had the largest absolute
            Genomes  (KEGG)  and  gene  ontology  (GO)  enrichment   correlation coefficients with certain TAL subset among all
            analyses .  Significant  pathways  and GO  items  were   TAL subsets were considered to have strong association
                  [22]
            selected with FDR q < 0.05.                        with that TAL subset. In addition, melanoma subset was
                                                               defined  as  the  average  expression  of  known  melanoma
            2.4. Identification and validation of prognostic gene   markers (MLANA, S100A1, PAN2, SOX10, TYR, S100B,
            signatures                                         and MITF), and the association between each identified
            DEGs that could be used to predict the OS of patients   prognostic gene and melanoma subset was explored using
            with melanoma were selected using the univariate Cox   Spearman’s rank correlation.


            Volume 2 Issue 1 (2023)                         3                           https://doi.org/10.36922/td.308
   48   49   50   51   52   53   54   55   56   57   58