Page 96 - EJMO-9-3
P. 96

Eurasian Journal of
            Medicine and Oncology                                           Novel senescence-based melanoma risk model



            to develop a risk model. Finally, the model’s predictive   risk scores and patient survival was evaluated based on the
            accuracy for patient survival was validated using the   number of days lived since metastasis, providing further
            validation set and an external Gene Expression Omnibus   insights into the prognostic utility of the risk model in a
            (GEO) dataset.                                     clinically relevant setting.
            2. Materials and methods                           2.2. Identification of prognostic senescence-related
                                                               genes
            2.1. Data acquisition and processing
                                                               Senescence-related genes were compiled from three gene
            Bulk RNA-seq fragments per kilobase of transcript   lists derived from previous studies that identified key genes
            per million mapped reads data for SKCM and the     associated with cellular senescence and their potential
            corresponding clinical information were downloaded   impact on various diseases, including cancer. 25-27  To
            from the UCSC Xena database (https://xenabrowser.  identify prognostic senescence-related genes, univariate
            net/datapages/). After excluding samples lacking clinical
            data, including stage, T stage, N stage, M stage, gender,   Cox regression analysis was initially conducted, selecting
            age, overall survival (OS), and OS time, 413 samples were   198 genes significantly correlated with survival outcome
            retained. These samples were randomly allocated into a   using a statistical significance threshold of  p<0.05.
            training set and a validation set at a 7:3 ratio. An additional   Subsequently, multivariate Cox regression analysis was
            validation dataset, GSE65904, comprising RNA microarray   performed to refine this list by selecting 190 prognostic
            data, was retrieved from GEO (https://www.ncbi.nlm.nih.  senescence-related  genes  with  p<0.05  while  adjusting
            gov/geo/). This dataset represents a population-based   for the effects of each gene identified in the univariate
            retrospective cohort consisting of 214 melanoma patients,   analysis. This analysis was performed using the survival R
            including 16 primary tumor samples and 188 metastatic   package (https://CRAN.R-project.org/package=survival).
            samples of various types. The male-to-female ratio is   To visually represent the relationships between the
            1.39. In addition, 84  patients harbor mutations in the   selected prognostic genes and patient survival, a forest
            B-Raf proto-oncogene, NRAS proto-oncogene (NRAS),   plot was generated using the survminer R package (https://
            neurofibromatosis Type I, and KIT proto-oncogene (KIT)   CRAN.R-project.org/package=survminer).
            genes.  The  dataset  contains  raw  signal  intensity  values,   2.3. Subtype classification and survival analysis
            ensuring high data fidelity for downstream analysis.
            Moreover,  it provides  two  distinct  survival  metrics,   To uncover distinct molecular subtypes within the SKCM
            including distant metastasis-free survival and disease-  cohort, unsupervised clustering analysis was performed
            specific survival (DSS), allowing for a comprehensive   based on the expression profiles of the 190 prognostic
            evaluation of the prognostic performance of the risk model   senescence-related genes. This analysis was conducted
            across different clinical endpoints. Probes with a detection   using the ConsensusClusterPlus (https://bioconductor.
            p<0.05 and present in more than 60% of the samples were   org/packages/ConsensusClusterPlus/), a widely used tool
            retained. The R package “lumi” (https://bioconductor.  for robust and reproducible clustering of high-dimensional
            org/packages/lumi/) was used to normalize the data and   genomic data. The clustering algorithm employed the
            convert raw Illumina probe intensities to expression values.   hierarchical clustering (hc) method, which groups samples
            Probe IDs were then mapped to their corresponding gene   based  on  similarities  in  gene  expression  patterns,  with
            symbols according to the platform annotation. For genes   the Pearson correlation coefficient serving as the metric
            mapped by several probe IDs, the mean expression value   to quantify pairwise similarity. The Pearson method
            was calculated using “avereps” function from the “limma”   was chosen for its sensitivity to both the magnitude and
            R package (https://bioconductor.org/packages/limma/).   direction of gene expression changes, making it particularly
            To further validate the robustness and reliability of the   suitable for capturing subtle yet biologically relevant
            risk model, an additional external dataset, GSE19234, was   differences in gene expression profiles. To determine the
            utilized, containing raw signal intensity values in CEL   optimal number of clusters (k), the cumulative distribution
            format. This dataset includes 44 metastatic melanoma   function (CDF) and its area under the curve (AUC) were
            samples from patients who experienced at least two or   systematically evaluated across a range of cluster numbers
            three recurrences, with all samples from Stage III or higher.   (k = 2 to 6). The CDF plot visualizes the stability of the
            The male-to-female ratio is 1.75. To ensure consistency   consensus  matrix,  with  a  flatter  curve  indicating  higher
            in data processing, the “rma” method within the “affy” R   clustering stability, while the AUC quantifies the overall
            package was utilized to normalize the raw expression data,   agreement across multiple clustering runs. The optimal k
            thereby minimizing technical variations and enhancing   was selected at the point where the CDF curve began to
            comparability across samples. The relationship between   plateau,  indicating  minimal  improvement  in  clustering


            Volume 9 Issue 3 (2025)                         88                              doi: 10.36922/ejmo.8574
   91   92   93   94   95   96   97   98   99   100   101