Page 98 - GPD-4-2
P. 98
Gene & Protein in Disease Significance of MXRA7 in bladder cancer
2.3. Kaplan–Meier (KM) survival analysis selected as the most robust and interpretable approach for
The analysis of clinical characteristics of BLCA patients and survival analysis in BLCA.
MXRA7 expression level primarily involved KM survival To determine the optimal model, a 10-fold cross-validation
curve analysis to evaluate progression-free survival (PFS) was conducted. By selecting the proper Lambda value, 15
35
and disease-free interval (DFI). PFS is defined as the time genes were initially identified from the 230 up-regulated
30
from diagnosis or treatment initiation to disease progression and 63 down-regulated DEGs. Then, downstream analyses
(including local tumor growth, new lesion development, or (Cox regression) were performed on the 15 genes to further
31
metastasis) or death. PFS is commonly used in settings evaluate their individual prognostic significance across
where patients still have measurable disease, and its 325 clinical samples. A risk score was calculated based
assessment helps evaluate treatment efficacy by measuring on the prognostically significant genes identified. After
how long a patient can survive without disease worsening. obtaining the risk scores for each BLCA clinical patient, the
Conversely, DFI refers to the time from achieving complete performance of the LASSO-Cox model was evaluated by the
remission after initial treatment to the first documented receiver operating characteristic (ROC) curve, which serves
recurrence of the disease. DFI is particularly relevant to assess the accuracy of predictive models.
31
for patients who have undergone curative treatment, such
as surgery or a complete response to therapy, and serves 2.5. Multifactor Cox regression analysis and
as an indicator of recurrence risk. These definitions help nomogram construction
distinguish different aspects of patient prognosis, with PFS The Cox proportional hazards regression analysis was
reflecting ongoing disease control and treatment response, performed using statistical product and service software
31
while DFI assesses the risk of recurrence after remission. automatically (SPSSAU) (Version 24.0) to identify factors
In this study, both metrics were analyzed to determine significantly associated with the survival of BLCA patients.
36
whether MXRA7 expression is associated with disease A total of 12 factors, including “Age”, “BMI”, “MXRA7”,
progression or recurrence risk, respectively, providing “Risk score”, “Sex”, “MXRA7 expression level”, “Cancer
insight into its potential prognostic relevance. Clinical data status”, “Stage”, “Tumor grade”, “Clinical_T”, “Clinical_N”,
samples derived from TCGA were used to correlate with and “Clinical_M”, were included as independent variables,
the clinical outcomes. After excluding samples lacking while the dependent variable was the survival state of
MXRA7 expression data, a total of 408 of PFS and 161 of patients (alive or dead). These factors can be defined as
DFI patients were analyzed for MXRA7 expression and follows: age (continuous variable), BMI (body mass index),
survival curves using log-rank test. 30 sex, cancer status (indicating whether the patient currently
has an active tumor: tumor-free vs. with tumor), tumor
2.4. LASSO-Cox regression analysis grade (low vs. high), stage (overall cancer stage), cinical_T
The prognostic relevance of any single gene could be (tumor invasion depth: T1-T4), clinical_N (lymph node
evaluated using the Cox method with the “survival” R involvement: N0: no nodes, N1–N3: increasing levels
package, which combined survival time, status, and gene of nodal metastasis, NX: unknown status), clinical_M
expression levels. Subsequently, the “glmnet” R package (presence [M1] or absence [M0] of distant metastasis and MX
was utilized to merge these data, applying the LASSO Cox for unknown status), MXRA7 (measured in fragments per
approach for regression analysis. LASSO regression is kilobase per million mapped fragments [FPKM]), MXRA7
32
particularly useful for selecting the most relevant genes expression level (a categorical variable based on median
influencing the survival time of patients with BLCA, MXRA7 expression value), and risk score (composite score
while Cox model can analyze the relationship between from LASSO regression). Although MXRA7 expression is
survival time and multiple factors. The LASSO Cox a continuous variable reflecting absolute gene expression
method was preferred over traditional Cox regression levels, and MXRA7 expression level is a dichotomized
due to its ability to perform automatic variable selection variable for clinical stratification, both were included to
and reduce overfitting, which is essential for handling assess their independent prognostic value – allowing for
high-dimensional transcriptomic data. Unlike standard precise quantification while ensuring interpretability in
33
Cox regression, which includes all variables and may face clinical settings, with statistical validation confirming their
37
multicollinearity issues, LASSO applies an L1 penalty, complementary roles in survival prediction. The analysis
shrinking irrelevant coefficients to zero and retaining aimed to assess the impact of these independent variables
only the most predictive genes. Alternative models, such on patients’ survival time. The initial Cox analysis identified
33
as Random Forest and Support Vector Machines, were seven significant factors, which included “Age”, “MXRA7”,
considered but not preferred due to their lack of direct “MXRA7 expression level”, “Risk score”, “Tumor grade”,
survival time interpretation. Thus, LASSO-Cox was “Cancer status”, and “Clinical_N”.
34
Volume 4 Issue 2 (2025) 3 doi: 10.36922/gpd.6256

