Page 129 - ITPS-7-3
P. 129
INNOSC Theranostics and
Pharmacological Sciences Prognostic values of peripheral blood CD4T transcriptomic signature
negative LASSO coefficients against the 12,549 genes used model coefficients to zero, thereby revealing input features
as the input for gene-signature discovery. Raw P-values (i.e., genes) most strongly correlated with the outcome
were adjusted by the Benjamini–Hochberg false discovery (i.e., CD4T abundance). A tenfold cross-validation
rate (FDR) method. experiment demonstrated the robustness of the final
model (tenfold average Pearson’s r = 0.89 and r = 0.79;
2
2.5. Gene signature-based stratification of an Table S1). The initial CD4T gene signature consisted
independent cohort of HIV-1-positive men of 334 (2.7%) genes with non-zero LASSO coefficients
The CD4T gene signature was subsequently applied to (Figure 1A). The final version of the signature with 207
the application population consisting of 24 peripheral genes (1.6%) was obtained by coefficient thresholding
blood samples from HIV-1-infected men receiving anti- (Figure 1A and Table S2).
retroviral therapies. All study participants had the absolute To explore the biological relevance of the identified
CD4T cell count before and after treatment. The percent gene signature, Gene Ontology: Biological Processes analysis
change is defined as: was performed on the gene features in the positive and
Change in CD4T negative directions, separately. The gene features positively
Percent change =100 × (III)
Baseline CD4T associated with CD4T abundance strongly enriched for
cellular adhesion (OR = 4.7, FDR = 0.01; Table 1). Notably,
Unsupervised hierarchical clustering with Euclidean the members comprising this ontology term included high-
and Ward D hyperparameters (pheatmap R package profile immune genes involved in HIV-1 pathogenesis:
v.1.0.12) followed by a dendrogram-tree split at the first CD28 and CTLA-4. The gene set negatively associated with
node was used to stratify the application population into CD4T abundance strongly enriched for metabolic processes
two groups for downstream statistical analyses. of macromolecules (all OR ≥ 4.5 and all FDR = 0.05; Table 1).
The genes encoding CD8 subunits, CD8A and CD8B,
2.6. Statistical analysis
showed strong, negative association with CD4 abundance
Unless otherwise specified, the computational environment and were selected by the LASSO procedure (Table S3).
used was R 4.3.1 (https://www.r-project.org) with data Hierarchical clustering of the identified gene signature
analysis and visualization packages base (4.3.1), ggplot2 segregated the discovery population into two major clusters:
(v.3.4.3), and matrixStats (v.1.0.0). All statistical tests used Cluster 1 and Cluster 2 (Figure 1B). The distribution of
were two-sided. The univariate (unadjusted) association subject demographics, including race/ethnicity, sex, and
between two binary variables was determined by a Fisher’s age, appeared balanced across the gene-signature clusters
exact test with odds ratio (OR) and 95% confidence interval (Figure 1B, horizontal tracking bars).
(CI) estimates. To address potential confounding, the
multivariate association was determined by multivariate 3.2. Application of CD4T gene signature to an
logistic regression (a generalized linear model with a HIV-1-positive cohort for biomedical knowledge
family “Binomial”) with adjusted ORs estimated by discovery
exponentiating the model coefficients. The mean difference
between any two groups was determined by Welch’s t-test. The next objective was to assess the clinical relevance of
A generalized linear model with family “Gaussian” was the CD4T gene signature in human disease. Given the
the generalization of the t-test to control for potential well-known role of CD4T in HIV-1 infection and recovery
3
confounding variables. All R code is deposited to GitHub on anti-retroviral treatment, the CD4T gene signature
(https://github.com/ydavidchen/cd4t_pilot_signature). was evaluated in this disease context. Dataset GSE19087
has 24 HIV-1 positive men treated with an anti-retroviral
3. Results regimen for approximately 1 year. Hierarchical clustering
3
of the CD4T gene signature stratified the HIV-1 positive,
3.1. Identification and characterization of a CD4T anti-retroviral treated men into two major groups
abundance signature in the transcriptomes of (Figure 2). The cluster structure, indicated by the pattern
healthy peripheral blood samples of dendrogram branching, showed striking similarity
A transcriptomic gene signature was identified by to that of the discovery population. On stratification of
supervised modeling of gene expression against CD4T cluster membership, the subject demographics present
proportions using LASSO regression, a conservative no significant differences between the two clusters
statistical approach for selecting the most important (Table 2). However, the HIV-1-positive men in Cluster
features from high-dimensional data space. LASSO 1 had an average of 122.7% CD4 increase at the end
9,10
is well-known for its ability and tendency to shrink the of the anti-retroviral therapy treatment, compared to
Volume 7 Issue 3 (2024) 3 doi: 10.36922/itps.2761

