Page 110 - GTM-2-3
P. 110
Global Translational Medicine TEs link to Parkinson’s risk and progression
shortest sequences (median length of 280 bp, Figure 1C). This analysis aimed to determine whether this TE risk locus
The annotation of TE insertion sites was also consistent exhibited strong LD (r > 0.8) with known PD risk SNPs.
2
with expectations (Figure 1D): approximately 51.3% of TE The TE LD analysis revealed no significant PD risk SNP
insertion sites were located within intergenic regions, while signals in this TE locus region, indicating that this TE
approximately 38.0% were located in the intronic areas. insertion site represents an independently discovered PD
This observation confirms the propensity of TE insertions risk locus (Figure S3).
to occur predominantly in the regions of the genome with There were 691 PD cases with longitudinal visits, and
minimal or negligible impact on genomic structure and 2,111 TE loci were used (Table S1) to examine potential
function, favoring the retention of TEs within the genome.
differences in disease progression over time among
TE insertions are prone to preferentially occur in non- subjects with TE genotypes. We constructed TE-LMM
coding regions, where they do not significantly disrupt models for six clinical indicators separately, and our
normal genomic structure and function, indicating a analysis revealed that chr8_114592257_ALU carriers
higher likelihood of these regions being non-conserved. To (Figure 2B) exhibited a faster progression in the Hoehn
evaluate the conservation status of TE insertion sites, we and Yahr stage compared to non-carriers (P = 1.87 × 10 ).
-5
used two conservation scoring tools and found that highly In addition, patients carrying chr13_81793576_SVA
conserved TE insertion sites accounted for a mere 0.033% (Figure 2C) on chromosome 13 q31.1 demonstrated a
of all TE insertion events (25,805 TEs). In comparison, more rapid progression in MDS-UPDRS Part I scores
non-conserved TE insertion sites accounted for 87.3% of than non-carriers (P = 2.47 × 10 ).
-6
all TE insertion sites (Figure 1E). We also measured the
repeatability of our TE calling by comparing it to data from 3.3. Effects of TEs on gene expression
the 1KGP and Genome Aggregation Database (gnomAD) After undergoing QC processing, a total of 1,709 subjects
databases. Our detected TEs showed high reproducibility from three cohorts were included in the TE-expression
with these existing 1KGP and gnomAD data (Figure 1F): quantitative trait loci (TE-eQTL) analysis. This cohort
out of the 19,119 ALU detected, 9,269 and 7,286 were consisted of 611 healthy controls and 1,098 patients with
validated in the 1KGP and the gnomAD databases, PD. A total of 2,867 TEs and 28,644 genes were used to
respectively; among the 4,454 LINE1 detected, 1,371 and investigate the associations between TEs and genes. In
1,398 were confirmed in the 1KGP and the gnomAD our investigation, we identified a total of 18 cis TE-eQTLs
databases, respectively; regarding the 2,232 SVA detected, (27 TE-gene pairs) in the interaction model and 290 cis
718 and 800 were validated in the 1KGP and the gnomAD TE-eQTLs (800 TE-gene pairs) in the non-interaction
databases, respectively. In total, 11,358 (44.0%) and 9,484 model. Notably, we observed that ALU contributed
(36.8%) out of the 25,805 TEs were validated in the 1KGP significantly to the identified TE-eQTLs, a finding
and the gnomAD databases, respectively.
consistent with previous reports . Quantile–quantile
[58]
3.2. The associations of TEs with the risk and the analysis demonstrated the observed distribution of P-values
progression of PD for outliers in the TE-eQTL analysis, demonstrating the
reliability of interaction-TE-eQTL (Figure S4A) and non-
To investigate the genetic association between TEs and PD, interaction-TE-eQTL analyses (Figure S4D). We also
we performed a TE-GWAS. Following rigorous quality assessed the distribution of effect size (β) for different
control (QC), we retained 1,910 samples and identified types of TEs-eGene (eGene: A gene with a TE-eQTL) in
2,867 high-quality TEs suitable for TE-GWAS analysis. both interaction (Figure S4B) and non-interaction modes
The dataset included 689 healthy controls and 1,221 PD (Figure S4E). Our findings revealed that TE insertions were
patients (Figure S1). During our analysis, we identified a linked to reduced gene expression levels in both models.
significant TE insertion site, labeled as chr1_246429040_ Furthermore, we investigated the types of genes linked to
ALU (Figure 2A, P = 8.73 × 10 , FDR = 0.024, effect size TE polymorphisms and found that only a minority of these
-6
β = −0.44), which exhibited a significant association with genes corresponded to protein-coding genes. Instead, they
PD onset. This finding suggests that subjects carrying primarily comprised pseudogenes and non-coding RNAs
this TE insertion have a lower risk of developing PD.
The TE-GWAS inflation factor λ was calculated as 1.058, (Figure S4C and S4F).
indicating the reliability of the overall results (Figure S2) and To explore the signaling pathways or functions
their independence from confounding factors. To further associated with genes regulated by cis-TE-eQTL loci
explore potential genetic associations, we conducted an LD in normal cellular physiology, we performed Gene
analysis of SNPs within the 500 kb region upstream and Ontology (GO) enrichment analysis on the significant
downstream of the TE risk locus chr1_246429040_ALU. eGenes identified through the TE-eQTL analysis. This
Volume 2 Issue 3 (2023) 6 https://doi.org/10.36922/gtm.1583

