Page 110 - GTM-2-3
P. 110

Global Translational Medicine                                     TEs link to Parkinson’s risk and progression



            shortest sequences (median length of 280 bp, Figure 1C).   This analysis aimed to determine whether this TE risk locus
            The annotation of TE insertion sites was also consistent   exhibited strong LD (r  > 0.8) with known PD risk SNPs.
                                                                                 2
            with expectations (Figure 1D): approximately 51.3% of TE   The TE LD analysis revealed no significant PD risk SNP
            insertion sites were located within intergenic regions, while   signals in this TE locus region, indicating that this TE
            approximately 38.0% were located in the intronic areas.   insertion site represents an independently discovered PD
            This observation confirms the propensity of TE insertions   risk locus (Figure S3).
            to occur predominantly in the regions of the genome with   There were 691 PD cases with longitudinal visits, and
            minimal or negligible impact on genomic structure and   2,111 TE loci were used (Table S1) to examine potential
            function, favoring the retention of TEs within the genome.
                                                               differences in disease progression over time among
              TE insertions are prone to preferentially occur in non-  subjects with TE genotypes. We constructed TE-LMM
            coding regions, where they do not significantly disrupt   models  for  six  clinical  indicators  separately,  and  our
            normal genomic structure and function, indicating a   analysis revealed that chr8_114592257_ALU carriers
            higher likelihood of these regions being non-conserved. To   (Figure 2B) exhibited a faster progression in the Hoehn
            evaluate the conservation status of TE insertion sites, we   and Yahr stage compared to non-carriers (P = 1.87 × 10 ).
                                                                                                           -5
            used two conservation scoring tools and found that highly   In addition, patients carrying chr13_81793576_SVA
            conserved TE insertion sites accounted for a mere 0.033%   (Figure  2C) on chromosome 13 q31.1 demonstrated a
            of all TE insertion events (25,805 TEs). In comparison,   more rapid progression in MDS-UPDRS Part  I scores
            non-conserved TE insertion sites accounted for 87.3% of   than non-carriers (P = 2.47 × 10 ).
                                                                                         -6
            all TE insertion sites (Figure 1E). We also measured the
            repeatability of our TE calling by comparing it to data from   3.3. Effects of TEs on gene expression
            the 1KGP and Genome Aggregation Database (gnomAD)   After undergoing QC processing, a total of 1,709 subjects
            databases. Our detected TEs showed high reproducibility   from three cohorts were included in the TE-expression
            with these existing 1KGP and gnomAD data (Figure 1F):   quantitative trait loci  (TE-eQTL)  analysis.  This cohort
            out of the 19,119 ALU detected, 9,269 and 7,286 were   consisted of 611 healthy controls and 1,098 patients with
            validated  in  the  1KGP  and  the  gnomAD  databases,   PD. A total of 2,867 TEs and 28,644 genes were used to
            respectively; among the 4,454 LINE1 detected, 1,371 and   investigate the associations between TEs and genes. In
            1,398 were confirmed in the 1KGP and the gnomAD    our investigation, we identified a total of 18 cis TE-eQTLs
            databases, respectively; regarding the 2,232 SVA detected,   (27 TE-gene pairs) in the interaction model and 290 cis
            718 and 800 were validated in the 1KGP and the gnomAD   TE-eQTLs (800 TE-gene pairs) in the non-interaction
            databases, respectively. In total, 11,358 (44.0%) and 9,484    model. Notably, we observed that ALU contributed
            (36.8%) out of the 25,805 TEs were validated in the 1KGP   significantly to the identified TE-eQTLs, a finding
            and the gnomAD databases, respectively.
                                                               consistent with previous reports . Quantile–quantile
                                                                                           [58]
            3.2. The associations of TEs with the risk and the   analysis demonstrated the observed distribution of P-values
            progression of PD                                  for outliers in the TE-eQTL analysis, demonstrating the
                                                               reliability of interaction-TE-eQTL (Figure S4A) and non-
            To investigate the genetic association between TEs and PD,   interaction-TE-eQTL analyses (Figure S4D). We also
            we performed a TE-GWAS. Following rigorous quality   assessed the distribution of effect size (β) for different
            control (QC), we retained 1,910  samples and identified   types of TEs-eGene (eGene: A gene with a TE-eQTL) in
            2,867 high-quality TEs suitable for TE-GWAS analysis.   both interaction (Figure S4B) and non-interaction modes
            The dataset included 689 healthy controls and 1,221 PD   (Figure S4E). Our findings revealed that TE insertions were
            patients (Figure S1). During our analysis, we identified a   linked to reduced gene expression levels in both models.
            significant TE insertion site, labeled as chr1_246429040_  Furthermore, we investigated the types of genes linked to
            ALU (Figure 2A, P = 8.73 × 10 , FDR = 0.024, effect size   TE polymorphisms and found that only a minority of these
                                     -6
            β = −0.44), which exhibited a significant association with   genes corresponded to protein-coding genes. Instead, they
            PD onset. This finding suggests that subjects carrying   primarily comprised pseudogenes and non-coding RNAs
            this TE insertion have a lower risk of developing PD.
            The TE-GWAS inflation factor λ was calculated as 1.058,   (Figure S4C and S4F).
            indicating the reliability of the overall results (Figure S2) and   To explore the signaling pathways or functions
            their independence from confounding factors. To further   associated with genes regulated by  cis-TE-eQTL loci
            explore potential genetic associations, we conducted an LD   in normal cellular physiology, we performed Gene
            analysis of SNPs within the 500 kb region upstream and   Ontology (GO) enrichment analysis on the significant
            downstream of the TE risk locus chr1_246429040_ALU.   eGenes identified through the TE-eQTL analysis. This


            Volume 2 Issue 3 (2023)                         6                        https://doi.org/10.36922/gtm.1583
   105   106   107   108   109   110   111   112   113   114   115