Page 10 - GPD-1-2
P. 10

Gene & Protein in Disease                      DNA methylation and gene expression on rats with protein malnutrition



            ordinate is the gene, and different colors indicate different   was used for MPileUP processing according to the Hisat
            gene expression levels.                            comparison  results  of  each  sample  and  the  reference
                                                               genome, and the possible SNP and indel information of
            2.9. RNA-seq reads mapping                         each sample were then annotated with Annovar.

            We aligned the reads of sample A and sample B to the UCSC
            (http://genome.ucsc.edu/) Homo sapiens reference genome   2.13. Statistical analysis
            using HISAT package, which initially removed a portion   The  methylKit  software  was  used  to  analyze  the
            of the reads based on quality information accompanying   differentially methylated regions (DMRs) between groups.
            each read and then mapped the reads to the reference   A 1000 bp Windows and 500 bp overlap were selected by
            genome. HISAT allows multiple alignments PE read (up   default.  P  <0.01 was the difference screening threshold.
            to 20 by default) and a maximum of two mismatches   GoMiner database was used to analyze the enrichment
            when mapping the reads to the reference. HISAT builds a   of Gene Ontology (GO) and Kyoto Encyclopedia of
            database of potential splice junctions and confirms these   Genes and Genomes (KEGG) for the DMR-related
            by comparing the previously unmapped reads against the   genes obtained from the difference comparison of each
            database of putative junctions.                    group. The number of DMR-related genes included in
                                                               each GO (or KEGG entry) was counted, and the P-value
            2.10. Transcript abundance estimation and          of enrichment significance of DMR-related genes in
            differentially expressed testing                   each GO (or KEGG pathway entry) was calculated by
            The mapped read of each sample was assembled using   hypergeometric distribution test. t-test was used to screen
            StringTie. Then, all transcriptomes from samples were   the different methylation sites between groups after data
            merged to reconstruct a comprehensive transcriptome   processing.
            using perl scripts. After the final transcriptome was
            generated, StringTie and edgeR were used to estimate   3. Results
            the expression levels of all transcripts. StringTie was used
            to perform expression level for mRNAs by calculating   3.1. The quality of raw sequencing data and
                                                               differentially expressed analysis
            FPKM. The differentially expressed mRNAs and genes
            were selected with log2 (fold change) >1 or log2 (fold   All the raw sequence data were eligible for further analysis,
            change) <−1 and with statistical significance (P < 0.01) by   and the results of quality control are shown in  Table 1.
            R package.                                         The results of mapping to genome through Hisat2 had a
                                                               higher concordant rate (Table 2). Regional distribution of
            2.11. Genome-wide DNA methylation assay            reference genome alignment is shown in Figure 1. Valid
            Total DNA was extracted using QIAamp Fast DNA Tissue   data that can be compared to the reference genome can
            Kit (Qiagen, Dusseldorf, Germany). The bisulfate sequence   be subjected to the comparisons of exon, intron, and
            libraries were constructed using the Acegen Bisulfite-Seq   intergeneric regions based on the region information
            Library Prep Kit (AceGen, Cat. No. AG0311), according   of the reference genome. Under normal circumstances,
            to the manufacturer’s protocol. Briefly, the genomic DNA   the percentage content of sequence localization in exon
            spiked with methylated Lambda DNA was fragmented   region should be the highest, while reads in intron and
            by  sonication  (for  whole-genome  bisulfite  sequencing)   intergeneric region are compared, which may be caused
            or using MspI (NEB, USA, for reduced representation   by the shearing event of pre-mRNA, incomplete genome
            bisulfite  sequencing)  to  a  mean  size  of  approximately   annotation, DNA pollution and background noise, etc.
            200–500  bp, then end-repaired, 5’-phosphorylated,
            3’-dA-tailed, and ligated to 5-methylcytosine-modified   3.2. Analysis of total gene expression level
            adapters. After bisulfate treatment, the DNA was amplified   The distribution statistics of expression values in the
            with 10 cycles of polymerase chain reaction (PCR). The   Table 3 can be further expressed by the sample FPKM box
            constructed libraries were then analyzed by Agilent 2100   diagram (Figure 2), so as to understand the gene expression
            Bioanalyzer and finally sequenced by Illumina platforms   level from the overall level. For samples of biological
            using a 2×150 bp paired-end sequence protocol.     duplication, the reproducibility of design samples can
                                                               also be preliminarily judged by the box  diagram. The
            2.12. Single-nucleotide polymorphism and indel     x-coordinate is the sample name, the y-coordinate is log10
            analysis                                           (FPKM), and the box chart for each region corresponds
            We analyzed single-nucleotide polymorphism (SNP) sites   to five statistics (maximum, upper quartile, median, lower
            in coding region at transcriptomic level. Samtools software   quartile, and minimum from top to bottom).


            Volume 1 Issue 2 (2022)                         4                      https://doi.org/10.36922/gpd.v1i2.169
   5   6   7   8   9   10   11   12   13   14   15