Page 102 - EJMO-9-1
P. 102

Eurasian Journal of Medicine and
            Oncology
                                                                         Genomics of breast cancer in Western Kazakhstan


            Flex Real-Time PCR System (Thermo Fisher Scientific,   3. Results
            USA) in the research laboratory of Karaganda Medical
            University, ensuring high sensitivity and specificity in the   3.1. Comparative analysis of allele and genotype
            detection of SNPs.                                 differences between breast cancer patients and
                                                               control group
            2.3. Data collection and integration of genome-wide   A total of 149 patients with a confirmed diagnosis of BC,
            association studies                                admitted to the Medical Center of West Kazakhstan Marat
            2.3.1. Collection of genome-wide association studies   Ospanov Medical University, were included in the study.
            data                                               The distribution of patients by cancer stage was as follows:
                                                               17 patients (11.4%) at stage I, 100 patients (67.1%) at stage
            Publicly available GWAS datasets relevant to BC were   II, 29 patients (19.5%) at stage IIIa, and 3 patients (2.0%) at
            reviewed to identify SNPs associated with BC susceptibility.   stage IIIb. The control group consisted of 150 conditionally
            These datasets were integrated with the results of our NGS   healthy women, recruited as part of a project funded by
            sequencing using bioinformatics pipelines. The integration   the Ministry of Education and Science of the Republic
            process involved cross-referencing SNPs identified in our   of Kazakhstan. All participants were unrelated Kazakh
            cohort with previously reported GWAS findings, enabling   women based on their lineage. The average age of the
            a comprehensive evaluation of genetic risk factors specific   patients was 56 years (95% confidence interval [CI]: 29.81
            to the Kazakh population.                          – 45.36), and a positive family history of BC was identified

            2.3.2. Bioinformatics analysis                     in 15 patients (10%). The genotyping panel included 113
                                                               polymorphisms localized in different regions of various
            Bioinformatics analysis was conducted at the Karaganda   chromosomes, as well as in different functional regions of
            Medical University using the following pipeline: raw   genes and intergenic regions, based on GWAS data.
            sequencing data were processed using the BWA-MEM
            algorithm for alignment to the human reference genome   The statistical analysis of our data yielded absolute
            (GRCh38); variant calling was performed using the   and relative allele and genotype frequencies for both BC
            Genome Analysis Toolkit, followed by annotation using   patients and the control group, along with the P < 0.05) for
            the ANNOVAR tool to identify functional implications   the Hardy–Weinberg equilibrium, as presented in Table S1.
            of the detected SNPs. Quality control measures included   A comparative analysis of allele and genotype differences
            filtering for a minimum read depth of 30×, a minimum   between BC patients and the control group revealed 28
            Phred quality score of 20, and exclusion of variants with a   statistically significant polymorphisms associated with BC.
            minor allele frequency of < 0.05.                  These results are detailed in Table 1.

            2.4. Statistical analyses                            For the identified polymorphisms demonstrating
                                                               statistically significant differences between the study
            2.4.1. Genotype-phenotype association analysis     groups, we referred to the GWAS catalog (https://www.ebi.
            The association between SNPs and BC risk was analyzed   ac.uk/gwas/) to determine whether these polymorphisms
            using various genetic inheritance models, including   had been previously associated with BC risk in other
            dominant, codominant, recessive, overdominant, and   researchers’ data. The GWAS catalog identified seven
            logarithmic models. Hardy-Weinberg equilibrium was   statistically significant risk polymorphisms:  RARG
            tested to verify the assumptions of population genetics.   (Rs2229774),  FGFR2 (Rs2981582),  ATM (Rs1800057),
            Logistic regression, with Bonferroni correction for multiple   MAP3K1 (Rs889312),  BRCA2 (Rs11571833),  FGFR2
            comparisons, was applied to adjust for false positives.  (Rs7895676), and FGFR2 (Rs1219648).
                                                                 Next, we assessed the influence of genetic inheritance
            2.4.2. Risk factor analysis and model validation
                                                               models on BC risk. Given the varying levels of
            To  assess  the  statistical  significance  of  SNPs  associated   statistical significance observed in the allele and
            with BC, Pearson’s Chi-square test was used. Variables   genotype differences, we evaluated genotype-phenotype
            were ranked by descending significance, allowing for the   associations across different inheritance models.
            selection of key risk factors. In addition, receiver operating   Calculations were performed assuming that the reference
            characteristic (ROC) curve analysis was performed to   (non-risk) allele could either be the major allele (which
            evaluate  the  predictive  power  of  the  identified  genetic   is true in most cases) or, in some instances, the minor
            markers. The area under the curve (AUC) was calculated to   allele. Therefore, both possibilities were considered in
            determine the discriminative ability of the model. A P < 0.05   the analysis. The assessment was based on a case–control
            was considered statistically significant for all tests.  design using a generalized linear model.


            Volume 9 Issue 1 (2025)                         94                              doi: 10.36922/ejmo.5385
   97   98   99   100   101   102   103   104   105   106   107