Page 102 - EJMO-9-1
P. 102
Eurasian Journal of Medicine and
Oncology
Genomics of breast cancer in Western Kazakhstan
Flex Real-Time PCR System (Thermo Fisher Scientific, 3. Results
USA) in the research laboratory of Karaganda Medical
University, ensuring high sensitivity and specificity in the 3.1. Comparative analysis of allele and genotype
detection of SNPs. differences between breast cancer patients and
control group
2.3. Data collection and integration of genome-wide A total of 149 patients with a confirmed diagnosis of BC,
association studies admitted to the Medical Center of West Kazakhstan Marat
2.3.1. Collection of genome-wide association studies Ospanov Medical University, were included in the study.
data The distribution of patients by cancer stage was as follows:
17 patients (11.4%) at stage I, 100 patients (67.1%) at stage
Publicly available GWAS datasets relevant to BC were II, 29 patients (19.5%) at stage IIIa, and 3 patients (2.0%) at
reviewed to identify SNPs associated with BC susceptibility. stage IIIb. The control group consisted of 150 conditionally
These datasets were integrated with the results of our NGS healthy women, recruited as part of a project funded by
sequencing using bioinformatics pipelines. The integration the Ministry of Education and Science of the Republic
process involved cross-referencing SNPs identified in our of Kazakhstan. All participants were unrelated Kazakh
cohort with previously reported GWAS findings, enabling women based on their lineage. The average age of the
a comprehensive evaluation of genetic risk factors specific patients was 56 years (95% confidence interval [CI]: 29.81
to the Kazakh population. – 45.36), and a positive family history of BC was identified
2.3.2. Bioinformatics analysis in 15 patients (10%). The genotyping panel included 113
polymorphisms localized in different regions of various
Bioinformatics analysis was conducted at the Karaganda chromosomes, as well as in different functional regions of
Medical University using the following pipeline: raw genes and intergenic regions, based on GWAS data.
sequencing data were processed using the BWA-MEM
algorithm for alignment to the human reference genome The statistical analysis of our data yielded absolute
(GRCh38); variant calling was performed using the and relative allele and genotype frequencies for both BC
Genome Analysis Toolkit, followed by annotation using patients and the control group, along with the P < 0.05) for
the ANNOVAR tool to identify functional implications the Hardy–Weinberg equilibrium, as presented in Table S1.
of the detected SNPs. Quality control measures included A comparative analysis of allele and genotype differences
filtering for a minimum read depth of 30×, a minimum between BC patients and the control group revealed 28
Phred quality score of 20, and exclusion of variants with a statistically significant polymorphisms associated with BC.
minor allele frequency of < 0.05. These results are detailed in Table 1.
2.4. Statistical analyses For the identified polymorphisms demonstrating
statistically significant differences between the study
2.4.1. Genotype-phenotype association analysis groups, we referred to the GWAS catalog (https://www.ebi.
The association between SNPs and BC risk was analyzed ac.uk/gwas/) to determine whether these polymorphisms
using various genetic inheritance models, including had been previously associated with BC risk in other
dominant, codominant, recessive, overdominant, and researchers’ data. The GWAS catalog identified seven
logarithmic models. Hardy-Weinberg equilibrium was statistically significant risk polymorphisms: RARG
tested to verify the assumptions of population genetics. (Rs2229774), FGFR2 (Rs2981582), ATM (Rs1800057),
Logistic regression, with Bonferroni correction for multiple MAP3K1 (Rs889312), BRCA2 (Rs11571833), FGFR2
comparisons, was applied to adjust for false positives. (Rs7895676), and FGFR2 (Rs1219648).
Next, we assessed the influence of genetic inheritance
2.4.2. Risk factor analysis and model validation
models on BC risk. Given the varying levels of
To assess the statistical significance of SNPs associated statistical significance observed in the allele and
with BC, Pearson’s Chi-square test was used. Variables genotype differences, we evaluated genotype-phenotype
were ranked by descending significance, allowing for the associations across different inheritance models.
selection of key risk factors. In addition, receiver operating Calculations were performed assuming that the reference
characteristic (ROC) curve analysis was performed to (non-risk) allele could either be the major allele (which
evaluate the predictive power of the identified genetic is true in most cases) or, in some instances, the minor
markers. The area under the curve (AUC) was calculated to allele. Therefore, both possibilities were considered in
determine the discriminative ability of the model. A P < 0.05 the analysis. The assessment was based on a case–control
was considered statistically significant for all tests. design using a generalized linear model.
Volume 9 Issue 1 (2025) 94 doi: 10.36922/ejmo.5385

