Page 121 - DP-2-3
P. 121

Design+                                                             ML for predicting Alzheimer’s progression




            Table 1. Dataset summary                           modeling, evaluation, and deployment. Figure 1 illustrates
                                                               a graphic representation of these CRISP-DM phases.
            Variable                  Description
            Demographics                                       4.1. Business understanding
             Age                      55–96 years old          The business understanding phase involves defining
             Gender                   Categorized as “Female” or   business objectives, assessing the current context,
                                      “Male”
            Medical history                                    establishing data mining goals, and formulating a project
             Psychiatric (MH_PSYCH)   Binary features          plan. As outlined in the introduction, a background
             Neurologic (MH_NEURL)                             study  was  conducted,  and  the  research  objectives  were
             Cardiovascular (MH_CARD)                          clearly defined. The success criteria for this study involved
             Hepatic (MH_HEPAT)                                benchmarking  classifier  performance  against  the  AD
                                                                                                             6
             Musculoskeletal (MH_MUSCL)                        classification model presented by Rahman and Prasad
             Endocrine–metabolic (MH_ENDO)                     and comparing the best diagnosis classifier with the one
             Gastrointestinal (MH_GAST)                        identified in their study.
             Renal–genitourinary (MH_RENA)                       This comparison focused on four key metrics critical for
             Smoking (MH_SMOK)                                 evaluating classifier performance: (i) Accuracy, indicating
             Malignancy (MH_MALI)                              the proportion of correctly predicted instances relative to
            ApoE genotype                                      the total number of instances in the dataset; (ii) precision, a
             Two-allele genotype      Each individual carries two   measure of prediction reliability, reflecting the ratio of true
                                      ApoE alleles, and each allele   positive predictions to all positive predictions; (iii) recall,
                                      can be E2, E3, or E4
            Neuropsychological assessments                     also  referred  to  as  sensitivity,  measuring  the  classifier’s
             Clinical dementia rating   Total number of story units   ability to identify actual positive cases; and (iv) F1-score,
             (CDGLOBAL)               recalled immediately; scores   the harmonic mean of recall and precision, which balances
                                      ranged from 0 to 25      the trade-off between these two metrics. 8
             Mini-mental state exam   Total number of story units   A comprehensive project plan was formulated
             (MMSCORE)                recalled after a delay; scores
                                      ranged from 0 to 25      based on available resources, requirements, and risk
             Logical memory immediate recall   -               assessments. The plan encompassed tasks across each
             (LIMMTOTAL)                                       CRISP-DM phase, including the selection of appropriate
             Logical memory delayed recall                     tools, methodologies, and risk mitigation strategies. The
             (LDELTOTAL)                                       primary tools utilized were Google Colab and Python,
            Blood analysis                                     with tasks involving data preparation, cleaning, and
             Thyroid stimulating hormone                       analysis. Python libraries, particularly functionalities
             (AXT117)
             Vitamin B12 (BAT126)
             Red blood cell count (HMT3)
             White blood cell count (HMT7)
             Platelet count (HMT13)
             Hemoglobin (HMT40)
             Mean corpuscular hemoglobin
             (HMT100)
             Mean corpuscular hemoglobin
             concentration (HMT102)
             Urea nitrogen (RCT6)
             Serum glucose (RCT11)
             Cholesterol (high performance;
             RCT120)
             Creatinine (rate blanked; RCT329)
            Diagnosis
             Diagnostic results       Categorized into healthy
                                      control, mild cognitive
                                      impairment, and Alzheimer’s
                                      disease
            Abbreviation: ApoE: Apolipoprotein E.              Figure 1. Phases of the cross-industry process for data mining


            Volume 2 Issue 3 (2025)                         3                            doi: 10.36922/DP025270031
   116   117   118   119   120   121   122   123   124   125   126