Page 120 - DP-2-3
P. 120
Design+ ML for predicting Alzheimer’s progression
One significant obstacle in addressing AD is the high challenge of accurately diagnosing AD—a disease that
cost associated with traditional imaging techniques and severely impacts cognitive and behavioral abilities—as a
diagnostic procedures. While these methods are beneficial, binary classification problem. Utilizing non-imaging data
they are often highly expensive for patients and healthcare from the AIBL, they built RF models employing different
systems. Nevertheless, emerging alternatives, such as combinations of data and preprocessing steps. An RF is
genetic markers, neuropsychological assessments, and an ML algorithm that uses an ensemble of decision trees
biomarker analysis, show promise as more accessible to make predictions. It is a supervised learning method,
and cost-effective diagnostic tools. By prioritizing these trained on labeled data to classify or predict outcomes. RFs
2
non-imaging methods, the financial burden of diagnosis are known for their accuracy and ability to handle complex
may be alleviated, thereby broadening access to care for datasets.
individuals with AD. Their approach included using scaled and unscaled
In this landscape of challenges and opportunities, data for simple RF classifiers, tuned RF classifiers, and
machine learning (ML) has emerged as a transformative RF classifiers with selected features using DALEX and
tool. With its ability to process complex datasets and extract Boruta packages in R software. Their results showed that
valuable insights, ML holds the potential to improve AD the tuned RF classifier, which utilized the original data,
diagnosis and management. Through the utilization of achieved an impressive 96% accuracy in classifying AD
novel data and rigorous training, ML algorithms excel at into HC and non-HC categories, with precision and recall
predicting outcomes and providing invaluable guidance for scores exceeding 97%. Model evaluation was primarily
decision-making processes. Moreover, ML enables earlier focused on accuracy, in line with their research objective of
disease detection and intervention, thereby contributing effectively classifying instances of AD. Furthermore, they
to improved patient outcomes and enhanced quality of developed multiple diagnostic classifiers and evaluated
life. The adaptability of ML models further allows for them to streamline the prediction process, aiming to create
3
continual refinement and optimization, ensuring ongoing a cost-effective diagnosis method.
improvements in prediction accuracy and diagnostic Notably, their classifier based on neuropsychological
efficacy. assessment variables demonstrated exceptional
The primary objective of this study is to develop a performance, achieving an accuracy of 93.68%. This model
robust multi-class classification model for predicting required only 4 out of 30 test variables, highlighting its
AD among three distinct groups: Healthy control (HC), potential to increase efficiency in diagnostic processes.
individuals with mild cognitive impairment (MCI), and
those diagnosed with AD. Leveraging non-imaging data 3. Dataset description
from the Australian AD Neuroimaging Initiative, with a The AIBL study commenced in 2006 with the aim of
4
particular emphasis on the Australian Imaging Biomarkers investigating the origins of AD and developing tools for
and Lifestyle Study of Aging (AIBL), this study utilizes identifying cognitive decline at its early stages. The
5
4,5
random forest (RF) and Extreme Gradient Boosting study includes a diverse population comprising healthy
(XGBoost) algorithms, along with their optimized individuals, those with MCI, and those diagnosed with
models. Through comparative analysis, the most AD. With over 1,000 participants, the AIBL dataset
effective classification model is identified. In addition, represents a comprehensive resource for AD research.
this study aims to enhance interpretability through It supports investigations into the associations between
feature importance analysis and the evaluation of various lifestyle factors and cognitive impairment and facilitates
classifiers. These efforts are expected to streamline the the development and evaluation of algorithms for early
predictive process for AD, facilitate early detection, enable AD detection. A summary of the dataset is presented in
personalized treatment strategies, and optimize resource Table 1.
allocation. The ultimate goal is to provide valuable insights
to inform the development of improved, cost-effective 4. Methodology
diagnostic and therapeutic approaches for addressing this The Cross-Industry Process for Data Mining (CRISP-DM),
debilitating condition. a widely adopted methodology recognized for its
2. Existing work effectiveness across industries, was employed in this study.
It offers flexibility while maintaining a comprehensive
Many researchers have conducted studies on classifying and structured approach compared to other methods.
7
AD using various datasets. In alignment with the present The method comprises distinct phases: business
study’s objectives, Rahman and Prasad addressed the understanding, data understanding, data preparation,
6
Volume 2 Issue 3 (2025) 2 doi: 10.36922/DP025270031

