Page 40 - AIH-2-4
P. 40
Artificial Intelligence in Health ViT for neurodegeneration diagnosis
Diagnosing NDDs is exceedingly demanding and Understanding the model’s logic is the key to obtaining
requires years of training and experience. Hence, according explainability in the medical domain, as human users must
to some studies, it has been estimated that 75% of NDD comprehend the reasoning behind each prediction before
cases are undiagnosed worldwide due to various reasons, considering it. Therefore, we combined ViT’s attention
12
including the diagnosis complexity. Astoundingly, this maps and the Automated Anatomical Atlas 3 (AAL3)
3
number rises to 90% in low- and middle-income countries, brain atlas to develop an explainable model that provides
according to the same analysis. Moreover, the growing the most critical brain regions in the classification. The
3
number of NDD cases could devastate healthcare systems proposed model also delivers a heatmap of the input
in coming years, according to this study. Therefore, scan, in which the brightness of each pixel corresponds
3
innovative and affordable methods are needed to assist to its significance in the model’s decision, overlaid on the
doctors and decrease this diagnosis gap. original image, allowing the user to investigate pivotal
regions further.
The rapid progress of artificial intelligence (AI) and
its sub-fields has led to outstanding results in different Our model achieves an F1 score of 81% (macro-
domains, including medical image processing. Thus, average of all classes) on the test dataset, surpassing
researchers attempted to harness the power of deep neural other approaches by a considerable gap. Please note
networks (DNNs) in diagnosing NDDs and demonstrated that we only analyze our results against comparable
that they could have competitive performance compared studies regarding classes and the type of input brain
to human experts. 4,5 scans. Furthermore, our proposed ViT has remarkable
performance, in contrast to other models, in
The advent of vision transformers (ViTs) resulted in distinguishing MCI, which has proved to be one of the
distinguished performance in various computer vision tasks, most challenging brain conditions to diagnose due to
surpassing traditional approaches like convolutional neural its prodromal nature. MCI is a transition stage between
networks (CNNs). Therefore, their application in NDD cognitively normal (CN) and AD. Consequently,
6
diagnosis has been a trending research subject and the focal MCI patients may experience some common NDD
point of various studies, including this paper. We developed symptoms, such as memory loss or language problems,
our model based on vanilla ViT, proposed by Dosovitskiy et but the extent is such that they do not impede daily
al. , and trained it using F-fluorodeoxyglucose ( F-FDG) life. Therefore, differentiating MCI cases from other
18
18
6
13
positron emission tomography (PET) brain scans provided categories can be inherently complicated.
by the Alzheimer’s Disease Neuroimaging Initiative
7
(ADNI). The motivation behind our work is as follows: Finally, we conducted experiments to reveal the
• Dosovitskiy et al. achieved exceptional results in image contribution of different brain regions to the model’s
6
classification tasks by applying standard transformers, decisions. Although NDDs can affect various areas, this
8
utilized in natural language processing (NLP), directly study showed that some brain regions are significantly
to images with the least possible modifications. In more critical in the model’s predictions.
addition to its notable performance, this approach To summarize, the contribution and novelty of this
enables vision models to benefit from advancements research is as follows:
in the NLP domain, including large language models, • Introducing a complete data pre-processing and
because of architectural similarities. Consequently, reshaping pipeline for 3D PET scans and brain atlases,
vanilla ViT was a rational and sustainable foundation allowing for fine-tuning of pre-trained ViTs on this
6
due to its design, performance, and simplicity for type of data. This step is crucial since most ViTs are
investigating what transformer-based vision models pre-trained on natural three-channel RGB images.
accomplish in diagnosing NDDs. Therefore, resizing and reshaping 3D data into three
• 18 F-FDG PET scans, which reveal metabolic activities channels are essential to match the model’s input shape.
of various brain regions by measuring their glucose • Obtaining competitive performance in ternary NDD
consumption, are considered pivotal in diagnosing classification (CN/MCI/AD) utilizing F-FDG PET
18
and discriminating different NDDs, including mild brain scans and vanilla ViT. This approach is beneficial
6
cognitive impairment (MCI) and AD. Although since vanilla ViT mostly shares the same architecture
9
other brain imaging technologies such as computed as the standard transformer, used in NLP. Therefore,
8
tomography (CT) and magnetic resonance imaging these architectural similarities could enable future
(MRI) can expose NDDs too, PET scans have proved studies to leverage advancements in NLP.
to be superior in exposing these brain conditions as • Outperforming previous approaches by a noticeable
soon as possible and earlier than other methods. 10,11 margin (specifically in predicting MCI cases) in
Volume 2 Issue 4 (2025) 34 doi: 10.36922/AIH025140026

