Page 44 - AIH-2-4
P. 44
Artificial Intelligence in Health ViT for neurodegeneration diagnosis
1 1 1 1 in which pixel values correspond to their influence on
= w → = w ,w = ,w =
c CN MCI AD the model’s decision
Class Frequency 140 160 160 (III)
• The model illuminates pixels with values exceeding
Finally, Algorithm 1 shows the data augmentation 95% of the maximum value using red rectangles. This
process for model training. step enables the user to examine and analyze all key
Algorithm 1. The data augmentation procedure used in the areas in the input scan
model training • Ultimately, the model overlays the heatmap, extracted
in the first step, on the AAL3 atlas and locates pixels
t1←GaussianBlur (kernel_size= (3, 3), sigma = (0.1,2)) with the highest intensity to provide the name of the
t2←GaussianNoise (mean=0, std=0.05)
t3←ColorJitter (brightness=0.1) brain regions that encompass them. Providing these
t4←ColorJitter (contrast=0.1) areas’ names is crucial to the user since they are most
t5←ColorJitter (saturation=0.1) influential in the model’s prediction.
random_choice←RandomChoice([t1, t2, t3, t4, t5]) We reshaped the AAL3 atlas to 3 × 950 × 570 to fit the
transforms←RandomApply([random_choice], p=0.7)
size of our input scans using the following procedure; the
3.6. Explainability result of which is in Figure 4:
Explainability is vital in healthcare since experts should • By overlaying the AAL3 atlas on a pre-processed sample
38
understand the reason behind the model’s predictions in MRIcron, we first reshaped AAL3 to 79 × 95 × 79
before considering or counting them. Therefore, we • It was crucial to verify that both the reshaped atlas and
combined the model’s attention maps and the AAL3 brain the input scan followed the same coordinate system.
atlas to discover the most impactful brain regions on the Therefore, we loaded the resulting new atlas and the
model’s conclusions. During the inference mode, our input scan into MRIcron again and compared their
model follows these steps to provide various details to the coordinate system side-by-side. This step ensured that
user: corresponding coordinates referred to the same brain
• The model extracts the attention map of each input area in both files
scan and overlays this data on the original image. The • We discarded the first ten and last nine slices from
outcome of this step is a heatmap of the brain regions, AAL3, similar to the input scans, resulting in a shape
Figure 3. The model architecture is identical to the ViT-Base introduced by Dosovitskiy et al. First, the scan is reshaped into 3 × 384 × 384 to fit the
6
model’s input. Then, it is split into patches of shape 3 × 32 × 32, flattened, and provided to a standard transformer along with position embeddings that
contain spatial information. At the last stage, an MLP acts as the classification head to map the final hidden state into the probability of classes. The
illustration of the model’s architecture was inspired by Dosovitskiy et al. 6
Abbreviations: AD: Alzheimer’s disease; CN: Cognitively normal; MCI: Mild cognitive impairment; MLP: Multilayer perceptron.
Volume 2 Issue 4 (2025) 38 doi: 10.36922/AIH025140026

