Page 48 - AIH-2-4
P. 48

Artificial Intelligence in Health                                       ViT for neurodegeneration diagnosis




                         A                                  B










                         C                                  D











            Figure 9. The mean attention maps of brain areas in our regions’ importance study, where the brightness of a pixel translates to its significance in the
            model’s predictions. This figure results from merging the training, validation, and test datasets and feeding the final dataset of 580 samples to the model.
            Then, we saved the model’s attention maps for all correctly classified scans. Finally, we computed the mean of saved attention maps to generate a mean
            attention map for the whole dataset and each class. (A) CN, (B) MCI, (C) AD, (D) Overall.
            Abbreviations: AD: Alzheimer’s disease; CN: Cognitively normal; MCI: Mild cognitive impairment.

            knowledge and training. Therefore, this complexity in   also compare our results with those presented in other
            disease diagnosis, together with other factors, has left   papers, as listed in Table 4. However, a few points worth
            about 75% of NDD patients undiagnosed worldwide, and   mentioning for a fair comparison:
            this number rises to about 90% in low- and middle-income   •   While  several  papers  examined  NDD  classification
            countries, according to some studies.  In addition to   using DNNs, we exclusively selected studies that
                                            3
            revealing a considerable diagnosis gap, these investigations   aimed for ternary classification (CN/MCI/AD) using
            predict that the fast-growing number of NDD patients could   18 F-FDG PET scans
            strain healthcare systems in the future.  Consequently,   •   Although all chosen studies employed ADNI as their
                                             3
            novel and affordable tools and techniques are required for   primary dataset, the authors may have used different
            the final diagnosis of NDD and/or for assisting healthcare   subsets to train and test the models
            providers in this task.                            •   As shown in  Table 4, DNNs can surpass physicians
              While the rapid progress of AI and its sub-fields   when NDD diagnosis is solely based on brain
            revolutionized our lives, researchers have attempted   scans. However, domain experts usually consider
            to harness the power of these new technologies in the   a comprehensive collection of information for the
            healthcare domain, including NDD diagnosis. Although   clinical  diagnosis,  including  the  patient’s  medical
            most of these research projects utilized long-established   history, genetics, blood tests, and cognitive and
            approaches and architectures like CNNs, the emergence of   physical evaluations. Consequently, instead of solely
            ViTs and their groundbreaking performance convinced us   relying on brain scans, doctors and nuclear medicine
            to explore the potential of employing this new architecture   physicians  consider  various  factors,  a  practice  that
            in NDD classification.                                poses a substantial advantage over AI models.
            In this work, we developed a model to classify  F-FDG   According to  Table 4, in addition to a significantly
                                                   18
            PET brain scans into CN, MCI, and AD. Specifically,   higher F1 score, our model excels in distinguishing MCI
            we designed the model based on the vanilla ViT-Base,   cases, which proved to be the most challenging condition
            introduced by Dosovitskiy et al.,  and trained on the ADNI   to classify, compared to human experts and other models.
                                     6
            dataset.  Combining the proposed data, pre-processing   Apart from improving the model’s performance,
                  7
            procedure, training recipe, and transfer learning enabled   developing an explainable model was a pivotal goal of this
            our model to achieve an F1 score of 81% (macro-average   research. Therefore, we integrated  the AAL3 brain atlas
            of all classes), significantly outperforming previous   information into the attention maps. This method resulted
            approaches. To comprehend the model’s performance, we   in a model that provides the brain region with the highest


            Volume 2 Issue 4 (2025)                         42                          doi: 10.36922/AIH025140026
   43   44   45   46   47   48   49   50   51   52   53