Page 75 - AIH-2-1
P. 75

Artificial Intelligence in Health                                       ViT for Glioma Classification in MRI



              MRI is a noninvasive medical imaging technique used   The primary contributions of this study are as follows:
            for diagnosing brain tumors, injuries, and bleedings . Over   (a)  A ViT was proposed as an alternative to CNN for
                                                     4
            120 types of brain tumors have been identified using MRI,   glioma classification using MRI.
            which begin with primary tumors and are followed by   (b)  A pretraining method was proposed for ViTs when
            secondary tumors. Glioma is a type of brain tumor that   working with small datasets.
            originates from glial cells and can be visualized through   (c)  An effective approach of intensity uniformization
            MRI or CT. Hence, automated image analysis techniques   for MRI images as a preprocessing technique was
            can positively contribute to the diagnosis of glioma.  introduced.
                                                               (d)  The  performance  of  CNN  models  and  ViTs  was
              Automated image analysis has rapidly evolved in the
            past few years owing to the introduction of AI and CV into   compared for two grades of glioma classification as
            traditional image processing techniques.  DL has been used   well as tumorous and nontumorous MRI classification.
                                            2
            in medical imaging to recognize cells of various sizes and   (e)  The effects of class imbalance in the medical dataset
            shapes,  locate  organs  and  body  parts,  and  automatically   were discussed.
            identify local anatomical features.  Owing to the intrinsic   2. Background literature review
                                       3-5
            locality of convolution operations, popular architectures
            such  as  convolutional  neural  network (CNNs) have   With the advancement in AI technologies, computer-
            shown limitations in modeling straightforward long-range   aided diagnoses have been extensively studied in medical
            relations. Therefore, CNNs with attention mechanisms   sciences  for  different  disease diagnoses.  In particular,
            that assist AI models to focus on specific pixels, regions, or   noninvasive image-based diagnosis has garnered the
            features have gained research attention in image analysis.  attention of researchers and medical practitioners owing to
                                                               its high accuracy, high precision, and auxiliary capabilities
              Transformer architecture, a DL method with self-  in applications such as brain tumor classification and
            attention mechanism, has become vital in natural language   segmentation using AI and DL models. Moreover, this
            processing (NLP) tasks.  Recently, it has considerably   field has gained popularity among medical image analysis
                                6
            impacted text classification, machine translation, and   researchers owing to well-established open challenges such
            query responding. However, its application in CV problems   as the BraTS challenge and publicly available large MRI
            requires further research. In CV, attention can either work   datasets.  For instance, in their brain tumor classification
                                                                      7-9
            in tandem with CNNs or replace some of its components   and segmentation study, Kaldera et al.  proposed a simple
                                                                                              5
            while maintaining the overall network structure. Thus, this   CNN-based classifier for classifying glioma, meningioma,
            architecture largely has the potential to provide promising   and the absence of a tumor using MRI. One of the main
            results  in object detection, video classification,  image   bottlenecks faced when using DL architectures for medical
            classification, and image generation.              domain are data scarcity. This bottleneck was addressed

            1.1. Contribution                                  using general data augmentation techniques such as
                                                               flipping, rotation, and translation. Alsaif et al.  presented
                                                                                                    8
            This study mainly focused on evaluating the transformer   an  improved  ResNet50  architecture,  which  incorporated
            architecture for MRI image classification when applied   data augmentation techniques for effective brain tumor
            directly to the sequences of image patches. It concentrates   classification.
            on the classification of MRI images based on the presence
            and absence of glioma while overcoming the persistent   Because of the intrinsic locality of convolution
            class imbalance within a dataset to obtain feasible and   operations, CNN-based approaches are generally
            resource-optimized solutions. We focus on glioma as it is a   inadequate for modeling straightforward long-range
            malignant (cancerous) brain tumor, which is treatable with   relations. Therefore, CNN-based architectures exhibit
            high prognosis if detected early. The classification of brain   weak performances, particularly for target structures
            tumors before segmentation is beneficial for saving time   with varying textures, shapes, and sizes across patients.
            and resources, improving accuracy, and providing valuable   In previous studies, self-attention mechanisms with CNN
                                                                                                      9
            information for treatment planning. Moreover, organized   features were used to overcome these limitations.
            data aid in analysis and model training. However, medical   Transformers,  intended  for  sequence-to-sequence
            data are mostly biased toward the absence of disease   prediction, have emerged as ideal candidates to replace
            (negative outcome) and require careful implementation   CNNs. These were first proposed for machine translation
            of algorithms to avoid model overfitting. This study aims   by Vaswani et al.  It was then established as the state-of-
                                                                             6
            to present a comprehensive ground-up mechanism for   the-art method for many NLP tasks. It has the capacity to
            glioma classification using vision transformers (ViTs).  substitute attention mechanisms in place of convolution. 9-13


            Volume 2 Issue 1 (2025)                         69                               doi: 10.36922/aih.4155
   70   71   72   73   74   75   76   77   78   79   80