Page 75 - AIH-2-1
P. 75
Artificial Intelligence in Health ViT for Glioma Classification in MRI
MRI is a noninvasive medical imaging technique used The primary contributions of this study are as follows:
for diagnosing brain tumors, injuries, and bleedings . Over (a) A ViT was proposed as an alternative to CNN for
4
120 types of brain tumors have been identified using MRI, glioma classification using MRI.
which begin with primary tumors and are followed by (b) A pretraining method was proposed for ViTs when
secondary tumors. Glioma is a type of brain tumor that working with small datasets.
originates from glial cells and can be visualized through (c) An effective approach of intensity uniformization
MRI or CT. Hence, automated image analysis techniques for MRI images as a preprocessing technique was
can positively contribute to the diagnosis of glioma. introduced.
(d) The performance of CNN models and ViTs was
Automated image analysis has rapidly evolved in the
past few years owing to the introduction of AI and CV into compared for two grades of glioma classification as
traditional image processing techniques. DL has been used well as tumorous and nontumorous MRI classification.
2
in medical imaging to recognize cells of various sizes and (e) The effects of class imbalance in the medical dataset
shapes, locate organs and body parts, and automatically were discussed.
identify local anatomical features. Owing to the intrinsic 2. Background literature review
3-5
locality of convolution operations, popular architectures
such as convolutional neural network (CNNs) have With the advancement in AI technologies, computer-
shown limitations in modeling straightforward long-range aided diagnoses have been extensively studied in medical
relations. Therefore, CNNs with attention mechanisms sciences for different disease diagnoses. In particular,
that assist AI models to focus on specific pixels, regions, or noninvasive image-based diagnosis has garnered the
features have gained research attention in image analysis. attention of researchers and medical practitioners owing to
its high accuracy, high precision, and auxiliary capabilities
Transformer architecture, a DL method with self- in applications such as brain tumor classification and
attention mechanism, has become vital in natural language segmentation using AI and DL models. Moreover, this
processing (NLP) tasks. Recently, it has considerably field has gained popularity among medical image analysis
6
impacted text classification, machine translation, and researchers owing to well-established open challenges such
query responding. However, its application in CV problems as the BraTS challenge and publicly available large MRI
requires further research. In CV, attention can either work datasets. For instance, in their brain tumor classification
7-9
in tandem with CNNs or replace some of its components and segmentation study, Kaldera et al. proposed a simple
5
while maintaining the overall network structure. Thus, this CNN-based classifier for classifying glioma, meningioma,
architecture largely has the potential to provide promising and the absence of a tumor using MRI. One of the main
results in object detection, video classification, image bottlenecks faced when using DL architectures for medical
classification, and image generation. domain are data scarcity. This bottleneck was addressed
1.1. Contribution using general data augmentation techniques such as
flipping, rotation, and translation. Alsaif et al. presented
8
This study mainly focused on evaluating the transformer an improved ResNet50 architecture, which incorporated
architecture for MRI image classification when applied data augmentation techniques for effective brain tumor
directly to the sequences of image patches. It concentrates classification.
on the classification of MRI images based on the presence
and absence of glioma while overcoming the persistent Because of the intrinsic locality of convolution
class imbalance within a dataset to obtain feasible and operations, CNN-based approaches are generally
resource-optimized solutions. We focus on glioma as it is a inadequate for modeling straightforward long-range
malignant (cancerous) brain tumor, which is treatable with relations. Therefore, CNN-based architectures exhibit
high prognosis if detected early. The classification of brain weak performances, particularly for target structures
tumors before segmentation is beneficial for saving time with varying textures, shapes, and sizes across patients.
and resources, improving accuracy, and providing valuable In previous studies, self-attention mechanisms with CNN
9
information for treatment planning. Moreover, organized features were used to overcome these limitations.
data aid in analysis and model training. However, medical Transformers, intended for sequence-to-sequence
data are mostly biased toward the absence of disease prediction, have emerged as ideal candidates to replace
(negative outcome) and require careful implementation CNNs. These were first proposed for machine translation
of algorithms to avoid model overfitting. This study aims by Vaswani et al. It was then established as the state-of-
6
to present a comprehensive ground-up mechanism for the-art method for many NLP tasks. It has the capacity to
glioma classification using vision transformers (ViTs). substitute attention mechanisms in place of convolution. 9-13
Volume 2 Issue 1 (2025) 69 doi: 10.36922/aih.4155

