Page 39 - AIH-1-4
P. 39
Artificial Intelligence in Health Segmentation and classification of DR using CNN
MA segmentation, optical disc segmentation, and SE
segmentation. Each lesion is represented as either true
(present) or false (absent) in the binary inputs. In addition,
string inputs are generated from a classification or image
grading model, offering insights into the DR stage classified
as classes 0 through 4. The amalgamation of binary and
string inputs forms a robust dataset that is processed by
ChatGPT, a pre-trained LLM. ChatGPT interprets and
synthesizes this diverse information to generate nuanced
test/treatment recommendations, contributing to a
sophisticated decision-support system that factors in both
the detailed visual segmentation features and the clinical
classifications of DR severity. 24
1.3. Research gap
Figure 2. Class distribution in the APTOS 2019 dataset. Image generated 1.3.1. Treatment recommendations
using VS code
Abbreviation: DR: Diabetic retinopathy. While significant strides have been made in the realm of
early DR detection, the existing research landscape reveals
a distinct gap when comparing traditional methodologies
with emerging approaches, particularly those involving
pre-trained LLMs integrated with segmented image
inputs for generating test/treatment recommendations.
Classical methods, as evidenced by Priya and Aruna,
6
have predominantly employed computer vision and
machine learning techniques for DR stage detection
using color fundus images. Similarly, the advent of deep
learning, particularly CNNs, has demonstrated promising
results in intricate feature identification for classification
tasks related to DR. Noteworthy works by Pratt et al.
8
have showcased the effectiveness of CNN architectures,
achieving high sensitivity and accuracy in diagnosing
retinal abnormalities. 18
Figure 3. Sample of fundus photograph from the dataset. Image is a However, the existing body of literature primarily
screenshot from VS code emphasizes isolated aspects such as lesion segmentation
or DR classification, with a limited exploration of the
3662 training, 1928 validation, and 13,000 testing images synergies between visual segmentation features and
as organized by the Kaggle competition organizers. All clinical classifications within a decision-support system.
datasets exhibit similar class distributions, as illustrated This is evident in the literature reviewed, which often
in Figure 1 for APTOS 2019. We maintained the original overlooks the potential intricacies arising from the
distribution of the datasets without any modifications, amalgamation of binary indicators for various lesions and
such as undersampling or oversampling. The smallest string inputs representing DR stages. The research gap lies
native size among all datasets is 640 × 480. A sample image in the absence of comprehensive investigations into the
from APTOS 2019 is presented in Figure 3. 23 challenges and opportunities associated with the proposed
methodology’s integration of diverse data inputs. While
1.2.4. Large language models (LLMs) previous studies have contributed valuable insights and
In the dataset section, the generation of test/treatment benchmarking using classical methods and deep learning
recommendations involves the integration of pre-trained architectures, there is a need for focused research that
LLMs, with a comprehensive range of inputs derived bridges the gap between visual segmentation and clinical
from segmented images. These inputs encompass binary classifications to refine the efficacy of decision-support
indicators for various lesions, including blood vessel systems in DR management. Exploring this gap will
segmentation, HE segmentation, EX segmentation, contribute to advancing the field by providing a holistic
Volume 1 Issue 4 (2024) 33 doi:10.36922/aih.2783

