Page 99 - AIH-2-4
P. 99
Artificial Intelligence in Health AI vs humans in clinical code conversion
1. Introduction large-scale neural networks, incorporating feed-forward
and convolutional architectures. 23
The volume of data generated annually by hospitals
and health services far exceeds the analytical capacity Following the widespread success of ChatGPT,
of humans. Murphy estimated hospitals produce competitors have since launched other GenAI tools available
1
1
24
approximately 50 petabytes (equivalent to 50,000,000 to the general public, including Google Gemini, Microsoft
25
26
gigabytes) of data each year—97% of which remains Copilot, and Claude. The accuracy and completeness of
unanalyzed or unused. Electronic health records contain a outputs are limited by the data available to the GenAI model
wide range of information, including patient demographics, (i.e., what it has been trained on, its access to real-time search
images, clinical notes, and pathology results. These records capabilities), which may be biased or inaccurate. GenAI
offer significant potential for retrospective analysis to tools also have limited knowledge of more specialized
support data-driven decision-making and more accurate topics, resulting in a tendency to “hallucinate”—a
predictions of service utilization. However, increasingly phenomenon where a GenAI tool generates information
1,2
financially constrained and resource-limited healthcare to fill knowledge gaps, thereby decreasing the accuracy of
systems lack the capacity to manually process such large outputs. Healthcare professionals require an up-to-date
27
datasets, limiting opportunities to improve healthcare understanding of the current and evolving limitations of
system efficiency. 1,3 GenAI in order to optimally select tasks at which it is likely
to excel and to prompt it appropriately.
Generative artificial intelligence (GenAI) refers to a
type of artificial intelligence algorithm that enables the A key challenge in analyzing large-scale healthcare
creation of new content—such as text, images, video, or data is ensuring the consistency of data recording
audio files—based on a set of training data. GenAI has a across different health services. Standardized diagnostic
4,5
wide range of applications, including creating illustrations, coding systems help maintain clinical data uniformity by
writing code, and processing datasets. Additionally, providing a universal language through which diagnoses
4-7
GenAI has the potential to support the analysis of large- can be coded and interpreted consistently across healthcare
scale datasets within healthcare settings. Healthcare settings. The Systematized Nomenclature of Medicine
5,8
28
has traditionally required significant human labor and Clinical Terms (SNOMED CT) is a diagnostic coding
29
expertise, and as such, it has often resisted large-scale system utilized by 48 countries (as of August 2024)
efforts for effective automation, particularly in the form of to capture detailed clinical information on procedures,
clinical and administrative decision-making. 9-12 A recent diseases, and clinical findings. SNOMED CT presents
13
literature review by Li et al. has identified some of the key diagnoses using both a numeric code (e.g., “230690007”)
areas in which GenAI is starting to make an impact within and a corresponding descriptor (e.g., “Stroke”). It employs
healthcare, including generating discharge summaries, a polyhierarchical structure, in which any given code may
14
determining appropriate screening procedures for a belong to one or more “parent” categories (e.g., “asthma”
patient, answering clinical questions, and providing may be categorized under both “respiratory diseases”
15
medical education. 16-19 and “allergic conditions”). While SNOMED CT provides
The increasing complexity of global healthcare a comprehensive framework for patient-level diagnostic
challenges necessitates new data analysis approaches that coding—encompassing symptoms, procedures, and
can expeditiously and efficiently leverage the vast datasets clinical observations—the system’s complexity can pose
available to healthcare systems. Recent advancements challenges for users with limited training.
in automation tools, such as GenAI, provide new The International Statistical Classification of Diseases
30
opportunities to efficiently complete large-scale healthcare and Related Health Problems (ICD) is currently the
data analytics. The widespread implementation of global standard for coding diagnostic information. ICD
20
GenAI represents one of the most rapid technological focuses on the classification of diseases, disorders, and
advancements in recent years. OpenAI’s ChatGPT causes of death using alphanumeric codes. These codes are
21
is currently one of the most widely used GenAI tools, determined using a hierarchical system, in which codes
with over 100 million online users per week. ChatGPT are categorized by chapters (e.g., F: mental and behavioral
22
allows users to input prompts, commands, or questions disorders) and then further subdivided as more detail is
and generates corresponding responses. Its interface provided (e.g., “F30: mood [affective] disorders,” “F30.9:
is driven by a large language model, a form of natural manic episode, unspecified”). Although the ICD provides
language processing capable of learning and refining less detail than SNOMED CT, its broader categories
its conversational abilities through both self- and semi- facilitate population health analytics and provide a
structured training. Data processing is carried out using standard for international health system comparison.
23
Volume 2 Issue 4 (2025) 93 doi: 10.36922/AIH025200045

