Page 100 - AIH-2-4
P. 100
Artificial Intelligence in Health AI vs humans in clinical code conversion
The ability to convert between diagnostic coding systems converted using ChatGPT-4o (https://chatgpt.com/).
has practical applications, particularly within research Third, the same set of codes was converted using Claude
contexts. For instance, extracting a subset of SNOMED CT 3.5 Sonnet (https://claude.ai/). Both GenAI tools required
codes related to a specific diagnostic grouping (e.g., mental paid subscriptions at the time of analysis.
health) is challenging, as there are no broader categories for The methodology and results of this study were reported
each condition, unlike ICD codes. This presents challenges in accordance with the METRICS reporting checklist,
when working with large SNOMED CT datasets while which outlines standardized reporting metrics – such as
attempting to analyze only a subset. Converting diagnostic model, evaluation, timing, transparency, range of tested
codes can be a time-consuming task, particularly when this topics, randomization, individual factors, query count,
process relies heavily on manual data input and extraction. and prompt specificity – for GenAI-based studies in
To the authors’ knowledge, it remains unexplored whether healthcare. The completed reporting checklist is listed in
33
GenAI can assist in the conversion of clinical data from Table S2.
one diagnostic coding language to another, such as
from SNOMED CT to ICD. Such conversions require 2.1. Phase 1: Manual conversion of SNOMED-CT-AU
specialized knowledge of clinical coding and are labor- codes
intensive to complete manually. Performing diagnostic The SNOMED CT-AU codes were manually converted by
code conversion tasks using AI models may enable less a team of three raters (AG = 800 codes; AJ = 644 codes;
qualified staff to complete the work in less time, thereby CH = 532 codes). Conversions were performed using
reducing the cost of data processing. the Interactive Map-Assisted Generation of ICD Codes
Therefore, this study aims to examine whether publicly (I-MAGIC) algorithm (https://imagic.nlm.nih.gov/
accessible GenAI tools – namely ChatGPT-4o and Claude imagic/code/map), an online tool that provides a mapping
34
3.5 Sonnet – can accurately convert clinical diagnostic between the two diagnostic coding systems. Codes were
codes from SNOMED CT to the 10 revision of the ICD entered into the tool in the format “SNOMED CT-AU
th
(ICD-10). This study also seeks to address the following name (SNOMED CT-AU code)” (e.g., “Anxiety reaction
sub-objectives: [48694002]”), and the corresponding ICD-10-CM code
(i) Compare the level of agreement between ChatGPT-4o was extracted.
and a human rater In this study, the I-MAGIC tool was employed as the
(ii) Compare the level of agreement between Claude 3.5 reference standard against which all other conversion
Sonnet and a human rater methods were compared. However, some SNOMED
(iii) Compare the level of agreement between ChatGPT-4o CT codes could not be located within the I-MAGIC
and Claude 3.5 Sonnet database. As the dataset utilized the Australian extension
(iv) Examine the economic benefit, in terms of time of SNOMED CT, while the mapping tool used the standard
and labor cost, of using GenAI to complete this task SNOMED CT list, it is likely that the missing codes were
compared to a human rater. region-specific. In such cases, the absence of an equivalent
35
2. Materials and methods was noted.
The SNOMED CT codes used in this study originate from 2.2. Phase 2: Conversion of SNOMED-CT-AU codes
a broader emergency department (ED) dataset, obtained as using ChatGPT-4o
part of a study investigating mental health presentations to ChatGPT-4o was used to automatically convert the
21
hospital EDs (ethics approval: HREC/2023/QGC/95219). SNOMED CT-AU codes and names into ICD-10-CM
This dataset consists of 19,764 unique SNOMED-CTAU codes (completed in August 2024). A Microsoft Excel
(Australian Extension) numeric codes (e.g., 48694002) file containing the SNOMED CT-AU codes and names
and SNOMED-CT-AU names (e.g., “Anxiety reaction”) was uploaded to ChatGPT-4o. The prompt used
representing the diagnoses made to the ED over a 3-year for the conversion was refined through an iterative
period (August 2020 to August 2023). The current process to improve efficiency and reduce the risk of
evaluation utilizes a randomly selected 10% subset of this “hallucinations” (i.e., providing false information) and
data (n = 1,976) ( Table S1). data processing errors.
To convert the SNOMED CT-AU codes to ICD-10 It was necessary to state that ChatGPT4o could take
31
Clinical Modification (ICD-10-CM), a three-phase as much time as required to complete this task, otherwise
32
approach was employed. First, codes were manually the message would time out and cease to produce output.
converted by human raters. Second, the codes were Additionally, a limit was observed regarding the number
Volume 2 Issue 4 (2025) 94 doi: 10.36922/AIH025200045

