Page 110 - AIH-1-2
P. 110
Artificial Intelligence in Health Schema-less text2sql conversion with LLMs
models against our fine-tuned Flan-T5 models, which were In the first example, the natural language query is
explicitly trained on the MIMICSQL dataset. accurately transformed into SQL, selecting the count of
Comparison of our fine-tuned Flan-T5 models with distinct patient IDs from the DEMOGRAPHIC table where
baseline TREQS, as well as other LLMs such as Defog- the corresponding HADM_ID in the DIAGNOSES table
SQLCoder, LLaMA-2-7B, GPT-3.5, and GPT-4, highlights matches the specified ICD9 code “28803.” This showcases
the significant impact of compact LLMs and illustrates the Flan-T5’s ability to understand and incorporate complex
effectiveness of schema-less querying. A key finding is the medical terminology, such as ICD9 codes, into SQL
ability of smaller models, such as Flan-T5, to generate SQL queries effectively. While in the second example, the model
queries directly from natural language questions without accurately generates an SQL query that selects the count of
relying on explicit database schema information. This distinct patient IDs from the DEMOGRAPHIC table based
approach challenges the traditional schema-dependent on the presence of the specific drug code “NEOLEVO5”
paradigm. Furthermore, as illustrated in the Table 6, our in the PRESCRIPTIONS table. Despite the unfamiliar
study emphasizes the importance of model competence. term “neolevo5,” Flan-T5 effectively incorporates it into
The Flan-T5 models demonstrated proficiency in the SQL query, showcasing its ability to comprehend and
interpreting complex medical terminology, even when handle diverse medical and pharmaceutical terminology.
certain terms were absent from the model’s vocabulary. This capability is vital in healthcare, where accuracy and
domain-specific language are paramount. Our findings
Table 5. Comparison with current state‑of‑the‑art models in encourage further exploration into strategies that reduce
the test set model and data complexity, focusing on task-oriented
models suitable for commodity hardware.
Model Parameter size LFA
TREQS 2.8 M 0.48 5. Conclusion
TREQS+Recover 2.8 M 0.55 This study investigates the efficacy of smaller, task-specific
GPT-3.5-Turbo 20 B 0.60 language models compared to more complex LLMs in the
LLAMA-2-7B 7 B 0.60 Text-to-SQL task, with a focus on the healthcare domain
Defog-SQLCoder 15 B 0.65 using the MIMICSQL dataset. Our findings reveal the
GPT-4 - 0.70 remarkable performance of the fine-tuned Flan-T5 models,
Flan-T5 Base (Ours) 220 M 0.56 particularly Flan-T5 Large, which achieved an LFA score of
0.85. This score surpasses the current state-of-the-art model,
Flan-T5 Large (Ours) 770 M 0.85 Defog-SQLCoder, as well as other advanced LLMs such as
Notes: Flan-T5 Large outperforms all other models. The parameter size LLaMA-2-7B, GPT-3.5-Turbo, and GPT-4. Our approach,
of GPT-4 has not been disclosed by OpenAI. The boldface indicates the advocating for the removal of schema definitions from input
best model with the highest score achieved compared to other models,
indicating its superior performance in the LFA metric. prompts and training separate models for distinct schemas,
Abbreviations: M: Million; B: Billion. has proven effective, requiring less hardware resources and
data for training. These findings underscore the potential
Table 6. Examples demonstrating proficiency in interpreting of tailored compact language models for domain-specific
complex medical terminology using our fine‑tuned Flan‑T5 applications, opening avenues for more efficient and effective
Large model natural language understanding in specialized contexts.
Input question SQL query generated by Flan‑T5 Large model Acknowledgments
How many SELECT COUNT (DISTINCT
patients have DEMOGRAPHIC."SUBJECT_ID") None.
diagnoses icd9 FROM DEMOGRAPHIC
code 28803? INNER JOIN DIAGNOSES ON DEMOGRAPHIC. Funding
HADM_ID=DIAGNOSES.HADM_ID
WHERE DIAGNOSES."ICD9_CODE" = '28803'; None.
Which patients SELECT COUNT (DISTINCT Conflict of interest
have a neolevo5 DEMOGRAPHIC."SUBJECT_ID")
drug code? FROM DEMOGRAPHIC The authors declare no competing interests.
INNER JOIN PRESCRIPTIONS ON
DEMOGRAPHIC.HADM_ID=PRESCRIPTIONS. Author contributions
HADM_ID
WHERE PRESCRIPTIONS."FORMULARY_ Conceptualization: Youssef Mellah, Veysel Kocaman
DRUG_CD" = 'NEOLEVO5'; Investigation: All authors
Volume 1 Issue 2 (2024) 104 doi: 10.36922/aih.2661

