Page 108 - AIH-1-2

P. 108

Artificial Intelligence in Health Schema-less text2sql conversion with LLMs

Table 4. Details of the tuning scenario
Parameter Value Description
output_dir /logs_long Directory where the trained model and logs will be saved
per_device_train_batch_size 1 Number of training samples per device (GPU) in each batch
per_device_eval_batch_size 1 Number of evaluation samples per device (GPU) in each batch
predict_with_generate True Whether to use generation during evaluation
fp16 False Whether to use mixed precision training with FP16
learning_rate 5e-5 Learning rate for training
num_train_epochs 5 Number of training epochs
logging_dir ./logs_long Directory for logging training metrics and logs
logging_strategy steps Strategy for logging training metrics (steps or epoch)
logging_steps 500 Interval for logging training metrics
evaluation_strategy epoch Strategy for evaluation during training (steps or epoch)
save_strategy epoch Strategy for saving checkpoints during training (steps or epoch)
save_total_limit 2 Maximum number of checkpoints to keep

Figure 3. Comparison of prompt construction for SQL query generation. This figure illustrates the detailed prompts used for LLaMA-2-7B, GPT-3.5-
Turbo, GPT-4, and DeFog-SQLCoder, highlighting the inclusion of schema information in each, except our Flan-T5 models in compliant with the schema-
less approach. Since TREQS is not regarded as an LLM, no prompt generated for that approach.

query exactly aligns with the ground-truth as described in in accurately translating natural language questions into
formula (III). their corresponding SQL queries, reflecting a deeper
NCLF understanding of the logic and relationships inherent
LFA = (III) in these representations. This makes LFA a particularly
TNI
pertinent measure for the text-to-SQL task, as it
Where NCLF is the number of correct logical forms, directly gauges the fine-tuned models’ effectiveness in
and TNI is the total number of instances in the test set. interpreting natural language and generating precise
A higher LFA score signifies a model’s enhanced capability SQL queries.

Volume 1 Issue 2 (2024) 102 doi: 10.36922/aih.2661

103 104 105 106 107 108 109 110 111 112 113