Page 108 - AIH-1-2
P. 108

Artificial Intelligence in Health                                 Schema-less text2sql conversion with LLMs




            Table 4. Details of the tuning scenario
            Parameter                        Value                 Description
            output_dir                       /logs_long            Directory where the trained model and logs will be saved
            per_device_train_batch_size      1                     Number of training samples per device (GPU) in each batch
            per_device_eval_batch_size       1                     Number of evaluation samples per device (GPU) in each batch
            predict_with_generate            True                  Whether to use generation during evaluation
            fp16                             False                 Whether to use mixed precision training with FP16
            learning_rate                    5e-5                  Learning rate for training
            num_train_epochs                 5                     Number of training epochs
            logging_dir                      ./logs_long           Directory for logging training metrics and logs
            logging_strategy                 steps                 Strategy for logging training metrics (steps or epoch)
            logging_steps                    500                   Interval for logging training metrics
            evaluation_strategy              epoch                 Strategy for evaluation during training (steps or epoch)
            save_strategy                    epoch                 Strategy for saving checkpoints during training (steps or epoch)
            save_total_limit                 2                     Maximum number of checkpoints to keep


































            Figure 3. Comparison of prompt construction for SQL query generation. This figure illustrates the detailed prompts used for LLaMA-2-7B, GPT-3.5-
            Turbo, GPT-4, and DeFog-SQLCoder, highlighting the inclusion of schema information in each, except our Flan-T5 models in compliant with the schema-
            less approach. Since TREQS is not regarded as an LLM, no prompt generated for that approach.

            query exactly aligns with the ground-truth as described in   in accurately translating natural language questions into
            formula (III).                                     their corresponding SQL queries, reflecting a deeper
                  NCLF                                         understanding of the logic and relationships inherent
            LFA  =                                     (III)   in these representations. This makes LFA a particularly
                   TNI
                                                               pertinent measure for the text-to-SQL task, as it
              Where NCLF is the number of correct logical forms,   directly gauges the fine-tuned models’ effectiveness in
            and TNI is the total number of instances in the test set.   interpreting  natural  language  and  generating  precise
            A higher LFA score signifies a model’s enhanced capability   SQL queries.


            Volume 1 Issue 2 (2024)                        102                               doi: 10.36922/aih.2661
   103   104   105   106   107   108   109   110   111   112   113