Page 120 - AIH-1-2
P. 120

Artificial Intelligence in Health                                Medical instruction-tuning for Japanese LLMs



            implication suggested by the results of this experiment   Conflict of interest
            is that “a more powerful base model is preferable to start
            with,” an overall performance improvement by upgrading   The authors declare they have no competing interests.
            the base model is highly expected.                 Author contributions
            6. Conclusion                                      Conceptualization: Issey Sukeda, Satoshi Kodera
                                                               Formal analysis: Issey Sukeda
            In this paper, we explore the capabilities and limitations
            of LoRA through various comparative analyses in the   Investigation: Issey Sukeda, Satoshi Kodera
                                                               Methodology: Issey Sukeda, Masahiro Suzuki, Hiroki Sakaji
            medical domain. LoRA-based instruction-tuning, while   Writing – original draft: Issey Sukeda
            avoiding an excessive number of steps, can partially
            integrate domain-specific knowledge into LLMs, with   Writing – review & editing: Issey Sukeda, Masahiro Suzuki,
                                                                  Hiroki Sakaji
            larger models demonstrating more pronounced effects.
            We also observe a decrease in performance after additional   Ethics approval and consent to participate
            pretraining on scarce training dataset. Furthermore,
            our  results  underscore  the  potential  of  adapting  larger   Not applicable.
            English-centric models for Japanese applications in   Consent for publication
            domain adaptation, while also highlighting the persisting
            limitations of Japanese-centric models including the   Not applicable.
            deterioration of 1-shot performance after instruction-
            tuning. Our findings here suggest that, at present, the   Availability of data
            most promising approach in constructing a domain-  Journal articles used in the study are available online in
            specific LLM is applying QLoRA to larger English-centric   PDFs. ChatGPT is utilized for generating and cleansing the
            base models.                                       data. IgakuQA is available online. JJSIMQA is not made
              Given the current situation, the clinical translation   publicly available.
            of medical LLMs into real-life applications still falls   Further disclosure
            short of our expectations. To fully harness the potential
            of medical LLMs in healthcare settings, addressing both   Part of findings has been presented in  Deep Generative
            the performance limitations and the associated security   Models  for  Health  in  NeurIPS  2023.  In addition, a
            and privacy concerns is imperative. Further  research   submission made to NeurIPS workshop is available on
            and development efforts are needed to enhance the   arXiv (https://doi.org/10.48550/arXiv.2310.10083).
            accuracy  and  reliability of  these models, ensuring
            they meet the rigorous standards required for clinical   References
            decision.                                          1.   Singhal K, Azizi S, Tu T, et al. Large language models encode
                                                                  clinical knowledge. Nature. 2023;620:172-180.
              Furthermore, the integration of medical LLMs
            with other AI technologies, such as those utilized in      doi: 10.1038/s41586-023-06291-2
            electrocardiograms and electronic medical records, has   2.   Singhal K, Tu T, Gottweis J,  et al. Towards Expert-level
            the potential to amplify their impact significantly. By   Medical Question Answering with Large Language Models.
            collaborating and cohesively using these AI systems   arXiv:2305.09617 [arXiv Preprint], 2023.
            along with medical LLMs, physicians can achieve a more      doi: 10.48550/arXiv.2305.09617
            comprehensive understanding of patient data, with which
            they could formulate more personalized treatment plans to   3.   Tu T, Azizi S, Driess D, et al. Towards generalist biomedical
                                                                  ai. NEJM AI. 2024;1(3).
            improve patient outcomes.
                                                                  doi: 10.48550/arXiv.2307.14334
            Acknowledgments                                    4.   Wang G, Yang G, Du Z, Fan L, Li X. CLINICALGPT: Large
            None.                                                 Language Models Finetuned with Diverse Medical Data
                                                                  and Comprehensive Evaluation. arXiv:2306.09968  [arXiv
            Funding                                               Preprint], 2023.
            This study was supported by the Japan Agency for      doi: 10.48550/arXiv.2306.09968
            Medical Research and Development (Grant Number:    5.   Sugimoto K, Iki T, Chida Y, Kanazawa T, Aizawa A.
            JP23hk0102078h0003).                                  JMedRoBERTa: A  Japanese Pre-trained Language Model


            Volume 1 Issue 2 (2024)                        114                               doi: 10.36922/aih.2695
   115   116   117   118   119   120   121   122   123   124   125