Page 25 - AIH-1-2
P. 25
Artificial Intelligence in Health LLMs-Healthcare: Application and challenges
area will improve, but continued research is essential to Another challenge is the capacity of LLMs to consider
fully understand and harness its potential. the comprehensive clinical picture, including patient
In a study by Haemmerli et al., the capability of functional status, which is often a nuanced judgment call
13
ChatGPT was explored in the context of central nervous made by experienced physicians. ChatGPT’s moderate
13
system tumor decision-making, specifically for glioma performance in this area, as seen in Haemmerli et al.,
indicates a gap between current LLM capabilities and the
management. Using clinical, surgical, imaging, and complex decision-making processes in medical practice.
immunopathological data from ten randomly chosen Furthermore, the integration of LLMs into existing
glioma patients discussed in a tumor board, ChatGPT’s medical workflows raises concerns. For example, Gebrael
recommendations were compared with those of seven central el al. study on triage in metastatic prostate cancer showed
8
nervous system tumor experts. While most patients had that while ChatGPT had high sensitivity, its low specificity
glioblastomas, findings revealed that ChatGPT’s diagnostic for discharges could lead to operational inefficiencies.
accuracy was limited, with a notable discrepancy in glioma Integrating LLMs within health-care systems also poses
classifications. However, it demonstrated competence in challenges in data privacy, interoperability, and the need
recommending adjuvant treatments, aligning closely with for robust IT infrastructure.
expert opinions. Despite its limitations, ChatGPT shows
potential as a supplementary tool in oncological decision- Finally, the role of LLMs in patient education and
making, particularly in settings with constrained expert communication is not without limitations. Inconsistencies
resources. in ChatGPT’s responses to breast cancer prevention and
screening demonstrated by Haver et al. This inconsistency
10
In a study on the effectiveness of ChatGPT in offering highlights the importance of human oversight in verifying
cancer treatment advice, Chen et al. scrutinized the model’s the information provided by LLMs, to ensure it aligns
14
alignment with the NCCN guidelines for breast, prostate, with established medical guidelines and practices. In
and lung cancer treatments. Through four diverse prompt summary, while LLMs present exciting opportunities
templates, the study assessed if the mode of questioning for enhancing cancer care, their current limitations in
influenced the model’s responses. While ChatGPT’s accuracy, comprehensive clinical assessment, integration
recommendations aligned with NCCN’s guidelines in into existing systems, and patient education necessitate a
98% of the prompts, 34.3% of these recommendations cautious and critical approach. These models should be
also presented information that needed to be more in viewed as supplementary tools that augment, rather than
sync with the NCCN guidelines. The study concluded replace, the expertise of medical professionals. Continuous
that, despite its potential, ChatGPT’s performance in evaluation, refinement, and ethical consideration are
consistently delivering reliable cancer treatment advice essential to harness the full potential of LLMs in oncology.
was unsatisfactory. Consequently, patients and medical
professionals must exercise caution when relying on 3. Skin care (dermatology)
ChatGPT and similar tools for educational purposes. Our skin is a barrier against external threats such as viruses,
2.1. Challenges associated with LLMs as a decision- bacteria, and other harmful organisms. Dermatology is the
support tool in cancer care branch of medicine dealing with skin diseases. There has
been a surge in cases related to skin diseases in the past years,
While integrating LLMs like ChatGPT into oncology affecting people of all ages. Common skin-related diseases
15
shows promise, particularly in decision support for cancer include acne, alopecia, bacterial skin infections, decubitus
treatment, it also presents several critical challenges, as ulcers, fungal skin diseases, pruritus, and psoriasis.
16
discussed in the previous section. These challenges must Traditional dermatology diagnosis is based on a visual
be addressed to ensure LLMs’ safe and effective use in inspection of skin features and subjective evaluation by a
high-stakes medical environments. First, the issue of dermatologist. The realm of dermatology diagnosis faces
17
accuracy and precision in LLMs is a significant concern. several significant challenges. First, accurately interpreting
For instance, in a study by Haemmerli et al. on glioma skin disease imagery is complex due to the wide variety
13
therapy, ChatGPT demonstrated limitations in accurately of skin conditions and their subtle visual differences. This
classifying glioma types. Similarly, the study by Lukac et al. task requires a high level of expertise, by dermatologists
7
revealed errors in patient-specific therapy suggestions, such obviously in shortage, especially in remote or underserved
as misidentifying patients for trastuzumab therapy. These areas. Finally, creating patient-friendly diagnostic reports is
inaccuracies highlight the risk of potential misdiagnoses or another hurdle because preparing reports that are detailed
inappropriate treatment recommendations, which could yet understandable to non-specialists is a time-consuming
have profound implications for patient care. and labor-intensive endeavor for dermatologists.
Volume 1 Issue 2 (2024) 19 doi: 10.36922/aih.2558

