Page 19 - AIH-1-4
P. 19

Artificial Intelligence in Health                                        AI scribe in clinical documentation



            like a low-hanging fruit, that would be easy to sell to the   2.4. Context window limitations
            burnt-out clinician with the promise of alleviating some   LLMs have a context window, which means they will
            of the burden of clinical documentation.  This has led   take into consideration a certain amount of input textual
                                              5
            to an explosion of startups offering such applications,   data to craft a response for the user. If the length of the
            with additional features such as recommendation of the   input data exceeds the context window, some information
            International Classification of Disease codes, patient   will likely be missed. In the context of AI scribes, if the
            instructions in simple language, and even some degree of   encounter goes on for too long and there is a large amount
            clinical decision-making.
                                                               of text in the input transcript, it is possible that the LLM
            2. The challenges                                  misses information because of the narrow context window,
                                                               leading to incomplete documentation.
            The AI scribe offers a potential solution to a problem
            that seemed impossible. However, this rapid adoption   2.5. Data security/Health Insurance Portability and
            has not been devoid of challenges. Current popular   Accountability Act (HIPAA) compliance
            large language models (LLM) like those that power   Many of the AI scribe applications utilize third-party
            ChatGPT and Google’s Gemini are very good at general   LLMs through APIs, requiring data to be passed on to
            tasks, but their performance is suboptimal in domain-  external  servers.  This  poses  a  data  security  risk,  as  the
            specific tasks. Thus, despite early excitement and some   organization loses control of the security and privacy of
            good feedback, it was soon realized that the level of   the data once it leaves their systems. In addition, if the
            diligence and accuracy of these models needed in   organization that owns the AI scribe application does
            clinical documentation may be out of reach. Some of the   not implement HIPAA-compliant technologies  for data
                                                                                                     8
            challenges are as follows:
                                                               transmission and storage, the confidentiality of patient
            2.1. Hallucinations and unfaithfulness             data may be compromised.

            LLMs are known to hallucinate, meaning that they can   3. The way forward
            “make up” information that may not be accurate.  This is
                                                   6,7
            because they have been trained on a large amount of textual   Even though AI scribes come with a unique set of challenges,
            data, the model tries to “fill in the gaps” with generated text   their place in healthcare is undeniable. Therefore, a lot of
            based on its dataset. This can be very helpful in some tasks   work is being done to improve their performance. Some of
            where accuracy is not a big concern; however, in healthcare,   the potential solutions are as follows:
            this poses a significant risk of introducing inaccuracies in   3.1. Fine-tuning
            clinical documentation which may compromise patient
            care.                                              The LLMs, like OpenAI’s Generative Pre-trained
                                                               Transformer, can be fine-tuned for a specific task with
            2.2. Omission of information                       the right data. In this process, the model is provided with
            The transcript of a clinical encounter may contain   sample input data and the expected response. The model
            information that is not clinically relevant, such as small talk   then learns from this data and modifies future output to
            between the patient and the clinician. The LLM may decide   match the desired output. Fine-tuning is relatively easy to
            to include that information, or conversely, decide not to   implement and may improve performance.
            include information that is clinically relevant, generating a   3.2. Selective information extraction
            note deficient in clinical information.
                                                               One potential solution to improve accuracy could be
            2.3. Note formatting inconsistencies               labeling information in the transcript based on their

            Even though there are generally accepted formats for   relevance, then omitting information that is labeled as
            clinical notes, each clinician has their own unique   not clinically relevant and including relevant information.
            style of note-taking. Some may prefer to document   This could provide the model with data that is relevant and
            problem-wise, while others may like it system-wise;   concise, reducing the amount of data provided as input,
            some like their notes in a descriptive format while   fitting it in the context as well as reducing computation
            others in bullets. The LLMs can be prompted to draft a   time.
            note in a certain format; however, their response is not   3.3. Domain-specific models
            always consistent, potentially leading to frustration for
            clinicians who expect their notes to be laid out in their   Models trained on curated medical data and designed
            preferred format.                                  specifically  for the task  of  clinical  note  generation  from


            Volume 1 Issue 4 (2024)                         13                               doi: 10.36922/aih.3103
   14   15   16   17   18   19   20   21   22   23   24