Page 31 - AIH-1-1
P. 31

Artificial Intelligence in Health                                                        NLP in EHR



            their synonyms. Emphasizing the expansion of methods to   study is constrained by a small sample size. König et al.
                                                                                                           [54]
            other domains is deemed essential.                 extracted clinical events from discharge letters, employing a
              Koza et al.  studied lexical and semantic levels extracted   combination of lexical, semantic, and morphological levels.
                      [47]
            from radiology reports in the Spanish language. The study   The study, based on data from the German language, has
                                                               the potential for application to other languages.
            is limited by typing and spelling errors in the radiological
                                                                             [55]
            findings. Future research perspectives include addressing   Oliveira  et al.  conducted surveillance of cervical,
            negation findings, testing the methodology on a more diverse   anal, and pre-cancer conditions using a combination of
                                                        [48]
            corpus, and creating a more complex dictionary. Lee et al.    rule-based and machine learning approaches. Notably, the
            extracted polyp information from colonoscopy reports, with   classification in the study was performed at the document
            manual data extraction affecting the degree of accuracy.   level rather than at the patient level. The reports considered
            The study’s findings suggest the potential for replication in   for the study originated from a single healthcare system,
                                       [49]
            other healthcare settings. Shen et al.  used a combination of   suggesting the possibility of expanding the study to multiple
            lexical and semantic levels, focusing on surgical site infection   institutes to address interoperability challenges among
                                                                             [56]
            from clinical notes in English related to colorectal surgery.   them. Wang et al.  conducted a study to recognize named
            The study revealed a relatively low F1 score, indicating room   entities  in  Chinese,  using  both  phonetic  and  semantic
            for  improvement.  Future  studies  could  explore  various   levels. The study focused solely on character-based named
            machine learning algorithms and sub-language supporting   entity recognition (NER), leaving room for potential
            techniques. Topaz et al.  used a combination of rule-based   exploration of word-based NER in future research.
                              [50]
            and machine learning approaches to study the semantic level,   3.1.6. Discourse Level
            processing clinical notes in English within a limited domain.
                                                                        [57]
            There is potential for future development by incorporating   Tou  et al.  focused on the automatic detection of
            different sources. Topaz et al.  investigated neuropsychiatric   infections before hospitalization using records from a
                                  [50]
            symptoms from free-text clinical notes in the English   surgical emergency department in the Chinese language.
            language. Due to the use of inadequate information, there   Limitations of the study include the lack of Chinese
            is potential to expand the symptoms category. In addition,   resources and feature extraction relying on manually
            since the findings are not validated, validating them would   prepared wordlists. Future studies could explore a
            create a scope for future research.                reinforcement-based approach.
                                                                              [58]
              Shi  et al.   performed  surveillance  of  surgical  site   Bozkurt  et  al.  centered their research on lesion
                      [51]
            infection using clinical notes in English. The study is   summarization and cancer response from mammography
            susceptible to mention-level and document-level errors,   reports, employing a rule-based approach. The study utilized
            and their removal presents avenues for future research   small datasets from a single institute. Exploring larger datasets,
            perspectives. Annotation errors also contribute to the study’s   enhancing generalizability, and incorporating convolution
            limitations. Senders et al.  focused on diagnosing brain   networks represent potential avenues for future scope.
                                [52]
            metastasis based on free-text radiology reports, employing   3.1.7. Pragmatic level
            a machine learning approach. The human classification
                                                                        [59]
            in the study introduces ambiguities, and the reliance on   Doan et al.  identified Kawasaki disease from emergency
            data from a single institute limits interoperability. External   department notes. The study encountered limitations
            validation of findings, extraction of higher-level concepts,   such as a limited variety of syntax, spelling errors, and
            use of unsupervised machine learning approaches, and   hypothetical clauses, for which the tool needs additional
            automated medical text analysis are crucial aspects   training. To enhance the study, future efforts could involve
                                                [53]
            guiding further studies. Misra-Hebert  et al.  aimed to   incorporating data from several visits and conducting
            detect hypoglycemia using information extracted from   timestamp analysis.
            clinical progress notes in English. However, the study is   Most of the articles reviewed emphasized the semantic
            confined to one EMR system and lacks information on the   level (67%), with a relatively smaller proportion related
            duration of diabetes. Wulff et al.  endeavored to convert   to phonetic and pragmatic levels. Annotation processes
                                      [15]
            unstructured information from the pediatric ICU of   were conducted by annotators whose expertise varied,
            Hannover Medical School into a structured format, using   leading to the introduction of personalized biases into the
            free text in the English language as the information source.   annotation process. Proficiency in language, grammar,
            The study’s limitation lies in the absence of a retrospective   and vocabulary is essential when conducting analyses
            aspect, as it did not consider the patient’s history from the   at the semantic, syntactic, or lexical levels. It is worth
            past month, past week, or even yesterday. In addition, the   noting that tools like UMLS, while effective in English,


            Volume 1 Issue 1 (2024)                         25                        https://doi.org/10.36922/aih.2147
   26   27   28   29   30   31   32   33   34   35   36