Page 26 - AIH-1-1
P. 26

Artificial Intelligence in Health                                                        NLP in EHR




            Table 1. Levels of NLP with their focus            EMR. Notably, the study did not consider images for
                                                               feature extraction, but it holds potential for prediction.
            NLP levels  Focus
                                                                         [26]
            Phonetic or   Pronunciation                          Cai  et al.  extracted numerical information using a
            phonological                                       rule-based approach. The abbreviations used in the study
            Morphological The smallest parts of words that carry meaning,   underwent manual review. The study faces challenges
                       suffixes, and prefixes                  related to overfitting due to additions, decision-making
            Lexical    Lexical meaning of words and parts of speech analyses  regarding  variable  boundaries,  and  differences  between
            Syntactic  Grammar and structure of sentences      the formats of clinical notes across different hospitals. The
                                                               addition of more keywords could facilitate the expansion
            Semantic   Meaning of words and sentences          of the study to multiple hospitals. In a separate study,
            Discourse  Structure of different kinds of text using document   Cai  et al.  worked on named entity recognition in
                                                                       [26]
                       structure
            Pragmatic  The knowledge that stems from the outside world  Chinese, conducting the study on data from two hospitals.
                                                               Thirukumaran et al.  identified surgical site infections
                                                                                [27]
            Abbreviation: NLP: Natural language processing.    from orthopedics notes, employing a rule-based approach
                                                               on data from a single institute. It is noteworthy that the
            as the absence of improvement in performance with the   study did not classify the infections.
            use of empirical methods, the unigram model requiring
                                                                             [28]
            to account for unigram, negation and consideration,   Dipaola  et al.  recognized syncope patients from
            information loss due  to the use  of template  notes, and   free notes at emergency departments in  Italian. Their
            external validation of the model lacked interoperability.   research was replicated in other languages, including the
            Further, generalization of the model is conceivable.   identification of patients with rheumatoid arthritis from
                                                                                [29]
            Workman et al.  identified correct and misspelled terms   free text in German . The imperative for performance
                        [21]
            within emergency department notes using small corpora of   optimization, testing with computational language experts,
            surgical pathology and emergency department documents.   and the implementation of encryption processes for
            The methods employed in the study  could potentially be   clinical notes is underscored to enhance protection against
                                        [21]
            extended to other domains.                         security breaches.
              Hanauer  et al.  performed a systematic review   3.1.4. Syntactic level
                            [22]
            investigating the utilization of electronic medical records   Gregg  et al.  focused on risk stratification in prostate
                                                                         [30]
            (EMR) in cancer-related research. Although only a small   cancer care. The algorithm demonstrated efficacy within a
            data  set  was  used  in  the  study,  there  is  a  potential  for   single institute but was limited to prostatectomy, rendering
                                            [23]
            application to a larger dataset. Baxter et al.  detected fungal   its applicability to the broader health system or health groups
            ocular involvement from critical care records. However,   limited. Clinical staging forms and electronic laboratory
            the study lacked a relative assessment of sensitivity and   results constituted the data sources for the study, facilitating
            specificity. Challenges, including the de-identification of   the identification of incidental pulmonary modules from
            records, issues with queries, and regular expressions, still   radiology reports through a rule-based approach. The
            need to be addressed. Notably, the study was performed   study’s limitations encompass the requirement of manual
            on data from the critical care unit of a single institute,   review in small amounts and the use of non-specific
            underscoring the importance of the inclusion of positive   ambiguous terminologies. To enhance performance, the
            cases in future research.                          application of a machine learning approach holds promise.

            3.1.3. Lexical level                               3.1.5. Semantic level
            Tao  et al.  automated information extraction from   Nath et al.  performed the extraction of information
                     [24]
                                                                        [31]
            prescriptions,  employing  unstructured  discharge  from echocardiology reports. However, the study’s
            summaries  as  the  primary  data  source.  Limitations  of   inclusion/exclusion criteria introduced certain limitations,
            the study include the de-identification of documents   compounded by the use of data sourced solely from a
            and the delineation of relationships among  entities. Liu   single institute. The extracted data elements held qualitative
            et al.  focused their study on named entity recognition   value but required manual review. To enhance the
                [25]
            from clinical text in Chinese. The use of fuzzy feature   algorithm’s applicability, consideration should be given to its
            engineering affects long short-term memory, which, in   implementation in a larger domain. Goldstein and Shahar ,
                                                                                                           [32]
            turn, can be applied to other specific domains. Li et al.    in their work on CliniText, focused on structured data
                                                        [25]
            conducted research on automatic detection using pediatric   while excluding images. The system’s testing was conducted

            Volume 1 Issue 1 (2024)                         20                        https://doi.org/10.36922/aih.2147
   21   22   23   24   25   26   27   28   29   30   31