Page 126 - AIH-1-2
P. 126

Artificial Intelligence in Health                                              SDoH in clinical narratives



              To analyze the link between article features and SDoH   3.3. Prevalence of social determinants of health
            mentions, we conducted six logistic regression analyses   mentions
            using the Python package statsmodels 0.14.0 to gauge the   Among the total case reports examined, 20,420 (4.4%) case
            adjusted odds ratio (AOR) for each article trait. We also   reports included references to at least one SDoH category.
            employed a stepwise additive method,  where features   A breakdown revealed that 17,765 case reports specifically
                                            35
            that could enhance the likelihood of the model were   mentioned race/ethnicity, followed by 1,991 articles that
            sequentially incorporated with a  P-value threshold of   discussed marital status, 524 on sexual orientation, 284
            0.001 for the likelihood ratio test.
                                                               on immigrant status, 63 on spiritual beliefs, and 60 on
            3. Results                                         homelessness. The mean and confidence intervals of the
                                                               mentioned rates within the study period are summarized
            3.1. Study population and data inclusion           in Table 2.
            We analyzed a comprehensive dataset comprising 463,546   The analysis of the proportion of clinical cases
            clinical case reports indexed in Medline from 1975   reporting SDoH within the study period indicated a
            through 2022. The distribution of the articles based on   statistically significant association between publication
            four key information (author’s geographic region, journal’s   year and race/ethnicity (P < 0.001), sexual orientation
            geographic region, journal specialty, and clinical diagnosis)   (P < 0.001), and homelessness (P < 0.001), respectively.
            is displayed in Table 1.                           Notably, there was a peak of sexual orientation mentions
            3.2. Recall and precision of identifying mentions of   from 1980 to 1995, and we hypothesized that this could
            the social determinants of health                  be  related  to  the  rise  of  acquired  immunodeficiency
                                                               syndrome (AIDS) cases, as depicted in Figure S3. There
            In our corpus analysis, the SDoH identification precisions   was also a prominent increase in race/ethnicity mentions
            were 99.3% (95% confidence interval [CI]: 99.2 – 99.4%)   between 2011 and 2013 (Figure S4) and a less evident but
            for race/ethnicity, 90.2% (95% CI: 88.8 – 91.4%) for marital   statistically significant increase in homelessness mentions
            status, 90.8% (95% CI: 86.9–93.6%) for population group,   since 1990.
            97.4% (95% CI: 95.6 – 98.4%) for sexual orientation, 100%
            (95% CI: 94.6 – 100%) for housing, and 98.4% (95% CI:   3.4. Factors associated with reporting social
            91.7 – 99.7%) for spiritual beliefs.               determinants of health
              During external validation, the precision results were   3.4.1. Race/ethnicity
            97.4% (95% CI: 86.5 – 99.5%) for race/ethnicity, 100%   Significant associations were observed between the author’s
            (95% CI: 92.3 – 100%) for marital status, 88.9% (95% CI:   geographic origins and the frequency of race/ethnicity
            56.5 – 98.0%) for population group, 93.8% (95% CI: 71.7   mentions. Authors from sub-Saharan Africa were most
            –  98.9%)  for  sexual  orientation,  98.6%  (95%  CI:  92.3  –   likely to discuss race/ethnicity (AOR: 4.47; 95% CI: 3.96 –
            99.7%) for housing, and 83.0% (95% CI: 70.8 – 90.8%) for   5.04), followed by the Caribbean (AOR: 3.31; 95% CI: 2.24
            spiritual beliefs.                                 – 4.89), Southeast Asia (AOR: 2.89; 95% CI: 2.58 – 3.25),
              The recalls in the external validation were 90.2% (95%   East Asia (AOR: 2.00; 95% CI: 1.90 – 2.09), and North
            CI: 77.5 – 96.1%) for race/ethnicity, 97.9% (95% CI: 88.9   America (AOR: 1.77; 95% CI: 1.68 – 1.86). Conversely,
            – 99.6%) for marital status, 88.9% (95% CI: 56.5 – 98.0%)   authors from the Indian subcontinent (AOR: 0.69; 95% CI:
            for population group, 100% (95% CI: 79.6 – 100%) for   0.62 – 0.76) and Middle East (AOR: 0.77; 95% CI: 0.70 –
            sexual orientation, 85.2% (95% CI: 75.9 – 91.37%) for   0.84) were less inclined to mention race/ethnicity in their
            housing, and 83.0% (95% CI: 70.8 – 90.8%) for spiritual   case reports.
            beliefs.                                             The journal’s geographic region also exerted an
              In our analysis comparing the recall and precision   independent influence on race/ethnicity mentions.
            of  the  JSL SDoH-NER  model with those of  zero-shot   Journals originating from Australia-Oceania (AOR: 1.34;
            learning (i.e., GPT-3.5 and GPT-4), both JSL and GPT-4   95% CI: 1.17 – 1.53) and Western Europe (AOR: 1.30; 95%
            displayed  comparable  results.  Notably,  some  differences   CI: 1.18 – 1.43) were slightly more prone to include race/
            were evident: JSL outperformed GPT-4 in precision for   ethnicity. In contrast, journals from East Asia (AOR: 0.48;
            marital status (p = 0.005; GPT-4 scored 82.9%; 95% CI:   95% CI:  0.43 –  0.54), Eastern  Europe (AOR:  0.54; 95%
            67.3–91.9%) and housing (p < 0.001; GPT-4 scored 82.9%;   CI: 0.45 – 0.64), and South America (AOR: 0.55; 95% CI:
            95% CI: 67.3–91.9%). The results of this comparison are   0.43 – 0.69) had much fewer race/ethnicity mentions than
            detailed in Figures S1 and S2.                     expected.


            Volume 1 Issue 2 (2024)                        120                               doi: 10.36922/aih.2737
   121   122   123   124   125   126   127   128   129   130   131