Page 64 - AIH-1-3
P. 64

Artificial Intelligence in Health                                       ChatGPT in writing scientific articles






                         A                                   B












                         C                                   D













                         E                                    F












            Figure 3. Impact factors (IF) distribution of all unique sources for ChatGPT-generated articles in the medical fields of cardiology, oncology, and remote
            medical examination. The X-axis of the charts presents the ranges of the IF distribution, whereas the Y-axis of the charts presents the source quantity.
            (A, C, and E) correspond to the results of ChatGPT 3.5, and (B, D, and F) correspond to the results of ChatGPT 4. Image created using MATLAB (R2021b,
            The MathWorks Inc., USA).
              The noticeable peak in the lower IF range in Figure 4A   related to the topics of the articles.  Figure  5 shows two
            suggests that ChatGPT 3.5 often cited sources with low   IF distributions of article sources generated by the third
            IF for reliable or semi-reliable content.  Figure  4C, also   prompt for ChatGPT 4.
            associated with ChatGPT 3.5, shows a predominance of
            fictitious sources with IFs in the range of 0 – 16.  For the topic “biotelemetry in cardiology,” most of the
                                                               sources provided by ChatGPT 4 have an IF below 8, and
              For ChatGPT 4, Figure 4B reveals a single isolated bar   only a few have an IF above 16. For the topic “biotelemetry
            in the IF range from 20 to 30 for reliable or semi-reliable   in oncology,” there is a smoother decrease in the number of
            sources, indicating a narrower scope of sourcing compared   sources as the IF decreases.
            to ChatGPT 3.5. Figure 4D, on the other hand, displays a
            bimodal distribution of fictitious sources, with clusters in   3.5. Characteristics of ChatGPT responses
            the low and middle of the IF spectrum.
                                                               On examination of the responses generated by ChatGPT 4,
              The analysis of fictitious sources shows that both   it became evident that certain characteristics could be
            ChatGPT versions are more likely to invent sources with   discerned. These characteristics included the structure,
            an IF of <16. However, sources with very high IF are also   format, content, and thematic remarks of the responses. In
            present in the sample.                             certain instances, ChatGPT 4 appends a note to the source
              Based on the results of the source analysis, it is also   of the article indicating that the content is implausible. An
            important to note that the IF values are also closely   example of such a note is:


            Volume 1 Issue 3 (2024)                         58                               doi: 10.36922/aih.2592
   59   60   61   62   63   64   65   66   67   68   69