Page 40 - AIH-1-3
P. 40
Artificial Intelligence in Health Predicting mortality in COVID-19 using ML
ended up with 3,809,119 COVID-19 patients with valid the target attribute used for predictions, and the remaining
attribute values. Fourth, we removed the non-correlated four being the “Registration ID” and three indicators. The
attributes – for instance, those related to geography, created “gold standard dataset” (Table 2) consisting of the
residency, and indigeneity – resulting in 24 attributes. 23 attribute values of the 3,809,119 patients was saved in
Finally, we transformed the “Date of Death” attribute CSV file format. The dataset cleansing process flowchart
into a categorical attribute by renaming it “Survived” is shown in Figure 1. The dataset value distribution for
and replacing all the “9999-99-99” date values with “1” the categorical attributes is depicted in Figure 2, whereas
(Yes) and the rest with “0” (No). We also combined the Figure 3 illustrates the distribution of age groups for both
“Symptom Onset Date” and “Hospital Admission Date” genders. In Figure 3, we observe that most of the patients
attributes into a numerical attribute labeled “Days from
Symptom to Hospitalization.” Hence, we settled with 23
attributes, as shown in Table 1. Nineteen of these attributes Table 2. The golden standard dataset table
were included in the ML models, with “Survived” being Attribute name Value distribution
Registration ID 3,809,119 unique values
Table 1. The 23 attributes of each patient Sex 1,921,058: Female; 1,888,061: Male
Age Mean: 40.723; Standard deviation:
Attribute name Values 17.173; Minimum: 0; Maximum: 121
Registration ID Patient’s unique identification code Smoker 250,558: Yes; 3,558,561: No
Sex 1: Female; 2: Male Pneumonia 395,528: Yes; 3,413,591: No
Age Numerical positive Diabetes 403,336: Yes; 3,405,783: No
Smoker 1:Yes; 2:No Obesity 448,412: Yes; 3,360,707: No
Pneumonia 1:Yes; 2:No COPD 31,477: Yes; 3,777,642: No
Diabetes 1:Yes; 2:No Asthma 74,682: Yes; 3,734,437: No
Obesity 1:Yes; 2:No Immunosuppressed 23,825: Yes; 3,785,294: No
COPD 1:Yes; 2:No Hypertension 526,891: Yes; 3,282,228: No
Asthma 1:Yes; 2:No Cardiovascular disease 44,362: Yes; 3,764,757: No
Immunosuppressed 1:Yes; 2:No Chronic kidney failure 42,703: Yes; 3,766,416: No
Hypertension 1:Yes; 2:No Other chronic disease 60,307: Yes; 3,748,812: No
Cardiovascular disease 1:Yes; 2:No Pregnancy 29,332: Yes; 1,891,726: No; 1,888,061:
Chronic kidney failure 1:Yes; 2:No Not applicable (Male)
Other chronic disease 1:Yes; 2:No Contact with 1,519,968:Yes; 2,289,151: No
COVID-19 case
Pregnancy 1:Yes; 2:No; 97: Not applicable
(Male) Laboratory result 3,802,238: SARS-CoV-2 positive; 0:
SARS-CoV-2 negative; 6,881: Not clear
Contact with COVID-19 case a 1:Yes; 2:No
Final classification 3,809,119: Confirmed cases; 0: Invalidly
Laboratory result a 1: SARS-CoV-2 positive; 2: identified cases; 0: Unconfirmed cases
SARS-CoV-2 negative; 3, 4: Not clear
Patient type 3,284,671: Not admitted; 524,448:
Final classification a,b 1, 2, 3: Confirmed case; 4: Invalidly Admitted
identified case; 5, 6, 7: Unconfirmed
case Intubated 60,539: Yes; 463,909: No; 3,284,671: Not
applicable
Patient type 1: Not admitted; 2: Admitted
ICU 42,095: Admitted to ICU; 482,353:
Intubated 1:Yes; 2:No; 97: Not applicable Not admitted to ICU; 3,284,671: Not
ICU 1: Admitted to ICU; 2:Not admitted applicable
to ICU; 97: Not applicable Days from symptom to Mean: 3.673; Standard deviation: 3.160;
Days from symptom to Numerical positive (created hospitalization Minimum: -13 ; Maximum: 43
a
hospitalization c attribute) Survived 3,558,390: Yes; 250729: No
Survived b 1:Yes; 2:No (created attribute) Note: The minimum value is a negative number in the cases where the
a
Notes: Indicators; COVID-19 sample classification; Created patient contacted the disease inside the hospital where he was being
c
a
b
attributes. treated.
Abbreviations: COPD: Chronic obstructive pulmonary disease; ICU: Abbreviations: COPD: Chronic obstructive pulmonary disease; ICU:
Intensive care unit. Intensive care unit.
Volume 1 Issue 3 (2024) 34 doi: 10.36922/aih.2591

