Page 50 - MSAM-2-1
P. 50
Materials Science in Additive Manufacturing Data imputation strategies of PBF Ti64
research question. In this case, it is necessary to impute a of missing data, as discussed above. For the MICE-
large proportion of missing data to maintain a sufficient imputed dataset, the standard deviations of ultimate tensile
sample size and to include important variables in the strength and yield strength do differ (147.47 vs. 218.33
analysis. However, imputing a high proportion of missing GPa, and 189.93 vs. 269.83 GPa, respectively) but given
data can also increase the risk of bias and lead to inaccurate the proportion of missing data for these two variables, the
results. Therefore, it is important to carefully evaluate the imputed values may be reasonable. Thus, individual values
validity of imputed data through various methods such as of the imputed dataset have to be checked to ascertain if
statistical summaries and comparison with observed data. the imputations are sensible.
Statistical summaries can be used to validate imputed Other than using the graphical and statistical methods
values, and Table 2 shows the compiled observed and to evaluate the imputed datasets, imputed values are also
imputed datasets. In the kNN-imputed dataset, the manually checked for any illogical values for the material
minimum and maximum values for all imputed variables properties: density and porosity values should add up to
remained unchanged from the original values. The mean 100%, and the microhardness should be higher than the
[40]
and standard deviation of observed and imputed energy macrohardness . Imputed values for process parameters
density values were similar (89.20 vs. 89.07 J/mm , and should also fall within the processing window.
3
68.05 vs. 65.14 J/mm , respectively). However, variables
3
such as laser spot showed disparities in mean (125.56 vs. 3.2. Comparison of imputation models
106.03 µm) and standard deviation (133.86 vs. 97.57 µm), Comparing the distribution graphs, all three imputed
possibly due to differences in the proportion of missing datasets have relatively close distributions to the original
data for each variable, with energy density having 364 dataset for the process parameters, as well as density and
observed values out of a total of 401, compared to only 194 porosity variables. The discriminating features are the
observed values for laser spot. remaining variables, namely, elongation, microhardness,
Similarly, for the MICE-imputed and GINN-imputed microhardness, ultimate tensile strength, yield strength,
datasets, the minimum and maximum values for all and Young’s modulus. The model performs better for the
imputed variables did not change. There were also processing parameters as they are deterministic and depend
disparities in mean and standard deviations for variables on fewer external factors. In addition, more datapoints are
laser focus and laser spot, possibly due to a large proportion available for the processing parameters as they are reported
in most of the studies. The material properties have a
higher deviation because they have fewer datapoints as not
Table 1. Percentage of missing values for each variable
every study focused on every aspect of material properties.
Variables Missingness (%) There are also other factors such as different scan strategies,
Laser power (W) 0.00 microstructures, and mechanical test conditions that are
Laser type (0 for cw, 1 for pw) 0.00 not captured in the dataset, leading to poorer imputation
Layer thickness (µm) 0.25 accuracy. As seen from the cumulative distribution plots
(Figure 9) and distribution plots (Figure 10) of the three
Hatch spacing (µm) 5.93
imputed datasets, GINN imputation results in the closest
3
Energy density (J/mm ) 9.14 distribution to the original dataset.
Scan speed (mm/s) 14.81 The distribution of the kNN-imputed dataset has
Density (%) 43.46 an acceptable deviation from the original distribution.
Laser spot (µm) 51.60 However, an examination of the imputed dataset found
Porosity (%) 69.14 that many imputed values for material properties are
Laser focus (mm) 73.09 identical, even with different process parameters. The
EL (%) 79.51 kNN algorithm did not manage to adequately capture
Ultimate tensile strength (MPa) 79.51 the relationship between process parameters and
Exposure duration (µs) 80.49 material properties. Even so, it did successfully model
the relationship between density and porosity, with all
Point distance (µm) 81.98 imputed values for these two variables adding up to 100%.
Yield strength (MPa) 82.22 There were also only a few instances where microhardness
Macrohardness (HV) 88.15 was lower than macrohardness.
Microhardness (HV) 90.12 Mean square error of the distributions is calculated and
Young’s modulus (GPa) 92.35 tabulated in Table 3. It was found that kNN performed
Volume 2 Issue 1 (2023) 9 https://doi.org/10.36922/msam.50

