Page 48 - IJAMD-1-3

P. 48

International Journal of AI for
Materials and Design
Metal AM porosity prediction using ML

a layer is critical in the manufacturing process since a slight separately on the “low” and “high” datasets were trained.
variance in the porosity at this level could affect the overall Then, the models evaluate their performance on the unseen
quality of the built product. held-out test dataset. Figure 9 illustrates the absolute error
and RMSE scores of the two models. The black dotted line
As shown in Figure 7, the significant split in the
distribution (x-axis) observed in Figure 3 is absent once represents the identity line, and the red line denotes the
we split the dataset. Hence, there is no clear distinction line of best fit.
between the two distributions once we separate the dataset Figure 10A shows the performance of the XT regressor
into “low” and “high” subsets. Therefore, predicting on the “low” porosity dataset. The model achieves a low
the porosity level should be more challenging than RMSE score of 0.0367, echoed in the absolute n error
classification. plot, as the best-fit line fairly aligns with the identity line.
Likewise, the model has also achieved a relatively low RMSE
3. Results and discussion score of 0.108 on the “high” porosity dataset (Figure 10B).

3.1. Regression However, the best-fit line has a slightly different slope
compared to the “low” dataset result, showcasing a slightly
All the regression models were evaluated using the RMSE weaker performance.
separately on the “low” and the “high” porosity training
datasets. The models’ performances are shown in Figure 8. Table 2 provides a summary of the 10 regression
In Figure 8A shows the performance of the regression models, including the general speed of each model based
models on the “low” porosity dataset. Linear Regression on the given dataset size, assuming GridSearch CV is
and ensemble models (which are RF, XT, GB, Adaptive used for hyperparameter tuning, which is considered
Boosting [AdaBoost], and eXtreme Gradient Boosting part of the model training. The models displayed low
[XGBoost]) attained similar RMSE values, whereas the XT RMSE values comparable to those reported by Ho
57
model achieved the lowest RMSE score of 0.04, signifying et al. who achieved RMSE values of around 5% when
better performance. Regarding the performance on the using DL models including CNN and transfer learning
“high” porosity dataset, both XT and AdaBoost are tied models such as VGG‑16. These DL models generally take
much longer to train, highlighting a strong advantage of
with the best RMSE score of 0.12. To keep our evaluations 21
consistent and comparable, we selected XT regressor for these ML models. Liu et al. also reported the use of ML in
both datasets and further investigated this model in the predicting porosity, achieving significantly higher RMSE
values between 10% and 26%, which are much higher
hyper-parameter optimization stage.
compared to those achieved in this study.
The best hyper-parameters for the XT regressor The execution time of Python scripts can be measured
model from Grid SearchCV on the “low” dataset are: using the time function to provide an indication of model
n_estimators = 700, max_depth = 76, min samples_leaf = 2, and performance. LR is the fastest in terms of real-time
min_samples_split = 2. The best hyper-parameters discovered processing due to the simplicity of solving a single linear
for the model on the “high” dataset are: n_estimators = 841, equation and the absence of hyperparameter tuning.
max_depth = 138, min_samples_leaf = 2, and min_samples_ However, the preference for models like XT over LR,
split = 3.
despite similar RMSE values (0.05 and 0.04) and slower
As the final step, the two XT regressors supplied with speed, stems from their ability to capture non-linear
the hyper-parameters discovered in the previous step patterns in the data. LR is limited to linear relationships,

A B

Figure 8. Performance of all the models on training dataset (lower RMSE is better): (A) Low‑porosity layers and (B) high‑porosity layers.
Abbreviations: AdaB: Adaptive Boosting; DT: Decision Tree; GB: Gradient Boosting; k-NN: k-Nearest Neighbors; LR: Linear Regression; RF: Random
Forest; RMSE: Root‑mean‑square error; SVR: Support Vector Regression; XGB: eXtreme Gradient Boosting; XT: Extremely Randomized Trees.

Volume 1 Issue 3 (2024) 42 doi: 10.36922/ijamd.4812

43 44 45 46 47 48 49 50 51 52 53