Page 72 - ESAM-1-1
P. 72
Engineering Science in
Additive Manufacturing ML in additive manufacturing
categorized into macroscale and microscale structural (1) Small data size and high dimensionality. Data scarcity
characteristics. Macroscale structural characteristics arises when only a small-scale dataset is available for
cover various macroscopic unconformities such as a high-dimensional task with heterogeneous data. As
delamination, warping, over/under-extrusion, and dimensionality increases, the dataset becomes sparser,
geometric deviation. Microscale structural characteristics leading to an elevated risk of overfitting.
refer to microscopic defects such as micro-cracks, balling, (2) Small data size and low data quality. A small-scale
porosity, and lack of fusion. Frequently measured product dataset usually implies that only limited information
property characteristics are physical properties such as can be extracted. Low data quality further compromises
mass, density, hardness, and surface roughness, as well as the information richness of a small-scale dataset with
mechanical properties, including tensile strength, flexural noise and redundancy, making it information-poor.
strength, and fatigue life. Due to such a variety of relevant (3) Low data quality and high dimensionality. For low-
12
characteristics, datasets in AM processes are collected quality, high-dimensional, and heterogeneous data,
from different sources at different time points in different a sophisticated data management pipeline is required
formats, leading to high data heterogeneity and posing a to perform data cleansing for a variety of data
challenge to data management. 54 types and formats. In this case, feature engineering
Low data quality has been recognized as a primary techniques must also be utilized to select or derive
challenge for AI-driven AM, including noises, biases, the most representative features by removing data
63
and inconsistencies. 7,12,54 Data quality assessment and noise and redundancy. When three challenges occur
improvement methods have become increasingly concurrently, the limited information in the dataset
important in this context due to their abilities to enhance becomes challenging to extract using ML, even with a
dataset information richness. Conventional primary data sophisticated data management pipeline.
quality dimensions, including completeness, timeliness, 4. Model architecture, learning strategies
uniqueness, validity, accuracy, and consistency, cover and emerging trends in AI-driven AM
most data quality concerns for data management while
overlooking crucial data quality characteristics for AI ML approaches developed to address AM concerns span
such as data biases. As AI-driven AM applications are over a decade of design, process, structure, and property-
54
emerging, recent reviews and research papers have started based applications. The distribution of these learning
to discuss data biases in the context of ML-based AM algorithms and methods is as diverse as ML applications
and propose evaluation and mitigation methods. 7,54,55,61,62 in any engineering discipline. In the past, some reviews
Measurement bias, omitted variable bias, aggregation bias, have focused on the applications of specific algorithms
and representation bias are the most prevalent data biases (e.g., feedforward neural networks [FFNNs], convolutional
in AM. Measurement bias occurs when data collection neural networks [CNNs]) due to their popularity and
methods systematically distort the measurement of compatibility with datasets available for learning AM
variables. Omitted variable bias arises when relevant concerns. Literature reviews have also summarized the
features influencing the outcome are excluded from overall landscape of ML models and techniques in AM.
the analysis. Aggregation bias occurs when combining In recent years, the AM community has expanded to
data across groups obscures meaningful patterns or developing more customized learning algorithms tailored
relationships. Representation bias arises when the dataset to process and quality concerns as well as applying more
does not adequately reflect the diversity of the population advanced and recent learning techniques to leverage their
of interest. Biases in the dataset, if not mitigated, will unique advantages. This section covers ML techniques
accumulate and propagate to the ML models and eventually applied to AM concerns in the past decade to draw
lead to unreliable predictions. Therefore, metrics such as perspectives for the future research and development of
61
coverage and diversity have been integrated into adaptive AI-driven AM and its subsequent adoption in the industry.
sampling and data augmentation to evaluate and reduce Due to its simplicity and suitability to AM datasets,
representation bias. In contrast, other types of data biases shallow learning has dominated the early years of ML
54
remain rarely discussed in this domain. applications in the field. Unlike DL, these simpler and less
The above-mentioned data challenges are intertwined computationally expensive models (e.g., linear models
in AI-driven AM applications, which ultimately lead to including linear regression, logistic regression, ridge
poor modeling performance. The intertwined relationship regression, lasso regression, elastic net, decision trees, and
between data quality, data quality, and ML model ensembles ) required small AM datasets to learn basic
12
47
performance is shown in Figure 3, which consists of three tasks (e.g., regression and clustering). With more and more
aspects: neural network-based algorithms applied to growing AM
Volume 1 Issue 1 (2025) 5 doi: 10.36922/ESAM025040004

