Page 72 - ESAM-1-1
P. 72

Engineering Science in
            Additive Manufacturing                                                       ML in additive manufacturing



            categorized into macroscale and microscale structural   (1)  Small data size and high dimensionality. Data scarcity
            characteristics. Macroscale structural characteristics   arises when only a small-scale dataset is available for
            cover various macroscopic unconformities such as      a high-dimensional task with heterogeneous data. As
            delamination,  warping,  over/under-extrusion,  and   dimensionality increases, the dataset becomes sparser,
            geometric deviation. Microscale structural characteristics   leading to an elevated risk of overfitting.
            refer to microscopic defects such as micro-cracks, balling,   (2)  Small data size and low data quality.  A  small-scale
            porosity, and lack of fusion. Frequently measured product   dataset usually implies that only limited information
            property characteristics are physical properties such as   can be extracted. Low data quality further compromises
            mass, density, hardness, and surface roughness, as well as   the information richness of a small-scale dataset with
            mechanical properties, including tensile strength, flexural   noise and redundancy, making it information-poor.
            strength, and fatigue life.  Due to such a variety of relevant   (3)  Low data quality and high dimensionality. For low-
                               12
            characteristics, datasets in AM processes are collected   quality, high-dimensional, and heterogeneous data,
            from different sources at different time points in different   a sophisticated data management pipeline is required
            formats, leading to high data heterogeneity and posing a   to perform data cleansing for a variety of data
            challenge to data management. 54                      types and formats. In this case, feature engineering
              Low data quality has been recognized as a primary   techniques must also be utilized to select or derive
            challenge for AI-driven AM, including noises, biases,   the most representative features by removing data
                                                                                    63
            and inconsistencies. 7,12,54  Data quality assessment and   noise and redundancy.  When three challenges occur
            improvement methods have become  increasingly         concurrently, the limited information in the dataset
            important in this context due to their abilities to enhance   becomes challenging to extract using ML, even with a
            dataset information richness. Conventional primary data   sophisticated data management pipeline.
            quality dimensions, including completeness, timeliness,   4. Model architecture, learning strategies
            uniqueness, validity, accuracy, and consistency, cover   and emerging trends in AI-driven AM
            most data quality concerns for data management while
            overlooking crucial data  quality  characteristics  for AI   ML approaches developed to address AM concerns span
            such as data biases.  As AI-driven AM applications are   over a decade of design, process, structure, and property-
                            54
            emerging, recent reviews and research papers have started   based applications.  The  distribution  of these  learning
            to discuss data biases in the context of ML-based AM   algorithms and methods is as diverse as ML applications
            and propose evaluation and mitigation methods. 7,54,55,61,62    in any engineering discipline. In the past, some reviews
            Measurement bias, omitted variable bias, aggregation bias,   have focused on the applications of specific algorithms
            and representation bias are the most prevalent data biases   (e.g., feedforward neural networks [FFNNs], convolutional
            in AM. Measurement bias occurs when data collection   neural networks [CNNs]) due to their popularity and
            methods systematically distort the measurement of   compatibility with datasets available for learning AM
            variables. Omitted variable bias arises when relevant   concerns. Literature reviews have  also  summarized  the
            features  influencing the  outcome  are excluded  from   overall landscape of ML models and techniques in AM.
            the analysis. Aggregation bias occurs when combining   In recent years, the AM community has expanded to
            data  across groups  obscures  meaningful patterns  or   developing more customized learning algorithms tailored
            relationships. Representation bias arises when the dataset   to process and quality concerns as well as applying more
            does not adequately reflect the diversity of the population   advanced and recent learning techniques to leverage their
            of interest. Biases in the dataset, if not mitigated, will   unique  advantages.  This section covers ML  techniques
            accumulate and propagate to the ML models and eventually   applied to AM concerns in the past decade to draw
            lead to unreliable predictions.  Therefore, metrics such as   perspectives for the future research and development of
                                    61
            coverage and diversity have been integrated into adaptive   AI-driven AM and its subsequent adoption in the industry.
            sampling and data augmentation to evaluate and reduce   Due to its simplicity and suitability to AM datasets,
            representation bias.  In contrast, other types of data biases   shallow learning has dominated the early years of ML
                           54
            remain rarely discussed in this domain.            applications in the field. Unlike DL, these simpler and less
              The above-mentioned data challenges are intertwined   computationally expensive models (e.g., linear models
            in  AI-driven  AM  applications,  which  ultimately  lead  to   including linear regression, logistic regression, ridge
            poor modeling performance. The intertwined relationship   regression, lasso regression, elastic net, decision trees, and
            between data quality, data quality, and ML model   ensembles ) required small AM datasets  to learn basic
                                                                       12
                                                                                                47
            performance is shown in Figure 3, which consists of three   tasks (e.g., regression and clustering). With more and more
            aspects:                                           neural network-based algorithms applied to growing AM

            Volume 1 Issue 1 (2025)                         5                          doi: 10.36922/ESAM025040004
   67   68   69   70   71   72   73   74   75   76   77