Page 79 - IJAMD-2-2
P. 79

International Journal of AI for
            Materials and Design                                                   Prediction of AM defect based on DL


            contrastive divergence.  Pre-training and fine-tuning   resilient backpropagation (grprop), adjusting the learning
                               20
            are implemented while training a DBN. The following   rate associated with the sag or the slr.
            is specific information regarding the methodology and   The calculation of ACC, FPR, and FNR (Equations I, II,
            related algorithms. 21,22  The network energy is expressed as   and III) in this paper is based on the confusion matrix and
            E(v,h).
                                                               the components in the matrix. The confusion matrix (CM)
              Let p(v) be the probability of a visible vector, and it is   is given as follows:
            described as follows:                                          FP TN
                     1                                            CM  =          
               p v         − ( , )E vh                (VII)              FN TP                         (XII)
                ( ) = ∑ e
                     Z   h
              where  Z  = ∑ exp( −E ( ))vh  is the partition function.  5. Results and discussion
                                 ,
              To train an RBM, weights are updated as follows:  5.1. Results of the Elman neural network and the
                                                               Jordan neural network
                ij (
               wt + ) = () +⋅1  wt  η  ∂ logp v(( ))  (VIII)   Both the Elman neural network and the Jordan neural
                         ij
                                    w
                                   ∂
                                     ij
                                                               network are RNNs. An Elman neural network can be
            4.3. Regular DNNs                                  thought of as a model that is constantly unfolding as a
                                                               sequence of predictors. A  Jordan neural network uses
            In the DNN learning, 23,24  each training tuple can be   a context layer to process sequential data.  The dataset
                                                                                                  15
            handled in two steps:  Propagating inputs forward and   (Table 1) used in this paper is experimental data that can
                              23
            backpropagating the error. For an input vector V = (V1,   be regarded as a sequential dataset over time.
            V2,…, V ), each hidden layer transforms its inputs from
                   p
            the layer to the next layer by applying an affine transform   Tables 2 and 3 show part of the results of the Elman
            and a nonlinear mapping as follows:                neural network and the Jordan neural network after min-
                                                               max normalization and z-score standardization on the
              z  = V                                   (IX)    dataset, respectively. It was shown that both the Elman and
               (1)
                j ∑
                l ()
                                  l ()
                          l () ( −1
               y =    N j=1 l () w z i l  )  +θ j ( l =  2 ,, … 3    L , )  (X)  Jordan neural networks did not work well in establishing
                                                               DL models on a small dataset (with unbalanced data) and
                         ij
                                                               predicting the LOF of the LPBF. Most of the ACC values
               z j l ()  =  f y(  j l () )             (XI)    were low, and most of the FPR values, as well as most of
                                                               the FNR values, were high or somewhat high. The reason
                                                               is the small dataset and the unbalanced data in the dataset.
                                              l ()
              where L is the number of layers;  N , θ , and  w  are
                                                      l ()
                                          l ()
                                              j
                                                      ij
            the number of nodes in the l  layer, the bias of the node j in   The structures of context layers show the number of nodes
                                  th
                                                               in the context layers. For instance, c(8) indicates that eight
            the l  layer, and the weight of the connection from node i
               th
            in the previous layer to node j of the l  layer, respectively;   nodes are in the context layer.
                                          th
            and f is the activation function (nonlinear).      5.2. Results of the DNN with weights initialized by
              In this paper, the Elman neural network, the Jordan   the DBN
            neural networks, the DNN with weights initialized by   Tables 4 and 5 list part of the results of the DNN with weights
            the DBN, and regular DNN based on various algorithms   initialized by the DBN after min-max normalization and
            were employed to establish DL models and implement   z-score standardization are employed, respectively. It was
            the prediction of the LOF of LPBF because all of the DL   shown that DNN-DBN did not work well in establishing
            methods achieved good ACC, FNR, and FPR when large   DL models on a small dataset (with unbalanced data) and
            and quality databases such as “spambase” (https://archive.  predicting the LOF of LPBF. There are the input layer,
            ics.uci.edu/ml/datasets/Spambase) were employed  in the   the  output  layer,  and  two  or  three  hidden  layers  in  this
            author’s past research work.                       technique. The performance (according to the ACC, FPR,
              Four  algorithms, “rprop+”,  “rprop−”, “sag,” and  “slr,”   and FNR) of the established DL models was not good due
            were used in this paper. “rprop+” and “rprop−“ refer to   to the small dataset and the unbalanced data in the dataset.
            the resilient back-propagation with and without weight   The structures of hidden layers indicate the number of
            backtracking, respectively. “sag” and “slr” induce the usage   nodes in the hidden layers. For instance, c(8, 6, 4) indicates
            of  the modified globally  convergent algorithm  globally   that there are three hidden layers, and the number of nodes
            Volume 2 Issue 2 (2025)                         73                        doi: 10.36922/IJAMD025060005
   74   75   76   77   78   79   80   81   82   83   84