Page 21 - IJAMD-2-1
P. 21

International Journal of AI for
            Materials and Design
                                                                             Predicting thermal conductivity of sintered Ag


            Bayesian optimization has been widely recognized for its   Table 1. Bayesian optimization algorithm calculation process
            convenience and accuracy.
                                                               Bayesian optimization algorithm
              Bayesian  optimization is  derived from  the  famous   For n=1, 2, …, do
            “Bayes theorem”: 39
                                                                a.  Obtain the next evaluation point x  by maximizing the
                                                                                        n+1
                     ( pD f )( )p f                               acquisition function α
             ( p fD =                                  (XI)     b. Get the objective function value of the evaluation point y ;
                 )
                        ()
                       pD                                                                             n+1
                                                                c. Augment data D ={D , (x , y )};
                                                                            n+1
                                                                                 n
                                                                                   n+1
                                                                                      n+1
              where  f,  D ={(x ,y ),(x ,y ),…,(x ,y )},  x , and y    d. Update probabilistic proxy model end for
                              1
                                   2
                                1
                                             n
                                                  n
                                                          n
                                           n
                                     2
            represent the unknown objective function, the set of
            observed sampling points, the decision vector, and the   assesses points near the possible minimum value and in
            observed value of the sampling point, respectively; p (D| f)   regions that have not been sampled. This approach helps avoid
            and p(f) represents the likelihood distribution of y and the   entrapment in the local optima, improves proxy function
            prior probability distribution of f (that is, the assumption   approximation of the true objective function, and facilitates
            about the unknown objective function state); and  p(D)   finding the minimum value of the objective function. In
            represents the marginal likelihood distribution of  f. The   simple terms,  the Bayesian optimization algorithm  selects
            function  p(D) is  usually  difficult  to calculate  because it   the next sampling point to maximize the return.
            involves the product and integral of the probability density
            function. However, since it does not depend on  f, it is   For the tuning method, the type and range of tuning
            treated as a normalized constant in Bayesian optimization.   parameters  need  to  be  defined.  In  the  ANN  model,  the
            In addition, p (f |D) represents the posterior probability   learning rate, number of hidden layers, and number of
            distribution of  f, which describes the confidence of the   neurons in each layer are selected as the tuning parameters.
            unknown objective function after modifying the prior   Among them, the learning rate in the optimizer is
            function from the observed data set.               an important hyperparameter in the neural network,
              The Bayesian optimization algorithm consists of two   regulating the step size of each parameter update and
            core parts: the probabilistic agent model and the acquisition   directly affecting the convergence speed and performance
            function. The probabilistic agent model includes the   of the model. The number of hidden layers and neurons
            prior probability model and the observation model. The   in each layer determines the structure of the model. To
            former is  p(f), while the latter describes the mechanism   streamline parameter adjustment, the number of neurons
            by which the observed data are generated, the likelihood   in each layer is set to be the same to ensure that the number
            distribution p(D| f). The posterior probability distribution   of hidden layers and neurons in each layer can be used as
            p (f |D), containing the observations of the latest evaluation   two independent parameters for parameter adjustment.
            points, is obtained using the Bayesian formula to update   Figure 10 presents the results of Bayesian optimization
            the probabilistic agent model. According to the posterior   of hyperparameters in the ANN model. The learning rate
            probability distribution, the next most “potential”   ranges from 1e  to 0.1 and is distributed exponentially to
                                                                           -4
            evaluation point is selected by maximizing the collection   improve optimization efficiency. The number of hidden
            function, and an effective collection function can ensure   layers is presented as an integer, ranging from 1 to 5, while
            that the selected evaluation point sequence minimizes the   the number of neurons is presented as an integer, ranging
            total loss value:                                  from 1 to 140. The evaluation index MSE (Equation IV)

            Loss = ∑  n i= 1  y * y−  i               (XII)    of the model is reflected by the color of the data points,
                                                               and the corresponding value range is referred to the color
                                                               scale on the right. Sampling occurs across the entire
              where y* represents the optimal solution of the current
            evaluation point.                                  hyperparameter space, avoiding the local optima. The
                                                               points around the final optimal result are relatively dense,
              The specific calculation process of the Bayesian   indicating that the model undergoes fine-tuning and
            optimization algorithm involves an iterative process of   optimization at the later stages. The final hyperparameters
            parameter updates, and its specific algorithm framework is   include two hidden layers, each with 32 neurons, and a
            presented in Table 1. 40                           learning rate of 9.29e .
                                                                                -4
              Figure  9 displays the principle of the Bayesian
            optimization algorithm. Each repeat sampling generates a   3. Results and discussion
            minimum value for the objective function. After the first   The established ANN model was used to map
            random sampling of the function, the second sampling   the relationship between the input characteristics


            Volume 2 Issue 1 (2025)                         15                             doi: 10.36922/ijamd.5744
   16   17   18   19   20   21   22   23   24   25   26