Page 34 - IJAMD-2-1
P. 34

International Journal of AI for
            Materials and Design
                                                                            ML molecular modeling of Ru: A KAN approach


            of complex relationships between input features and   function,  introducing non-linearity and sparsity into the
                                                                      45
            target outputs. The output layer consists of a single node   representation, thereby focusing the model’s attention on
            performing a final linear transformation, specifically   the most significant features of the atomic structure.
            configured for continuous value prediction in regression   The feature aggregation process culminates in a global
            tasks.                                             max pooling operation that consolidates information across
              The training process employs the Adam optimizer with   the entire  graph structure. This pooled representation
            a learning rate of 0.001.  This optimizer was selected for   then passes through a series of fully connected layers,
                               42
            its ability to dynamically adjust learning rates for each   systematically reducing the dimensionality from 256
            parameter based on first- and second-moment estimates   to 128, and finally to the output value representing the
            of the gradients, enhancing both convergence speed and   predicted energy and force of the crystal structure.
            model performance. The training procedure processes data   The  model optimization  utilizes  the  Adam  optimizer
            in batches to optimize memory usage and computational   with an initial learning rate of 0.001, carefully chosen to
            efficiency,  leveraging  stochastic  gradient  descent  for   balance  training  efficiency  with  optimization  stability.
            gradient computation. This batch processing approach,   This configuration allows for effective convergence while
            combined with multiple training epochs, allows the model   mitigating the risk of overshooting the optimal parameters
            to progressively refine its predictions through repeated   during training. The gradient-based optimization process
            iterations over the complete dataset.              enables the model to learn the complex relationships

            2.6. Graph neural network model                    between atomic structure and crystal energy while
                                                               maintaining numerical stability throughout the training
            We developed a GNN model specifically designed to   procedure.
            handle the structural complexities of Ru crystals. The
            model presents atomic structures as  graphs, where each   2.7. Accurate neural network engine for molecular
            node corresponds to an atom and incorporates both   energies model
            static and dynamic properties. The node feature vector   We implemented a standard ANI model following the
            combines the constant atomic number (Z = 44 for Ru) with   original framework specifications.  The model utilizes an
                                                                                          46
            the components of the force vector acting on each atom   atomic environment vector (AEV) computed with radial
            (F ,F ,F ). Thus, each node incorporated both elemental   and angular cutoffs of 5.2 and 3.5 Å, respectively.  The
                                                                                                        47
                iy
                  iz
              ix
            and dynamic properties and is represented as:      radial basis functions employ EtaR parameters of 16.0, 8.0,
            x  = [Z,F ,F ,F ]                         (Ⅷ)      4.0, and 2.0, with corresponding ShfR values of 0.0, 0.9, 1.8,
             i     ix  iy  iz                                  and 2.7. The angular component uses EtaA values of 8.0,
              The edge structure of the  graph is determined by a   4.0, and 2.0, a zeta parameter of 32.0, and both ShfA and
            predefined connectivity matrix characteristic of Ru crystal   ShfZ values set to 0.0, 0.9, and 1.8.
            structures. Each edge carries geometric information,
            specifically the Euclidean distance between connected   The neural network architecture consists of three
            atoms, calculated from their respective position vectors.   primary blocks processing the AEV inputs. The first block
            This distance-based edge attribution ensures that spatial   transforms the input to 128 dimensions using a linear layer
                                                                                48
            relationships between atoms are explicitly encoded in the   with SELU activation,  followed by layer normalization and
            graph structure, allowing the model to learn from both   0.1 dropout. The second block maintains this dimension
            local atomic environments and their spatial arrangements:  with identical transformation and regularization schemes.
                                                               The third block halves the dimensionality while retaining
            d =||r −r ||                               (Ⅸ)     SELU  activation  and  layer normalization.  The  network
             ij  i  j
              where r  and r  are the position vectors of atoms  and ,   concludes with a linear transformation to  three output
                    i
                         j
            respectively. The neural network architecture implements   dimensions.
            a hierarchical processing scheme through multiple graph   We also employed a batch size of 16 across 70
            convolutional layers. The initial layer transforms the   epochs of training, with an initial learning rate of 0.001
            four-dimensional input features into a 64-dimensional   managed by the Adam optimizer. The learning rate
            latent space, initiating the abstract representation of the   was dynamically adjusted using a ReduceLROnPlateau
            atomic environment. The network progressively expands   scheduling mechanism, which reduced the rate by a factor
            this representation through subsequent layers, first   of 0.5 when validation performance plateaued for five
            to 128 dimensions and then to 256 dimensions. Each   consecutive epochs.  Training stability was maintained
                                                                               44
            convolutional operation is followed by a ReLU activation   through gradient clipping with a maximum norm of 1.0,


            Volume 2 Issue 1 (2025)                         28                             doi: 10.36922/ijamd.8291
   29   30   31   32   33   34   35   36   37   38   39