Page 34 - IJAMD-2-1
P. 34
International Journal of AI for
Materials and Design
ML molecular modeling of Ru: A KAN approach
of complex relationships between input features and function, introducing non-linearity and sparsity into the
45
target outputs. The output layer consists of a single node representation, thereby focusing the model’s attention on
performing a final linear transformation, specifically the most significant features of the atomic structure.
configured for continuous value prediction in regression The feature aggregation process culminates in a global
tasks. max pooling operation that consolidates information across
The training process employs the Adam optimizer with the entire graph structure. This pooled representation
a learning rate of 0.001. This optimizer was selected for then passes through a series of fully connected layers,
42
its ability to dynamically adjust learning rates for each systematically reducing the dimensionality from 256
parameter based on first- and second-moment estimates to 128, and finally to the output value representing the
of the gradients, enhancing both convergence speed and predicted energy and force of the crystal structure.
model performance. The training procedure processes data The model optimization utilizes the Adam optimizer
in batches to optimize memory usage and computational with an initial learning rate of 0.001, carefully chosen to
efficiency, leveraging stochastic gradient descent for balance training efficiency with optimization stability.
gradient computation. This batch processing approach, This configuration allows for effective convergence while
combined with multiple training epochs, allows the model mitigating the risk of overshooting the optimal parameters
to progressively refine its predictions through repeated during training. The gradient-based optimization process
iterations over the complete dataset. enables the model to learn the complex relationships
2.6. Graph neural network model between atomic structure and crystal energy while
maintaining numerical stability throughout the training
We developed a GNN model specifically designed to procedure.
handle the structural complexities of Ru crystals. The
model presents atomic structures as graphs, where each 2.7. Accurate neural network engine for molecular
node corresponds to an atom and incorporates both energies model
static and dynamic properties. The node feature vector We implemented a standard ANI model following the
combines the constant atomic number (Z = 44 for Ru) with original framework specifications. The model utilizes an
46
the components of the force vector acting on each atom atomic environment vector (AEV) computed with radial
(F ,F ,F ). Thus, each node incorporated both elemental and angular cutoffs of 5.2 and 3.5 Å, respectively. The
47
iy
iz
ix
and dynamic properties and is represented as: radial basis functions employ EtaR parameters of 16.0, 8.0,
x = [Z,F ,F ,F ] (Ⅷ) 4.0, and 2.0, with corresponding ShfR values of 0.0, 0.9, 1.8,
i ix iy iz and 2.7. The angular component uses EtaA values of 8.0,
The edge structure of the graph is determined by a 4.0, and 2.0, a zeta parameter of 32.0, and both ShfA and
predefined connectivity matrix characteristic of Ru crystal ShfZ values set to 0.0, 0.9, and 1.8.
structures. Each edge carries geometric information,
specifically the Euclidean distance between connected The neural network architecture consists of three
atoms, calculated from their respective position vectors. primary blocks processing the AEV inputs. The first block
This distance-based edge attribution ensures that spatial transforms the input to 128 dimensions using a linear layer
48
relationships between atoms are explicitly encoded in the with SELU activation, followed by layer normalization and
graph structure, allowing the model to learn from both 0.1 dropout. The second block maintains this dimension
local atomic environments and their spatial arrangements: with identical transformation and regularization schemes.
The third block halves the dimensionality while retaining
d =||r −r || (Ⅸ) SELU activation and layer normalization. The network
ij i j
where r and r are the position vectors of atoms and , concludes with a linear transformation to three output
i
j
respectively. The neural network architecture implements dimensions.
a hierarchical processing scheme through multiple graph We also employed a batch size of 16 across 70
convolutional layers. The initial layer transforms the epochs of training, with an initial learning rate of 0.001
four-dimensional input features into a 64-dimensional managed by the Adam optimizer. The learning rate
latent space, initiating the abstract representation of the was dynamically adjusted using a ReduceLROnPlateau
atomic environment. The network progressively expands scheduling mechanism, which reduced the rate by a factor
this representation through subsequent layers, first of 0.5 when validation performance plateaued for five
to 128 dimensions and then to 256 dimensions. Each consecutive epochs. Training stability was maintained
44
convolutional operation is followed by a ReLU activation through gradient clipping with a maximum norm of 1.0,
Volume 2 Issue 1 (2025) 28 doi: 10.36922/ijamd.8291

