Page 70 - IJAMD-2-2
P. 70
International Journal of AI for
Materials and Design
A unified industrial AI foundation framework
AI foundation framework, guiding knowledge extraction, across various architectures, with the transformer
data preparation, model development, and evaluation in a model achieving the highest accuracy of 99.47%. These
structured and systematic manner. results highlight the potential of advanced deep learning
The process began with the knowledge module. Under architectures in industrial fault diagnosis. Compared with
its guidance, domain knowledge was extracted from previous studies on similar gearbox fault classification
existing publications related to the dataset by leveraging tasks, the proposed approach demonstrates clear
69
the latest LLMs, GPT-4o, through OpenAI’s application improvement. For instance, Su and Lee developed a
programming interface. Researchers interacted with the residual CNN that achieved 96.99% accuracy, Vaerenberg
70
LLM using targeted research questions such as: “Can you et al. used power spectral density preprocessing,
provide the background information about this dataset?;” log normalization, and a 3-layer CNN to reach 96.9%
71
“In [target] paper, how did the authors perform data accuracy, and Gauriat et al. proposed multi-class
processing?;” and “How can one develop specific ML neural additive models with 92.03% accuracy. The higher
models such as 1D-CNN, LSTM, or Transformer for this performance achieved in this study is attributed to not
task?” GPT-4o provided structured responses, summaries, only the model design but also the systematic application
and code generation examples, helping researchers of the industrial AI foundation framework. The structured
quickly understand the dataset’s background, signal guidance from the knowledge, data, and model modules
characteristics, data preparation requirements, feature ensured consistent data preprocessing, appropriate
extraction strategies, and model development practices. model selection, and effective hyperparameter tuning,
ultimately enhancing reliability, scalability, and real-
The data module was applied to ensure systematic world applicability.
preprocessing. A structured pipeline was designed under
the guidance of domain knowledge from prior studies. 6. Future direction
The process began with data cleaning to improve data
quality, followed by data segmentation to increase the To further strengthen the industrial AI foundation
number of usable samples for model training. Next, framework, several key directions require further
feature engineering was performed using the fast Fourier exploration. One critical aspect is talent development.
transform to convert raw time-series signals into frequency Incorporating 4P-based learning (principle, practice,
domain features. The dataset was then split into training, problem-solving, and professional) and interdisciplinary
validation, and test sets in a 3:1:1 ratio to ensure robust training in AI/ML, engineering, and industrial
model evaluation. Throughout this stage, researchers applications could benefit the next generation of
continued to interact with LLMs to validate the soundness industrial AI practitioners. Another promising direction
of their preprocessing strategies or to quickly obtain code is data foundry, which aims to establish a standardized
examples for implementing new ideas. framework for industrial dataset collection, annotation,
benchmarking, and management. A well-structured
After ensuring that the data were AI-ready, in the data foundry would enhance collaborative AI research,
model module, researchers developed and evaluated eight reproducibility, and cross-industry data sharing while
different AI models based on previous methodologies also facilitating the hosting of the Industrial AI Data
and their expertise. These included: (i) tree-based model Challenge Competitions to promote innovation and
(decision tree and random forest); (ii) CNN-based model benchmark AI model performance on industrial datasets.
(naive 1D-CNN, residual 1D-CNN); (iii) long short-term Meanwhile, the foundation framework can be extended
memory (LSTM)-based model (naive LSTM, bi-LSTM, toward discrete event dynamic systems and hybrid control
hybrid-LSTM); and (iv) transformer-based model (vanilla systems. This would involve adapting the knowledge,
transformer). Deep learning models were selected for data, and model modules to better handle event-based
their ability to automatically extract complex patterns transitions, symbolic representations, and hierarchical
from large-scale, frequency-domain features and for their system logic. Furthermore, future research should explore
proven robustness in handling variations in operating the development of an LLM-assisted intelligent knowledge
conditions without the need for extensive manual feature management system to better make use of historical
engineering. Hyperparameter tuning and performance case studies, domain expertise, and best practices. Such
evaluation were performed following the experience a system could autonomously acquire, structure, and
indicated in the previous research and researchers’ retrieve relevant information, providing researchers and
development experience. engineers with contextualized and actionable insights to
The classification accuracy and confusion matrix are improve AI model development and decision-making in
presented in Figure 3, demonstrating high performance industrial applications.
Volume 2 Issue 2 (2025) 64 doi: 10.36922/IJAMD025080006

