Page 19 - IJAMD-1-3
P. 19
International Journal of AI for
Materials and Design
Smart cockpit design with generative models
Table 1. Evaluation results of the human digital twin as shapes, colors, and materials based on the identified
perception model Kansei vocabulary.
Method Ergonomic scores mean error For example, if a user describes their requirement
MTL 53 0.83 as “I want a vintage vehicle seat,” the system needs to
Ours 0.66 deconstruct the concept of “vintage” into its key design
Abbreviation: MTL: Multi-task learning. elements to create a specific prompt for generating a
vintage vehicle seat. The system identifies “vintage” as a
key Kansei word and maps it to design elements in terms
of shape, color, and materials. The specific prompt output
by GPT-4o-mini for a vintage vehicle seat is described as
follows:
(i) Shape: Incorporate curved lines and ornate details
typical of vintage furniture. The overall form should
be reminiscent of classic armchairs or lounge chairs
from the early 20 century.
th
(ii) Color: Use warm, rich tones such as mahogany, deep
browns, or muted greens. The color palette should
evoke a timeless and elegant feel.
(iii) Materials: Utilize high-quality materials such as leather
for the upholstery, with tufted detailing to enhance the
vintage look. The frame should be made of polished
Figure 5. Qualitative examples of the human digital twin perception wood and include brass accents or embellishments to
model
Abbreviations: GT: Ground truth; Pre: Prediction. add an extra touch of sophistication.
4.2.2. Designer agent
criteria of users. The internal collaborative mechanisms of
54
generative model-based MAS for personalized seat design In this subsection, the TAPS3D model, a state-of-the-art
are illustrated in Figure 6. text-to-3D generative model, has been fine-tuned to function
as a designer agent for creating personalized vehicle seat
4.2.1. User requirement analysis agent design solutions. By leveraging advanced AI technologies,
The capture and expression of user requirements are TAPS3D can generate high-quality, textured 3D meshes
crucial components of the product design process. User’s with arbitrary topologies based on design prompts derived
emotional needs are often implicit and challenging to fully from user requirement analysis agents. This functionality
quantify through traditional research methods. Kansei enables the model to address both esthetic and functional
Engineering provides a methodology to translate users’ specifications provided by customers. The implementation
esthetic demands into design parameters by utilizing details for the text-to-3D generative model are illustrated in
Kansei vocabulary to express users’ affective responses. Figure 7, which primarily encompasses three components:
When combined with the powerful natural language pseudo caption generation, 3D textured mesh generator,
processing capabilities of LLMs, this approach enables and 3D textured mesh discriminator.
a more systematic and accurate generation of prompts First, a 3D object dataset is rendered into multi-view
for design inputs, guiding the designer agent to create images using Blender v2.90.0. The system then leverages the
more user-friendly designs that meet user expectations. contrastive language-image pre-training (CLIP) model, a
55
Initially, user’s voice input detailing their personalized pre-trained model for image-text similarity and zero-shot
emotional demands. For instance, users might describe image classification, to establish pseudo-captions. These
their desire for a new automobile seat to “be comfortable pseudo captions are subsequently encoded as embeddings
to use” or “look fashionable.” The user requirements by the CLIP model. The caption embeddings, along with
analysis agent must identify Kansei words relevant to two random noise vectors, are fed into the 3D textured mesh
the design problem. These words can include adjectives generator backbone from GET3D, which controls the 3D
55
that characterize the user’s desired feelings, such as shape and texture, respectively. Finally, two discriminators
“comfortable,” “fashionable,” or nouns that represent (an RGB image and a silhouette discriminator), based on
sources of inspiration, such as “nature.” Subsequently, the the StyleGAN architecture, evaluate the authenticity of
43
LLM can reason and output visual design elements such the generated objects.
Volume 1 Issue 3 (2024) 13 doi: 10.36922/ijamd.4220

