Page 19 - IJAMD-1-3
P. 19

International Journal of AI for
            Materials and Design
                                                                              Smart cockpit design with generative models



            Table 1. Evaluation results of the human digital twin   as  shapes, colors,  and  materials  based  on  the  identified
            perception model                                   Kansei vocabulary.
            Method                    Ergonomic scores mean error  For example, if a user describes their requirement
            MTL 53                            0.83             as “I want a vintage vehicle seat,” the system needs to
            Ours                              0.66             deconstruct the concept of “vintage” into its key design
            Abbreviation: MTL: Multi-task learning.            elements to create a specific prompt for generating a
                                                               vintage vehicle seat. The system identifies “vintage” as a
                                                               key Kansei word and maps it to design elements in terms
                                                               of shape, color, and materials. The specific prompt output
                                                               by GPT-4o-mini for a vintage vehicle seat is described as
                                                               follows:
                                                               (i)  Shape: Incorporate curved lines and ornate details
                                                                  typical of vintage furniture. The overall form should
                                                                  be reminiscent of classic armchairs or lounge chairs
                                                                  from the early 20  century.
                                                                                th
                                                               (ii)  Color: Use warm, rich tones such as mahogany, deep
                                                                  browns, or muted greens. The color palette should
                                                                  evoke a timeless and elegant feel.
                                                               (iii) Materials: Utilize high-quality materials such as leather
                                                                  for the upholstery, with tufted detailing to enhance the
                                                                  vintage look. The frame should be made of polished
            Figure  5. Qualitative examples of the human digital twin perception   wood and include brass accents or embellishments to
            model
            Abbreviations: GT: Ground truth; Pre: Prediction.     add an extra touch of sophistication.
                                                               4.2.2. Designer agent
            criteria of users. The internal collaborative mechanisms of
                                                                                              54
            generative model-based MAS for personalized seat design   In this subsection, the TAPS3D model,  a state-of-the-art
            are illustrated in Figure 6.                       text-to-3D generative model, has been fine-tuned to function
                                                               as a designer agent for creating personalized vehicle seat
            4.2.1. User requirement analysis agent             design solutions. By leveraging advanced AI technologies,
            The capture and expression of user requirements are   TAPS3D can generate high-quality, textured 3D meshes
            crucial components of the product design process. User’s   with arbitrary topologies based on design prompts derived
            emotional needs are often implicit and challenging to fully   from user requirement analysis agents. This functionality
            quantify through traditional research methods. Kansei   enables the model to address both esthetic and functional
            Engineering provides a  methodology  to translate  users’   specifications provided by customers. The implementation
            esthetic  demands  into  design  parameters  by utilizing   details for the text-to-3D generative model are illustrated in
            Kansei vocabulary to express users’ affective responses.   Figure 7, which primarily encompasses three components:
            When combined with the powerful natural language   pseudo caption generation, 3D textured mesh generator,
            processing capabilities of LLMs, this approach enables   and 3D textured mesh discriminator.
            a more systematic and accurate generation of prompts   First, a 3D object dataset is rendered into multi-view
            for design inputs, guiding the designer agent to create   images using Blender v2.90.0. The system then leverages the
            more user-friendly designs that meet user expectations.   contrastive language-image pre-training (CLIP) model,  a
                                                                                                           55
            Initially, user’s voice input detailing their personalized   pre-trained model for image-text similarity and zero-shot
            emotional demands. For instance, users might describe   image classification, to establish pseudo-captions. These
            their desire for a new automobile seat to “be comfortable   pseudo captions are subsequently encoded as embeddings
            to  use”  or  “look  fashionable.”  The  user  requirements   by the CLIP model. The caption embeddings, along with
            analysis agent must identify Kansei words relevant to   two random noise vectors, are fed into the 3D textured mesh
            the design problem. These words can include adjectives   generator backbone from GET3D,  which controls the 3D
                                                                                          55
            that characterize the user’s desired feelings, such as   shape and texture, respectively. Finally, two discriminators
            “comfortable,” “fashionable,” or nouns that represent   (an RGB image and a silhouette discriminator), based on
            sources of inspiration, such as “nature.” Subsequently, the   the StyleGAN  architecture, evaluate the authenticity of
                                                                          43
            LLM can reason and output visual design elements such   the generated objects.

            Volume 1 Issue 3 (2024)                         13                             doi: 10.36922/ijamd.4220
   14   15   16   17   18   19   20   21   22   23   24