Page 60 - DP-2-2
P. 60

Design+                                                               Da Vinci AI Tutor in art history learning



            provided an optimal balance of responsiveness and cost-  a robust software development kit that addressed several
            effectiveness. The responses generated by the model were   limitations of the original method and ensured smoother
            displayed in a text box within Unity’s user interface (UI)   implementation for developmental purposes. These features
            Canvas, enabling clear and dynamic interaction during   provided significant advantages over the earlier approach.
            text-to-text communication.                        One of its standout functionalities was the ability to create

              To further enhance the user experience, provisions   a custom “character description,” allowing developers to
            were made for converting voice input into text and vice   input detailed information about the AI persona, including
            versa.  Whisper,  an automatic speech  recognition  tool,   a name, backstory, and specific characteristics (Figure 2).
            was employed to transcribe user voice input into text.   This customization extended to the inclusion of the
            Although this feature demonstrated potential, it was   “Knowledge Bank” feature, enabling the addition of
            not fully developed in the initial version of the project.   specialized information tailored to the project. By doing
            Similarly, the final phase of this development step involved   so, it minimized the likelihood of model hallucinations
            selecting a text-to-speech service, such as Amazon Polly or   and ensured that the avatar’s responses remained accurate
            ElevenLabs, to convert the textual responses of ChatGPT   and contextually appropriate. In addition, the “Personality
            into  spoken  words.  This addition would allow users  to   & Style” settings in Convo.ai allowed for fine-tuning of
            engage in verbal conversations with the tutor, creating a   traits such as openness, extraversion, and agreeableness,
            more immersive and accessible experience.          ensuring that the model reflected a personality befitting
              These foundational steps ensured that the tutor was not   the historical figure of Leonardo da Vinci. The “Core AI
            only  capable  of understanding and  generating complex   Settings” provided another layer of control. Developers
            responses but also fully integrated into Unity for a seamless   could select the foundational LLM to be used; for this
            and engaging user experience. The design decisions made   project, GPT-4o was chosen for its advanced capabilities in
            during this phase reflected a commitment to balancing   generating nuanced and contextually accurate responses.
            technological sophistication with user accessibility, laying   Temperature settings were also adjustable, offering precise
            the groundwork for further enhancements and features   control over the randomness of the model’s outputs – a
            in  subsequent  iterations.  This  process  underscores  the   crucial feature for balancing creativity with reliability in
            collaborative and interactive nature of the technology,   responses.
            where careful integration is essential to translate   Despite these advantages, Convo.ai presented challenges
            computational capabilities into meaningful real-world   specific to the needs of the project. Its default avatar creator,
            applications.                                      Ready Player Me (https://readyplayer.me/), proved useful
              The initial approach to integrating the tutor into Unity,   for initial development but lacked the historical fidelity
            which relied on the OpenAI API for direct interaction   required for a serious educational application (Figure 3).
            with the ChatGPT model, was ultimately set aside in favor   Specifically, creating a convincing 3D model of Leonardo
            of Convo.ai (https://convo.ai/), a more comprehensive   da Vinci posed significant obstacles. Many character
            solution designed to streamline back-end requirements   creation tools fell short in replicating period-specific attire,
            for integrating LLMs within Unity. The platform offered   and  models  often  failed  to  accurately  depict  distinctive






















                                              Figure 2. The “Core AI Settings” of Convo.AI
                                                Abbreviation: AI: Artificial intelligence.


            Volume 2 Issue 2 (2025)                         10                               doi: 10.36922/dp.8365
   55   56   57   58   59   60   61   62   63   64   65