Page 60 - DP-2-2
P. 60
Design+ Da Vinci AI Tutor in art history learning
provided an optimal balance of responsiveness and cost- a robust software development kit that addressed several
effectiveness. The responses generated by the model were limitations of the original method and ensured smoother
displayed in a text box within Unity’s user interface (UI) implementation for developmental purposes. These features
Canvas, enabling clear and dynamic interaction during provided significant advantages over the earlier approach.
text-to-text communication. One of its standout functionalities was the ability to create
To further enhance the user experience, provisions a custom “character description,” allowing developers to
were made for converting voice input into text and vice input detailed information about the AI persona, including
versa. Whisper, an automatic speech recognition tool, a name, backstory, and specific characteristics (Figure 2).
was employed to transcribe user voice input into text. This customization extended to the inclusion of the
Although this feature demonstrated potential, it was “Knowledge Bank” feature, enabling the addition of
not fully developed in the initial version of the project. specialized information tailored to the project. By doing
Similarly, the final phase of this development step involved so, it minimized the likelihood of model hallucinations
selecting a text-to-speech service, such as Amazon Polly or and ensured that the avatar’s responses remained accurate
ElevenLabs, to convert the textual responses of ChatGPT and contextually appropriate. In addition, the “Personality
into spoken words. This addition would allow users to & Style” settings in Convo.ai allowed for fine-tuning of
engage in verbal conversations with the tutor, creating a traits such as openness, extraversion, and agreeableness,
more immersive and accessible experience. ensuring that the model reflected a personality befitting
These foundational steps ensured that the tutor was not the historical figure of Leonardo da Vinci. The “Core AI
only capable of understanding and generating complex Settings” provided another layer of control. Developers
responses but also fully integrated into Unity for a seamless could select the foundational LLM to be used; for this
and engaging user experience. The design decisions made project, GPT-4o was chosen for its advanced capabilities in
during this phase reflected a commitment to balancing generating nuanced and contextually accurate responses.
technological sophistication with user accessibility, laying Temperature settings were also adjustable, offering precise
the groundwork for further enhancements and features control over the randomness of the model’s outputs – a
in subsequent iterations. This process underscores the crucial feature for balancing creativity with reliability in
collaborative and interactive nature of the technology, responses.
where careful integration is essential to translate Despite these advantages, Convo.ai presented challenges
computational capabilities into meaningful real-world specific to the needs of the project. Its default avatar creator,
applications. Ready Player Me (https://readyplayer.me/), proved useful
The initial approach to integrating the tutor into Unity, for initial development but lacked the historical fidelity
which relied on the OpenAI API for direct interaction required for a serious educational application (Figure 3).
with the ChatGPT model, was ultimately set aside in favor Specifically, creating a convincing 3D model of Leonardo
of Convo.ai (https://convo.ai/), a more comprehensive da Vinci posed significant obstacles. Many character
solution designed to streamline back-end requirements creation tools fell short in replicating period-specific attire,
for integrating LLMs within Unity. The platform offered and models often failed to accurately depict distinctive
Figure 2. The “Core AI Settings” of Convo.AI
Abbreviation: AI: Artificial intelligence.
Volume 2 Issue 2 (2025) 10 doi: 10.36922/dp.8365

