A new model that uses sound and speech to guide the robotic painting process – known as robotic synesthesia – was proposed by a research team from Netaji Subhas University of Technology (NSUT), The Robotics Institute, and Carnegie Mellon University (CMU).
Their method has been fully integrated the robotic painting framework Frida, by adding sound and speech to its existing input modes.

Frida can now experience the world around it with a heightened sense of synesthesia, perceiving sounds and emotions, colors and shades.
The model
The model integrates the artistic intentions of human users, which are expressed through text, styles, and sketches, into the robot system. Additionally, the team introduced audio input.
Two distinct approaches were developed to handle the sounds:
- natural sounds, which encompass a wide range of diverse sound samples coming from various sources
- speech sounds, which is a special subset of natural sounds characterized by language and tone
The input speech is transcribed to text using Whisper.

The approach is depicted in picture above. Using the multi-modal input in the form of text, style, sketch, and audio, the model generates a plan for the painting process. The plan is executed by a Rethink Sawyer robot, while a camera provides a feedback used to modify the painting process in real-time and guide the painting to have an emotional appearance.
Results
The authors conducted a survey to assess the correlation between the input audio and generated paintings.
The results indicate that participants were able to correctly identify the emotion or natural sound used to create a given painting 43.3% of the time, which is higher than the random selection rate of 16.7%.

Conclusion, future research
This approach may allow artists to explore the creative process in a more holistic way, considering not only the visual aspects of the painting but also the auditory and emotional aspects.
The emotions and sounds that are associated with a painting can offer valuable insights into an artist’s vision and the intentions behind their work.
Learn more:
- Research paper: “Robot Synesthesia: A Sound and Emotion Guided AI Painter” (on arXiv)
- Related work: “FRIDA: A Collaborative Robot Painter with a Differentiable, Real2Sim2Real Planning Environment” (on arXiv)