Give us your feedback

VOXReality

Voice-driven interaction in XR spaces

Website

Open link

Business Categories

Cultural Heritage . Manufacturing . Telecommunications

Project Timeline

October 1, 2022 – September 30, 2025

VOXReality

VOXReality is an ambitious project driving the convergence of two transformative technologies: Natural Language Processing (NLP) and Computer Vision (CV). Leveraging advancements in data-driven methods such as machine learning (ML) and artificial intelligence (AI), VOXReality combines these technologies to revolutionize Extended Reality (XR).

On one front, CV and ML propel XR innovations, while on the other, speech-based interfaces and text-based understanding enhance human-machine and human-human interactions. VOXReality adopts an efficient approach to integrating language- and vision-based AI models, enabling unidirectional and bidirectional exchanges between modalities. Vision systems underpin AR and VR experiences, while language understanding offers a natural interaction method for backend XR systems, fostering multimodal experiences with vision and sound.

The project’s outcomes include:

  1. Pretrained XR models that merge language and vision AI, delivering immersive, natural experiences to boost XR adoption.
  2. Applications demonstrating innovations across diverse sectors, validated through three use cases:
    • Personal Assistants for supporting daily tasks with enhanced human-machine interactions.
    • Virtual Conferences in fully online environments with shared virtual spaces for global participation.
    • Theaters, integrating language translation, audiovisual synchronization, and AR VFX triggered by speech for enriched performances.

These advancements aim to redefine XR’s capabilities and drive its integration into everyday life.