The Speaking the world into existence research effort aims to apply the recent progress in large language models to prompt-based creation of interactive 3d scenes. Integrating LLMs with a game engine should enable not only faster development of 3d content in various domains, such as gaming, mixed reality applications or animated films, but also allow for spontaneous user-generated content in the course of an interactive experience. Our previous work has also demonstrated that giving a large AI model the ability to act in a simulated environment with feedback has the potential to improve the outputs of generative models by grounding them in the real world. One of the fundamental research directions of this project is to make large multimodal models more reliable in the domain of human-scale activity, by not only incorporating what has been said about the world, but also testing results in a simulation of the world.
Beyond application in gaming and building compelling virtual worlds, easier creation of interactive 3d content and simulations by non-coding users also opens applications to education (e.g. allowing teachers to create immersive VR lessons in a short time), rapid creation of interactive training scenarios (imagine a group of first responders spinning up a simulation in the 30 minutes it takes them to arrive at the location of a disaster so that they can be prepared for the potential difficulties at the location) and many other applications in various domains such as real-time generation of virtual environments for therapy or creative applications.