James Stanley


The future of virtual reality

Sat 3 September 2022
Tagged: futurology

I know very little about artificial intelligence. Mainly I just like to argue that machines can be sentient, because I don't see the difference between a computer and a brain. But I think the new prompt-driven AI stuff is incredibly powerful, and I don't think we're that many steps away from being able to create fully-immersive virtual worlds that can be summoned at will from free-form English-language prompts.

GPT-3 takes a prompt and creates more text. Stable Diffusion takes a prompt and creates an image. Copilot takes a prompt and creates source code.

We only need to imagine a very small number of new technologies, along similar lines, in order to create 3-dimensional worlds, and interaction logic, from simple text prompts.

Just imagine an extra layer on top of Stable Diffusion that takes the same kinds of prompts, but it generates you a 3d world instead of a 2d image.

Imagine an extra layer on top of Copilot that generates an entire application rather than just small functions, and prime it to create common video game interaction patterns rather than leetcode solutions.

Maybe it would require very detailed prompts? OK, let's imagine using something like GPT-3 to turn a single coarse-grained "world prompt" into the fine-grained prompts for the underlying layers.

Now put on your VR headset. Connect to your friend who lives on the other side of the world. Give it a world prompt:

I'm a cop, my friend is a robber. I am chasing him through the streets of Chicago at night time. In the rain. If I catch him within 15 minutes I win, otherwise he wins. Also we each have a radar on our wrist that tells us the direction to the other player. Go!

The VR system comes with a built-in concept of what a 3d world is, how your motions in the real world control a character in the virtual world, how to link up multiple players into a single world over the Internet, etc.

Copilot++ has created the tiny bit of interaction logic to set a 15-minute timer, work out whether you've "caught" the robber, and implemented the standard "radar arrow on the wrist" pattern that is probably just a Unity++ plugin.

StableDiffusion++ has created the 3d world (both geometry and textures) and continuously paints in new areas of the world as they come into view.

So now you've got a reasonably photorealistic, fully-interactive, immersive 3d world, created to your specification, and a game to play against your friend, also created to your specification. You're almost actually a cop chasing a robber through the streets of Chicago at night time. In the rain.

And you can set the prompt to whatever you want! Literally anything you want. Literally everything is possible. It's like AI Dungeon on steroids.

When you ask it for a world, your experience would be exactly like doing the thing for real, except maybe sometimes the people have extra limbs and none of the text makes sense.

Why can't this exist? I think this can exist. I think the only question is how long it's going to take.



If you like my blog, please consider subscribing to the RSS feed or the mailing list: