"It’s much easier to brainstorm designs with simple sketches, and this technology is able to
convert sketches into highly realistic images," said Bryan Catanzaro, vice president of applied
deep learning research at NVIDIA.
The goal is to go from a semantic sketch map to photorealistic shots. To do this, the underlying artificial intelligence needs to be trained on scenes and objects, but not just how they look—it also has to understand how they interact with each other. That part is key for, say, not just placing a tree next to a body of water, but also having its reflection appear in the water, with realistic distortion.
"It’s like a coloring book picture that describes where a tree is, where the sun is, where the sky
is," Catanzaro said. "And then the neural network is able to fill in all of the detail and texture, and
the reflections, shadows and colors, based on what it has learned about real images."
This requires a massive amount of data, and so far NVIDIA has fed its GauGAN deep learning model a million Creative Commons images. To be clear, though, GauGAN does not just stitch together a bunch of preexisting photos and clean up the end the result. What you're seeing are actually unique images.
"It's actually synthesizing new images, very similar to how an artist would draw something," Catanzaro added.
In a sense, GauGAN becomes the artist, constructing photorealistic images based on what the human artist is trying to create. It's nothing short of impressive, and there are numerous uses for something like this, everything from architectural designs and urban planning, to creating virtual worlds and scenes in games.