NVLink was codeveloped with IBM, and will incorporated into the OpenPower architecture that IBM is spearheading, Nvidia said. NVLink will be used not only to connect a GPU to the motherboard, but also to connect GPUs to each other, with improvements of up to 5X in terms of GPU to GPU scaling, Huang said.
Likewise, scaling is one of the problems that the new 3D architecture will help solve. Nvidia's existing Kepler architecture already provides 288 Gbytes/s of memory bandwidth, according to Huang, but that too will inevitably increase over time. By stacking memory and other chips on top of one another, "in a couple of years we're going to take bandwidth to a whole new level," Huang said.
The idea is to use these GPUs to not only solve big-data simulations of weather, economics, and other computationally intensive problems, but also render images photorealistically. And the next step, as Huang said, was to combine both: for example, Nvidia sent engineers to take high-dynamic range photos of the stage itself, then added a realistic car model in the center, and moved the "camera" about to explore it.
Combining realistic graphics with a dynamic scene has typically been the provenance of CGI movies, but that has also moved from the big screen to the computer monitor. Huang showed off a demonstration of the next-generation Unreal Engine 3 running on top of the new GeForce GTX Titan Z that looked, in places, completely real. But there's obviously a price: the Titan Z will cost $3000, although it will provide 5760 CUDA cores with two Kepler cores inside of it, 12GB of memory, and 8 teraflops of computing power. Oh, and it will consume 2000 watts by itself.
Pairing three of those GPUs together, Huang said, would provide the computational power of the "Google Brain," the company's effort to model the human brain that originally used a cluster of 16,000 computers. Just three GTX Titan Zs could be used instead, Huang said.
And if that's not enough, Nvidia has an Iray VCA to offer you. Essentially, the VCA is a remote server designed as a "render farm" for companies, taking a scene and rendering it as quickly as possible. The technology uses what Nvidia calls "Irays," modeling photons that fly though the air, bouncing off objects and being absorbed by them. Each VCA contains 8 GPUs for a total of 23,000 CUDA cores, which can access 12 Gbytes of memory per VCA. Each VCA runs on top of mobile graphics packages like Maya and 3DS Max. "What would take an hour to render, now takes a minute," Huang said.
Each Iray can be tied to others, using Nvidia software to connect them together and run them in parallel. Nvidia combined 19 iRays together to produce the equivalent of a petaflop — that's equivalent to the fastest supercomputer in the world, six years ago, Huang said.
Sign up for CIO Asia eNewsletters.