Crowning Achievement: NVIDIA Research Model Enables Fast, Efficient Dynamic Scene Reconstruction
Nvidia

Crowning Achievement: NVIDIA Research Model Enables Fast, Efficient Dynamic Scene Reconstruction

Content streaming and engagement are entering a new dimension with QUEEN, an AI model by NVIDIA Research and the University of Maryland that makes it possible to stream free-viewpoint video, which lets viewers experience a 3D scene from any angle.

QUEEN could be used to build immersive streaming applications that teach skills like cooking, put sports fans on the field to watch their favorite teams play from any angle, or bring an extra level of depth to video conferencing in the workplace. It could also be used in industrial environments to help teleoperate robots in a warehouse or a manufacturing plant.

The model will be presented at NeurIPS, the annual conference for AI research that begins Tuesday, Dec. 10, in Vancouver.

“To stream free-viewpoint videos in near real time, we must simultaneously reconstruct and compress the 3D scene,” said Shalini De Mello, director of research and a distinguished research scientist at NVIDIA. “QUEEN balances factors including compression rate, visual quality, encoding time and rendering time to create an optimized pipeline that sets a new standard for visual quality and streamability.”

Reduce, Reuse and Recycle for Efficient Streaming

Free-viewpoint videos are typically created using video footage captured from different camera angles, like a multicamera film studio setup, a set of security cameras in a warehouse or a system of videoconferencing cameras in an office.

Prior AI methods for generating free-viewpoint videos either took too much memory for livestreaming or sacrificed visual quality for smaller file sizes. QUEEN balances both to deliver high-quality visuals — even in dynamic scenes featuring sparks, flames or furry animals — that can be easily transmitted from a host server to a client’s device. It also renders visuals faster than previous methods, supporting streaming use cases.

In most real-world environments, many elements of a scene stay static. In a video, that means a large share of pixels don’t change from one frame to another. To save computation time, QUEEN tracks and reuses renders of these static regions — focusing instead on reconstructing the content that changes over time.

Using an NVIDIA Tensor Core GPU, the researchers evaluated QUEEN’s performance on several benchmarks and found the model outperformed state-of-the-art methods for online free-viewpoint video on a range of metrics. Given 2D videos of the same scene captured from different angles, it typically takes under five seconds of training time to render free-viewpoint videos at around 350 frames per second.

This combination of speed and visual quality can support media broadcasts of concerts and sports games by offering immersive virtual reality experiences or instant replays of key moments in a competition.

In warehouse settings, robot operators could use QUEEN to better gauge depth when maneuvering physical objects. And in a videoconferencing application — such as the 3D videoconferencing demo shown at SIGGRAPH and NVIDIA GTC — it could help presenters demonstrate tasks like cooking or origami while letting viewers pick the visual angle that best supports their learning.

The code for QUEEN will soon be released as open source and shared on the project page.

QUEEN is one of over 50 NVIDIA-authored NeurIPS posters and papers that feature groundbreaking AI research with potential applications in fields including simulation, robotics and healthcare.

Generative Adversarial Nets, the paper that first introduced GAN models, won the NeurIPS 2024 Test of Time Award. Cited more than 85,000 times, the paper was coauthored by Bing Xu, distinguished engineer at NVIDIA. Hear more from its lead author, Ian Goodfellow, research scientist at DeepMind, on the AI Podcast:

Learn more about NVIDIA Research at NeurIPS.

See the latest work from NVIDIA Research, which has hundreds of scientists and engineers worldwide, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics.

Academic researchers working on large language models, simulation and modeling, edge AI and more can apply to the NVIDIA Academic Grant Program.

See notice regarding software product information.

This material is for informational purposes is not intended to be relied upon as a forecast, research or investment advice, and is not a recommendation, offer or solicitation to buy or sell any securities or to adopt any investment strategy. The opinions expressed are as of date of publication and are subject to change. Reliance upon information in this material is at the sole discretion of the reader. Past performance is not indicative of current or future results. This information provided is neither tax nor legal advice and investors should consult with their own advisors before making investment decisions. Investment involves risk including possible loss of principal.