Breaking

First look: Nvidia DLSS 3 - AI upscaling enters a new dimension - Eurogamer.net
Sep 28, 2022 4 mins, 42 secs
DLSS 3 adds AI frame generation to its existing DLSS 2-based spatial upscaling.

Nvidia is talking about DLSS 3 as an enabler for next generation experiences, showing its highly impressive Racer RTX, Portal RTX and the Overdrive RT version of Cyberpunk - which, believe it or not, is effectively a path-traced rendition of the game.

At the nuts and bolts level, DLSS 3 is actually a suite of three different technologies Nvidia has spent years developing.

This is joined by DLSS frame generation.

Essentially, the GPU renders two frames and then inserts a new frame between the two, generated via a mixture of game data such as motion vectors along with optical flow analysis, delivered by a revised fixed function block in the new Ada Lovelace architecture - which Nvidia says is three times faster than the last-gen Ampere.

Because frames are now being buffered, extra latency is added to the pipeline, which Nvidia seeks to mitigate with its lag-reduction technology, Reflex.

At worst, the game may have some extra latency added - we'll share some initial findings later on.

There's nothing stopping you not using frame generation at all, and simply banking the lag reduction Reflex offers, if that's what you prefer.

Because of the speed of the optical flow analyser in Ada Lovelace, prior Turing and Ampere cards cannot run DLSS frame generation.

For owners for RTX 2000 and RTX 3000 series cards, this means that DLSS 3 supported titles still offer DLSS 2 upscaling and Reflex latency benefits, but frame generation is off the table.

In looking at how the buffering works for frame generation, I'm reminded of the old AFR (alternate frame rendering) techniques used with SLI - where two graphics cards worked in tandem rendering every other frame.

This had a similar increase in latency, but without the mitigation of Reflex.

So, in effect, DLSS frame generation on the same GPU is taking the place of the second graphics card from the SLI days.

Still, the bottom line is that the likes of DLSS 2/FSR 2.x/XeSS speed up rendering and reduce latency - frame generation does not.

It's only really with prolonged eyeballing that you can tell where DLSS 3 frame generation has fallen short.

In its keynote, we saw how Morrowind received a new RT look but we've actually been hands-on with Portal RTX - and it's a truly beautiful new way to look at the game.

Path tracing is exceptionally heavy on the GPU, and the heavier the workload, the bigger the performance uplift provided - not only by DLSS 3 frame generation but by DLSS 2 upscaling too.

The table below shows a 3.19x performance uplift from DLSS 2 on its own, which rises to 5.29x with the addition of frame generation.

Also note the latency numbers: in this case, Nvidia Reflex is indeed nullifying the extra lag introduced by frame generation buffering.

Looking at the screenshot directly below, you can see that this quicktime event only sees a 15.2 percent increase in frame-rate with DLSS 2.

What's actually happening here is that at native 4K, we're GPU constrained, while DLSS 2 sees us hitting the CPU limit.

Because DLSS 3 frame generation does not rely on the CPU preparing instructions for the frames it creates, the performance increase kicks in despite the CPU being fully tapped out.

DLSS 3 frame generation is effectively doubling the frame-rate.

Frame generation continues to increase frame-rate, however.

Also noteworthy here is that Reflex doesn't help latency much with DLSS 3 - the tech works by optimising the relationship between CPU and GPU, which is hard to achieve if the CPU is hitting its performance limit.

In this pre-release preview code, Nvidia Reflex latency figures with DLSS 3 can't match DLSS 2 with Reflex off, which I expect to be the 'unofficial' target.

After all, this isn't a twitch shooter or an esports competitive experience - but with that said, we'll definitely need to see how latency fares in more DLSS 3 titles going forward.

Apart from not disclosing frame-rate numbers, the only other restriction Nvidia asked for was to limit gen-on-gen comparisons to DLSS 2 on the older card to DLSS 3 on the new.

DLSS 2 on Ampere vs DLSS 3 on Ada Lovelace essentially provides a three-times increase to performance overall.

The same can be said of the preview build of Cyberpunk 2077 we played, where the performance multiplier gen-on-gen may not be as large as Portal RTX but the base frame-rate on the RTX 3090 Ti side is larger.

The discontinuities in the second AI generated frame are easy to see - but are they easy to see with each frame persisting for just 8.3 milliseconds?

Also pay attention to how different Spider-Man's arms and legs are from frame to frame: it indicates how fast the motion is on these three image, across a total of 24.9ms game time.

In this scenario, the DLSS 3 generated frame is close to perfect, with only the yellow HUD element having issues.

The next of the obvious questions: why isn't DLSS 3 frame generation available on RTX 2000 and 3000 cards?

Nvidia says that the optical flow analyser in Ada Lovelace is three times faster than the Ampere equivalent, which would have profound implications on DLSS 3's 3ms generation cost.

Next up, let's tackle how frame generation overcomes the CPU limit.

However, frame generation can also be called frame amplification.

The frame-rate increased dramatically with frame generation, but the stutter was amplified too.

In terms of unknowns we're still looking to test, there's the question of just how low the base frame-rate can be, post DLSS 2.

At the extreme level, could DLSS 3 actually work in making a 30fps game look like 60fps?

It's an enticing thought and we'll be returning to DLSS 3 and Cyberpunk 2077 in future content.

RECENT NEWS

SUBSCRIBE

Get monthly updates and free resources.

CONNECT WITH US

© Copyright 2024 365NEWSX - All RIGHTS RESERVED