An input lag investigation

I honestly do not know how it is possible to have such a low latency on any modern system. If I were you I would do the test multiple times to rule out any frame skipping on the recording app, and do averages over some 10 button presses.

1 Like

Here’s the original 240 fps video with 5 button presses, the deviation is less than 1 frame https://drive.google.com/file/d/1HJhmiZianoUYacdztOC_jT25y03Ffv5h/view?usp=sharing . I am pretty sure that sub 4ms system latency is possible, since major input latency is coming from my monitor that supports only 100hz.

If that video’s metadata is correct, then I count 2 frames of delay at 240fps. That is about 8ms of latency, which is indeed extremely good.

The Dualshock 4 has about 5~6 ms of average latency by itself, acording to this, so that leaves 2 or 3 ms for the rest of the system, including the monitor, which is hard to believe even if VRR is active.

This looks like an extreme best-case scenario. Even if that is just a game menu, I gotta take my hat off to the Ripout developers.

One way for you to make sure the video is being recorded at the correct speed is doing the same experiment, but with a stopwatch running being visible in the video somewhere (e.g. on another phone).

Forget to mention that DS4 v2 controller is overclocked to 2000hz (250hz default) using usb connection with hidusb and the best thing is that it works in that frequency, i found that it has more lag at 1000hz than 2000hz. 100zh monitor is about 1ms-10ms avarage minimum + monitor latency itself, pc latency 3.5msСнимок экрана (31) is reported on nvidia latency overlay monitor, the rest is DS4 v2 controller. Lets take a 360hz monitor 1.39 ms avarage at best, better cpu ram mobo - even better pc latency and overclock ds4 to 4000hz and maybe we will see end to end system latency under 4ms, maybe someone can do it.

2 Likes

Ah, if the DS4 is overclocked, that helps things a bit. That sounds about right then. Excellent result!

You are running this above the monitor’s max refresh rate, correct? With Vsync disabled, I presume.

I was not aware about the Nvidia latency tool. What subsystems is it measuring exactly?

I have a question about shaders and input lag. Are shaderes applying in parallel or consecutively? I mean, if new frame can get emulated and redered while old is still getting shaded, and (emulation + shading) exceeds 16.6 ms, then this create 1 frame input lag. Can this happen, or not?

And about frame delay. What will happen if it is set too large? One extra frame of lag without any indication or sound stuttering with less than 60 fps?

Sure. If a shader takes >16 ms to render, you’ll drop frames, audio will crackle, etc.

Excessive frame delay will cause audio crackles, as well, though auto frame delay will adjust downward until that stops (in the space of a couple of frames).

2 Likes

Note that on Vulkan, parallel emulation and shading (2 frames in flight) only happens with Max Swapchain Images set to 3 (the default) or more.

If you want to reduce one frame of input lag, you can set Max Swapchain Images to 2, but in that case, the frame will be emulated, and then shaded sequentially in the same 16ms window.

Or at least this is what I’ve been told. @hunterk correct me if I am wrong.

3 Likes

Do input latency mitigation techniques rely more on the CPU or GPU? From many tests I’ve done over the years, it always seemed that you needed at least a decent GPU for those things to work. I once tested a machine with a robust CPU and weak GPU, results were not great.

Nonetheless, I suspect it taxes both parts somewhat, so the user would need a base standard of performance from the two.

1 Like

The answer is that it depends. If the emulator is using software rendering, then that’s fully handled by the CPU. The GPU will only receive the rendered pixels into a framebuffer and then handle the displaying of that framebuffer. There could be some differences due to the available memory bandwidth, but generally you’d get good results with a fast CPU and a comparably slow GPU.

If the emulator uses the 3D API (OpenGL, Vulkan, etc.) for doing actual rendering, then parts of the emulation workload will be handled by the GPU. For example, let’s say the load ends up 50% on the CPU and 50% on the GPU, then both are equally important. This also means that even if you had an inifinitely fast CPU, you’d still have 50% of the original processing time left (unless you also upgrade the GPU).

Finally, when you add shaders, you add a GPU load. This work will happen sequentially, after the emulator has generated the frame. So no matter if the emulator itself is SW or HW rendered, you’ll have this additional GPU dependent processing happening before the frame can be sent to the display. Obviously, for shaders, having a faster GPU will reduce this processing time.

EDIT: I almost forgot to add: All of the above decides what kind of headroom you have available for latency mitigating features. These features usually rely on being able to generate the frames as quickly as possible, and that in turn is decided by the composition of the particular load vs your particular HW.

5 Likes

Excellent answer! Thank you very much! It explains what happened on my side, as I like to test shaders alongside input latency, and probably didn’t pay attention that I could’ve been stressing the GPU with something else.