An input lag investigation

It tears now with vsync disabled as it should, but it ALSO stutters somehow. Stuttering and tearing are usually mutually exclusive.

So i have a Gsync monitor what settings should i be…setting to get the lowest input lag??

vsync OFF, audio sync ON, audio latency down as low as you can get without crackling.

Just tried the latest AMD driver (16.8.2) which includes VulkanRT 1.0.17.0. However, this VulkanRT version was already installed on my system. I tried anyway, but the rather severe stuttering remains. Max swapchain images set to 2 or 3 makes no difference.

That’s too bad. Hopefully AMD doesn’t end up too far behind on fixing that.

Hi ! Sorry for this dumb user question : as of now, using retropie, dispmanx with its 1 frame reduced delay makes a real difference to me and I cannot stand the delay in ogl mode. Believe it or not, my characters miss jumps in opengl mode and not in dispmanx mode in several games :). However I cannot stand the bilinear or rendering without any filter. It seems one cannot apply overlays in dispmanx mode. Or maybe I missed something ? Cheers

If you’re interested in N64 emulation, there is this Mupen + GlideN64 libretro core that’s quite great except for a really high input lag. When the frame buffer is ON we got about 80ms lag (probably more than 100ms in real conditions).

If you feel like checking where the input polling occurs (as you should be the expert for that :slight_smile: ) the github and issue are there: https://github.com/loganmc10/GLupeN64/issues/55

No wonder why my lag test with Mupen64-Libretro and any plugins on both Project64 and Mupen64plus are different. Doom64 and Quake games are easier to play in Mupen64-Libretro with no vsync. It seems to add 4 frames on these standalone emulators the last time I tested.

[QUOTE=Tatsuya79;47205]If you’re interested in N64 emulation, there is this Mupen + GlideN64 libretro core that’s quite great except for a really high input lag. When the frame buffer is ON we got about 80ms lag (probably more than 100ms in real conditions).

If you feel like checking where the input polling occurs (as you should be the expert for that :slight_smile: ) the github and issue are there: https://github.com/loganmc10/GLupeN64/issues/55[/QUOTE] I’ll have to decline the offer for the time being. I really don’t have much time right now and I’ve already spent a lot of time on investigations like this lately. I haven’t posted in a while, but I have been busy with some stuff that I’m not ready to show just yet: I’m testing the new (and still experimental) VC4 OpenGL driver for Raspberry Pi to determine if it has any positive effects on input lag. While I have been able to do some tests, I have also run into what seems to be a bug in the driver. I’m currently awaiting feedback on that, but unfortunately there’s no timetable.

Ok, hope you’ll have success with that.

https://www.phoronix.com/scan.php?page=news_item&px=VC4-Job-Shuffling

Damn!, thats some big performance boosts!!

Is there a definitive setup for the lowest latency and highest accuracy? I know those two goals are often at opposite ends, but I think it would be great to have a definitive guide for the “Lowest Latency” and another guide for “Highest Accuracy”.

I’ll see if I can write a wiki page on this topic soon, but here’s a quick general guide:[B]

Linux[/B]

Important: Run RetroArch from an X-less terminal. This requires a working DRM video driver, which most modern systems appear to have. See https://github.com/libretro/RetroArch/wiki/KMS-mode

Important #2: You may get performance issues unless you set your CPU to max frequency. This is because the CPU’s power management thinks the CPU is idle enough to be downclocked. In Ubuntu, you can run sudo cpufreq-set -g performance to do this. You may want to put this in a startup script.

In retroarch.cfg set:

video_driver = “gl” video_vsync = true video_threaded = false video_max_swapchain_images = 2 video_frame_delay = See description further down

Windows

In retroarch.cfg set:

video_driver = “gl” video_vsync = true video_threaded = false video_fullscreen = true video_windowed_fullscreen = false video_hard_sync = true video_frame_delay = See description further down

Note on video_max_swapchain_images setting

When using the OpenGL (“gl”) video driver, this setting switches between using two or three buffers for rendering. Without going into details, a setting of 3 allows the emulator to run ahead and prepare the next frame before the current one has even been shown. This improves performance (i.e. makes framerate hiccups less likely), especially on slow hardware, but increases input lag by one whole frame in the general case.

So, the general rule is to use a setting of 2 if the system can handle it. It will shave off one frame of input lag compared to the default setting of 3. Please also note that a setting of 2 forces vsync on.

For OpenGL, this setting only applies under Linux KMS/DRM. Using “video_hard_sync = true” is similar on the Windows side.

Note on video_frame_delay setting

This setting delays the running of the emulator by the specified number of milliseconds. This sounds bad, but actually improves input lag, since it pushes the input polling and rendering closer to when the frame will actually be displayed. For example, setting video_frame_delay = 10 shaves off 10 ms of input lag.

The general rule here is to use the highest value possible that doesn’t cause framerate or audio issues. This is highly system dependent. The faster your system is and the less demanding the emulator is, the higher you can push this setting. On my Core i7-6700K, I can put this setting at 12-13 ms when using snes9x2010, but not nearly as high when using bsnes-mercury-balanced.

Please note that the frame delay value can’t be higher than a frame period (which is 16.67 ms at 60 Hz). I believe the GUI caps this setting to a maximum value of 15.

I would also advice to play with this setting last. It takes a bit of trial and error to find a good setting, and unless you’re willing to make per game settings, you might not be able to find a setting that works well in all situations while still giving a worthwile improvement.

A general note on GPU drivers

Input lag can vary depending on GPU driver, so it’s not possible to guarantee a certain input lag without testing the particular combination of hardware and GPU driver. For example, I have measured different input lag when just upgrading from one GPU driver version to another.

Note on Raspberry Pi [SIZE=2] The Raspberry Pi is sort of a special case. In general, it’s too slow to use anything other than the default value for[/SIZE] video_frame_delay (which is 0). Also, unless you’re using the DispManX driver or OpenGL via the experimental open source driver (VC4), the video_max_swapchain_images setting has no effect.[SIZE=2]

[/SIZE]In retroarch.cfg set:[SIZE=2]

[/SIZE]video_driver = “dispmanx” (“gl” if you require 3D acceleration or shaders. Using the default GPU driver, this will add one frame of input lag compared to the dispmanx driver.) video_vsync = true video_threaded = false video_frame_delay = 0

The settings above are what’s recommended for all of those using the default Raspberry Pi GPU driver. I have some comments coming up regarding the experimental OpenGL driver.

If you’re using DispManX with the default GPU driver or OpenGL with the experimental GPU driver (VC4), you can try setting video_max_swapchain_images = 2. It will reduce input lag by one frame, but framerate will suffer unless you’re running some very lightweight stuff. It seems to work better with DispManX than OpenGL on VC4, probably thanks to lower overhead. If you want to try video_max_swapchain_images = 2 with the DispManX driver, please make sure you’ve rebuilt RetroArch after October 17 2016, since this setting wasn’t enabled on the DispManX driver before that.

Also, I would highly recommend adding the setting force_turbo=1 to /boot/config.txt when using the video_max_swapchain_images = 2 setting. This will force the Raspberry Pi’s CPU to run at max frequency at all times and has been shown to provide much better performance, since the Pi otherwise occasionally tries to downclock to 600 MHz.

Regarding accuracy vs input lag

There’s no real correlation between the two, except that accuracy usually comes with a performance penalty (i.e. frame rendering times increase). This, in turn, makes it less likely that you can use video_max_swapchain_images = 2 and high video_frame_delay numbers. I’d choose the emulator(s) I prefer/need for the games I play and then tweak the above mentioned settings to their optimal values.

Thank you very much for this Brunnis, much appreciated. I do have a couple of questions though. Obviously these settings are for with V-Sync On but what if any settings would be changed if you have V-Sync Off like I do because of having a G-Sync monitor ? The way I understand it (and hopefully understand correctly) is the video frame delay has zero effect when running V-Sync Off. But what about the max swapchain images setting ? Does this have any effect with V-Sync Off ?

Thanks again very much, and I hope you get around to writing up a Wiki page for this soon. I also would like to copy / paste this information over to the Launchbox forums with your permission and all credit to you of course. Or if you like you can just post over there yourself if you prefer, I am sure we would have some appreciative people over there for this information.

Nice summarization.

I would tell to ignore the video_frame_delay setting at first, this is too unreliable and gives less gain. Just for advanced users who want to experiment (and make per game config).

@Brunnis : thanks a lot ! Could you also give us some advice about the pieces of hardware one should pick ? (especially the GPU) Also I read about recent monitor features like “FreeSync” which seems to indicate one could disable VSync without getting tearing. Are those relevant ? Does it enable to switch to simple buffering and get 1 fewer frame of lag ? Thanks again for all your contributions !!

Also, as you said, on the Raspberry, the dispmanx mode is necessary to remove one frame of lag. Is this additional frame of lag something not there when using a regular Linux PC, even when using OpenGL mode and pixel shaders ? Does any of you know if the RPi driver will eventually support KMS ? Apart from Input lag, the RPi is good enough for me as far as emulation is concerned. But input lag is a major issue :slight_smile:

Edit : eheh, I asked the same question as lordmonkus :slight_smile:

[QUOTE=lordmonkus;49147]Thank you very much for this Brunnis, much appreciated. I do have a couple of questions though. Obviously these settings are for with V-Sync On but what if any settings would be changed if you have V-Sync Off like I do because of having a G-Sync monitor ? The way I understand it (and hopefully understand correctly) is the video frame delay has zero effect when running V-Sync Off. But what about the max swapchain images setting ? Does this have any effect with V-Sync Off ?

Thanks again very much, and I hope you get around to writing up a Wiki page for this soon. I also would like to copy / paste this information over to the Launchbox forums with your permission and all credit to you of course. Or if you like you can just post over there yourself if you prefer, I am sure we would have some appreciative people over there for this information.[/QUOTE]

Do you think using a Gsync monitor makes the video frame delay seting useless ? because I have a Gsync monitor too and I always set the video frame delay to at least 7 or higher, resulting in crackling sound sometimes when the value is too high.

Other than that, nice summary Brunnis. :slight_smile: It’s too bad the dispmanx video driver on the Pi can’t use shaders… (By the way, is the dispmanx driver enabled on officiel Pi Lakka nightlies ?)

Thank you so much for the detailed explanation of the settings! I’m looking forward to your information about the experimental GL driver. Can you give any brief information on whether or not it’s worth it?

Also, does Lakka automatically adjust most of these settings that you mentioned to optimal values? (and Linux KMS mode, etc)

Would an ODROID-XU4 have lower latency than the RPi3?

[QUOTE=Brunnis;49145]Note on video_max_swapchain_images setting

When using the OpenGL (“gl”) video driver, this setting switches between using two or three buffers for rendering. Without going into details, a setting of 3 allows the emulator to run ahead and prepare the next frame before the current one has even been shown. This improves performance (i.e. makes framerate hiccups less likely), especially on slow hardware, but increases input lag by one whole frame in the general case.

So, the general rule is to use a setting of 2 if the system can handle it. It will shave off one frame of input lag compared to the default setting of 3.[/QUOTE]

Just two quick questions concerning this setting: would a value of 1 award any benefit in terms of responsiveness? And is this setting even working with the OpenGL driver under Windows?

[QUOTE=Tromzy;49161]Do you think using a Gsync monitor makes the video frame delay seting useless ? because I have a Gsync monitor too and I always set the video frame delay to at least 7 or higher, resulting in crackling sound sometimes when the value is too high.

Other than that, nice summary Brunnis. :slight_smile: It’s too bad the dispmanx video driver on the Pi can’t use shaders… (By the way, is the dispmanx driver enabled on officiel Pi Lakka nightlies ?)[/QUOTE]

It’s not that I think it’s useless. I just seem to recall reading somewhere that it was a setting tied to V-Sync On. I could be entirely wrong though.