An input lag investigation

whoa, that’s pretty surprising, but in a good way. Pretty great news all around :slight_smile: Thanks for your testing and reporting, as always!

1 Like

@Brunnis Have you had a look at RetroFlag’s Classic USB Controller-J /U controller?

No ghost input, great d-pad and buttons and as far as i can tell no added latency.

I also forced 1000 Hz polling for USB gamepads (add usbhid.jspoll=1 at end of line in /boot/cmdline.txt in Raspbian).

interesting! could you measure any performance degradation with this option? I wonder if it would be a ‘safe’ default in retropie (do you know what retropie defaults to?)?

Note that this is not guaranteed to work. For xbox controllers for example, when using the kernel’s xpad driver, you need xpad.cpoll=1 for 1 millisecond poll interval (1000Hz.)

And you need to verify by running the evhz tool:

I’ve found this information on the MisSTer Wiki though, so not sure if this is something that only works there or in general. Need to test.

Update:
Nope, no effect with XInput gamepads. xpad.cpoll is a custom patch in the mister kernel.

1 Like

Yeah, I bought one a good while ago. It’s really nice in most ways (look, feel, 250 Hz USB polling by default), except for one important aspect: D-pad sensitivity. I noticed immediately when playing Street Fighter II that when rocking your thumb left and right, there’s a very high likelihood of performing an involuntary jump or crouch. This phenomenon is not nearly as likely to occur on my 8bitdo controllers or my original SNES Mini controllers.

I’ve not noticed any performance degradation, but I’ve not run any formal tests on it. I would guess that if there is any measurable performance impact, it would only be seen while any button/stick is being pressed. I guess there might also be some risk that certain devices don’t like being polled at 1kHz. It would be nice if this could become a new default for RetroPie, but it certainly needs thorough testing.

Good info. Thanks.

@Brunnis

I have not noticed any d-pad sensitivity issues so far.

Recently completed Super Castlevania for Snes.

Yeah, it could of course be my sample that is particularly sensitive.

1 Like

This could be the same problem you are describing that Level1online mentions in his review of both the US version and the Japanese Famicom.

I myself have two J versions and don’t have this problem.

So is there any chance waterbox save states could be implemented in the MAME core to eliminate all input lag?

1 Like

Hello! I’m doing some measurements right now (RetroArch, Windows 10, LCD…) with a custom led SNES classic controller + raphnet adapter with my ROM test (NES) and Xperia 960fps HD vidéo. I will give you my conclusions later (french google trad, sorry).

3 Likes

Hello! I made a short comparison video (only with favorable input timing). Details in descriptions and pinned comment.

https://youtu.be/NyrcPyZtfMg

and another to illustrate the concept of favorable and unfavorable input timing

https://youtu.be/YOMIV6PAyR0
4 Likes

Well of course RetroArch is going to be the fastest with Run Ahead = 1.

Question is, how fast is it without it and only GPU sync ON?

1 Like

In theory without Run Ahead it’s +16 frames Xperia, without Frame Delay +12.5, without Hard GPU Sync +32.

Vulkan with “max swapchain images” on 2 should provide the same latency as Hard Sync to 0 in gl without the increased cpu cost (I think it was just under 20%).

It would be interesting to see if that’s working. :smirk:

1 Like

I cant seem to modify “max swapchain”. It is locked to 3 on my android phone. It is better or worse than 2? Thanks!

3 is worse, 1 additional frame of lag.

That sucks. Sticking to GL hard sync then.

I’d like to share some input latency tests I did recently, with the focus on which video renderer shows better results: D3D11 vs GL. Unfortunately I can’t test Vulkan…

I used a 120FPS smartphone camera (8,3ms/frame… yeah the results are not super accurate) and took the average value of 5-10 tries (jumps) per test. The test game was Little Samson with the Nestopia core and Run-Ahead enabled. V-Sync is ON for all tests but the last one (#5) resulting in constant frame-pacing and perfectly smooth scrolling.

#1 GL = 117ms

#2 GL + Hard GPU Sync = 67ms

#3 D3D11 - mode A = 67ms

  • This mode means: Windows 7 - Aero is disabled
  • there is a permanent tear-line in a fixed position at the top of the screen, slightly wobbling

#4 D3D11 - mode B = 100ms

  • this mode is enabled by pressing ALT+ENTER after the game runs. Pressing ALT+ENTER again switches back to mode A
  • this mode is always active if you have Aero enabled (with Enable Desktop Composition = ON)

#5 D3D11 + RTSS Sync = 67ms

  • for this test I disabled RetroArch’s V-Sync Option and used the Scanline Sync function of RTSS
  • like in test #3 there is a permanent tear-line, but it can be moved and hidden via RTSS
  • downside: fast-forward can’t be used while Scanline Sync is ON, and you will have to start RTSS manually each time you start Retroarch

Conclusion

#2, #3 and #5 have the same latency of 67ms. #4 is 2 frames behind and #1 is the worst with 3 frames behind.

Performance Test

I was also interested to see how performance intensive the “low latency modes” #2, #3 and #5 are. For that, I limited my max CPU clock speed to 80% (via windows energy options) which lead to 2.1Ghz. I used Beetle PSX HW for this test and “fast forward” shows that it can run at 79FPS with this setup. Now the results:

#2 and #3 drop to 40FPS while #5 can keep the 60FPS. This shows that #2 and #3 need quite some CPU overhead to properly run at the 60 FPS target. On the other hand, RTSS can keep the 60FPS even with a small “fast forward” overhead of 63FPS and it doesn’t drop to 40/30/20FPS when the CPU can’t maintain fullspeed. So this would be the best solution for weaker systems that can barely run a core/game at fullspeed and don’t want to miss out on some lower input lag.

Specs

  • RetroArch 1.8.5
  • Laptop: Lenovo N581
  • OS: Windows 7 - 64-bit
  • Screen Resolution: 1366x768
  • CPU: Intel Core i5-3230M
  • IGP: Intel HD Graphics 4000
  • RAM: 4GB, DDR3-1600
2 Likes

Thank you for those tests Ortega, exactly what I was searching for :slight_smile: . Do you know id mode B is the default setup than you get under Windows 10? (with desktop composition always on) Have you tested Frame Delay to get additionnal milliseconds? Also, do you confirm that turning off Run-Ahead increase the lag by multiples of 16ms?

I’m not familar with Windows 10 and whether there is a similar input lag issue as with Windows 7. Run-Ahead should work as intended, as you can confirm in quick Retroarch test with the method:

pause game > press & hold action/jump button > press frame advance until the action is displayed

That’s a good idea to test Frame Delay in addition, so I just did that with the above mentioned “low input lag modes” #2, #3 and #5. The requirement for an acceptable frame delay value is absolutely no stutter/audio crackling.

First test is with Nestopia, so a low CPU-intensive core. Run-Ahead = 1 and Second Instance = ON. The results show barely a difference:

  • #2 GL + Hard GPU Sync = 12ms
  • #3 D3D11 - mode A = 13ms
  • #5 D3D11 + RTSS Sync = 13ms

There is a noticable difference when using a much higher CPU intensive core, Beetle PSX HW (in software render mode). I ran the test in a game scene where the fast-forward overhead was 88FPS:

  • #2 GL + Hard GPU Sync = 2ms
  • #3 D3D11 - mode A = 3ms
  • #5 D3D11 + RTSS Sync = 5ms

So combined with Frame Delay, the RTTS scanline sync method can offer the lowerst average input lag on my system.

3 Likes