This could be the same problem you are describing that Level1online mentions in his review of both the US version and the Japanese Famicom.
I myself have two J versions and don’t have this problem.
This could be the same problem you are describing that Level1online mentions in his review of both the US version and the Japanese Famicom.
I myself have two J versions and don’t have this problem.
So is there any chance waterbox save states could be implemented in the MAME core to eliminate all input lag?
Hello! I’m doing some measurements right now (RetroArch, Windows 10, LCD…) with a custom led SNES classic controller + raphnet adapter with my ROM test (NES) and Xperia 960fps HD vidéo. I will give you my conclusions later (french google trad, sorry).
Hello! I made a short comparison video (only with favorable input timing). Details in descriptions and pinned comment.
https://youtu.be/NyrcPyZtfMgand another to illustrate the concept of favorable and unfavorable input timing
https://youtu.be/YOMIV6PAyR0Well of course RetroArch is going to be the fastest with Run Ahead = 1.
Question is, how fast is it without it and only GPU sync ON?
In theory without Run Ahead it’s +16 frames Xperia, without Frame Delay +12.5, without Hard GPU Sync +32.
Vulkan with “max swapchain images” on 2 should provide the same latency as Hard Sync to 0 in gl without the increased cpu cost (I think it was just under 20%).
It would be interesting to see if that’s working.
I cant seem to modify “max swapchain”. It is locked to 3 on my android phone. It is better or worse than 2? Thanks!
3 is worse, 1 additional frame of lag.
That sucks. Sticking to GL hard sync then.
I’d like to share some input latency tests I did recently, with the focus on which video renderer shows better results: D3D11 vs GL. Unfortunately I can’t test Vulkan…
I used a 120FPS smartphone camera (8,3ms/frame… yeah the results are not super accurate) and took the average value of 5-10 tries (jumps) per test. The test game was Little Samson with the Nestopia core and Run-Ahead enabled. V-Sync is ON for all tests but the last one (#5) resulting in constant frame-pacing and perfectly smooth scrolling.
#1 GL = 117ms
#2 GL + Hard GPU Sync = 67ms
#3 D3D11 - mode A = 67ms
#4 D3D11 - mode B = 100ms
#5 D3D11 + RTSS Sync = 67ms
Conclusion
#2, #3 and #5 have the same latency of 67ms. #4 is 2 frames behind and #1 is the worst with 3 frames behind.
Performance Test
I was also interested to see how performance intensive the “low latency modes” #2, #3 and #5 are. For that, I limited my max CPU clock speed to 80% (via windows energy options) which lead to 2.1Ghz. I used Beetle PSX HW for this test and “fast forward” shows that it can run at 79FPS with this setup. Now the results:
#2 and #3 drop to 40FPS while #5 can keep the 60FPS. This shows that #2 and #3 need quite some CPU overhead to properly run at the 60 FPS target. On the other hand, RTSS can keep the 60FPS even with a small “fast forward” overhead of 63FPS and it doesn’t drop to 40/30/20FPS when the CPU can’t maintain fullspeed. So this would be the best solution for weaker systems that can barely run a core/game at fullspeed and don’t want to miss out on some lower input lag.
Specs
Thank you for those tests Ortega, exactly what I was searching for .
Do you know id mode B is the default setup than you get under Windows 10? (with desktop composition always on)
Have you tested Frame Delay to get additionnal milliseconds?
Also, do you confirm that turning off Run-Ahead increase the lag by multiples of 16ms?
I’m not familar with Windows 10 and whether there is a similar input lag issue as with Windows 7. Run-Ahead should work as intended, as you can confirm in quick Retroarch test with the method:
pause game > press & hold action/jump button > press frame advance until the action is displayed
That’s a good idea to test Frame Delay in addition, so I just did that with the above mentioned “low input lag modes” #2, #3 and #5. The requirement for an acceptable frame delay value is absolutely no stutter/audio crackling.
First test is with Nestopia, so a low CPU-intensive core. Run-Ahead = 1 and Second Instance = ON. The results show barely a difference:
There is a noticable difference when using a much higher CPU intensive core, Beetle PSX HW (in software render mode). I ran the test in a game scene where the fast-forward overhead was 88FPS:
So combined with Frame Delay, the RTTS scanline sync method can offer the lowerst average input lag on my system.
what is this story of direct x 11 mode A and mode B I play under windows 7 with systematically aero deactivated, is that what you call mode A?
The 2 fullscreen modes that I refer to as “mode A” and “mode B” are only available on Windows 7 with Aero disabled and when using the D3D10/11 video renderer in RetroArch.
“Mode A” is the default fullscreen mode, with low input latency but with a visible tear-line at the top of the screen. Now if you press ALT+ENTER on your keyboard while a game is running, it switches to “mode B”. This mode has higher input latency (~2 more frames) but has no visible tear-line.
I assume that pressing ALT+ENTER actually switches from “Windowed Fullscreen Mode” to “Exclusive Fullscreen Mode” which has a differenct v-sync behavior. Though when I check “Settings > Video > Fullscreen mode” the option “Windowed Fullscreen Mode” is always ON… changing it to OFF also doesn’t make a difference.
Alt+Enter is a de-facto standard shortcut in many applications to toggle between windowed and fullscreen mode. Works in many games too.
Let’s necro this…
Got a new shiny phone so I can do high framerate recording.
My PC is an old i5-3570k@4GHZ with an nvidia GTX770 on win7 x64.
Monitor is LG 32gk850g.
Xbox one Gamepad in USB.
RA is using Exact Sync, Gsync is On.
Vulkan is using max swapchain 2 (supposed to be the fastest).
240fps recording (1 frame = 4.167ms)
FCEUMM runahead 1 in smb
glcore hard sync 7 5 6 5 8 8 9 7 6 7
6.8 = 28ms
no hard sync 7 7 5 7 7 9 6 7 7 7
6.9 = 29ms
vulkan 9 7 4 9 7 8 10 6 7 6
7.3 = 30ms
vulkan no shader 7 5 5 8 5 8 7 8 6 9
6.8 = 28ms
So, I guess the explanation for the lack of difference is I’m using gsync.
It’s recorded on the bottom of my LG monitor tested for 6.4ms lag (so worst lag case),
for the xbone gamepad in usb I see 6.9ms from a test.
That adds up nicely: 6.9 + 16.67 + 6.4 = 30ms average if you want to think like that.
And I tried to check the Mame main core in RA, to see if it has 1 extra frame of lag or not vs stand-alone.
Test is Unibios boot settings menu (supposed to react in 1 frame in MAME), bottom of the monitor.
windowed (aero enabled):
MAME 6 10 10 10 7 12 10 10 10
9.4 = 39ms
RA 13 13 13 11 13 10 12 11 11 10
11.7 = 49ms
fullscreen:
MAME 7 7 7 8 10 7 4 8 9 5 10 8
7.5 = 31ms
RA (hard sync 0) 12 8 10 7 10 8 8 11 8 7
8.9 = 37ms
gl no hard sync 10 11 11 11 9 8 10 10 10 9
9.9 = 41ms
vulkan 10 10 11 10 11 11 10 8 10 8
9.9 = 41ms
So, remember it’s with G-sync too for both RA-MAME and stand-alone (except for the windowed tests), lowlatency enabled for both in mame.ini.
FCEUMM didn’t show a difference for hard sync 0 or nothing, so I would ignore the slight advantage for it enabled here.
+10ms of lag for RA it is then, something that could be improved in theory.
A bit more testing with FCEUMM and smb still. (run ahead 1 frame)
gsync frame delay 12 glcore 6 6 7 10 9 9 7 4 8 8
7.4 = 31ms
So, frame delay doesn’t seem to do much with gsync…
Now Testing with gsync turned off, 120hz vsync swap interval 2.
gsync off glcore (hard sync off) 20 22 21 19 20 19 20 23 18 19
20.1 = 84ms
gsync off glcore (hard sync 0 frame) 11 9 8 9 9 9 9 10 7 6
8.7 = 36ms
gsync off vulkan swapchain1 10 12 9 11 11 9 7 10 7 9
9.5 = 40ms
gsync off vulkan swapchain2 9 11 8 7 8 11 10 7 6 9
8.6 = 36ms
gsync off vulkan swapchain3 12 9 12 12 6 9 9 5 10 9
9.3 = 39ms
Did some more tests.
In short, d3d11 is similar to glcore and vulkan with gsync, around 30ms.
More surprising, in windowed mode, bottom of the screen, it’s on win7 aero is enabled, gsync enabled for fullscreen mode only, same fceumm smb run ahead 1 frame (to get a 1 frame minimal internal lag):
windowed glcore default 11 10 9 11 9 8 10 9 12 11
10 = 42ms
windowed hardsync 0 frame (in case it does anything for nvidia drivers) 7 12 8 8 8 9 10 8 11 11
9.2 = 38ms
windowed vulkan 9 12 7 9 9 10 9 10 9 7
9.1 = 38ms
Around 40ms while fullscreen without gsync was giving 84ms (probably triple buffering with default nvidia drivers settings)…
Aero isn’t so bad in the end, most probably because my default monitor refresh is 120hz (or gsync helping with screen composition).
So, it means that with my Gsync monitor, I don’t need Frame Delay ?