An input lag investigation

I’ll echo the kudos to Brunnis for a great investigation. :slight_smile:

That’s where I am, as well. I do have a CRT and an NES and SNES, and I have a PS360±modded arcade stick that can connect to both NES and SNES but I haven’t gotten around to wiring an LED in-line yet (I have an LED+resistor set aside, I just haven’t found the time for actually doing it yet). I have access to a 240-fps gopro camera, but it only records in 480p at that speed, which sucks but should be good enough for testing, I guess. I’ll try to get off my ass and actually do these modifications and tests ASAP.

I agree with vanfanel that it’s probably the triple buffering that’s taking a frame away from KMS/EGL, and I, too, would be interested in seeing results from a hard-rt kernel.

I have actually got an older CRT TV, SNES and an iPhone that can record videos at 240fps myself. Might give it a try on Sunday, most likely.

Anyone tested the Vulkan video driver on Linux and Windows to see if it’s an improvement over hardsynced GL?

Thanks for the further testing Brunnis. Would having triple buffering and v-sync disabled have any effect on the results ? I have always turned them off because they “felt” like they always added to the input lag to me, though I have no way at all of testing this. I would also be interested in seeing the results of testing done on other cores like Genesis GX and PSX.

Thanks for the kind words everyone! It was a lot of work, so it’s great that it’s appreciated. :slight_smile:

Just a couple of comments now, since I need to run (I’m off to spend the weekend by the sea). I’ll read and comment more thoroughly when I get back.

I just checked this by running the same test on Super Mario World on Windows 10 and I got exactly the same input lag results as on Yoshi’s Island (5/6/7 frames min/avg/max). So Yoshi’s Island does not appear to have any unusually laggy input.

That’s awesome! I just rebuilt RA from the RetroPie setup script and it works as expected. :slight_smile:

Time for some more interesting stuff! :slight_smile:

The emulator lag figures of 2 frames for NES and 3-4 frames for SNES bothered me. So, I started looking some more at the source code of snes9x-next. Turns out that frame rendering is performed first, followed by reading the input towards the end of the frame (in the vblank interval, just after having completed rendering of the visible scanlines). From what I can gather, this means that the emulator begins every loop (frame interval) by rendering what was prepared in the previous frame. It then proceeds to read the input and run the game logic, BUT at that point the main loop exits, which means that it won’t be able to render what it just calculated until the next iteration. And so on. In other words, given no other delays within the emulator or game, it takes a minimum of two frames to produce a visible reaction to input.

That got me thinking. How about simply changing at which point the main loop starts. Why not kick off the emulator right before input is read? By doing that, we instead start by reading the input, we then run the game logic and we finally render the visible output, all within the same call to the emulator’s main loop. This should mean that we’d get just one frame between input and visible reaction.

Here’s a simple sketch I made:

So, I set about doing the necessary changes in snes9x-next’s cpuexec.c source file. I just needed a few extra lines in the main loop and a couple more in the event handling function. All in all, less than 20 lines of code. It’s not nice looking code, but I just threw it together as a proof of concept.

With the code done, I compiled it for Windows and tested it out. The result? Have a look:

The chart below contains data from previous camera tests (the first six bars) to results from the newly compiled core with my fix.

I also tested the new core using the frame advance method (test result for the newly compiled core is to the right):

Believe it or not, but one frame of lag was removed. I have only tried a few games so far, but haven’t seen any adverse effects. That said, I’m no emulator programmer, so I can’t say for sure that I haven’t messed anything up by doing this. Given the fact that I made this change in only a few hours, having never looked at the code (or any emulator code) before, I’m sort of thinking that I’ve overlooked something… I definitely think it’s time for the developers to chime in on this one.

Below is a link to the modified source code file. You can search for “finishedFrame” and you’ll easily spot the few places where I made changes. Let me know if you want to try the compiled core (for Windows 64-bit) and I’ll e-mail it to you.

http://pastebin.com/5TfgnzLG

I haven’t investigated yet if the same kind of change can be made in Nestopia.

Finally, a quick note before I head off to bed: During my testing today, I discovered that Yoshi’s Island actually has one frame of input lag less when in the main menu (showing the island) compared to when you’re actually playing…

That’s it for tonight!

Nice, it feels faster here. Mario World is not lacking much to be perfectly fine.

I wanted to test Shubibinman Zero on Satellaview but that’s something the Next version of Snes9x can’t do it seems.

Made a DDL for it hope that’s ok with you:

Snes9x Next Brunnis test (win x64)

edit: Just played some Super Aleste, quite sure that’s not placebo. (I’m comparing against standard bsnes-mercury balanced.) Great improvement. :slight_smile:

Hi Brunnis. Great investigation! Are you able to do a test with Vulkan? Also, where is your snes9x-next-libretro fork repository? Can you put it on github?

Thx @Brunnis for the deeper investigation and the first solution. And thx for @Tasuya79 for the DLL.

It realy seems faster for around 1 frame when I test with 240pSuite. With the Dell U2715 I get around 1.8 frames in average.

Realy great effort @Brunnis keep going!

Hope some of the other devs also get into deeper investigation to get rid of some of the lag frames.

I tried to apply the same trick to snes9x (not “Next”). I did the change on Github if anyone wants to check that, should be easier to read.

Here is the core for win x64:

snes9x_libretro (Brunnis test input lag fix)

I always feel Snes9X is slightly slower than Next or Bsnes… Is this real? Anyway that’s quite close now. :slight_smile:

[QUOTE=Tatsuya79;41391]I tried to apply the same trick to snes9x (not “Next”). I did the change on Github if anyone wants to check that, should be easier to read.

Here is the core for win x64:

snes9x_libretro (Brunnis test input lag fix)

I always feel Snes9X is slightly slower than Next or Bsnes… Is this real? Anyway that’s quite close now. :)[/QUOTE] That’s awesome! Do you want me to test it before/after using the camera method as well? If the fix works as expected, I’ll create a bug report on GitHub. I just made one for snes9x-next here:

I’ll see if I can test with Vulkan as well, but I don’t really expect any improvement. Given the figues I’ve collected so far, it appears that less than one frame of delay (maybe even less than half a frame) is added between the finished frame being output from the emulator and the monitor starting to scan out the frame at the top left. Both the tested graphics card and the display seem to be exceptionally fast in getting a frame out…

EDIT: Ohh, and I don’t have a fork repository set up. I’ll see if I can do something about that.

great work!

honestly, i think if you want dev feedback i’d create a pull request on github rather than an issue. it would give a nice easy view of your diff rather than having to download the source code and compare manually, plus people can add comments in-line for any concerns.

[QUOTE=dankcushions;41394]great work!

honestly, i think if you want dev feedback i’d create a pull request on github rather than an issue. it would give a nice easy view of your diff rather than having to download the source code and compare manually, plus people can add comments in-line for any concerns.[/QUOTE] Thanks, dankcushions! I’ll see if I can try that approach instead.

Yes that’s definitely something I’d like to know I you feel like doing more testing.

Quick comment:

FYI regarding additional input lag in Retroarch running on Linux: https://github.com/libretro/RetroArch/issues/3100 (modification shaves half a frame off on average)

Also, Lakka has now been modified upstream to apply the rt-kernel patch for more consistent scheduling with 1Khz tick rate, preemptible etc. (frame delay setting can be increased further)

The above two changes makes the input lag in Linux as low as seen on Win10.

It would be interesting to test how many games this fix of mine actually works on… If the game polls input early, i.e. before the vblank interval, the fix is not effective (well, if I’ve understood things correctly, that is). However, the result should still not be worse than without the fix. Anyone care to help out by testing snes9x-next with and without the fix, using the frame advance method? I’d like to test myself, but I don’t have time right now (and might not until next week).

[QUOTE=larskj;41434]Quick comment:

FYI regarding additional input lag in Retroarch running on Linux: https://github.com/libretro/RetroArch/issues/3100 (modification shaves half a frame off on average)

Also, Lakka has now been modified upstream to apply the rt-kernel patch for more consistent scheduling with 1Khz tick rate, preemptible etc. (frame delay setting can be increased further)

The above two changes makes the input lag in Linux as low as seen on Win10.[/QUOTE] Very interesting! I’ll definitely want to test this. :slight_smile:

How could we do this for bsnes-mercury balanced?

I think perhaps in this part here:

But I don’t understand the code there…

Bsnes is a LLE hardware emulator, so it polls exactly when the SNES would poll, right? I also don’t fully understand the source, but I skimmed through a bunch of files including timing.cpp and scheduler.cpp and that’s what it seemed like. I might be totally wrong here.

I’m having a pretty bizarre discussion (or lack thereof) in my thread on byuu’s forum at the moment… The thread starts with some old stuff regarding my first test results. The discussion about the code change in snes9x-next begins here: http://board.byuu.org/phpbb3/viewtopic.php?f=8&t=1058&start=10#p26925

They either misunderstand the intent of the fix, ignore the test results or simply state something along the lines of “we’ve discussed input lag before and there’s nothing to be done in the emulator”. Maybe this supposed fix is so stupid that I don’t even deserve to be told why it won’t work? I just wish someone would tell me what I’m missing. Looks like byuu’s just ignoring me now…

In the meantime, I’ve started testing a list of games with and without my fix in snes9x-next by using the frame-advance method:

[TABLE=“width: 283”]

Stock With fix

SMW2: Yoshi’s Island 4 3

Super Metroid 3 2

Super Mario World 4 3

[/TABLE]

So far, things look good. Interesting that Super Metroid has one frame of lag less than SMW and SMW2.

Keep up the good work, luckily the source code is available so we can apply these kinds of improvements ourselves.

It would be very interesting to apply the same to bsnes mercury when we figure out how.