An input lag investigation

Thanks for the pause method, that’s great, forgot to ask about that when you mentioned it before.

So with that scheduler change only:

Super Aleste: 2 frames vs 3 before Kirby’s Dreamland 3: 3 vs 4 (my previous method was working, I know it now! yeah who cares… lol) Super Mario World: 3 vs 4 Super Pang: 2 vs 3 F-Zero: 2 vs 3

No bad registration in RPM racing or Seiken Densetsu.

You’re the boss. :slight_smile:

Glad I linked those technical docs (that didn’t make much sense for me).

Here is the new download link:

bsnes_mercury_balanced_libretro (Brunnis test input lag fix) win x64

Beware as I have a testing method now. :slight_smile:

Super Swiv (Japan): 4 frames on bsnes-mercury balanced default core / 3 frames with my previous modified core / 3 frames with the new method Brunnis just gave above

Late is the best setting for fast response. You should gain a bit from each; at the emulator (core) level and with Retroarch timings.

Wow, I can’t thank you enough for your contribution, Brunnis! I can confirm that my results are in line with what Tatsuya79 is reporting: there is a consistent reduction of 1 frame of latency across all games I have tested with the pause-method.

All games were tested on a Win7 x64 build, with Hard GPU Sync ON, Exclusive Fullscreen and V-Sync ON:

Super Mario World: 3 frames with the latest fix vs. 4 frames with the original bsnes-mercury core Super Metroid: 2 frames with the latest fix vs. 3 frames with the original bsnes-mercury core Akumajou Dracula (Japanese version of Super Castlevania IV): 2 frames with the latest fix vs. 3 frames with the original bsnes-mercury core Aladdin: 2 frames with the latest fix vs. 3 frames with the original bsnes-mercury core Donkey Kong Country: 2 frames with the latest fix vs. 3 frames with the original bsnes-mercury core The Legend of Zelda ALttP: 2 frames with the latest fix vs. 3 frames with the original bsnes-mercury core Rockman X (Japanese version of Megaman X): 3 frames with the latest fix vs. 4 frames with the original bsnes-mercury core

The only interesting case was Super Bomberman, which features 2 frames of latency on both versions regardless of any fix, although I suspect that Brunnis’ fix is a bit less latent than 2 full frames.

Anyway, this got me thinking. So far we can say that Brunnis’ fix brings the SNES cores to an equal level of responsiveness if compared to other cores (I’m thinking of mgba, which is astounding in that regard). The lowest amount of input lag that we’re getting is of 2 frames, but I wonder if something can be done on a Retroarch-wide level now, maybe with the OpenGL video driver (some swapchain / triple buffering modification?) in order to achieve true next-frame latency…

Twinaphex merged the snes9x-next PR and made the same changes to snes9x mainline. I opened a PR for bsnes-mercury but it may not go anywhere, since it breaks on mid-frame overscan changes. I don’t think any commercial games actually do that but some demos do, IIRC, and bsnes is more accuracy-focused than snes9x.

Could it be implemented as a Core Option feature maybe? That way the priority of accuracy would still be preserved in the code, while allowing the user to manually activate this lag fix whenever playing normal commercial games.

Hunterk, does it mean we can use the patch on the Android version of Snes9x-next ?

@Tromzy Yeah, just fetch the newest one from the online updater and it should have it.

@Geomancer Possibly. It’s hard to make core options out of some things, but I’ll look into it.

More games tested with the latest modified bsnes-balanced code, using the pause-method:

Aero the Acro-bat: 2 frames (Brunnis’ fix) vs. 3 frames (original bsnes-mercury) Earthbound: 2 frames (Brunnis’ fix) vs. 3 frames (original bsnes-mercury) Kirby Super Star: 3 frames (Brunnis’ fix) vs. 4 frames (original bsnes-mercury) Megaman & Bass: 2 frames (Brunnis’ fix) vs. 3 frames (original bsnes-mercury) Super Ghouls and Ghosts: 2 frames for attacking, 3 frames for jumping (Brunnis’ fix) vs. 3 frames for attacking, 4 frames for jumping (original bsnes-mercury) Super Mario All-Stars: 2 frames (Brunnis’ fix) vs. 3 frames (original bsnes-mercury) Zombies ate my Neighbours: 3 frames (Brunnis’ fix) vs. 4 frames (original bsnes-mercury)

FYI, the bsnes-mercury lag fix patch has now been added and merged to Lakka as well.

If updating Lakka to try this, make sure to delete any previous bsnes so file in /Storage/Cores/ (only necessary if you have updated bsnes mercury previously from within Lakka)

Super Mario World is actually playable now :slight_smile:

Geomancer - Out of curiosity, what game did you test with in Super Mario All-Stars? I noted you have 2 emulation frames from All-Stars but 3 for Super Mario World stand-alone-game.

[QUOTE=Geomancer;41672]More games tested with the latest modified bsnes-balanced code, using the pause-method:

Aero the Acro-bat: 2 frames (Brunnis’ fix) vs. 3 frames (original bsnes-mercury) Earthbound: 2 frames (Brunnis’ fix) vs. 3 frames (original bsnes-mercury) Kirby Super Star: 3 frames (Brunnis’ fix) vs. 4 frames (original bsnes-mercury) Megaman & Bass: 2 frames (Brunnis’ fix) vs. 3 frames (original bsnes-mercury) Super Ghouls and Ghosts: 2 frames for attacking, 3 frames for jumping (Brunnis’ fix) vs. 3 frames for attacking, 4 frames for jumping (original bsnes-mercury) Super Mario All-Stars: 2 frames (Brunnis’ fix) vs. 3 frames (original bsnes-mercury) Zombies ate my Neighbours: 3 frames (Brunnis’ fix) vs. 4 frames (original bsnes-mercury)[/QUOTE]

I have so far tested Super Mario Bros. 1 and The Lost Levels within All-Stars and they both exhibit 2 frames of latency with the new fix. Bear in mind that the version I’ve tried is the one without SMW.

I also confirm that standalone Super Mario World features 3 frames of lag (4 with original bsnes-mercury).

It’s how the games are programmed.

Just tried several Mario World version (USA, Japan, All-stars pack, Arcade) they all have 3 frames lag. Then Mario 1, 2, 3 have 2 frames.

This is so great for SNES! Finally Super Mario World and Yoshi’s Island feel in RA like they do on the real console! I can almost start practicing speed running on emulator! The only thing stopping it is the vsync issue with audio sync (drops many frames every few seconds instead of just 1 every ~10 seconds as it should running at 60.099Hz on a 60Hz display). I’ll mention that issue somewhere else. It might be an Nvidia Vulkan driver issue.

Requested cores to investigate for applicability of this awesome “Brunnis lag fix”, with some games I like that I think would benefit greatly from it:

Nestopia (Chip and Dale: Rescue Rangers) mGBA (Mother 3, Castlevania: Aria of Sorrow) mupen (Super Smash Bros.) desmume (Castlevania: Portrait of Ruin)

I agree, but it would be nice to know for sure if there’s an extra frame of input latency and that the feeling isn’t just a placebo.

I noticed that a while ago, in super mario world it stutters every 10 seconds or so when you’re running around.

With the Pause method:

Mgba Super Mario Advance = 2 frames Nestopia Chip n Dale = 2 frames

No problem there it seems.

Nestopia Batman = 2 frames get ready to jump, 9th frame is in the air. :slight_smile:

But that’s just the way it is. I can finish it in less than half an hour anyway, one of my all time favourite.

To be clear for anyone confused, the input lag you feel is something like:

game code / design choices + emulator (core) processing time (Pause method we give here) + Retroarch dealing with all these “Operating sytem dependant timings” + your joypad response time + your monitor/TV response time

Brunnis fix is at the emulator / core level. Then you can hope for Vulkan? faster standards for peripherals? Retroarch magic?

Yeah, my mGBA results seem indeed consistent with yours, Tatsuya. It’s actually one of my favourite emulators out there: lightweight and responsive, with extremely low audio latency, great accuracy while not too taxing on the CPU. :slight_smile:

I think Brunnis’ fix has brought the SNES core back in line with the others as far as processing time is concerned. Now I wonder if it’s possible to further reduce emulator latency and get to the point of seeing a reaction time of just one frame.

It is just pure speculation on my part and I don’t have any programming competency either, but the fact that the lowest amount of latency we’re experiencing on all cores (even the fastest ones) is of 2 frames makes me wonder if we may look at something at a Retroarch-wide level to decrease it further.

The reason I’m making this assumption is that there is another libretro frontend that actually appears to be even more reactive than what we have measured so far. I’m talking about Alcaro’s ZMZ, which couples the old ZSNES interface with the SNES libretro cores. You can find it here: http://www.smwcentral.net/?p=section&a=details&id=5681 Here is its github page: https://github.com/Alcaro/ZMZ

Back when I made the other thread on SNES latency, hunterk advised me to try it and I immediately noticed a huge improvement, although the timings were not as buttery-smooth as in Retroarch. Hunterk references this difference with actual measurements on his blog page: http://filthypants.blogspot.it/2015/06/latency-testing.html

I have made a quick test by using the latest Brunnis-fixed cores within ZMZ and it seems even more responsive, to the point of starting to really resemble actual SNES hardware. I wouldn’t really know where to start honestly, but maybe Brunnis might find something relevant by comparing the two programs. :slight_smile:

Another interesting resource, mentioned also by hunterk, might be to look at Calamity’s GroovyMAME, available here: http://forum.arcadecontrols.com/index.php/board,52.0.html It’s a special distribution of MAME that diverges from baseline thanks to several features aimed at CRT usage (but it also works on common LCD screens) and minimizing latency. As far as I can understand, they have implemented two things that make it great from an input-lag perspective: a d3d9ex backend which supposedly skips all sorts of driver overhead and a “frame_delay” function. They claim to achieve next-frame latency on Windows x64, so once again I wonder if we can look at it as a reference.

I tested those programs before, but ZMZ seemed very hacky and I couldn’t see myself using this regularly. About GroovyMAME I didn’t get much out of it, don’t know if I missed some settings. ShmupMame looked interesting to me too but it can’t do shaders and is a bit outdated. I don’t remember if I tested it in the end.

The latency tests Hunterk made are strange to me, particularly on the shader part within Retroarch. I’m using CRT geom all the time and it’s just impossible it would give twice the response time vs stock.

I can go between both with the N & M keys while making mario jump on bsnes: difference is hard to tell. It can’t be 86 vs 148ms while I can feel the 16ms difference Brunnis fix makes. I put it on some kind of problem, AMD drivers with cg shaders or something.

And there is this “RetroArch - fullscreen Hard GPU Sync = 0 bsnes-compatibility core xbr-lvl4-multipass” at 80ms average Vs ZMZ fullscreen 78ms that looks strange too compared to the other cases.

With Brunnis’ fix and using frame delay of 4, everything feels instant to me (w/ 1 frame of LCD lag). Have you tried fiddling with that setting?

Yes, I actually pushed it all the way to 7 and I agree, with the Brunnis-fixed cores it feels almost instant, at least subjectively.

Don’t get me wrong, the results Brunnis has achieved with the SNES cores are outstanding and make things way more playable and responsive than before. Once again, I can’t be thankful enough. :slight_smile:

I’m just asking myself whether it is possible to theoretically shave one further frame off in the actual emulator processing code, the one we’re measuring through the pause-method, and achieve a minimum latency of just 1 frame there. ZMZ and other solutions out there are surely way more hack-ish and inaccurate than Retroarch as far as timings go, but maybe there’s something useful in there.