An input lag investigation

Built your updated snes9x-next-libretro binary with the input lag “fix” for 32-bit Windows. I haven’t run objective tests, but I can’t discern any input lag in this snes emulator with the latest 32-bit Retroarch at default settings. I also tested in d3d mode, but have not re-read the thread to determine any effect of the fix among different video modes in Windows (sdl2, d3d, gl).

[QUOTE=Tatsuya79;41525]Made a pull request for Next:

Will be easier to review as I added just the new lines for clarity.[/QUOTE] Great, thanks! Looks like it’s going to be committed. :slight_smile:

[QUOTE=spinningacorn;41536]Thanks for this, it really explains a lot. Thanks specially for the modifies snes9x-next :slight_smile: Any chance of getting a similar tweak for Nestopia? Or is that an entirely different case?[/QUOTE] You’re welcome! I’m planning to give Nestopia a go next week.

[QUOTE=Tatsuya79;41540]Can’t rewrite bsnes but I can get rid of the overscan in bsnes-mercury\sfc\system\system.cpp line 285:

if(cpu.vcounter() == 241) scheduler.exit(Scheduler::ExitReason::FrameEvent);

force 225 to test it.

(I think that’s slightly faster but not as much as snes9x next)

edit: between slightly and none… …none?[/QUOTE] Actually, you’re almost there. I just looked some more at the code and finally managed to achieve the same 1 frame reduction in input lag. Here’s how:

bsnes-mercury\sfc\system\system.cpp line 285:

if(cpu.vcounter() == 226) scheduler.exit(Scheduler::ExitReason::FrameEvent);

bsnes-mercury\sfc\cpu iming\joypad.cpp line 5:

if(vcounter() >= (ppu.overscan() == false ? 227 : 242)) {

This hack makes it so that the frame is always output after line 225. The emulator loop then exits. The next time the main loop is called, it enters at line 227 and proceeds to poll input and finally render the frame. The reason it didn’t work for you is that the emulator has already polled just before exiting the loop. Give this a try and see what you think.

Important: This is just a hack! Always outputting the frame at line 226 will not work in all cases, since some games have 240 lines. The interesting thing here is that snes9x seems to handle this by determining if the game renders 225 or 240 scanlines and outputs the frame as soon as it’s ready. That’s what made the fix so easy to implement on snes9x, i.e. it already output the frame at the correct place, we just needed to make sure it also left the main loop right after that point.

So, the question is: why doesn’t bsnes detect this in the same way? As it is now, it’s hard coded to exit at line 241 and output the frame. Maybe this has something to do with how overscan needs to be handled for maximum accuracy and snes9x foregoes such accuracy? Then again, byuu actually uses a check for overscan in the polling code (see second code block above)… Why not just add the following code to the first code block above and be done with it:

if(cpu.vcounter() == (ppu.overscan() == false ? 226 : 241)) scheduler.exit(Scheduler::ExitReason::FrameEvent);

I could see this as having a side effect if the game changes the overscan value between line 226 and 241. Then we’d get another main loop exit when we reach line 241, which is not what we want. I don’t know if that can happen, though. I’ll compile it anyway and give it a go. Any particular games you recommend to test the 225 vs 240 scanline handling?

Good to hear. The fix will be equally effective, regardless of video mode.

Nice gonna test that! :slight_smile:

edit: Yes that’s obviously better! Nice work! Gonna use this as I use crop overscan anyway for the SNES (if I remember right there’s rarely interesting stuff there unlike on the Pc-Engine).

Did you measure the input lag against snes9X or Next?

[QUOTE=Brunnis;41566] So, the question is: why doesn’t bsnes detect this in the same way?[/QUOTE] Blind guess: this is how the real SNES works. I can understand that as that’s what made bsnes so accurate.

[QUOTE=Tatsuya79;41568]Nice gonna test that! :slight_smile:

edit: Yes that’s obviously better! Nice work! Gonna use this as I use crop overscan anyway for the SNES (if I remember right there’s rarely interesting stuff there unlike on the Pc-Engine). [/QUOTE]

Would it be possible to have a DLL of Bsnes-mercury-balanced modified with this new code? Unfortunately I don’t have the means to compile it myself right now.

Thanks in advance, much appreciated. :slight_smile:

[QUOTE=Geomancer;41569]Would it be possible to have a DLL of Bsnes-mercury-balanced modified with this new code? [/QUOTE]

Here it is:

edit: updated in a further post.

I get some wrong input command in R.P.M Racing (512x448 game) in the 2nd menu (green one). I just push up and down and it registers START and launch the game.

[QUOTE=Tatsuya79;41574]I get some wrong input command in R.P.M Racing (512x448 game) in the 2nd menu (green one). I just push up and down and it registers START and launch the game.[/QUOTE] Interesting. Does it work with the modified snes9x-next?

EDIT: Modified snes9x-next appears fine. The change to bsnes probably mucks up timing in a bad way. Interesting that it reads the wrong button, though.

But I played a standard game like Super Aleste for a while without issue. So it’s perhaps just some high resolution modes? (just 512x448 and RPM racing if we’re lucky)

About games with overscan, no clue what to test. I tried several random games, none had any valid overscan lines.

Found 1: Dragon Quest V.

Yes… annoying game for testing lol. Anyway the overscan is displayed but with redraws (edit: I applied your 227 : 242 case in joypad.cpp and that fixed it). The input is laggy in this game I can’t say anything.

Need to find something better.

Seiken Densetsu 3 has wrong button registration like RPM racing. In game menu and during game in character menu (start button) too. It’s supposedly a 512x224 game. (resolution seems to change when displaying japanese characters)

Ranma 1-2 - Chounai Gekitou Hen (Japan) / Street Combat (USA) (I forgot about that abomination…) are interlaced at 256x448 and the controls are working. Is it the 512 horizontal mode that causes problem?

Kirby’s Dream Land 3 has some 512 wide parts and works OK… So no clue.

Super Pang is 512 and OK.

Jurassic Park does some bad button registration during gameplay.


So, probably some display modes?

(In bsnes-mercury\sfc\alt\ppu-balanced\render\ I saw stuff in render.cpp and at the end of line.cpp that could be relevant.) More important probably : bsnes-mercury\sfc\system\video.cpp

Some documentation. And more.

Only stupid work around I could find atm in bsnes-mercury\sfc\cpu iming\joypad.cpp:

if((ppu.hires() && vcounter() >= (ppu.overscan() == false ? 225 : 241)) || vcounter() >= (ppu.overscan() == false ? 227 : 242))

No more input problem in hires mode but… play Kirby’s Dream Land 3 and cry each time you enter an hires level as the input lag comes back.

I think I found it.

The problem comes from mode 5 in hires. I disabled the fix for this case; that’s just some menus as far as I could see.

I hope other hires games got their input lag shorten (I think it does trying Super Pang, but it would need a camera test). edit: I think Kirby still lags and so is probably using mode 5 in a particular way (it didn’t register bad controls before). / or this game has a big input lag by default and it works, not sure.

Download:

edit: better version next page

Thanks for all your work, Tatsuya79! However, I think I we might have gone down the wrong path. I skimmed the emu links you posted (thanks!), regarding input handling. At one of the links, the following can be read:

When enabled, the SNES will read 16 bits from each of the 4 controller port data lines into registers $4218-f. This begins between dots 32.5 and 95.5 of the first V-Blank scanline, and ends 4224 master cycles later.

So, the joypad is obviously read within the vblank period, which means that the frame must be ready by then. This means that there must be a possibility of exiting the main loop and outputting the frame before the polling takes place. My thinking here is that we should not need to modify joypad.cpp at all, just the emulator exit point. I actually don’t really think you were so wrong in your first attempt… I have thought about this during the past two days (I’ve been away from home, so couldn’t test until now) and just tried to modify system.cpp line 285 from this:

if(cpu.vcounter() == 241) scheduler.exit(Scheduler::ExitReason::FrameEvent);

to this:

if(cpu.vcounter() == (ppu.overscan() == false ? 225 : 240)) scheduler.exit(Scheduler::ExitReason::FrameEvent);

I have done two quick tests with this code:

  1. Yoshi’s Island emulator lag: down from 3 frames to 2 frames (yay!).
  2. R.P.M. Racing: Does NOT detect a false “Start” press when pressing up/down in the menu.

The code above outputs the frame during the very first line of vblank. From what I read, the actual visible frame starts at V=0 and ends at either V=224 or V=239 (i.e. 225 or 240 lines), so it should be safe to end the main loop and output the rendered frame at V=225 or V=240. I don’t know why byuu choose to output at V=241, but this causes the frame to be output just after the polling event, which in turn causes the extra frame of input lag.

Could you please try this code and see how it works? Also, to test input lag differences between the stock core and the modified one, please use the frame advance method, i.e. press ‘p’ to pause emulation, press and hold button on controller while pressing ‘k’ repeatedly to run the emulator loop and count how many presses it takes until you see a reaction on screen.

[QUOTE=Tatsuya79;41613]I think I found it.

The problem comes from mode 5 in hires. I disabled the fix for this case; that’s just some menus as far as I could see.

I hope other hires games got their input lag shorten (I think it does trying Super Pang, but it would need a camera test). edit: I think Kirby still lags and so is probably using mode 5 in a particular way (it didn’t register bad controls before). / or this game has a big input lag by default and it works, not sure.

Changes here.

Download:

bsnes_mercury_balanced_libretro (Brunnis test input lag fix) win x64[/QUOTE]

Hi,

Thanks for all the investigation, very interesting.

BSNES Mercury Balanced is my favourite SNES core latency wise, so I was anxious to try this out and provide some feedback.

I tried it with Super SWIV (FirePower 2000 in USA), a game I know by heart (I also have it on my real SNES for comparison), but the above modified core makes latency worse for me actually. So prefer stock core still.

[QUOTE=Brunnis;41625]Thanks for all your work, Tatsuya79! However, I think I we might have gone down the wrong path. I skimmed the emu links you posted (thanks!), regarding input handling. At one of the links, the following can be read: [snip]] Could you please try this code and see how it works? Also, to test input lag differences between the stock core and the modified one, please use the frame advance method, i.e. press ‘p’ to pause emulation, press and hold button on controller while pressing ‘k’ repeatedly to run the emulator loop and count how many presses it takes until you see a reaction on screen.[/QUOTE]

If it’s not too much trouble please provide us with the modified DLL also, as then we can get as much “outside testing” as well and provide you with feedback.

How does this all tie-in with the “Poll Type Behaviour” setting in Retroarch’ btw? This can be found under settings->input->poll type behaviour and can be set to “early”, “normal” or “late”.

Thanks for the pause method, that’s great, forgot to ask about that when you mentioned it before.

So with that scheduler change only:

Super Aleste: 2 frames vs 3 before Kirby’s Dreamland 3: 3 vs 4 (my previous method was working, I know it now! yeah who cares… lol) Super Mario World: 3 vs 4 Super Pang: 2 vs 3 F-Zero: 2 vs 3

No bad registration in RPM racing or Seiken Densetsu.

You’re the boss. :slight_smile:

Glad I linked those technical docs (that didn’t make much sense for me).

Here is the new download link:

bsnes_mercury_balanced_libretro (Brunnis test input lag fix) win x64

Beware as I have a testing method now. :slight_smile:

Super Swiv (Japan): 4 frames on bsnes-mercury balanced default core / 3 frames with my previous modified core / 3 frames with the new method Brunnis just gave above

Late is the best setting for fast response. You should gain a bit from each; at the emulator (core) level and with Retroarch timings.

Wow, I can’t thank you enough for your contribution, Brunnis! I can confirm that my results are in line with what Tatsuya79 is reporting: there is a consistent reduction of 1 frame of latency across all games I have tested with the pause-method.

All games were tested on a Win7 x64 build, with Hard GPU Sync ON, Exclusive Fullscreen and V-Sync ON:

Super Mario World: 3 frames with the latest fix vs. 4 frames with the original bsnes-mercury core Super Metroid: 2 frames with the latest fix vs. 3 frames with the original bsnes-mercury core Akumajou Dracula (Japanese version of Super Castlevania IV): 2 frames with the latest fix vs. 3 frames with the original bsnes-mercury core Aladdin: 2 frames with the latest fix vs. 3 frames with the original bsnes-mercury core Donkey Kong Country: 2 frames with the latest fix vs. 3 frames with the original bsnes-mercury core The Legend of Zelda ALttP: 2 frames with the latest fix vs. 3 frames with the original bsnes-mercury core Rockman X (Japanese version of Megaman X): 3 frames with the latest fix vs. 4 frames with the original bsnes-mercury core

The only interesting case was Super Bomberman, which features 2 frames of latency on both versions regardless of any fix, although I suspect that Brunnis’ fix is a bit less latent than 2 full frames.

Anyway, this got me thinking. So far we can say that Brunnis’ fix brings the SNES cores to an equal level of responsiveness if compared to other cores (I’m thinking of mgba, which is astounding in that regard). The lowest amount of input lag that we’re getting is of 2 frames, but I wonder if something can be done on a Retroarch-wide level now, maybe with the OpenGL video driver (some swapchain / triple buffering modification?) in order to achieve true next-frame latency…

Twinaphex merged the snes9x-next PR and made the same changes to snes9x mainline. I opened a PR for bsnes-mercury but it may not go anywhere, since it breaks on mid-frame overscan changes. I don’t think any commercial games actually do that but some demos do, IIRC, and bsnes is more accuracy-focused than snes9x.

Could it be implemented as a Core Option feature maybe? That way the priority of accuracy would still be preserved in the code, while allowing the user to manually activate this lag fix whenever playing normal commercial games.

Hunterk, does it mean we can use the patch on the Android version of Snes9x-next ?

@Tromzy Yeah, just fetch the newest one from the online updater and it should have it.

@Geomancer Possibly. It’s hard to make core options out of some things, but I’ll look into it.