An input lag investigation

[QUOTE=Tatsuya79;41490]Think I found a way:

bsnes_mercury_balanced_libretro (Brunnis test input lag fix)

Tell me if that works for you as sometimes I get tired and I’m not sure of my eyes any more!

That’s one line change.

edit: and it’s causing issues. :frowning: Random button activation can happen or worse depending on games.[/QUOTE] I actually had a look as well. To be honest, it doesn’t look like bsnes has exactly the same issue as snes9x. Here’s the code that exits the main loop:

if(cpu.vcounter() == 241) scheduler.exit(Scheduler::ExitReason::FrameEvent);

It suggest that bsnes actually does exit the main loop right after (or close to right after) a frame has been generated. Snes9x exited at the last line of the vblank interval. The line of code you modified suggests that bsnes polls right before a frame is generated (at V=240), before exiting:

if(vcounter() >= (ppu.overscan() == false ? 225 : 240)) {

The polling seems to be made repeatedly, once every 256 cycles of the SNES CPU, though. The strange thing is that I haven’t been able to get any improvent OR regression by moving around the polling to different places within the frame… The fact remains, though, that bsnes reacts 1 frame slower than snes9x with my fix. Maybe bsnes actually reacts like real hardware and snes9x with my fix is “too fast”? :stuck_out_tongue: Sounds a bit far fetched…

I’ll try to measure this somehow later on, but no, neither standalone bsnes/higan nor its libretro core appear to have the same input responsiveness as real hardware.

Okay, thanks.

And I agree. I just realized the likely reason for why the change doesn’t have any effect on bsnes. It’s one of the things that byuu alluded to in my thread over at his forum:

  • bsnes always exits the main loop after rendering line 240, no matter if the game has 225 or 240 visible lines.
  • Many (most) games have just 225 lines.
  • bsnes polls input from line 225 or line 240 depending on how many visible lines the game in question has.

So, for games with 225 lines, the following might happen:

  1. Frame is ready at line 225, but it’s not output yet.
  2. Still at line 225, the emulator polls the system for input.
  3. The game reads the input.
  4. The emulator continues to run until line 240 and outputs the frame that was generated in step 1.

We will then have to wait until the next run of the main loop before the emulator actually runs the remaining part of the vblank interval (where game logic is run) and outputs the frame. We have introduced a full 16.67 ms delay. Snes9x with my fix doesn’t have this problem. It exits the main loop at the very moment a frame has been finished, be it at line 225 or line 240.

So, what we should try to do is to rewrite bsnes to exit the main loop at different points depending on if the game has 225 or 240 scanlines. Anyone up for the challenge? :slight_smile:

EDIT: This is what byuu wrote:

HOWEVER, the SNES gets a bit shafted here. The SNES can run with or without overscan. And you can even toggle this setting mid-frame. So I have to output the frame at V=241,H=0. Whereas input is polled at V=225,H=256+.

I’m still not sure why we couldn’t vary the exit point dynamically, depending on game, if overscan is used or not, etc. We should at least try and see what happens.

Yeah what I did was random and not working. Too many stuff I ignore there, can’t do much I think. :confused:

Made a pull request for Next:

Will be easier to review as I added just the new lines for clarity.

Thanks for this, it really explains a lot. Thanks specially for the modifies snes9x-next :slight_smile: Any chance of getting a similar tweak for Nestopia? Or is that an entirely different case?

Can’t rewrite bsnes but I can get rid of the overscan in bsnes-mercury\sfc\system\system.cpp line 285:

if(cpu.vcounter() == 241) scheduler.exit(Scheduler::ExitReason::FrameEvent);

force 225 to test it.

(I think that’s slightly faster but not as much as snes9x next)

edit: between slightly and none… …none?

Built your updated snes9x-next-libretro binary with the input lag “fix” for 32-bit Windows. I haven’t run objective tests, but I can’t discern any input lag in this snes emulator with the latest 32-bit Retroarch at default settings. I also tested in d3d mode, but have not re-read the thread to determine any effect of the fix among different video modes in Windows (sdl2, d3d, gl).

[QUOTE=Tatsuya79;41525]Made a pull request for Next:

Will be easier to review as I added just the new lines for clarity.[/QUOTE] Great, thanks! Looks like it’s going to be committed. :slight_smile:

[QUOTE=spinningacorn;41536]Thanks for this, it really explains a lot. Thanks specially for the modifies snes9x-next :slight_smile: Any chance of getting a similar tweak for Nestopia? Or is that an entirely different case?[/QUOTE] You’re welcome! I’m planning to give Nestopia a go next week.

[QUOTE=Tatsuya79;41540]Can’t rewrite bsnes but I can get rid of the overscan in bsnes-mercury\sfc\system\system.cpp line 285:

if(cpu.vcounter() == 241) scheduler.exit(Scheduler::ExitReason::FrameEvent);

force 225 to test it.

(I think that’s slightly faster but not as much as snes9x next)

edit: between slightly and none… …none?[/QUOTE] Actually, you’re almost there. I just looked some more at the code and finally managed to achieve the same 1 frame reduction in input lag. Here’s how:

bsnes-mercury\sfc\system\system.cpp line 285:

if(cpu.vcounter() == 226) scheduler.exit(Scheduler::ExitReason::FrameEvent);

bsnes-mercury\sfc\cpu iming\joypad.cpp line 5:

if(vcounter() >= (ppu.overscan() == false ? 227 : 242)) {

This hack makes it so that the frame is always output after line 225. The emulator loop then exits. The next time the main loop is called, it enters at line 227 and proceeds to poll input and finally render the frame. The reason it didn’t work for you is that the emulator has already polled just before exiting the loop. Give this a try and see what you think.

Important: This is just a hack! Always outputting the frame at line 226 will not work in all cases, since some games have 240 lines. The interesting thing here is that snes9x seems to handle this by determining if the game renders 225 or 240 scanlines and outputs the frame as soon as it’s ready. That’s what made the fix so easy to implement on snes9x, i.e. it already output the frame at the correct place, we just needed to make sure it also left the main loop right after that point.

So, the question is: why doesn’t bsnes detect this in the same way? As it is now, it’s hard coded to exit at line 241 and output the frame. Maybe this has something to do with how overscan needs to be handled for maximum accuracy and snes9x foregoes such accuracy? Then again, byuu actually uses a check for overscan in the polling code (see second code block above)… Why not just add the following code to the first code block above and be done with it:

if(cpu.vcounter() == (ppu.overscan() == false ? 226 : 241)) scheduler.exit(Scheduler::ExitReason::FrameEvent);

I could see this as having a side effect if the game changes the overscan value between line 226 and 241. Then we’d get another main loop exit when we reach line 241, which is not what we want. I don’t know if that can happen, though. I’ll compile it anyway and give it a go. Any particular games you recommend to test the 225 vs 240 scanline handling?

Good to hear. The fix will be equally effective, regardless of video mode.

Nice gonna test that! :slight_smile:

edit: Yes that’s obviously better! Nice work! Gonna use this as I use crop overscan anyway for the SNES (if I remember right there’s rarely interesting stuff there unlike on the Pc-Engine).

Did you measure the input lag against snes9X or Next?

[QUOTE=Brunnis;41566] So, the question is: why doesn’t bsnes detect this in the same way?[/QUOTE] Blind guess: this is how the real SNES works. I can understand that as that’s what made bsnes so accurate.

[QUOTE=Tatsuya79;41568]Nice gonna test that! :slight_smile:

edit: Yes that’s obviously better! Nice work! Gonna use this as I use crop overscan anyway for the SNES (if I remember right there’s rarely interesting stuff there unlike on the Pc-Engine). [/QUOTE]

Would it be possible to have a DLL of Bsnes-mercury-balanced modified with this new code? Unfortunately I don’t have the means to compile it myself right now.

Thanks in advance, much appreciated. :slight_smile:

[QUOTE=Geomancer;41569]Would it be possible to have a DLL of Bsnes-mercury-balanced modified with this new code? [/QUOTE]

Here it is:

edit: updated in a further post.

I get some wrong input command in R.P.M Racing (512x448 game) in the 2nd menu (green one). I just push up and down and it registers START and launch the game.

[QUOTE=Tatsuya79;41574]I get some wrong input command in R.P.M Racing (512x448 game) in the 2nd menu (green one). I just push up and down and it registers START and launch the game.[/QUOTE] Interesting. Does it work with the modified snes9x-next?

EDIT: Modified snes9x-next appears fine. The change to bsnes probably mucks up timing in a bad way. Interesting that it reads the wrong button, though.

But I played a standard game like Super Aleste for a while without issue. So it’s perhaps just some high resolution modes? (just 512x448 and RPM racing if we’re lucky)

About games with overscan, no clue what to test. I tried several random games, none had any valid overscan lines.

Found 1: Dragon Quest V.

Yes… annoying game for testing lol. Anyway the overscan is displayed but with redraws (edit: I applied your 227 : 242 case in joypad.cpp and that fixed it). The input is laggy in this game I can’t say anything.

Need to find something better.

Seiken Densetsu 3 has wrong button registration like RPM racing. In game menu and during game in character menu (start button) too. It’s supposedly a 512x224 game. (resolution seems to change when displaying japanese characters)

Ranma 1-2 - Chounai Gekitou Hen (Japan) / Street Combat (USA) (I forgot about that abomination…) are interlaced at 256x448 and the controls are working. Is it the 512 horizontal mode that causes problem?

Kirby’s Dream Land 3 has some 512 wide parts and works OK… So no clue.

Super Pang is 512 and OK.

Jurassic Park does some bad button registration during gameplay.


So, probably some display modes?

(In bsnes-mercury\sfc\alt\ppu-balanced\render\ I saw stuff in render.cpp and at the end of line.cpp that could be relevant.) More important probably : bsnes-mercury\sfc\system\video.cpp

Some documentation. And more.

Only stupid work around I could find atm in bsnes-mercury\sfc\cpu iming\joypad.cpp:

if((ppu.hires() && vcounter() >= (ppu.overscan() == false ? 225 : 241)) || vcounter() >= (ppu.overscan() == false ? 227 : 242))

No more input problem in hires mode but… play Kirby’s Dream Land 3 and cry each time you enter an hires level as the input lag comes back.

I think I found it.

The problem comes from mode 5 in hires. I disabled the fix for this case; that’s just some menus as far as I could see.

I hope other hires games got their input lag shorten (I think it does trying Super Pang, but it would need a camera test). edit: I think Kirby still lags and so is probably using mode 5 in a particular way (it didn’t register bad controls before). / or this game has a big input lag by default and it works, not sure.

Download:

edit: better version next page

Thanks for all your work, Tatsuya79! However, I think I we might have gone down the wrong path. I skimmed the emu links you posted (thanks!), regarding input handling. At one of the links, the following can be read:

When enabled, the SNES will read 16 bits from each of the 4 controller port data lines into registers $4218-f. This begins between dots 32.5 and 95.5 of the first V-Blank scanline, and ends 4224 master cycles later.

So, the joypad is obviously read within the vblank period, which means that the frame must be ready by then. This means that there must be a possibility of exiting the main loop and outputting the frame before the polling takes place. My thinking here is that we should not need to modify joypad.cpp at all, just the emulator exit point. I actually don’t really think you were so wrong in your first attempt… I have thought about this during the past two days (I’ve been away from home, so couldn’t test until now) and just tried to modify system.cpp line 285 from this:

if(cpu.vcounter() == 241) scheduler.exit(Scheduler::ExitReason::FrameEvent);

to this:

if(cpu.vcounter() == (ppu.overscan() == false ? 225 : 240)) scheduler.exit(Scheduler::ExitReason::FrameEvent);

I have done two quick tests with this code:

  1. Yoshi’s Island emulator lag: down from 3 frames to 2 frames (yay!).
  2. R.P.M. Racing: Does NOT detect a false “Start” press when pressing up/down in the menu.

The code above outputs the frame during the very first line of vblank. From what I read, the actual visible frame starts at V=0 and ends at either V=224 or V=239 (i.e. 225 or 240 lines), so it should be safe to end the main loop and output the rendered frame at V=225 or V=240. I don’t know why byuu choose to output at V=241, but this causes the frame to be output just after the polling event, which in turn causes the extra frame of input lag.

Could you please try this code and see how it works? Also, to test input lag differences between the stock core and the modified one, please use the frame advance method, i.e. press ‘p’ to pause emulation, press and hold button on controller while pressing ‘k’ repeatedly to run the emulator loop and count how many presses it takes until you see a reaction on screen.

[QUOTE=Tatsuya79;41613]I think I found it.

The problem comes from mode 5 in hires. I disabled the fix for this case; that’s just some menus as far as I could see.

I hope other hires games got their input lag shorten (I think it does trying Super Pang, but it would need a camera test). edit: I think Kirby still lags and so is probably using mode 5 in a particular way (it didn’t register bad controls before). / or this game has a big input lag by default and it works, not sure.

Changes here.

Download:

bsnes_mercury_balanced_libretro (Brunnis test input lag fix) win x64[/QUOTE]

Hi,

Thanks for all the investigation, very interesting.

BSNES Mercury Balanced is my favourite SNES core latency wise, so I was anxious to try this out and provide some feedback.

I tried it with Super SWIV (FirePower 2000 in USA), a game I know by heart (I also have it on my real SNES for comparison), but the above modified core makes latency worse for me actually. So prefer stock core still.

[QUOTE=Brunnis;41625]Thanks for all your work, Tatsuya79! However, I think I we might have gone down the wrong path. I skimmed the emu links you posted (thanks!), regarding input handling. At one of the links, the following can be read: [snip]] Could you please try this code and see how it works? Also, to test input lag differences between the stock core and the modified one, please use the frame advance method, i.e. press ‘p’ to pause emulation, press and hold button on controller while pressing ‘k’ repeatedly to run the emulator loop and count how many presses it takes until you see a reaction on screen.[/QUOTE]

If it’s not too much trouble please provide us with the modified DLL also, as then we can get as much “outside testing” as well and provide you with feedback.

How does this all tie-in with the “Poll Type Behaviour” setting in Retroarch’ btw? This can be found under settings->input->poll type behaviour and can be set to “early”, “normal” or “late”.