An input lag investigation

Concerning RetroArch, are these results similar in other cores?

Does BSNES or SNES9x (recent version) show other results for instance?

There are some cores that suffer from high input lag like Mednafen Saturn. So i would like to find the fastest cores per system.

I wonder what type of meager hardware setup would allow for the ideal settings to reduce input without video_frame_delay=14 due to the fact that it is the most expensive of the lag reduction techniques. The reason this would be the ideal setup in my mind, is that it would allow for Pixellate.cg or sharp-bilinear.cg to do some interpolation, like the SNES Mini when doing non-integer scaling, while still being an affordable piece of hardware to put on a TV without sacrificing a gaming PC budget. Interpolation for non-integer 1080p that limits artifacting is 2nd only to input lag for me in terms of enjoyable experiences. What a blessing that feature is on the SNES Mini.

Now I am not familiar with RetroPie, but I do know the “dispmanx” setting is unique to the Raspberry Pi. I assume that using a x86 based PC, there is no similar choice available due to being “limited”(?) to running the OpenGL renderer? Cheap PC would be even better if that meant free OS.

Can someone do this type of testing for G-SYNC monitors?

I’m always getting conflicting information on what settings you should be using for G-SYNC.

At first, I was under the impression that you just flip v-sync to off in RA and none of the other settings like hard gpu sync/video frame delay matter anymore because they rely on v-sync. But then I got told otherwise so I’m really not sure what the definite settings are.

Does anyone know? Could we have some testing with G-SYNC? It would ideally be the best way to get the least input lag on a non-CRT monitor. I would test myself but I don’t have a high-speed camera.

1 Like

Is there any chance we could do comparisons between the newly added D3D11/12 drivers and OpenGL/Vulkan on Windows too? D3D11 has a lot higher maximum FPS for me on an Intel PC with nvidia GPU vs. Vulkan/GL.

1 Like

The bsnes-* and bsnes-mercury-* cores as well as the regular snes9x core perform the same latency wise. They’re more demanding than snes9x-2010, though, so it will be harder to use all the latency reducing settings to full effect (primarily frame delay).

I’d like to, but I need to stop myself now. I keep coming back to this stuff because I find it interesting, but I really don’t have the time anymore. It’s pretty easy to do the testing for anyone that has a 240 FPS camera (that’s the minimum I’d use), though, so hopefully someone can step up and do it.

@Twinaphex On another forum I read that to achieve minimal latency with D3D11 a so called “waitable swap chain” has to be implemented. Apparently it can reduce latency to up to a frame. See this link:

Reduce latency with DXGI 1.3 swap chains

Do you know whether this is used with the current D3D11 version on the buildbot?

Thanks for another great test Brunnis. I’m impressed by how thorough you’re doing them and sharing the results about it. I have a SNES Mini myself also, so it’s really good to know how it stacks up :smiley:

With regards to the D3D11 driver, it seems it’s putting a lot less load on the system compared to GL. I have a small Atom setup which I can now run with CRT-Easymode-Halation using the D3D11 driver, whereas with GL it will slow down to an unplayable level. Latency wise it feels really good (much much better than the old D3D9 driver). Not sure though if it’s fully on par with GL and hard GPU sync on.

Super interested to hear, as I have a small atom setup (original compute stick) and it would be nice to see improvements performance wise as the latency fixes don’t all work due to the weak (but still better than the raspberry pi 3) chipset.

I wonder what type of meager hardware setup would allow for the ideal settings to reduce input without video_frame_delay=14 due to the fact that it is the most expensive of the lag reduction techniques

The closest (laziest?) way I’ve found to measure performance for systems is by searching for CPUs or devices on Geekbench and comparing Single Core scores. Good enough to get you in the ballpark

A Raspberry Pi 3 is just under 500. An NES/SNES Mini (AllWinner R16) is just above 300.

Today’s high-end CPUs come in around 5000.

Let’s say a score of 5000 is enough to get you able to use a Frame Delay of 14 on non-complex emulators (Snes9x? Sure. Higan? No. Moore’s Law failed us. Sorry). That leaves ~2ms of time to actually DO the computation for emulation.

In theory, a Frame Delay of 12 would then mean ~4ms of time (double!) for computation.

A Frame Delay of 8 gives double the time again. In other words, 1/4 the CPU power needed compared to the highest delay setting. 5000 / 4 = ~1250 Single Core Score in Geekbench

Of course, there’s lots of overhead and other things the OS needs to do, so this won’t be perfectly linear. Not to mention differing CPU needs for each emulator or console. There’s no magic static setting that anyone can point to.

@rafan - I have mentioned that bit about waitable swap chains to aliaspider. He might implement it either through hooking up the swap chain setting or the GPU hard sync setting, whichever of the two is best.

All these measurements are fantastic but they lack the varying human factor.
I owned an NES back in the early 90s and played Sega over friends houses, but we quickly moved to a Pentium 133MHz and never had another TV console till I acquired the PS2 in 2002 and back to PC.

So from the late 90s till a year a go, it was all emulation for me and I did not own a single “retro” console till 2017. Lately I amassed several retro consoles (Nes,Snes,Sega,N64,Ps1,Ps2) and everdrives and a Sony CRT to compare with emulation.
LAG, what once we a non-issue and something I was completely oblivious too for almost 20 years became something I very much notice and it expresses itself in how well I play these games.

BUT, before I played these games on the real hardware with zero lag I was very adapted to these games on the emulator that I played them like a ‘pro’ without being bothered by the lag, or dying.
After spending some time playing these games on the hardware and CRT, going back to Emulation lag is very noticeable indeed and clearly diminishes performance, it almost feels like playing inside a dream… if you know what I mean.

BUT (again) and here’s my point, the BRAIN can adapt to the lag of emulation and compensate for the delay and make you play like a pro again completely removing the “in a dream” sensation when going back from hardware+crt to emulation.
In my experience, this adaptation to lag takes maybe a day or a couple of hours of attentive playing.

Of course zero loop lag from an emulator will be ideal, but under 100mS is totally adaptable hence playable thanks to our easily fooled brains. :slightly_smiling_face:

There are some games that you can’t really adapt to the lag if they depend on very fast reactions like Punchout. Beating Mike Tyson with high input lag is near impossible. Most games 100ms can be adapted to though for the far majority of games.

Only in very specific situations is this true. But in an action game the brain can’t see the future and know when an enemy is going to shoot at you and then have you react before it even happens. There’s no way in hell you’d be able to get through a fast past game like Punch Out, or the super fast perfect platforming of a game like Gimmick with input lag.

Only in very specific situations is this true.
There’s no way in hell you’d be able to get through a fast paced game like Punch Out

You are right but games that do not require super reflexes like platformers are definitely playable.

Modern games on the PS4, XboxOne, PC, etc… have much more lag than the CRT generation ever had. The games obviously are not the same and require less reflex timing than old nes,snes,sega games, and the developers very well aware of that.
Modern platforms or the “new wave of indy 8bit games” are less challenging than games of the crt generation especially NES, and also took into consideration the inherent lag of PC gaming on an LCD (or phone, PI, miniPC, whatever…).

Emulation will always have lag, that is the nature of it. Super reflex games that require zero lag like in competitions or speed runs, are still played on the real hardware and CRT.

Try to play guitar with 100mS delay, , , , you’ll be kicked out of the band in no time.
The auditory system in the brain is several magnitudes more sensitive to time shift than the visual system, lucky us gamers. :smile:

That’s great, thanks.

@Brunnis I noticed something odd in the px68k core when doing the single step frame advance method. If you open the px68k emulator menu by pressing “F12” (it opens the internal emulator menu where you can set FDD1 and FDD2 etc, see image below) and you do pause (“p”) and then push down or up and single frame advance (“k”), then within the same frame as you do the frame advance the cursor in the menu will move.

I have not encountered this behaviour with any other core (there as far as I know quickest response will be on second frame advance / pressing “k” twice). **

Is this normal / expected behaviour or is there something odd going on? Does it possibly have any implication for how we’ve been counting the delay for nes and snes?

0d7a63fd31ffda6c826f4139baef3384

** Note that once a game is running the x68k core shows “normal” latencies of two or three times pressing “k” before movement is shown. So it may be something odd with how the frame stepping is working with the internal menu?

That’s completely normal. It simply means the menu has next frame response. The same can be seen on NES/SNES emulators if you run content that responds on the next frame. For example, try the menu in Mega Man 2. While this game takes two frames to respond during gameplay, it has single frame response in the menu.

Did you try it?

I’m not getting same results.

px68k in “F12” menu shown in my previous post:

[down+k] = cursor moves

mega man 2 menu on NES (mesen core):

[down+k] + [k] = cursor moves

One frame difference between the two. It would be great if you could take a closer look / try my example.

I haven’t had the time to test yet, but last time I tested Mega Man 2 in the menu it definitely only needed a single frame advance to show the result. Maybe the response depends on which menu view you’re in. I tested on the view with the blue background where you choose between going to the stage select screen or entering a password. Try pausing there, then pressing/holding the Start button and then ’k’. It should respond immediately.

I just did a quick test: Mega Man 2 takes two frames to respond on the stage select screen, BUT only one frame on the Start/Password select screen. Also, single frame response on the very first screen with text when you start up the game.

Thanks, I could replicate it now!

Hi…i am a new user here. As per my knowledge RetroArch and Canoe themselves should both produce a response on the third frame after receiving input, since that’s how Super Mario World behaves even on a real console. We know for a fact that snes9x2010 doesn’t add any extra lag on top of this. I think it’s pretty safe to assume that Canoe and RetroArch perform the same here.

pcb assembly service