An input lag investigation

Some people have been testing display lag with an Arduino and a photodiode placed on a screen. (Example here.) The Arduino’s ADC can do about 10,000 samples a second which should be fast enough. It should be pretty easy to knock up a simple circuit. You could even have the Arduino detect a button press and start sampling the photodiode to detect a change and report back the time difference. You’d then only need an application running in an emulator that turns a known part of the screen light/dark for you to place the photodiode against.

It would certainly be a lot cheaper than obtaining an oscilloscope.

dave j

[QUOTE=dave j;37247]Some people have been testing display lag with an Arduino and a photodiode placed on a screen. (Example here.) The Arduino’s ADC can do about 10,000 samples a second which should be fast enough. It should be pretty easy to knock up a simple circuit. You could even have the Arduino detect a button press and start sampling the photodiode to detect a change and report back the time difference. You’d then only need an application running in an emulator that turns a known part of the screen light/dark for you to place the photodiode against.

It would certainly be a lot cheaper than obtaining an oscilloscope.

dave j[/QUOTE]

That seems like a pretty good idea for testing input lag in general, but it is a bit more limited in that you just get an overall “input lag”. Hunterk was referring to more granular testing of the causes of input lag.

1 Like

Some of the testing I was planning to do requires the actual oscilloscope (specifically regarding fight stick processing latency) but the arduino/photodiode route is the same concept I used for my previous testing (I used a photoresistor and measured the voltage drop vs a button, but potato potahto) and it would be good enough for most of it. I’ll look into building one. Thanks, dave j :smiley:

I hadn’t realised you were measuring things than needed more accurate timing measurements. Still, if if you can do most of it using an Arduino based solution it might save you from spending all your money hiring oscilloscopes. :slight_smile:

I can shoot 120fps on my phone at lower res, I might try to measure this with an in-line led on my old xbox 360 controller as it’s a mess anyway. By far and away the best (in latency terms) SNES emulation I’ve had is in Windows 10 with Hard GPU SYNC enabled. In my experience the SNES core makes little difference. Furthermore, and this is crucial, if you’re running an Nvidia GPU I highly recommend going into Nvidia Control Panel and reducing the maximum pre-rendered frames to 1. This makes an enormous difference. Audio latency as low as you can go. It’s a shame there’s no ASIO support (although I completely understand why there isn’t). I’m pretty accustomed to 6-10ms roundtrip audio latency with work and this would be amazing, although in practice ~20ms will feel instantaneous under most circumstances. Generally though, Windows is much better at low latency audio than it used to be and 32ms should be possible on most systems without ASIO.

With all of the above settings, it’s pretty much the only way I find Super Mario World to be playable. For whatever reason this game is very sensitive to latency, but it feels like around 50ms - say 3 frames total including the monitor latency of 1 frame. Manual testing in 240p Suite shows either 0, 1/2 frame or 1 frame consistently.

[QUOTE=markiemarcus;37260]Furthermore, and this is crucial, if you’re running an Nvidia GPU I highly recommend going into Nvidia Control Panel and reducing the maximum pre-rendered frames to 1.[/quote]Is that not what Hard GPU sync does?

[QUOTE=markiemarcus;37260]Audio latency as low as you can go. It’s a shame there’s no ASIO support (although I completely understand why there isn’t). I’m pretty accustomed to 6-10ms roundtrip audio latency with work and this would be amazing, although in practice ~20ms will feel instantaneous under most circumstances. Generally though, Windows is much better at low latency audio than it used to be and 32ms should be possible on most systems without ASIO.[/quote]I don’t understand why there isn’t any ASIO support. Many cores in RetroArch will have audio problems for me below 32ms on hardware that can use 1ms buffers via ASIO. That means the audio is 2 frames behind the video at all times.

We support JACK and you get those low buffers via Linux. It would probably be easier to extend that JACK support to Windows than to write an ASIO driver, but either would be welcome if someone wants to submit a PR.

1 Like

“Is that not what Hard GPU sync does?”

I was under the impression that this is what it did also, but somehow the additional step was also required. Could very well be a recent driver quirk.

What about BFS-patched Linux kernels? Has anybody tried one of these with RetroArch and measured lag? The BFS is an scheduler tailored specifically to avoid latency and improve interactive tasks responsiveness. I did a GREAT difference for me in the past on X86 for Wine games.

Also, what about realtime patched Linux kernels? Should that make any difference in an ideal RetroArch enviroment? (meaning no X associated services and junk, RA directly on KMS/DRM, UDEV, ALSA).

[QUOTE=vanfanel;37375]What about BFS-patched Linux kernels? Has anybody tried one of these with RetroArch and measured lag? The BFS is an scheduler tailored specifically to avoid latency and improve interactive tasks responsiveness. I did a GREAT difference for me in the past on X86 for Wine games.

Also, what about realtime patched Linux kernels? Should that make any difference in an ideal RetroArch enviroment? (meaning no X associated services and junk, RA directly on KMS/DRM, UDEV, ALSA).[/QUOTE]

I tried this with Lakka when they had KMS builds and it was great (around this time last year). Comfortably as a good as a tweaked Windows RA, if not better. As I remember it, they stopped releasing them (falling back on X11) due to a bug that manifested itself with AMD/ATI cards. This was actually the best way of playing for me, but unfortunately the bug crept in and persisted for months so I dropped it. Haven’t checked it recently; will definitely look at it again in the next few days.

Edit: They fixed the AMD bug in November.

[QUOTE=markiemarcus;37381]I tried this with Lakka when they had KMS builds and it was great (around this time last year). Comfortably as a good as a tweaked Windows RA, if not better. As I remember it, they stopped releasing them (falling back on X11) due to a bug that manifested itself with AMD/ATI cards. This was actually the best way of playing for me, but unfortunately the bug crept in and persisted for months so I dropped it. Haven’t checked it recently; will definitely look at it again in the next few days.

Edit: They fixed the AMD bug in November. http://www.lakka.tv/articles/2015/11/26/new-major-version-released/[/QUOTE]

You tried KMS or you tried BFS-patched or realtime patched kernels? What do you mean?

Hi, I am interested in helping to optimize the input lag in retroarch too. I have therefore tried retroarch on several platforms.

The emulator functionality itself works great, but I have never been satisfied with the input lag.

I just bought a skylake i5 NUC for the sole purpose of running lakka and to decrease the latency by using KMS with lakka on integrated intel graphics, but I’m still not pleased with the input lag having similar observations as OP.

I measured NES and SNES input lag on different platforms using a 120fps iphone camera. To compare my results with OP, I have converted the measurement to 60fps.

  • Windows 10, i7 920, ATI HD7850 desktop pc with Retroarch v1.3.2 using wireless xbox360 -> Terrible input lag (most likely caused by the ATI card / drivers?). I gave up on using the desktop, it was always too much configuration before starting to game and input lag terrible. (SnesGT on this system however somehow managed to be much more responsive regarding input lag than retroarch)

  • Wii with Retroarch v1.3.2 and wii2hdmi converter and wiimote -> Nestopia = 4.5 frames, snes9x = 6 frames.

  • NUC6i5SYH with latest Lakka (10th of April 2016) using wireless x360 -> Nestopia = 5.5 frames, bsnes mercury balanced = 7.5.

Playing platformers on the SNES you can really feel the latency. The only system I have tried where platformers on SNES were really playable (like super mario world) was the Wii.

Is it expected to have less input lag running retroarch on the wii than on a i5 NUC with Lakka?

I had hoped Lakka on integrated intel graphics would perform better than the wii.

What can we do to investigate further?

KMS. Via Lakka. I’ve linked to it.

[QUOTE=larskj;37406]Hi, I am interested in helping to optimize the input lag in retroarch too. I have therefore tried retroarch on several platforms.

The emulator functionality itself works great, but I have never been satisfied with the input lag.

I just bought a skylake i5 NUC for the sole purpose of running lakka and to decrease the latency by using KMS with lakka on integrated intel graphics, but I’m still not pleased with the input lag having similar observations as OP.

I measured NES and SNES input lag on different platforms using a 120fps iphone camera. To compare my results with OP, I have converted the measurement to 60fps.

  • Windows 10, i7 920, ATI HD7850 desktop pc with Retroarch v1.3.2 using wireless xbox360 -> Terrible input lag (most likely caused by the ATI card / drivers?). I gave up on using the desktop, it was always too much configuration before starting to game and input lag terrible. (SnesGT on this system however somehow managed to be much more responsive regarding input lag than retroarch)

  • Wii with Retroarch v1.3.2 and wii2hdmi converter and wiimote -> Nestopia = 4.5 frames, snes9x = 6 frames.

  • NUC6i5SYH with latest Lakka (10th of April 2016) using wireless x360 -> Nestopia = 5.5 frames, bsnes mercury balanced = 7.5.

Playing platformers on the SNES you can really feel the latency. The only system I have tried where platformers on SNES were really playable (like super mario world) was the Wii.

Is it expected to have less input lag running retroarch on the wii than on a i5 NUC with Lakka?

I had hoped Lakka on integrated intel graphics would perform better than the wii.

What can we do to investigate further?[/QUOTE]

That’s disappointing . For reference the machine I was running this on was pretty old (Core 2) but the experience was definitely better than this. Assuming I wasn’t using shaders, performance was similar in SNES games between the Intel onboard graphics and the dedicated HD 4670. Not perfect, but definitely not 7 frames.

Edit: Wired 360 controller BTW. Edit 2: I will test this at some point this week with 120fps on my phone. Still have the old Lakka version installed on it so I’ll check that against the latest build in case there’s a problem there.

I can’t really tell if it is relevant or not, but I played around some more with Lakka settings on my i5 NUC…

if I disable vsync, disable audio sync and leave frame throttle at 0 (which makes the game run as fast as possible I assume) - then I get very low input lag:

Nestopia: 4 frames (sometimes 3.5 frames) Bsnes mercury balanced: 4 frames (sometimes 4.5)

Of course it is useless because the game is running way too fast, but it is interesting to see how low the lag can be in this case.

What would be required for normal game speed with vsync disabled?

If I set frame throttle to 1.0x, then the game stutters/jerks a lot like it is constantly changing speed.

I don’t really see any tearing though.

EDIT: Basically the conclusion from above is that the nes and snes core seem to have the same input lag when game is running very fast, and not the usual 2 frames difference.

[QUOTE=larskj;37423]I can’t really tell if it is relevant or not, but I played around some more with Lakka settings on my i5 NUC…

if I disable vsync, disable audio sync and leave frame throttle at 0 (which makes the game run as fast as possible I assume) - then I get very low input lag:

Nestopia: 4 frames (sometimes 3.5 frames) Bsnes mercury balanced: 4 frames (sometimes 4.5)

Of course it is useless because the game is running way too fast, but it is interesting to see how low the lag can be in this case.

What would be required for normal game speed with vsync disabled?

If I set frame throttle to 1.0x, then the game stutters/jerks a lot like it is constantly changing speed.

I don’t really see any tearing though.

EDIT: Basically the conclusion from above is that the nes and snes core seem to have the same input lag when game is running very fast, and not the usual 2 frames difference.[/QUOTE] I have done similar tests (although I haven’t compared to NES emulation yet) and also get very good input lag (better than you, even).

Base setup

i7-6700K Radeon R9 390 (16.3.2 driver, OpenGL triple buffering enabled) Windows 10 64-bit RetroArch 1.3.2 64-bit + bsnes-mercury-balanced core + “Hard GPU Sync” setting enabled for all tests CIRKA SNES style USB gamepad HP Z24i (all tests run at native 1920x1200 over DisplayPort)

Test scene:

Super Mario World 2: Yoshi’s Island, first stage. I don’t move at all from the starting position, just stand right there and press jump repeatedly.

Methodology:

Same as previously, i.e. film monitor and gamepad at 60 FPS with a Canon EOS 70D and analyze the video frame by frame afterwards.

Test 1

vsync on audio sync on

Result: 6 frames of input lag

Test 2

vsync off audio sync on

Result: 5 frames of input lag

Test 3

vsync off audio sync off (Note: This is what sets the framerate free, resulting in approximately 220 FPS during the test)

Result: 2-3 frames of input lag

So, turning off vsync seems to have a marginal effect on the input lag (a very consistent 1 frame improvement). However, turning off the audio sync (which is what sets the frame rate free) has a pretty profound effect, improving the input lag by at least 2 frames. During test 3, I never saw more than 3 frames of lag, but approximately 10 of the 30 jumps I did had only 2 frames of lag. The reason I never saw as quick results as this before is that I only disabled vsync and not audio sync.

The results are interesting, but I’m not sure what we can do with them. One way or another, the syncing is pretty much the single biggest culprit when it comes to input lag (at least as far as the monitor/display is fast). Part of the difference we’re seeing (I believe approx. 8-12 ms) is due to the free-running emulator being able to update directly in the front buffer. When vsync is active, the full frame must be complete before it’s scanned out to the display. Without vsync, the emulator can write in the front buffer just before Mario and Yoshi are to be drawn on the screen, saving us the time it otherwise takes to scan from the top of the screen down to Mario and Yoshi’s location.

Another part of the difference (~12 ms?) can probably be explained by the free-running emulator being able to poll input much closer to when the frame is actually output to the screen.

EDIT: Personally, I’m very happy with the 4 frames of input lag I get with NES emulation. It’s probably not realistic to expect lower while maintaining vsync. I only wish for two things:

  • For SNES emulation to match this. Still not sure why it can’t.
  • For Linux input lag to match Windows.

If both of the above issues were “fixed”, it would be possible to get a total of 5-6 frames of input lag when using a dedicated Linux emulation machine on a low input lag TV. That’s a pretty far cry from the current situtation with, for example, RetroPie, where 9-10 frames of input lag is to be expected even with a low input lag TV.

I don’t think that unlocked framerate tests are very accurate. At least not if you’re using a 60Hz display, and/or a 60 FPS camera.

If you’re rendering a 60 FPS game at 3x speed (180 FPS) then the time it takes between pressing a button and seeing your action should automatically be 1/3 lower because it’s running 3x faster. You’re still going to have 6 frames of latency, it just takes 1/3 the time because it’s running at 3x speed.

But since you’re displaying 180 FPS on a 60Hz display, 120 of those frames are being dropped. So the end result is that it appears to have saved 4 frames of latency when in reality the latency is the same, it’s just running faster. But you’d only be able to see that if you were running on a 180Hz display.

I would like to chime in too. Since using the latest version of Retroarch (1.3.2) I’m also experiencing massive input lag in various emulators. Now I didn’t do any accurate research, it’s just a gut feeling, but a very strong one. I was using mednafen psx, and I’m absolutely sure there was no input lag I could feel in older versions, but now it feels like the input comes about hald a second late.

Regarding inaccurate measurements with the free running framerate, then the measurements should not be treated as a measurement of the amount of emulated frames before we see a response but the amount of camera frames until we see a response, which is just a measure of the input lag in time, not emulated frames.

As long as we realise that difference (and what we are measuring is actually time, and not internally emulated frames) then i guess the measurements are as reliable as with vsync enabled.

I just thought it was interesting there is such a big difference in reaction time and it makes you wonder why it takes a couple of emulated frames (in the snes case) to react to controller input that was clearly received “much” earlier timewise.

As I said in my post, game latency doesn’t change. If it takes 6 frames at 60 FPS, it will still take 6 frames at 180 FPS. It takes 1/3 the time, but latency between the input and the game doesn’t change, it’s just running faster.

It’s not a useful test, and you’re introducing a lot of unreliability by rendering many more frames than your screen is capable of displaying.