An input lag investigation

[QUOTE=vanfanel;36635]I don’t know how current standalone MAME works, but in the past it used to have these horrible hiccups all emulators had due to refresh rate differences between the games and the physical screen. Some MAME hacks allowed to run them at the physical screen refresh rate, but without a resampler it caused problems like wrong sound pitch, etc. I was tired of that crap so I was only using FPGAs, but then I discovered Libretro/RetroArch and I found a new “home” :smiley: Also, I have cleared the ship part in Battletoads, the one with the incoming coming walls, in the Gambatte core, several times. Always on an X-less Raspbian with no unnecessary services running, on the Pi/Pi2 (GLES on Dispmanx rendering context, UDEV, ALSA). I haven’t noticed any more input lag than I would notice on a real GameBoy back in the day and that part is playable for me.[/QUOTE] Yeah I love Retroarch and I use it for most everything nowadays except I do prefer stand alone Mame still. I think sometimes when people bitch about emulation and bring up input lag and saying things like certain games being unplayable either don’t know how to set up their emulators or they just have a hatred for emulation and are unwilling to admit that it could ever possibly be just as good as original hardware.

Hi ! Well, the fact that some people do not seem to notice input lag doesn’t necessarily mean that it’s not an issue. Of course, I don’t refer to huge lags that are obvious and obviously due to bad hw/sw combinations/configurations. The biggest issue is subtle input lags (like 1/10 of sec) that annoy some people (this is my case) and that others do not even notice. And which do not exist at those levels with the original hardware.

I made the experiment with a friend. He never admitted there was lag which if found it completely obvious and couldn’t see any with the original hardware.

Just a very small update:

I filmed the screen while doing the 240p manual lag test in the bsnes-mercury-balanced core on RetroArch in Windows 10. I was short on time, so I just filmed the upper left part of the screen where the result is printed and timed the number of frames it took from pressing the button to seeing the result printed out. The answer is that it takes 4 frames (some took 5 frames, but the majority were 4 frames).

So, it appears to react slighty faster than the actual games I’ve tested so far, by about 1-2 frames, but not as fast as the manual lag test results initially indicated (between 1 and 2 frames of lag). I guess it shows how hard it actually is to not inadvertently “cheat” in this test.

Would be interesting to hear some theories as to why this small difference in lag between the test and actual games exists. Pixel response times should be quicker when testing with a black and white screen, but the difference should be in the single milliseconds not 1-2 frames.

EDIT: It would be interesting to get an actual SNES emulator dev in here to shed some light on the situation. There should be a decent amount of people that are familiar enough with the inner workings of SNES emulators to provide some hard facts regarding expected lag, yet these kinds of threads are always devoid of any such input. The result is endless speculation regarding what’s caused by the system and what’s caused by the emulator. A pity, really.

1 Like

Great to hear you are still investigating.

Seems that byuu from higan is not here. Maybe you could contact him directly in his forum: http://board.byuu.org/phpbb3/viewtopic.php?f=4&t=1037

Snes9x-Next is under @Twinaphex hand I think?

Thanks for starting this topic. It’s 2016 and we still have input lag in 1990 console emulators. If I can act entitled and rant a bit, this the last major “compatibility” issue in modern emulation. Regardless of how accurate, fast, pretty, etc. an emulator is, it’s not a proper emulator unless it is emulating input timing to a perfect or near-perfect resemblance of the original console. Jump timing in Mario brothers is still impossible for me and I have very decent hardware. Yes, I can adjust to the delay after a few deaths, but it is never the same intuitive input that existed on a real NES.

But the major issue that I think is misrepresented is that this is somehow caused by modern displays and controllers. Any such explanation is a red herring. The counterargument is that the Wii and Wii U virtual consoles emulate classic Mario brothers and other games with near-perfect input. It is not my TV, because I have tried it connected by the exact same HDMI cord on both my PC and my Wii (though my PC is generally on VGA). The Wii handles input perfectly without any configuration, stuttering, screen tearing, or anything. (Perhaps the Wii is not perfect for a speed runner, but I can definitely make my jumps!). For sake of argument, one can press back against this position and say that it must be my TV, my controller, my settings, my imagination, etc., but many people know exactly what I’m talking about here, and it is an issue with EMULATION in general. I wish I could do more to solve it, and if there’s anything I could do, I would. I know that Byuu has introduced extensive sync options into Higan to address this issue, and though I haven’t tried it yet, perhaps that is a step forward.

In any case, the fact that Wii handles things perfect lets us know the gold standard of what emulation should be capable of achieving in 2016. If Nintendo can do it, then the emulation community, among which are some of the most talented programmers in the world, ought to be able to do it. Effortless, perfect input emulation should not be just a pipe dream (though, I would put in a ton of effort if thats what it took.)

Of course the reason I’m posting here is because I LOVE retroarch. I mean I “love love” retroarch. And I wish it could be perfect. I greatly admire the effort that the Retroarch development team has put into it, and hope that this sort of feedback helps to create a greater Retroarch. Hopefully this is something that can be solved. After all, as I stated, you are not truly emulating an old console in the strictest sense unless timing is at least nearly-perfectly accurate, which I have yet to experience. If your kids can’t get the same visceral rush of making Mario jump at just the right time, they are never truly “experiencing” the classics.

I’m preparing to try out Lakka in hopes that it will help (I’ve often heard Windows graphics implementation as the blame of input lag, which is the only explanation that seems reasonable to me). Then perhaps following up with a barebones Linux distro if that doesn’t work (I have no idea on the progress of Lakka).

1 Like

Latency is caused by a lot of things. Displays are a big one but not the only one by far. USB polling rates, kernel queues and waits, image buffering and so on all add their little bits of latency. You could get much lower latency running a DOS system, which has direct and exclusive access to hardware, but you would also lose USB and background processes and many of the other things that we’ve come to take for granted in modern computing.

Higan has worse latency than most alternatives (on Windows, at least) because of its lack of exclusive fullscreen mode.

Displays and input polling are certainly part of it. However, average time to detect an input on USB is only 4 ms (half of the polling period). The average time until the next sampling of the input by the emulator is (i presume) 8.33 ms (half a frame). That’s 12.33 ms in total. Then you have the input latency of the display, which, apparently, can be as low as 1 ms but also easily a couple of frames for slower models. My monitor is apparently one of the fast ones, with detectable changes on screen being measureable 0.8 ms after input.

So, for my testing, input and display can only account for approximately one frame of input lag, while the testing I’ve done on SNES emulators suggest 4-6 frames of input lag. In other words, we have 3-5 frames to account for. I’m not doubting that the OS, drivers and associated buffering might account for the lion’s share of that time. However, during my extensive googling, I’ve never once seen any information as to how big the emulator’s part is in this. For example, in my testing NES input lag was 4 frames, while SNES input lag running actual games was 6 frames. Obviously, SNES emulation adds a siginificant amount of lag that’s not part of the host system’s input/output lag. If the NES emulator in my test needs 1 frame between reading the input and producing the output frame, the SNES emulator actually needs 3 frames. 3 frames would be half of the total measured input lag for SNES emulation in my test, which is pretty significant.

If we knew what kind of lag different emulators incurred by design, we’d also have a much better understanding of the minimum expected input lag. But for some reason, that information is missing. The only info we’ve got is along the lines of “modern OSs and drivers incur lag” and “emulation inherently introduces lag”. What’s with the lack of cold hard facts on this? And, by the way, I’m not in any way worked up over this (I realize my post may give that impression), just really wondering about the lack of facts and seriously interested in getting to the bottom of it.

By the way, I’ve registered on byuu’s forum, but I haven’t had the time to put together a post/message yet. Interesting questions could be:

  • How many frames/milliseconds does the emulator itself add to the total input lag?
  • Is the emulator induced input lag deterministic or does it vary depending on what code/game it runs? [In my testing, the 240p test shows 4 frames of input lag during camera test, while games need 5-6 frames.]
  • What is it that makes SNES emulators (such as bsnes and snes9x) slower in terms of input lag than many other emulators (such as NES emulators)? SNES emulators seem to need at least 1-2 additional frames.
1 Like

First off: yes, SNES emulators appear to have ~2 frames of latency more than other emulators (you can compare results from the 240p test suite using bsnes/snes9x/zsnes with results from genplusgx/picodrive). This seemed to be the case in my own testing, though I didn’t know to specifically examine it at the time, so my data is light and inconclusive. I haven’t gotten to test emulators on a CRT nor have I gotten to test physical consoles, so I can’t say whether it’s something inherent to emulation of the SNES or something related to the SNES itself.

There’s currently no hard data on it because nobody has bothered to do proper testing. Cameras need to be much higher than 60 fps to get any meaningful data due to nyquist (which also comes into play with the various polling rates involved in emulation latency discussions). Additionally, if you’re going with a high-speed camera, you need a gamepad with an LED wired in-line so the camera has an exact visual marker for actuation, as finger movement is mushy and inexact. I would be interested in doing more testing but I don’t have $400 to plunk down on an oscilloscope. I’ve considered doing a gofundme or something…

The lack of hard data isn’t confined to this latency discussion, though, it permeates the vast majority of latency discussions online. The fighting game community decides which fight stick to buy based on garbage voodoo data that they think is valid because they performed the same worthless comparison 1,000x.

I tried retroarch from an Ubuntu live USB and it was much better. So hunterk is the reason the Wii is so much better because it has more direct access to hardware? And presumably Linux is better than windows for the same reason? Or could it be configuration is more difficult to tune in Windows compared to Linux default?

I ran out of time to fool around last night but tried some circa 2014 crt shaders through Linux and it ran miserably. I’m afraid to have to choose between good input emulation and good crt simulation.

As far as gofundme I will gladly donate to investigate this issue. I would hope that it would ultimately produce some hard data that even beyond investigating input lag, would elevate discourse beyond the conventional voodoo. (I.e., Teach them how to fish)

Direct access usually helps (vulkan may provide some benefits here; time will tell as the drivers improve), as does a system that’s built around low-latency communication (buffering is kept to a minimum, resources are focused on synchronizing events, etc.). Calamity told me that they regularly get “next-frame” latency with GroovyMAME on Win7 but I don’t have the required hardware to reproduce that. It’d be an understatement to say I’m skeptical but he knows his shit, so… /shrug

My best “feel” comes from linux using RA via KMS (i.e., direct control over the framebuffer) on a CRT arcade monitor. The GroovyMAME guys claim this is always at least 4 frames of latency (so, ~48 ms more latent than Win7; again, I’m skeptical but have no data to suggest otherwise).

Shaders indeed can increase latency, sometimes significantly so. My testing of them was pretty brief but crt-hyllian seems okay (didn’t get to test royale or easymode):

1 Like

Did you have video hard sync enabled on Windows? The reason I’m asking is that my results don’t match with yours. Linux with KMS was consistently 1 frame slower than Windows 10 with video hard sync enabled in RA. The best result I’ve measured so far is a consistent 4 frames of input lag with RA running Nestopia under Windows 10 and that actually feels very good. I would certainly be content with 4 frames, if I could achieve that with SNES emulation (and preferably under Linux). As it is now, SNES emulation under Linux with KMS has an input lag of 7 frames. With additional frames added by the TV, it becomes a not-so-pleasant experience.

BTW, regarding additional testing: I have access to oscilloscopes, solder equipment, etc. and could easily do a better test. The problem is time. This could easily take a lot of spare time to do right and I don’t think I can do that without making my fiancé blow a fuse… :stuck_out_tongue:

My testing was not rigorous so I can’t speak with objectivity, but the feel of linux retroarch was better than windows retroarch without hard sync enabled. Hard sync introduces some kind of visual jerkiness on my system in Windows so I dismissed it as a solution. I may investigate further configuration under Windows now that I have established a baseline of playability on Linux. I would much rather run Windows retroarch because my HTPC serves various roles throughout the network and rebooting to Linux is not ideal.

Speaking of testing, can you elaborate on your “30 jump” test? How are you measuring the input lag – I wasn’t able to infer from your post and I would like to see if I can reproduce your results. How are you detecting delay between pressing a button and screen response?

Okay, interesting. I’ve not noticed any jerkiness on my machine. RetroArch on Windows without hard sync is pretty slow, so in that case Linux should probably be faster.

I’m just filming the screen and holding up the controller in front of it, so that both the character on the screen and the controller are visible. I then tap the jump button repeatedly 30 times. When done, I import the video into Premiere Pro (any video editing software should do) and analyze the video frame by frame, counting the number of frames between each button press and when the character jumps. Filming in 60 fps is not ideal, since it gives room for a pretty large error. However, with around 30 attempts, a clear trend usually develops, where most of the values are the same and some are +1 frame or -1 frame. Also, if you’re using an LCD screen, analyze the screen closely. Due to pixel response times, you can sometimes just make out the screen starting to change.

Oh I see, so your are holding the controller so that you can identify the moment when the button is depressed. Hmm, I wonder if I could do a similar test based on sound? I have an x-arcade stick that makes a very audible “boing” when the button is pressed. And SMB has a unique jumping sound that occurs at the moment that mario jumps. It should be possible to sample audio at a much higher rate than video so that I can get over the nyquist rate. Then it would just be a matter of spectrum analyzing. Since retroarch syncs audio/video frames, wouldn’t this be viable?

Audio typically runs at a higher latency than video. You can use JACK in linux to get it pretty low but 16-32+ ms is typical (I think 64 ms is the default but that’s generous and can usually be reduced). It’s probably best to compare audio latency and video latency separately, but it’s worthwhile to try dropping your audio latency as far as you can until you get crackling.

So that is 1-2 frames and also it will vary by OS so it would confound analysis then? If latency is a setting can’t I just subtract the latency setting from the time between events? i.e., (t2 - t1) - 64ms= input lag?

I don’t think the comparison would be reliable, since they’re processed differently both in RetroArch and in the OS. Again, I think it’s worthwhile to test it and get your audio latency as low as possible, as well, but it’s not going to be a reliable indication of input latency.

Okay thanks. By the way I contacted an electronics rental company, and I’m not sure what model you need, but in the price range you mentioned I found this: https://www.microlease.com/us/products/keysight-agilent-technologies/oscilloscopes/u2701a?basemodelid=4656&cond=all They quoted $65/month (+shipping I imagine). That seems fairly reasonable compared to other places that had their prices published. Any idea how long you would need for the tests you’d like to run? A one month rental beats $400 I suppose, so something to consider.

1 Like

I would probably need to hold onto it for a longer period for continued testing and verification of the inevitable doubts, not to mention additional questions that subsequent data analysis would bring up. I also don’t have a ton of time to devote to such a project all at once (work, family, etc). However, repeated rentals could be another possibility.

That one you linked may work for my purposes. I would have to look into it further. My previous testing was conducted with the $1,000 big brother of this one:

The differences between the two models shouldn’t matter for my purposes, and there’s apparently a softmod that can make it functionally identical to the more expensive model (unless they patched it out at some point).

[QUOTE=hunterk;37222]Audio typically runs at a higher latency than video. You can use JACK in linux to get it pretty low but 16-32+ ms is typical (I think 64 ms is the default but that’s generous and can usually be reduced). It’s probably best to compare audio latency and video latency separately, but it’s worthwhile to try dropping your audio latency as far as you can until you get crackling.[/QUOTE]Is there a possibility of ASIO support in RetroArch? My hardware supports 1ms buffers via ASIO which will significantly reduce latency. When ASIO was added to GroovyMAME it greatly lowered the latency compared to XAudio.