An input lag investigation

re patching higan: i was wrong when i said it would have no affect, what i meant to say is i don’t think it will be significant. i believe there will be a small difference simply due to the fact that without your patch higan holds onto a frame it could otherwise push for small amount of time. worst case the emu syncs to realtime in between that interval adding $AUDIOLATENCY ms worth of latency, but those cases aren’t going to happen every frame or even every other frame. my argument is essentially the same as koubiack’s in that reddit thread Brunnis linked to.

regardless i’m interested seeing actual results, i’m so glad someone is finally doing these kinds of tests, thank you! if i can make a suggestion when doing the tests, set the audio latency as low as you can (i believe setting it to 0 in higan, which works on my system, simply has the audio system assign whatever is the lowest value it can have) and use the optimal audio driver for your system.

timing wise setting maximum runspeed to 1x achieves the same effect as using vsync, it makes 1 frame (active portion + vblank) on the host correspond directly to 1 frame on the guest. when using audio sync this is no longer the case, where the host syncs no longer resides on clean frame boundaries on either the host or guest, which is why at this point it’s easier to think in terms of realtime vs emutime. so it would have to be audio sync only.

it already has it, that bug report and patch address the same issue. the snes is not unique in how it operates, as you can see in the discussion the main reason the dev was reluctant to do so (which is that the active frame height of the guest can change mid frame) is the same reason byuu opposes it. byuu’s argument is that since higan doesn’t use vsync to sync realtime = emutime that it’s not worth the trouble, that any gain would only be minimal on his end.

As I mentioned on the previous page, higan’s performance in the context of vsync is still worth investigating, both with and without the runloop reorganization, even if the GUI option for vsync is gone.

no i agree, i’m not trying to take byuu’s side on this here, but basically all i care about is that we finally got these fixes downstream in the libretro cores where it’s needed the most and that we now have the means to test, pinpoint problems, and experiment, i’m very grateful to Brunnis and you and the other core maintainers for this. thank you

also, not that it matters but i disagree with byuu in that i think there are still improvements out that haven’t been explored to their full potential. off the top of my head, methods like using a leap frog approach to emulation, that is emulating 2 frames to push video then going back 1 and continuing in that fashion. it’s hacky and can cause video glitches but for most games i’d imagine it would work fine most of the time. frame_delay like in retroarch is another relatively unexplored option in practice, which reduces the latency associated from using vsync as the primary time sync method. the concept could be improved further by keeping stats that allow us to figure out how much we can safely push the frame up w/o likely going over the edge. also bare metal programming on fixed hardware like the raspberry pi isn’t a pipe dream and lies squarely within the realm of possibility, which opens doors to new methods of reducing latency that can’t be realized on a modern os in a portable manner. etc.

Yep, I considered that as well. On average, the difference will be minimal, though. Probably not even measurable with my method.

You’re welcome! I have already run some tests and have comparison data for 60 ms vs 20 ms. Setting it to 0 ms made the emulation halt… I’ll see if I can get a good comparison up tonight.

Yep, I agree. A question on that: do you enable vsync by simply going into higan’s settings file and setting audio synchronization to false and video synchronization to true?

I only have v097 here, but assuming it hasn’t changed since then, open up settings.bml and in the ‘Video’ block, change ‘Synchronize:false’ to ‘Synchronize:true’.

I’ve been busy testing higan with and without the lagfix and comparing to RetroArch with the bsnes-accuracy core. A few important notes before we start:

[ul] [li]higan exhibited stuttery visual performance in all tests, no matter if I was using the default “Synchronize Audio” or the now hidden “Synchronize Video” (vsync) settings. Looking at my recordings, it drops frames frequently, despite running on a Core i7-6700K capable of sustaining a stable 128 FPS in the test scene (if all synchronization is turned off). The behavior looks very similar to what the Vulkan backend in RetroArch produces.[/li][li]When using “Synchronize Video” instead of “Synchronize Audio”, performance seemed even less predictable with a larger swing between minimum and maximum input lag and a few latency spikes.[/li][li]No difference was found between the unmodified higan build and the lagfix enabled one when using “Synchronize Audio”, as expected.[/li][li]With “Synchronize Video” (vsync), the version with the lagfix tested slightly worse than the unmodified build. However, the test results of the version with the lagfix contains one nasty latency spike and three slightly less nasty ones, while the unmodified build only has one slightly nasty spike. These spikes are probably random, but more testing would be needed to conclude that. If the latency spikes (outliers) are removed from the test results, both versions (with/without lagfix) once again have the same input lag.[/li][li]With no conclusive differences between lagfix/no lagfix and no way of performing the frame advance test (since higan doesn’t have that ability) to confirm that my code even works, I’ve decided to only include the unmodified higan input lag numbers in the graph below. Perhaps there really is no difference, perhaps my code doesn’t work, perhaps the erratic performance skews the results, perhaps there’s something else that prevents this fix from working in higan even when using vsync instead of audio sync. I consider this part of the testing inconclusive and it will probably stay that way, since I don’t intend to spend any more time on testing this.[/li][li]Finally, changing higan’s audio latency down to 20 ms (from the default 60 ms) produced 0.3 frames higher input lag. I decided to leave this result out of the graph below.[/li][/ul] While the testing of the lagfix was inconclusive, it’s still interesting to see how higan compares to RetroArch in terms of input lag:

Test setup

[ul] [li]Core i7-6700K @ stock frequencies[/li][li]Radeon R9 390 8GB (Radeon Software 16.5.2.1, default driver settings)[/li][li]HP Z24i monitor (1920x1200)[/li][li]Windows 10 64-bit (Anniversary Update)[/li][li]RetroArch nightly August 4th 2016 + bsnes-accuracy v094[/li][li]higan v101[/li][li]Super Mario World 2: Yoshi’s Island[/li][/ul] RetroArch settings:

[ul] [li]OpenGL video driver[/li][li]xaudio audio driver[/li][li]Fullscreen (with windowed fullscreen mode disabled)[/li][li]Vsync enabled[/li][li]GPU hard sync enabled[/li][li]HW bilinear filtering disabled[/li][/ul] higan settings: [ul] [li]OpenGL video driver[/li][li]XAudio2 audio driver[/li][li]Fullscreen[/li][li]Video Emulation -> Blurring disabled[/li][li]Video Shader -> None[/li][/ul] For these tests, 20 measurements were taken per test case. The test procedure was otherwise the same as described in the first post in this thread, i.e. 240 FPS camera and LED rigged controller.

Results

Comments

Whether using audio sync or vsync, higan has significantly higher input lag than RetroArch. Despite this, neither test configuration of higan performed satisfactory in terms of smoothness, with frequent distracting frame drops/stuttering. Audio has major issues when using vsync, but that’s to be expected when using a setting that’s not even exposed in the GUI. I would also like to mention that I did not measure higan with the default Direct3D video driver. I wanted to keep things as similar as possible between RetroArch and higan to minimize the risk of external factors skewing the results. However, I did try just playing SMW2 with Direct3D and it definitely had a similar amount of frame drops.

So, to conclude, higan does not seem fully optimized in terms of input lag, at least not when running in Windows. RetroArch not only shaves off 1.4 to 2.6 frames worth of lag, it does so while producing subjectively perfect scrolling.

So the conclusion: everyone should code for Retroarch. :slight_smile:

[QUOTE=Brunnis;44858] I would also like to mention that I did not measure higan with the default Direct3D video driver. I wanted to keep things as similar as possible between RetroArch and higan to minimize the risk of external factors skewing the results. However, I did try just playing SMW2 with Direct3D and it definitely had a similar amount of frame drops.

So, to conclude, higan does not seem fully optimized in terms of input lag, at least not when running in Windows. RetroArch not only shaves off 1.4 to 2.6 frames worth of lag, it does so while producing subjectively perfect scrolling.[/QUOTE]

Thanks for the test. Forgive me if I’m missing something, but you can’t disable desktop compositing in Win10, can you? Wouldn’t Windows 7 have been a better approach here?

Also, did you leave D3D apart for not having a frame delay feature in Higan and using a Radeon card? Open GL is said to perform worse than D3D (+ frame delay) in this regard, isn’t it?

Better as in producing a better input lag result? Possibly. However, Windows 7 is an outdated OS with no more (mainstream) support from MS, so it’s not all that interesting to test. Besides, both applications are tested on the same OS, so it can’t really be considered “unfair”.

I’m sorry, but I don’t really understand. I have not heard that OpenGL performs worse input lag wise compared to Direct3D. On the contrary, really. Also, frame delay delay was disabled in RetroArch during my tests.

That’s pretty similar to my results with higan v094. However, that’s very surprising that the runloop reorganization had either zero or negative impact! Oh well. Null data is still data, so good on you for doing the work :slight_smile:

But one application uses exclusive full screen whereas the other does not. It’s a given that desktop compositing adds lag, so the results were more or less decided beforehand (again, if I’m not missing something).

I’m sorry, but I don’t really understand. I have not heard that OpenGL performs worse input lag wise compared to Direct3D. On the contrary, really. Also, frame delay delay was disabled in RetroArch during my tests.

I’m not sure now about OpenGL. I think I read somebody which made some tests with Groovymame, but I can’t find it now… I found that apparently OpenGL would behave the same as D3D, anyway, and no matter if it’s an ATI or Nvidia card in this regard, so forget it, sorry.

Could you test with a Voodoo2 on win98SE now please? (original and Plus! theme)

Only if he has a solid 2D card to bridge it with, like a Diamond Stealth. S3 cards have buggy drivers which could skew the results.

Yep, I would have expected a slight (0.5 frame times) improvement when using video synchronization. The code changes were a bit of a crap shoot, though, and I’m not too keen on putting in the necessary instrumentation to be able verify them in a fashion similar to how I use the frame advance method in RA.

Just for completeness and future reference, I’ll add a description of my understanding as to why the lagfix doesn’t work in higan with default (audio sync) settings:

The lagfix is only (fully) effective in applications where the emulator main loop is kicked off once for every frame that needs to be generated, generates the frame as fast as possible and then waits until the next time a frame is to be generated. In the case of the bsnes core before the lagfix, it ended each frame generation by reading the input, but wouldn’t use that input until the next call to the emulator main loop. The wait time between reading the input and kicking off the main loop again to actually generate the corresponding frame was unnecessary added input lag. The faster the emulator ran, the higher the added input lag would be, with a max theoretical value of one added frame period. Actually, I believe the bsnes libretro implementation reads the input from the system right before kicking off the emulator main loop (someone correct me if I remember wrong), which causes the worst case to occur, i.e. one full frame of added input lag.

Now, imagine a case where the emulator main loop doesn’t just generate a frame as fast as possible and then spends the rest of the frame interval waiting before the frame is output. Imagine instead that it’s continously synchronized so that it runs close to realtime, i.e. generating a frame actually takes 1/60th of a second like on the real hardware. Since there are no large wait times anywhere, there’s now no issue of reading the input on the wrong side of such a wait time. Immediately after input is read, the emulator continues to execute and the corresponding frame is output approximately one frame interval later (in the case of most SNES games).

With this said, the above case is an idealization. However, it can be used to get the general idea of why the lagfix is not effective in higan with default settings. As mentioned previously in this thread, the lagfix will actually have a really tiny positive effect on higan in many cases, since it usually outputs the frame marginally earlier than without the fix. However, the effect is around 5% of a frame period (or less than 1 ms), so it’s not all that interesting (or even measurable with my method).

[QUOTE=Brunnis;44858]

[ul] [li]higan exhibited stuttery visual performance in all tests, no matter if I was using the default “Synchronize Audio” or the now hidden “Synchronize Video” (vsync) settings. Looking at my recordings, it drops frames frequently, despite running on a Core i7-6700K capable of sustaining a stable 128 FPS in the test scene (if all synchronization is turned off). The behavior looks very similar to what the Vulkan backend in RetroArch produces.[/li][/ul] [/QUOTE]Just to note, this is expected when using “sync audio” on audio backends that take large chunks of audio at a time, which is most of them because everything is horrible.

On linux with OSS I’m capable of pushing the hardware audio buffer size to 256, which close enough to an integral dividend of 800 (=1/60th of a second at 48000hz) that it doesn’t act so erratically, but that’s a pretty unstable OS environment (oss on linux is a fourth-class citizen). On windows your best bet is normally the WASAPI backend, but on most machines the chunks that WASAPI shared mode takes out are ~10ms in size, so you’re still going to get noticable stutter even though the chunks are smaller than 1/60th of a second. This is one of the drawbacks of higan’s audio output chain being too simple.

It should be possible to make audio sync work better for input latency on audio backends that take more than 1/60th of a second at a time, but there’ll still be microstutter.

have you tried different gamepads to see if they cause any input lag themselves? For instance to see if there is a difference between the pad you’re using and a xbox 360 wired controller.

I discovered something crazy today. I was playing Paper Mario for Gamecube in Dolphin, and I noticed that I didn’t have any difficulty timing super attacks and such like I do when emulating gameboy advance and other Mario themed RPGs (Press A at the moment you hit the enemy, press B when the enemy is about to hit you, etc.).

So on a hunch, I loaded up an NES game in Dolphin Virtual Console (something I would never do normally). Super Mario Bros, my game of choice for detecting input lag because I know it by feel. Lo and behold, the input lag is clean. I had no severe problems making jumps in that game. I’ve had the same experience when playing on real Wii hardware, and I’d reserved myself to believing that it was due to PC overhead as discussed in this thread. It’s not as perfect as a real NES, but it is well within playability.

The ramifications of this, if accurate (I’m aware of the tendency for confirmation bias) are huge I think. First of all, it means that Dolphin has great input code. But more profoundly, it would imply that the input lag problem in the emulation. Not specific to Retroarch, I have never found a PC emulator that could play Super Mario correctly. But it would be no surprise if Nintendo had perfect NES emulation, right? This would prove that good input emulation is indeed possible under Windows despite all the problems that are covered in this thread, and through another emulator no less. Imagine if that could be replicated in a native PC emulator? It may be damn near perfect at that point.

I would love to see someone independently confirm this, preferably Brunnis with his testing apparatus, but even if someone could try it themselves (no SMW2, but they have SMW1). Load up your game in virtual console on Dolphin and tell me I’m wrong.

I realize a lot of very smart people have worked on these various NES and SNES emulators, but just try it and you’ll see. Is it really outside the realm of possibility that there is just some delay in the emulation cores that perpetuates to this day? (I also realize that this implies Dolphin is doing something differently so that it is not producing high input lag, hopefully if confirmed it can be instructive!)

Or maybe I’m completely wrong, let me know either way.

Testing environment: 11August2016 Retroarch daily (FCEUmm core) Dolphin 5.0 stable (Virtual Console) Core i5 4590/GTX 970 Wii controller input through DolphinBar. Windows 10 AU

1 Like

It may seem counter-intuitive, but I would be surprised if many of Nintendo’s VC emulators were best in class. So many years later, there isn’t likely anyone working for Nintendo who is extremely knowledgeable about old hardware (at least out of the people who are actually working on emulation for Nintendo) unless they’ve poached an elite emu developer like Sony probably did with the mysterious, very much missed pSX Author. I don’t know that Nintendo would do anything so clever.

I agree that they would be the ideal emulators, and it’s not surprising that wii vc is near-perfect on hardware, but what is significant is that dolphin is on PC and is not developed by Nintendo. Doubly so because it’s an emulator inside an emulator. It’s a testament to the quality of dolphin for sure, but if I’m correct then it also means that they are doing something that reduces input lag on PC that other emulators are not.

I can’t comment either way on native VC vs Dolphin with regard to input lag, but I’m actually trying to disagree about the quality of these ‘official’ emulators of Nintendo’s. :stuck_out_tongue: The NES, SNES, and N64 VC emulators aren’t terribly accurate and just need to run the games Nintendo sells with minimal glitches. Also, you may not have meant to suggest otherwise, but the Wii’s hardware is a significant disadvantage for the sake of emulating past Nintendo systems as it’s quite weak and only resembles the NGC in architecture.