An input lag investigation

Yep, I would have expected a slight (0.5 frame times) improvement when using video synchronization. The code changes were a bit of a crap shoot, though, and I’m not too keen on putting in the necessary instrumentation to be able verify them in a fashion similar to how I use the frame advance method in RA.

Just for completeness and future reference, I’ll add a description of my understanding as to why the lagfix doesn’t work in higan with default (audio sync) settings:

The lagfix is only (fully) effective in applications where the emulator main loop is kicked off once for every frame that needs to be generated, generates the frame as fast as possible and then waits until the next time a frame is to be generated. In the case of the bsnes core before the lagfix, it ended each frame generation by reading the input, but wouldn’t use that input until the next call to the emulator main loop. The wait time between reading the input and kicking off the main loop again to actually generate the corresponding frame was unnecessary added input lag. The faster the emulator ran, the higher the added input lag would be, with a max theoretical value of one added frame period. Actually, I believe the bsnes libretro implementation reads the input from the system right before kicking off the emulator main loop (someone correct me if I remember wrong), which causes the worst case to occur, i.e. one full frame of added input lag.

Now, imagine a case where the emulator main loop doesn’t just generate a frame as fast as possible and then spends the rest of the frame interval waiting before the frame is output. Imagine instead that it’s continously synchronized so that it runs close to realtime, i.e. generating a frame actually takes 1/60th of a second like on the real hardware. Since there are no large wait times anywhere, there’s now no issue of reading the input on the wrong side of such a wait time. Immediately after input is read, the emulator continues to execute and the corresponding frame is output approximately one frame interval later (in the case of most SNES games).

With this said, the above case is an idealization. However, it can be used to get the general idea of why the lagfix is not effective in higan with default settings. As mentioned previously in this thread, the lagfix will actually have a really tiny positive effect on higan in many cases, since it usually outputs the frame marginally earlier than without the fix. However, the effect is around 5% of a frame period (or less than 1 ms), so it’s not all that interesting (or even measurable with my method).

[QUOTE=Brunnis;44858]

[ul] [li]higan exhibited stuttery visual performance in all tests, no matter if I was using the default “Synchronize Audio” or the now hidden “Synchronize Video” (vsync) settings. Looking at my recordings, it drops frames frequently, despite running on a Core i7-6700K capable of sustaining a stable 128 FPS in the test scene (if all synchronization is turned off). The behavior looks very similar to what the Vulkan backend in RetroArch produces.[/li][/ul] [/QUOTE]Just to note, this is expected when using “sync audio” on audio backends that take large chunks of audio at a time, which is most of them because everything is horrible.

On linux with OSS I’m capable of pushing the hardware audio buffer size to 256, which close enough to an integral dividend of 800 (=1/60th of a second at 48000hz) that it doesn’t act so erratically, but that’s a pretty unstable OS environment (oss on linux is a fourth-class citizen). On windows your best bet is normally the WASAPI backend, but on most machines the chunks that WASAPI shared mode takes out are ~10ms in size, so you’re still going to get noticable stutter even though the chunks are smaller than 1/60th of a second. This is one of the drawbacks of higan’s audio output chain being too simple.

It should be possible to make audio sync work better for input latency on audio backends that take more than 1/60th of a second at a time, but there’ll still be microstutter.

have you tried different gamepads to see if they cause any input lag themselves? For instance to see if there is a difference between the pad you’re using and a xbox 360 wired controller.

I discovered something crazy today. I was playing Paper Mario for Gamecube in Dolphin, and I noticed that I didn’t have any difficulty timing super attacks and such like I do when emulating gameboy advance and other Mario themed RPGs (Press A at the moment you hit the enemy, press B when the enemy is about to hit you, etc.).

So on a hunch, I loaded up an NES game in Dolphin Virtual Console (something I would never do normally). Super Mario Bros, my game of choice for detecting input lag because I know it by feel. Lo and behold, the input lag is clean. I had no severe problems making jumps in that game. I’ve had the same experience when playing on real Wii hardware, and I’d reserved myself to believing that it was due to PC overhead as discussed in this thread. It’s not as perfect as a real NES, but it is well within playability.

The ramifications of this, if accurate (I’m aware of the tendency for confirmation bias) are huge I think. First of all, it means that Dolphin has great input code. But more profoundly, it would imply that the input lag problem in the emulation. Not specific to Retroarch, I have never found a PC emulator that could play Super Mario correctly. But it would be no surprise if Nintendo had perfect NES emulation, right? This would prove that good input emulation is indeed possible under Windows despite all the problems that are covered in this thread, and through another emulator no less. Imagine if that could be replicated in a native PC emulator? It may be damn near perfect at that point.

I would love to see someone independently confirm this, preferably Brunnis with his testing apparatus, but even if someone could try it themselves (no SMW2, but they have SMW1). Load up your game in virtual console on Dolphin and tell me I’m wrong.

I realize a lot of very smart people have worked on these various NES and SNES emulators, but just try it and you’ll see. Is it really outside the realm of possibility that there is just some delay in the emulation cores that perpetuates to this day? (I also realize that this implies Dolphin is doing something differently so that it is not producing high input lag, hopefully if confirmed it can be instructive!)

Or maybe I’m completely wrong, let me know either way.

Testing environment: 11August2016 Retroarch daily (FCEUmm core) Dolphin 5.0 stable (Virtual Console) Core i5 4590/GTX 970 Wii controller input through DolphinBar. Windows 10 AU

1 Like

It may seem counter-intuitive, but I would be surprised if many of Nintendo’s VC emulators were best in class. So many years later, there isn’t likely anyone working for Nintendo who is extremely knowledgeable about old hardware (at least out of the people who are actually working on emulation for Nintendo) unless they’ve poached an elite emu developer like Sony probably did with the mysterious, very much missed pSX Author. I don’t know that Nintendo would do anything so clever.

I agree that they would be the ideal emulators, and it’s not surprising that wii vc is near-perfect on hardware, but what is significant is that dolphin is on PC and is not developed by Nintendo. Doubly so because it’s an emulator inside an emulator. It’s a testament to the quality of dolphin for sure, but if I’m correct then it also means that they are doing something that reduces input lag on PC that other emulators are not.

I can’t comment either way on native VC vs Dolphin with regard to input lag, but I’m actually trying to disagree about the quality of these ‘official’ emulators of Nintendo’s. :stuck_out_tongue: The NES, SNES, and N64 VC emulators aren’t terribly accurate and just need to run the games Nintendo sells with minimal glitches. Also, you may not have meant to suggest otherwise, but the Wii’s hardware is a significant disadvantage for the sake of emulating past Nintendo systems as it’s quite weak and only resembles the NGC in architecture.

This isn’t about vc versus hardware it’s about retroarch and other emulators versus vc emulation on dolphin. You can comment on that just try it out :grin: Lack of various os level delays and USB delays have been cited by hunterk I believe as cause of input lag, but if dolphin is playing nes with small input lag then os/USB are ruled out as causes. It doesn’t matter if the emulation is poor/glitched really as it relates to this topic.

I’ve understood all you’ve said. I disagreed with your words about the quality of the VC emulators, you took that to mean I agreed, I restated my disagreement, and now you’re saying we’re not talking about one part of what we were certainly talking about.

I agree that they would be the ideal emulators, and it’s not surprising that wii vc is near-perfect on hardware

Again, I don’t think the VC titles are near-perfect on Wii hardware. If you think as you suggest in your first post that VC titles on Dolphin may be performing with less input lag than, say, ROMs in Retroarch NES or SNES cores, then it’s really on you to test this unlikely possibility as ‘feel’ really isn’t enough for something that, frankly, defies credulity.

I’m not an emu developer, and I’m happy to be proven wrong when I’m negative about something. Few things make me happier than the developments of this thread, but I don’t think there’s anything to be gained with regard to input lag for the general emu scene by studying Dolphin, VC, or VC in Dolphin.

Oh you’re one of those “input lag is in your head” guys… I think you’re in the wrong thread. I’m here to ask for help with testing. I don’t have the hardware to replicate Brunnis tests, and requesting assistance to verify. Your speculation and doubt is adding nothing. Tryt it yourself or don’t. Gg mate.

What? I’m an “input lag is in your head” guy? Really? Is that why I’ve lavished praise on Brunnis and just told you “Few things make me happier than the developments of this thread”? Is that why I’m critical of byuu’s rejection of the possibility of further improvements to input lag, and of his suggestion that an additional 16ms doesn’t matter?

I’m afraid your emulator lacks I/O accuracy. Nothing to be gained from this. Moving on…

"If anything, the idea, if true, that VC in Dolphin performs better than VC on the Wii is just testament to the Wii’s weakness. " This isn’t the idea at all. I’m sorry if this wasn’t clear enough for you. Yes, please stop discussing you are just confusing the issue. If you aren’t willing to help confirm that’s fine.

as stated in the op he’s using a retrolink usb controller. this is a basic usb hid device, not something you typically have to worry about. the controllers i’d be suspicious of would be ones that require specialized drivers or especially those cheapo console to usb adaptors. the latter are notorious for additional input lag.

also, before ppl bring up usb polling rates keep in mind that the 8ms default usb polling rate != 8ms of additional latency, that just means that your os only polls the controller for input every 8ms. so just ball parking it that’s more like ~4ms of additional latency on the avg (+ ~8ms worst case for just missing an input change, + ~0ms if just caught it). but compared to everything else this is kind of an insignificant factor.

according to the lead libretro/retroarch devs the video display stack (os wm / compositor, video drivers, etc) are responsible for nearly all additional latency on pc hardware. if the various software involved didn’t excessively buffer cmds and frames and gave us a universal way to control how/when the hardware draws and pushes frames to the display, then even with vsync & double buffering we’d be able to emulate a 2d raster based system with only ~1 frame of additional latency (that is step emu (guest vblank first, then active) to generate a frame, tell the gfx card to draw said frame immediately into the back buffer, then wait for vsync and flip front/back buffers). vulkan may very well allow us to have such control, but from what i understand at this point the drivers are still very much a wip.

@vookvook re nintendo’s vc emu’s: software running on the wii/u has pretty much direct control over the hardware, add to the fact that nintendo probably knows how to use that hardware better than anyone else and so i’d assume their emu’s probably have only ~1-2 frames of additional latency over the original hardware. i’ve never examined dolphin’s source code in depth so i have no idea how it’s setup to work, but in a very best case scenario gpu cmds from the guest would be translated 1-1 to equiv. cmds on the host resulting in no additional latency. however, even if this is the case as stated above the video display stack on the host is probably going to add a few frames.

while it’s possible i think it’s unlikely that emulation of games thru dolphin would result in less latency than retroarch + nestopia/fceumm. equal maybe, but even then… i think you’d really have to have retroarch configured sub-optimally in order for this to be true. according to tests done by Brunnis he’s been able to achieve as low as ~2 frames of additional latency over orig hardware (even lower when adding a small frame_delay) in retroarch using the following settings, try them and see if things improve at all:


audio_sync = "false"
video_threaded = "false"
video_fullscreen = "true"
video_driver = "gl"
video_vsync = "true"
video_max_swapchain_images = "1"
video_hard_sync = "true"
video_hard_sync_frames = "0"
video_disable_composition = "true"

(note video_max_swapchain_images probably doesn’t apply here, i’d have to double check the source to retroarch to be certain but i set it here anyway, figure it can’t hurt)

[QUOTE=e-tank;45019]a

@vookvook re nintendo’s vc emu’s: software running on the wii/u has pretty much direct control over the hardware, add to the fact that nintendo probably knows how to use that hardware better than anyone else and so i’d assume their emu’s probably have only ~1-2 frames of additional latency over the original hardware. i’ve never examined dolphin’s source code in depth so i have no idea how it’s setup to work, but in a very best case scenario gpu cmds from the guest would be translated 1-1 to equiv. cmds on the host resulting in no additional latency. however, even if this is the case as stated above the video display stack on the host is probably going to add a few frames.

while it’s possible i think it’s unlikely that emulation of games thru dolphin would result in less latency than retroarch + nestopia/fceumm. equal maybe, but even then… i think you’d really have to have retroarch configured sub-optimally in order for this to be true. according to tests done by Brunnis he’s been able to achieve as low as ~2 frames of additional latency over orig hardware (even lower when adding a small frame_delay) in retroarch using the following settings, try them and see if things improve at all:


audio_sync = "false"
video_threaded = "false"
video_fullscreen = "true"
video_driver = "gl"
video_vsync = "true"
video_max_swapchain_images = "1"
video_hard_sync = "true"
video_hard_sync_frames = "0"
video_disable_composition = "true"

(note video_max_swapchain_images probably doesn’t apply here, i’d have to double check the source to retroarch to be certain but i set it here anyway, figure it can’t hurt)[/QUOTE]

hey these settings worked great. I can’t tell the difference between them any more.

Thanks.

[QUOTE=wareya;44919]Just to note, this is expected when using “sync audio” on audio backends that take large chunks of audio at a time, which is most of them because everything is horrible.

On linux with OSS I’m capable of pushing the hardware audio buffer size to 256, which close enough to an integral dividend of 800 (=1/60th of a second at 48000hz) that it doesn’t act so erratically, but that’s a pretty unstable OS environment (oss on linux is a fourth-class citizen). On windows your best bet is normally the WASAPI backend, but on most machines the chunks that WASAPI shared mode takes out are ~10ms in size, so you’re still going to get noticable stutter even though the chunks are smaller than 1/60th of a second. This is one of the drawbacks of higan’s audio output chain being too simple.

It should be possible to make audio sync work better for input latency on audio backends that take more than 1/60th of a second at a time, but there’ll still be microstutter.[/QUOTE] Thanks for this explanation, wareya.

[QUOTE=e-tank;45019]as stated in the op he’s using a retrolink usb controller. this is a basic usb hid device, not something you typically have to worry about. the controllers i’d be suspicious of would be ones that require specialized drivers or especially those cheapo console to usb adaptors. the latter are notorious for additional input lag.

also, before ppl bring up usb polling rates keep in mind that the 8ms default usb polling rate != 8ms of additional latency, that just means that your os only polls the controller for input every 8ms. so just ball parking it that’s more like ~4ms of additional latency on the avg (+ ~8ms worst case for just missing an input change, + ~0ms if just caught it). but compared to everything else this is kind of an insignificant factor.

according to the lead libretro/retroarch devs the video display stack (os wm / compositor, video drivers, etc) are responsible for nearly all additional latency on pc hardware. if the various software involved didn’t excessively buffer cmds and frames and gave us a universal way to control how/when the hardware draws and pushes frames to the display, then even with vsync & double buffering we’d be able to emulate a 2d raster based system with only ~1 frame of additional latency (that is step emu (guest vblank first, then active) to generate a frame, tell the gfx card to draw said frame immediately into the back buffer, then wait for vsync and flip front/back buffers). vulkan may very well allow us to have such control, but from what i understand at this point the drivers are still very much a wip.[/QUOTE] Yep, while I don’t like to make assumptions, I wouldn’t expect this particular controller to add any significant delay in itself. I have been asked to look into the effect of de-bouncing circuitry in the controller and while that would be interesting, I don’t really plan on going into such depth. Besides, with my test configuration, there really isn’t much of the input lag that we can’t already account for given the known quantities. Please see this post where I made a summary:

http://libretro.com/forums/showthread.php?t=5428&p=41748&viewfull=1#post41748

OpenGL on this Radon R9 390 with the 16.5.2.1 drivers really seems to have a minimal amount of buffering. It seems to start scanning out the framebuffer immediately or almost immediately after swapping buffers, which means that there’s hardly any room for improvement. That said, my recent tests of the latest 16.7.3 driver shows how volatile the situation is. That new driver added rather siginficantly to the input lag, for no apparent reason.

Good to hear!

Hi Brunnis and others,

You can check the polling rate of your game controller by running the tool USBView: http://msdn.microsoft.com/en-us/library/windows/hardware/ff560019(v=vs.85).aspx

The Device Bus Speed and the bInterval value together determine the polling rate in milliseconds, as shown on the following page (second table is for full speed devices):

https://msdn.microsoft.com/en-us/library/windows/hardware/ff539317(v=vs.85).aspx I guess for minimal lag from the controller polling side you want it to be a Full Speed device with a bInterval value of 1 (resulting polling rate is 1ms).

[QUOTE=Dr.Venom;45113]Hi Brunnis and others,

You can check the polling rate of your game controller by running the tool USBView: http://msdn.microsoft.com/en-us/library/windows/hardware/ff560019(v=vs.85).aspx

The Device Bus Speed and the bInterval value together determine the polling rate in milliseconds, as shown on the following page (second table is for full speed devices):

https://msdn.microsoft.com/en-us/library/windows/hardware/ff539317(v=vs.85).aspx I guess for minimal lag from the controller polling side you want it to be a Full Speed device with a bInterval value of 1 (resulting polling rate is 1ms).[/QUOTE] Thanks for the tip! I just checked my LED rigged RetroLink controller and bus speed is “Low” and bInterval is 0x0A. This translates to a polling interval of 8 ms, which is what I expected. As far as I know, it’s what pretty much all gamepads use, but I’d be happy to see examples of ones using higher polling rate.

Either way, 8 ms polling rate leads to an average of 4 ms contribution to input lag. Hunting down a gamepad with higher polling rate should probably not be high on one’s list of priorities…

If I use Vsync ON and Frame Delay of let’s say 10… Will there be any improvement turning GPU Sync to 0 in input latency? It seems like using Frame delay and GPU Sync together is a little better but hard to tell. GPU Sync alone is not good enough, though. The reason I ask is because with CRT Royale shader I have to lower the Frame Delay if I have GPU Sync on, or performance drops.

yes, when used in conjunction with vsync frame_delay should remove that many ms worth of input latency.

that’s to be expected, many shaders are computationally expensive and will reduce the amount of time you’d otherwise be able to safely allocate via frame_delay.

[QUOTE=Brunnis;44578]Finally, an important observation which could very well compromise the Vulkan results in this post: The rendering does not work as expected in Vulkan mode. I first noticed this when playing around with Yoshi’s Island. As soon as I started moving, I noticed that scrolling was stuttery. It stuttered rapidly, seemingly a few times per second, in a quite obvious and distracting way. I then noticed the very same stuttering when scrolling around in the XMB menu. I tried changing pretty much every video setting, but the stuttering remains until I change back to OpenGL and restart RetroArch.

When anayzing the video clips from my tests, I noticed that the issue is that frames are skipped. When jumping with Mario in Yoshi’s Island, I can clearly see how, at certain points, a frame is skipped.

My system is pretty much “fresh” and with default driver settings. Triple buffering on/off makes no difference. The stuttering appears in both RetroArch 1.3.4 and the nighly build I tested. Same behavior with Radeon Software 16.5.2.1 and 16.7.3.

I’ve seen one other forum member mention the very same issue, but I couldn’t find that post again. I’ve seen no issues on GitHub for this. I doubt that this is a software configuration issue, but I guess it could be specific to the GPU (i.e. Radeon R9 390/390X). Would be great if we could try to get to the bottom of this, because it makes Vulkan unusable for actual gaming on my setup and could also skew the test results.[/QUOTE] I tested the new Nvidia driver version 372.54 which bumps up Vulkan to version 1.0.13 and it fixes the stuttering I was noticing in Symphony of the Night. I guess they fixed something with Vulkan’s Vsync. I still can’t fast forward though.