An input lag investigation

Thanks for the detailed post Brunnis.

After some testing i noticed that Mednafen Saturn has significantly more lag than most other cores. I tested Sonic on genesis and the Saturn version in Sonic Jam, using the same settings, and there’s a big difference, i’d say that the saturn version is barely playable even with the best possible settings.

Is this a known problem with the core? I have a real Saturn and it’s not that bad.

Also, is video_max_swapchain_images = “1” better than “2” if the system can handle it?

@GemaH game consoles prior to the 5th gen (saturn, psx, n64) lacked full video frame buffers and hence were able to output frames to the display faster. for most 60fps based games 1 additional frame of latency is to be expected for these newer consoles. mednafen saturn has 2 additional frames of latency, however this extra frame may be unavoidable as it’s probably inherent to the hardware, but i’m not sure, for all i know it could be a design decision in mednafen a la how the snes cores used to be, or an error causing additional frame buffering, etc. from what i know of the saturn’s hardware, which apart from how vdp1 operates is limited, i always assumed that the saturn wasn’t inherently any laggy-er than the psx, but i’m clearly not the expert here and am personally willing to trust mednafen author’s call on this one, though it would def be nice if she or someone else could confirm this for us. i did try to confirm this fact myself but after 30 mins or so of looking thru the source and some tech docs i gave up as it would’ve required more time to investigate than i have available atm. idk, maybe vdp2 adds a frame of latency on top of what vdp1 already adds when it pushes the complete frame to the display, if that’s the case then there’s nothing that can be done about it.

edit: i’m an idiot, it could just as well be software (ie the game), which if that’s the case then there’s def nothing that can be done about it. as i said before many pages back in this thread, to understand where latency is coming from you need to have knowledge of A. how the hardware works, B. how the emulator works (both core and retroarch), and C. how the game works. i completely forgot C, i made the above post under the assumption that the source of 1 extra frame of latency for most games (compared to direct ports on the psx, like rockman 8 for example) is either in the hardware or in mednafen’s implementation, but that’s only true if games made on the saturn follow the same programming patterns as games on psx or other similar hardware, and while i think that’s likely i have no basis for that claim. the standard gfx libs that sega provided (that i’d assume most games used) could be adding a frame of latency over what similar games on the psx had. however, as of right now i’ve yet to come across a game that has a minimum response time of less than 4 frames via frame stepping mednafen saturn, where as on the psx i know for a fact that many 60 fps games were programmed in a way that gave them a minimum response time of 3 frames on both actual hardware and frame stepping mednafen psx.

TLDR: no clue but it’s def worth investigating or at least prodding some more knowledgable ppl on.

also, i just checked sonic jam and wow, you’re right, it’s pretty laggy. sonic 1 has a min response time of 2 frames both on hardware and also via frame stepping in genesis_plus_gx (via retroarch) sonic 1 in sonic jam via frame stepping in mednafen saturn (via retroarch) has a minimum response time of 5 frames! i can’t say for certain, but i believe 1 of the extra frames is from emulated hardware (double buffering), 1 from either emulated hardware (maybe vdp2?) or the emulator (vblank/active ordering or extra buffering? this is the only case that could possibly be eliminated) or software (gfx lib?), and 1 from software. basically no matter what happens the game is always going to be quite a bit laggy-er on the saturn, now add in the extra frames of latency associated with emulating a saturn on pc hardware and you’re getting into no man’s land lag territory.

Hey e-Tank, thanks for the detailed response.

I’m not very knowledgeable of the inner workings of these things, i’m just an end user with a CRT PC monitor, a CRT TV + real consoles, an LCD TV and a lot of free time for testing.

It’s just that after Brunnis latest post the lag problem in Mednafen Saturn only became more noticeable to me. Although, to be fair, Mednafen Saturn is probably the most demanding core. So i can’t use the same options with all cores, i have to make some sacrifices like using GPU_Sync_Frames “1” instead of “0” (i have a i5 4670). The minimum input lag i could manage is almost as bad as it was with SSF. I remember Yabause being a bit better with this, i need to re-test it, it’s just that i can never run it at full speed in software mode for some reason, i always have to use frame skip.

[QUOTE=lordmonkus;49147]Obviously these settings are for with V-Sync On but what if any settings would be changed if you have V-Sync Off like I do because of having a G-Sync monitor ? The way I understand it (and hopefully understand correctly) is the video frame delay has zero effect when running V-Sync Off.[/QUOTE] I actually don’t know enough about how that’s implemented to be able to say for sure. Maybe someone else can chime in on that?

A video_max_swapchain_images setting of 2 will actually force vsync on, as RetroArch will be made to wait until the buffer flip has been performed before generating another frame. I’ve added a note about that in my previous post.

Sure, go ahead and post it over there (and preferably link back to the original post). :slight_smile:

[QUOTE=Tatsuya79;49149]Nice summarization.

I would tell to ignore the video_frame_delay setting at first, this is too unreliable and gives less gain. Just for advanced users who want to experiment (and make per game config).[/QUOTE] Thanks. I’ve added a note about that to my previous post.

You’re welcome! Giving hardware recommendations is pretty hard, actually. One really needs to test both hardware and the GPU driver to know for sure. For minimum input lag (to be able to use video_max_swapchain_images = 2 and a bit of frame delay), I would go for an Intel x86 system and probably use the integrated graphics (mainly to get a low power system that’s durable and easy to cool). Either a Core model or perhaps the upcoming Apollo Lake based systems (some info here). I’m kind of in the process of evaluating hardware for my next build, so I will get back to you on that front. My next step will be testing input lag performance of my Core i7-6700K under Linux KMS/DRM.

I’ve not really read up on the nitty gritty technical details of G-Sync and FreeSync, but I’d assume that they need two framebuffers as well. The difference being that when rendering to a framebuffer has been completed it can be scanned out immediately without waiting for sync. In the case of RetroArch, where we want (need) to output a consistent 60 FPS, I don’t really see the benefit from an input lag perspective. If someone has a different view on this, I’d be happy to hear it.

It probably depends on GPU drivers on Linux as well, but on my Broadwell laptop with integrated Intel graphics, OpenGL has one frame lower input lag than OpenGL (with closed source driver) on the Pi. In other words, it matches the input lag performance of Dispmanx on the Pi. Setting video_max_swapchain_images = 2 on the Broadwell system removes another frame of input lag and makes it even faster than the Pi with Dispmanx.

The reason for Dispmanx being one frame slower than the Broadwell system with OpenGL and video_max_swapchain_images = 2 is that Dispmanx on the Pi is hardcoded to use three framebuffers (i.e. video_max_swapchain_images = 3). RetroArch’s Dispmanx driver doesn’t support the video_max_swapchain_images setting. However, I rewrote the Dispmanx driver to support the setting, but it turns out even the Raspberry Pi 3 is too slow to run every SNES game at full speed with video_max_swapchain_images = 2. So, I decided not to push the updated code.

Yes, KMS is already supported with the experimental open source driver called VC4. I’ll write a separate post about that soon.

As I wrote further up in this post, I’m not sure. If I were to guess, I’d say that it helps even with G-Sync, but I will let someone else confirm that.

I’d prefer to hold on to any preliminary info for now. :slight_smile: Hopefully I’ll have something soon.

No idea what the current situation with Lakka is, sorry.

Well, it’s definitely faster from a processing performance point of view. So it should be better equipped to use video_max_swapchain_images = 2 and maybe frame delay. But it also depends on the GPU driver, so testing would be needed to confirm that.

No, that will not have any effect when using OpenGL. Any setting below 2 is the same as 2 and any setting above 3 is the same as 3.

I’ve been reading some things about Freesync vs Gsync and each apparently has a small latency advantage over the other in certain circumstances. It seems Gsync also has a buffer inside the monitor that can be a source of latency in some circumstances. I think the consensus suggests that Freesync might be preferable for emulation purposes while Gsync is better for typical asynchronous (i.e., game logic is not synced to frame timing) gaming.

Frame delay should still matter on these variable refresh monitors because you’re still trying to hit the ~60 fps target. If audio sync were disabled and it were just running as fast as possible, it wouldn’t matter, but as long as there’s still a set interval on the frames, you’ll want to use frame delay to get as close to that interval as possible.

@Brunnis Thanks again, I have copied and pasted your post here over to a thread on the Launchbox forums while linking back directly to your post and giving you 100% credit for the information.

@hunterk Thanks for your info as well. Good to know that frame delay still is a useful setting even with V-Sync off.

I wish I had the hardware to accurately measure and test with G-Sync. I know it’s one of those things that not many people have in their setups and right now there is little information on it when it comes to emulation. All I can say from my experience with it so far is that it is very good even while still trying to work out 100% optimized settings.

@hunterk

Does max_swapchain_images have any effect in Windows? The blog post about it only mentions Linux.

I don’t actually know for sure, but it looks like it only applies to OpenGL in a KMS/DRM context, which isn’t available in Windows.

EDIT: oh yeah, from maister: “Only KMS. It is not possible to control this in regular GL. GPU Hard Sync is similar, but the hacky version.”

I tried changing it to 2 but couldn’t notice a difference. I know there are games that allow you to change between double and triple buffering, so it should be possible. It would be nice for the 1 frame input lag reduction.

Edit: saw the your edit. Hmm, maybe I’ll make a Lakka flash drive.

do you have access to a camera that can record video at a high frame rate (> 60 fps)? if so would you be willing to run some tests on sonic 1 in sonic jam on your saturn in order to determine the actual avg response time of the game? while having a led wired up to a controller (like brunnis has been doing) certainly aids greatly in ones ability to get accurate readings quickly, it can be done w/o too. if you hold the controller in front of the display you’re recording and get enough samples (say 20-30 or so), i bet we could get an accurate enough reading to help determine where the problem lies, if it’s the emulator or not. if you plan on doing this pls let me know as i’ll give you some tips that might help you get more accurate readings.

edit: although i can tell you i don’t have high hopes that there’s really any room for improvement to remove latency from mednafen saturn; well, besides general optimizations that may eventually reduce the cpu load and allow you to run it with more aggressive settings. of the saturn emus out there that allow you to frame step them (yabause via libretro and mame) mednafen saturn appears to be the quickest to show a response to inputs. my guess is that it’s already doing everything it can and that the problem lies either in how the hardware works or the software itself… but again, anyone who’s able to do testing could at least confirm this fact for us.

@brunnis : wow, that’s a comprehensive post, it’s so much clearer in my mind now, thanks :slight_smile: So a NUC seems to be a great solution, meanwhile.

@all : do you all use TN monitors or do some of you use (A-M)VA / IPS monitors ? The latter have better contrast, better colours, better angles etc. but not as good latency. Do it matter for oldschool 2D gaming ?

I have an IPS monitor but it also supports Gsync so I guess the lag is compensed by the Gsync function.

I’d say pixel response time is less of an issue than input lag (I tend to view them as different things). A longer pixel response time blurs the picture slightly, but generally doesn’t affect playability much (unless it’s really bad). It’s true that VA and IPS monitors usually have a few milliseconds longer response time than TN, but whether or not the extra blurring is viewed as distracting or not varies from person to person. It’s also important to note that the monitor’s input lag, i.e. the processing delay between input and start of pixel change, isn’t worse on VA/IPS than TN. Input lag is dependent on the design of the specific monitor model and not the display panel technology.

I personally use an IPS monitor (HP Z24i) which appears to have very low input lag. I also use a plasma TV, which has pretty high input lag (~2.5 frames) but quick pixel response time.

what snes core and what games are you talking about, like only super fx games or others as well?

can’t wait to see your results testing various setups on the pi, it’s very much appreciated!

I used the snes9x2010 core. Primarily tested with Yoshi’s Island, so, yes, SuperFX. Ran some tests with Super Mario World as well and believe I had some slight hiccups but nothing too serious.

Come to think of it, I might as well upload the code and people can tinker with the setting if they want to. It will probably work fine for NES and most SNES games on a Pi 3 and emulator and game specific settings can be used to just apply it where it works.

Thanks!

@Brunnis forgot to ask you, hopefully you’ll see this question b4 you log off :slight_smile: are you doing your tests with audio_sync enabled? if you are might i suggest in the future, it might be a good idea to disable it, it’s possible that it could be messing with results slightly / be responsible for some of hiccups you might experience. prob not a big deal, but it’s one less variable to worry about

Created a pull request for the updated DispManX driver: https://github.com/libretro/RetroArch/pull/3815

[QUOTE=e-tank;49299]@Brunnis forgot to ask you, hopefully you’ll see this question b4 you log off :slight_smile: are you doing your tests with audio_sync enabled? if you are might i suggest in the future, it might be a good idea to disable it, it’s possible that it could be messing with results slightly / be responsible for some of hiccups you might experience. prob not a big deal, but it’s one less variable to worry about[/QUOTE] Thanks, I’ll keep that in mind. I have been having audio_sync enabled so far.

I don’t have that type of equipment.

All i can share is some words here in this forum. Also, i don’t have Sonic Jam on the real Saturn but i do have Duke Nukem and it’s the first game i usually test. There is a noticeable difference between it and Mendafen Saturn with the highest settings possible (still i can’t manage 0 GPU frames though). For Sonic, i just compared Mednafen Saturn with Genesis GX plus and there was a huge difference in lag (i didn’t even have to use the real Mega Drive).

I really hope some improvements can be done with Mednafen Saturn on that front. It’s the best Saturn Emulator right now and input lag seems to be it’s only weakness. SSF is also pretty bad. Not many options here.

The pull request has been merged. Compile RetroArch from source on your Raspberry Pi and set video_driver = “dispmanx” and video_max_swapchain_images = 2 to test. Be prepared for anything demanding to stutter or run slow, though. Raspberry Pi 3 highly recommended.

EDIT: I also updated my previous post with settings recommendations to include info about using video_max_swapchain_images = 2 on the Pi.

I’m sorry if this has been addressed (couldn’t find it), but currently, which cores have these inpug lag improvements applied on the main branches? I know Snes9x was being worked on, but now that there’s a Snes9x 2010 and a Snes9x on the updater, but no longer a Snes9x Next… I’m a bit lost.