Let's talk about Latency (and Game Mode)

The way it works is: states are being made constantly and held in memory, and any time you change your inputs (e.g., you press a button or let go of a button that’s currently held), it rolls back X frames (where X is your runahead value) and applies your input change in the past and then emulates the frames between then and the current frame with your input change applied, all within the time it takes to show the next frame.

So, depending on your perspective, it’s either showing you the future based on your current inputs (i.e., running ahead) or sending your inputs into the past and then retroactively applying them to the present.

As long as the runahead frames do not exceed the game’s internal latency, you will not see any of this. It all happens invisibly, under the hood. However, if you exceed the game’s internal latency, you will see rollback artifacts in the form of skipped animation frames, choppy motion, etc., just like you would see during rollback-based netplay on a dodgy connection.

To make the latency match the original hardware, you first need to measure the total latency of the original hardware setup (we’ll call this A), then measure the total latency of your emulation setup (we’ll call this B). Subtract A from B and the remainder is what you need to shave off, either with runahead or whatever other mitigation strategy (frame delay, hard gpu sync, etc). Since latency is experienced as a total of all contributing factors, reducing it via any of the available means feels the same. That is, you reduce the time between your finger pressing a button and your eyeballs seeing the effects produced on-screen.

3 Likes

In addition to everything hunterk said, you can check this too:
https://docs.libretro.com/guides/runahead/

It teaches you the correct way to find how many frames of internal latency a game has.

3 Likes

An interesting game to try out and totally feel the difference is King of Dragons for the arcade, it probably has an intended input delay to give your character more “weight”. In my current setup, it’s running on Vulkan, max swapchains 2, if runahead is OFF, the character will react in the 4th frame (tested using the frame-to-frame function with the K key, the core used is FBNEO since MAME doesn’t support runahead so far, but still have even less natural input latency in some cases. Turning runhead and setting it to 3 makes the character react in the next frame, the difference is huge, but Capcom games often have this delay, maybe one frame less on real hardware, which I suppose is intended. If you run this test yourself, I’d say that that once you get used to the more responsive controls these functions provide, it will be a bit hard to play like we used to back then, or even on older hardware that can’t afford such features.

1 Like

The feeling of weight could be given to the characters with longer animations if desired. If those 4 frames of delay are present for all actions and moves in the game, it’s probably an indication of a technical limitation on the game engine and not intentional. The fact that multiple games from the same company have the same amount of frames of latency probably means those games use the same engine.

Your PC and RetroArch settings (other than runahead) don’t change anything in the game’s internal latency, which is what you’re measuring with K. Even with the worst settings for latency, you will always measure the same 4 frames of internal latency on that game.

1 Like

Yes, Capcom arcade games, at least in emulation has 3 to 4 frames in my experience. Interesting about your points, so basically only runahead really does make the difference there. Still, with it ON and at 3, most Capcom games feel great to play. The only downside is that once you play even official compilations such as the Genesis Collection on the Xbox SX, it’s clear when playing Shadow Dancer and basically any other fast-paced game, in the PC using Retroarch or even standalone emulators that are now incorporating runahead natively, the games are much more responsive. So it can make us a bit spoiled, specially for action oriented games.

1 Like

This explanation was just perfect! I got it interely.

Thank you! I’ll check this, that’s what I needed.

1 Like

Not only official compilations lack run-ahead support (that I know of), but they’re also typically terrible when it comes to input lag.

Most often than not those compilations use emulation with no attention paid to latency at all. I don’t ever buy any of those compilations because of those and other issues.

3 Likes

If I can have a better experience playing on retroarch instead of original console, I guess I’ll go for it, I’m crazy enough to try to maximize competitiveness and fun at the same time, so, I should buy some refurbished or FPGA based consoles if I want to record some authentic gameplay, and use retroarch functions to make some really hard games less frustrating as possible.

1 Like

I agree, and that’s why I said the only downside is that we can’t unnotice that. I tried it in my brothers Xbox and I told him that it was hard for me to play Shadow Dancer, a game I’m used to playing, so for him, the games were probably good enough, but since I’m more into PC emulation than he is, I’m spoiled or just a bit more aware about this difference. About runahead, do you notice if it adds sound delay compared to the same test with it off, I’m pretty sure that some games add a bit of delay in that regard.

Absolutely, specially if you’re into shooters and action games in general, it’s that much better. Be aware that FPGA, while an interesting approach isn’t better by default just because someone said it is. It’s still emulation and it still requires good coding, or like the article from below states, any inherent input delay is from the OS and not from the code itself, but then again, I’m not an expert to really explain why that is.

Near wrote this article (archived from 2018) which explains it better than I could. I really like both approaches, I just think this should be clarified, though.

3 Likes

Yeah, after one is spoiled by low-latency, going for another system with higher latency is immediately noticeable. There’s this feeling of disconnection with the actions in the game.

Not sure as I don’t use runahead myself, but you can test sound synchronization on the 240p test suit for the SNES, for example.

This. It’s very important to note that an FPGA might, or might not, beat the latency of a beefy PC. It depends.

3 Likes

Has anyone noticed that by overclocking, depending on the game, input lag is halved or can be even be lower? I was always fascinated by PC to console ports and I just tried the Quake II release for the PS1, the game runs fine mostly at 30FPS and looks great, specially for the hardware. I noticed that the game feels a bit heavy on the controls but what I wanted to do was to see how far I could smooth the gameplay, so I started overclocking from 200% up to 1000%, from 300% onwards, the game runs at 60FPS most of the time but I suddenly noticed the controls were much more responsive too and I measured 7 frames of delay in the stock CPU speed, 4 frames of delay at 300% OCed, 3 frames from 500%+, the sweet spot for this game is at 500~, I couldn’t notice any improvements beyond that. By activating Swanstation’s internal runahead to 1 (Core Options), I could reduce the input lag even further and the character reacts at the second frame, where 3 by the OCis already quite impressive, but I didn’t fine tuned it extensively, as the OC and the runahead together started to make the game hiccup a bit as I was pushing it too hard for my old i7-2600, still I found it very interesting to see how input lag can be inherent to the game’s own performance limitations, depending on how it was coded and how it can be basically eliminated with modern enhancements, while making the game look stunning with a tweaked shader (which still needs more time to reach that goal, by the way) and even in widescreen, very impressive stuff. If you try this approach using other cores/games to see improvements on input lag across different systems, I’m sure the info will be useful to others as well. Here’s a short one showing the overclocked performance: Edit: I’ve uploaded a much better video:

5 Likes

higher frame rates always bring lower input lag, thats natural.

3 Likes

I was going to say that since you’re doubling the framerate from 30 to 60 fps, you would naturally get lower latency. But the expected latency reduction should be only half a frame of 30fps (~16ms). But you got a 3 frames at 60fps (50ms) reduction.

It’s impressive how that game benefits so much from overclocking.

2 Likes

That makes sense, the benefits from getting less latency were unexpected, though, as I was focused in framerate performance, you’d expect the game to control better just by making it run consistently, that’s for sure.

True, I wasn’t really even thinking about going that far but the way the game is designed really benefits from this. I’ll say that even in its stock performance, if I still had a PS1 console around, the game itself is very enjoyable. I’m used to how games worked and actually find them fascinating in a way or another, I don’t recall once thinking SNES games were slow or any drops in framerate being an annoyance, as a kid and a teen, hooking a Power Base on the Mega Drive to play Master System games was fascinating, it still is, also, each hardware behaved so differently and the same games across these systems also did.

Don’t take my word for it, though, if you can test yourself and confirm the benefits from this game or others that can benefit from increased internal performance, that would be very interesting.

PS: Another improvement shown in the video above I forgot mentioning is the ability of the core/frontend to preload the CHD into memory, and the added increased CD reading speeds/seek time, I probably could go even further to eliminate loading times as the game is split into small sections that needs to load here and there. One might say “are you still playing a PS or 'insert retro game console/game here”? Yes, it’s still a PS1 game, the design is so good it shows how much more they can offer when enhanced, it’s like they have so much potential to be discovered and enjoyed.

3 Likes

There were definitely some games with lots of slowdown on the 8 and 16 bit consoles, but yeah, from my experience, framerate consistency started going down from the PS1 onwards.

2 Likes

Wait, this question is off-topic for my own thread but I must know this, what core are you using and this works in any PS1 game? I tried to eliminate the long loading times of Diablo (this is the only annoying downside of this port in my opinion) but I had no sucess at all.

If you use normal Bin+Cue or Isos, make sure to check 'Preload CD-ROM Image to RAM, if you use CHD, the best option is the highlighted, while preloading games into RAM doesn’t have too much to do with the read/seek speed, it can definitely help avoiding hiccups if using a normal HDD. In Swanstation and probably in Mednafen too you can shorten load times by using the other highlighted options, be aware that depending on the game, it can hang, while others benefit a lot from the increased read/seek speeds, so if you notice a game misbehaving, try lowering the speed or default it altogether, I think Diablo won’t have issues, but you should try it to make sure.

1 Like

Note that pre-cache/preload is mostly useful for unreliable storage devices (like when you have your CD images on some network device that can suddenly become very slow or completely drop-out from time to time.)

On a normal HDD (let alone an SSD) it’s not really useful. The 10x read speed option for example only requires 6MB/s from your storage device. Through a network you might not get that at all times, but from your local storage, it’s pretty much guaranteed.

2 Likes

Other benefits by using the caching, specially for USB external HDDs, is that it also avoids hiccups since these devices sleep in a few minutes (you can use a third party program to avoid that but this is an extra layer/step to take and the power options to keep it ON all the time in Windows never worked, it always turns off, I had a few in the past and it can be annoying. That can probably make HDDs live a bit longer too, since the games need random access like the real thing, the HDD is accessed frequently, long games and extensive play sections makes so that having it in the RAM is the better option as well.

3 Likes

On my Windows machine I am using RTSS scanline sync and I am pleased with the result.

Basically what you do is turn vsync off in retroarch and in your GPU control panel, then RTSS injects variable amounts of delay into the OGL/Vulkan/DX pipeline to keep the horizontal tear line in a fixed position at the top/bottom of the frame. Actually it’s hidden completely in the blanking area so you don’t even see it, and you can control the position of it manually as well with hotkeys while you’re playing the game.

The catch is the game has to have consistent render times between frames otherwise you’ll briefly see the tear line appearing before RTSS hides it away again.

I use it with Retroarch NEStopia and standalone Dolphin, GTX1070 Windows 7. Also use it for other PC games.

1 Like