Let's talk about Latency (and Game Mode)

Wait, this question is off-topic for my own thread but I must know this, what core are you using and this works in any PS1 game? I tried to eliminate the long loading times of Diablo (this is the only annoying downside of this port in my opinion) but I had no sucess at all.

If you use normal Bin+Cue or Isos, make sure to check 'Preload CD-ROM Image to RAM, if you use CHD, the best option is the highlighted, while preloading games into RAM doesn’t have too much to do with the read/seek speed, it can definitely help avoiding hiccups if using a normal HDD. In Swanstation and probably in Mednafen too you can shorten load times by using the other highlighted options, be aware that depending on the game, it can hang, while others benefit a lot from the increased read/seek speeds, so if you notice a game misbehaving, try lowering the speed or default it altogether, I think Diablo won’t have issues, but you should try it to make sure.

1 Like

Note that pre-cache/preload is mostly useful for unreliable storage devices (like when you have your CD images on some network device that can suddenly become very slow or completely drop-out from time to time.)

On a normal HDD (let alone an SSD) it’s not really useful. The 10x read speed option for example only requires 6MB/s from your storage device. Through a network you might not get that at all times, but from your local storage, it’s pretty much guaranteed.

2 Likes

Other benefits by using the caching, specially for USB external HDDs, is that it also avoids hiccups since these devices sleep in a few minutes (you can use a third party program to avoid that but this is an extra layer/step to take and the power options to keep it ON all the time in Windows never worked, it always turns off, I had a few in the past and it can be annoying. That can probably make HDDs live a bit longer too, since the games need random access like the real thing, the HDD is accessed frequently, long games and extensive play sections makes so that having it in the RAM is the better option as well.

3 Likes

On my Windows machine I am using RTSS scanline sync and I am pleased with the result.

Basically what you do is turn vsync off in retroarch and in your GPU control panel, then RTSS injects variable amounts of delay into the OGL/Vulkan/DX pipeline to keep the horizontal tear line in a fixed position at the top/bottom of the frame. Actually it’s hidden completely in the blanking area so you don’t even see it, and you can control the position of it manually as well with hotkeys while you’re playing the game.

The catch is the game has to have consistent render times between frames otherwise you’ll briefly see the tear line appearing before RTSS hides it away again.

I use it with Retroarch NEStopia and standalone Dolphin, GTX1070 Windows 7. Also use it for other PC games.

1 Like

What is the benefit of RTSS over regular Vsync? In theory, doesn’t it increase latency, as it waits for the frame? Just like Vsync would do.

From the reading it sounds like VSync, but if the game cannot deliver the frames, then it stops being Vsync and instead shows the tearing. So no slowdown caused by syncing. I guess that is the benefit, right?

1 Like

In theory I’d imagine a latency-optimised implemenation of vsync with no prerender queue would only add up to a maximum 1 frame of latency as that is the maximum time to wait until the current frame’s refresh interval is finished and it can be swapped with the next frame.

Anyway, scanline sync feels to me identical to vsyncless in terms of latency. So if you can’t feel any diff between vsync on or off, then you don’t need to bother with scanline sync.

I suppose the amount of latency added by scanline sync might be affected by how much delay scanline sync has to add to shift it to the edge and out of sight. For instance if your tear line is already near to the edge it might only need to delay it by 2ms to move it that final bit and thus out of sight.

I’ve seen the option for “GPU hard sync” in retroarch and will have a play around with that later tonight as it sounds like it might potentially be as good as scanline sync.

NVidia driver also has something called “fast vsync” which might work a similar way but I think it only works when the game is rendering fps above the refresh rate which obviously isn’t the case with emulation.

2 Likes

For reference here are the RTSS settings I’m using (it’s the version that comes packaged with MSI Afterburner)

Alright had a play with this. I found both vsync and hard gpu sync must be enabled for it to work, and yes, it definitely feels to me like a big improvement in latency vs hard sync disabled. I have sync frames set to 0.

Subjectively I would say scanline sync still feels “just noticeable difference” more responsive than hard sync. But it’s very close, and hard sync has the advantage of not suffering the possibility of tearing if render times become inconsistent. In that scenario I would expect hard sync might drop a frame?

In any case for now I’ll switch to hard sync and see if there are any other issues with it. I am only using NEStopia core at the moment.

edit: after further playtesting with SMB1 and Zelda2 I’m feeling like hard sync and scanline sync are pretty much neck and neck.

edit2: I noticed there’s a retroarch setting under power management called “frame rest” which says it’s meant to be used with scanline sync, but I don’t really understand it - says it reduces vsync CPU usage. But scanline sync is meant to be used with vsync disabled .

2 Likes

@pneumatic I don’t think RTSS or other tools can reduce the RetroArch latency any lower. Have you measured your results, for example with a high-speed phone camera? Differences of 1 frame of latency are too small to be consistently measured by the naked eye.

I have made some measurements (without “Frame delay”) a few years ago about what I believe is the best case scenario for RetroArch. You can see the measurements here: An input lag investigation

Other than the “Frame delay” feature, one thing that should, in theory, reduce input lag slightly further is enabling VRR.

3 Likes

I wish that I had the tools to do this, but all I can do right now is jump around in mario and “feel”, which isn’t very accurate.

I have gsync on my pc monitor but I’m using a HTPC TV for retro games and that only supports 60hz fixed.

My EVGA 1070 just died yesterday, puff of smoke and PSU auto shutdown. I’m still salty about it but at least it didn’t destroy other parts. Put an old R9 380 in there and it feels like the same amount of latency.

The TV itself doesn’t have much internal latency (Samsung 768p plasma) and I’m happy with the feel even without game mode it feels quite responsive to me as long as I run hard sync or scanline sync.

I will have to read the thread you linked and get up to speed before I can comment further.

If I had to guess I’d say you’re right and that Retroarch latency with hard sync is already maximally optimised.

Ouch! Sorry about your GPU.

You don’t need to read the whole thread. It’s huge, and earlier posts are full of outdated information.

Many modern smartphones have slow-motion modes in their cameras. As long as they go as fast as 240fps it should be enough to measure latency between the button press and the action on screen. Try to see if your phone camera has a slow-motion mode. Don’t forget that Super Mario Bros has 2 frames (I think) of internal input lag though. EDIT: It’s only 1 frame of lag for SMB after all.

Regardless, if I were you I would play in game mode. And if you use Vulkan or other drivers, you should set Max Swapchain Images to 2.

1 Like

Hey it does, I didn’t know that. But it’s only 120fps…so I guess that means it will be inaccurate by around 8ms or something? Well it might still be useful… I’ll have to play around with it - thanks for the tip.

I would but the colours in that mode are garish and it feels quite responsive in Movie mode. If I recall the lower end Samsungs used a cheaper Mediatek processor which for whatever reason has less lag than the higher end models. I think the windows mouse cursor has something like hard vsync and it doesnt feel rubberbandy in Movie mode.

1 Like

Yes, with a 120fps recording and a careful frame-by-frame analysis you can get a pretty good ballpark for the input lag I think.

1 Like

I believe nyquist is involved here :smiley:

2 Likes

The results are in!

Keep in mind this is with my old R9 380 which might be adding latency due to it being slower with the CRT shader - will retest next week when 3060 arrives. The wireless Wavebird may be adding some too, but these are the conditions I play under so I want to know “buttons to pixels”

I start counting from 0 when the button is fully depressed. i.e first frame of button depressed = frame 0. Repeated this 6 times and took the average.

Vsyncless
9F, 9F, 8F, 8F, 7F, 7F = 66ms

Scanline sync
11F, 11F, 10F, 10.5F*, 10F, 11F = 88ms
* because image contained a blend of 2 frames due to plasma subfields

Hard vsync
14F, 13F, 12F, 12F, 13F, 12F = 105ms

Normal vsync (off in Radeon Settings, on in Retroarch)
14F, 15F, 15F, 15F, 16F, 16F = 126ms

Trivia: the little white specular reflection on the A-button was critical to knowing when the button was actually depressed, otherwise it was too ambiguous

3 Likes

Interesting that RTSS is giving you lower input lag than Hard Vsync. Not sure why that happens.

Why not test Vulkan with Max Swapchain images set to 2? That should give you the lowest possible input lag without the performance penalty of Hard Vsync.

I’ve never tested OpenGL much because I never had reason to use it, so I don’t know much about it’s lag.

Here are my results for Vulkan with max swapchain = 2.

Vsyncless
8F, 7.5F, 8F, 9F, 7F, 7F = 64ms

Scanline sync
11F, 10.5F, 10F, 11F, 10.5F, 11F = 88ms

Normal vync (off in Radeon Settings, on in Retroarch)
10F, 11F, 10F, 11F, 11F, 10F = 87ms

Normal vsync but with “Radeon Antilag” enbled in Radeon Settings
11F, 10.5F, 10F, 10F, 11F, 10F = 86ms

Looks like Vulkan vsync is the way to go!

I’ll have to retest with the 3060 next week but this is looking pretty promising. 88ms would mean my TV has 55ms of latency in Movie mode after deducting the 2 frames of SMB internal lag, which seems about right for a low end Samsung TV of that era (2013). I know I know, I should use game mode… but I hate it, the colours are just not to my liking. I’d definitely use it for competitive gaming though.

The main takeaway for me personally is that as long as I have a button-to-pixels no more than 100ms, I’m generally satisfied. I can still feel some latency at 100ms, but it feels to me ok for casual play. Adding an extra frame or two top of that is where I start to feel it intruding into the experience even for casual games. If the game doesn’t involve any timing challenges then I might tolerate more if it was the only way to get a smooth framerate.

2 Likes

Yeah, this is the important thing. As long as you get it below your personal threshold, you’re all set.

I’d be curious to see your game mode results. Also, don’t forget about runahead, since it’ll shave off entire frames at a time :wink:

2 Likes

I have done the same test as proposed here: https://www.youtube.com/watch?v=tnD56BI-ZGA

But with RetroArch, instead of a MiSTer. I’ve gotten 3 frames of lag with a desktop PC and regular TV and 3~4 frames with my notebook. I used:

  • Video driver: Vulkan
  • Max Swapchain Images: 3
  • Input Poll Type: Early
  • Game mode: Enabled
  • A wired (not wireless + cable) USB controller

Everything else was default. If I set the max swapchain images to 2, I reduce the input lag by a whole frame on both devices.

3 Likes