Input Lag Compensation to compensate for game's internal lag?

Edit:

Visitors finding this thread, you can download builds implementing Lag Reduction from the Nightly Builds page. (Link goes to Windows 64-bit, but you can find other builds around there too).

Original post:

For a game, there are several critical timings: The time of the joystick input, the time the game has decided what to display on the screen, and the time the image actually appears on the screen.

For the Atari 2600, there is zero input lag. Input is sampled and the game logic runs during vblank time, then the screen image is generated as the screen renders using the new information about the game’s state.

But for the NES and later 2D consoles, the game logic runs during render time. So as the game is reacting to your input, it is displaying the previous frame on your TV, and you wan’t see the effects of your joystick input until the next frame. This is one frame of internal lag.

Sonic the Hedgehog games on the Genesis happen to have two frames of internal lag.

Anyway, the “gens-rerecording” emulator has an example of how to deal with internal lag, as seen in the basic update_frame code:


int Update_Frame_Adjusted()
{
	if(disableVideoLatencyCompensationCount)
		disableVideoLatencyCompensationCount--;

	if(!IsVideoLatencyCompensationOn())
	{
		// normal update
		return Update_Frame();
	}
	else
	{
		// update, and render the result that's some number of frames in the (emulated) future
		// typically the video takes 2 frames to catch up with where the game really is,
		// so setting VideoLatencyCompensation to 2 can make the input more responsive
		//
		// in a way this should actually make the emulation more accurate, because
		// the delay from your computer hardware stacks with the delay from the emulated hardware,
		// so eliminating some of that delay should make it feel closer to the real system

		disableSound2 = true;
		int retval = Update_Frame_Fast();
		Update_RAM_Search();
		disableRamSearchUpdate = true;
		Save_State_To_Buffer(State_Buffer);
		for(int i = 0; i < VideoLatencyCompensation-1; i++)
			Update_Frame_Fast();
		disableSound2 = false;
		Update_Frame();
		disableRamSearchUpdate = false;
		Load_State_From_Buffer(State_Buffer);
		return retval;
	}
}

So what it does is this: (to compensate for N frames of latency)

  • Run one frame quickly without outputting video or sound
  • Save State
  • Run N - 1 frames quickly without outputting video or sound
  • Run one frame with outputting video and sound
  • Load State

edit: fixed the typo on third bullet point, thanks Brunnis

11 Likes

I modified RetroArch to force it to do this sequence every time it draws a frame:

  • Run one frame without outputting the video or sound
  • Save State
  • Run one frame normally
  • Load State

It is not optimized at all yet. Since it runs the core twice per frame, it runs half as fast as the normal frontend. Right now it serves as a stress test to see if the emulator cores can handle saving and loading state 60 times per second without screwing up.

Many cores are messing up here. Most cores will have audio problems due to loading state 60 times per second, other cores (such as QuickNES) have weird graphical problems in addition to the audio problems. The Genesis Plus GX core and BSNES cores seem to work alright.

I would really like to see someone measure the input lag properly with a high-speed camera and LED on the controller, comparing the stock version to the hacked version.

Download link for Experimental Input Lag test version, Windows x64, requires RetroArch 1.7.1 installed.

2 Likes

Any cores that work well with netplay should theoretically work well for this, as well, since it’s a similar workload. Snes9x and FBA are good candidates. FCEUmm should be good, too, I think.

Snes9x and FCEUmm are having servere audio problems by repeatedly saving and loading state 60 times per second.

I’d say most of the cores are having issues with saving state and loading state 60 times per second.

Gambatte works fine however.

This is ingenious! :smiley: I’ve made a few tests on two systems:

  • Core i5-5300U (laptop)
  • Core i7-6700K @ 4.4 GHz

Both systems are able to maintain framerate (reported as 120 FPS with this hack) with no drops when running Super Mario Bros (Nestopia) and Super Mario World (Snes9x). Both system had Hard GPU Sync enabled as well.

However, as you say, sound is borked. I’m guessing this is not for performance reasons, but rather due to something that happens when the state is loaded. Is the sound buffer emptied on load state? It sounds similar to buffer underruns… If so, is it possible that we need to have a custom load state function that omits some stuff that the normal load state function does?

As for the effect on input lag: I have not measured with a camera yet. However, doing the pause + single frame step test, I can confirm that your build removes one full frame of lag. I tested it in Super Mario World and mario jumps on frame two instead of frame three. Very promising! If this pans out, it’s possible that emulation could have lower input lag than a real console. :astonished:

Just a quick note: I think bullet point number three should say “without outputting video or sound” as well, right?

1 Like

Sounds like an exciting development :smiley:

I think Dwedit’s statement is correct, basically because you want to output the entire “future frame” within the frame and then return to previous state.

Edit: you changed dwedits “with” to “without”, so I’m not sure what you’re exactly pointing at, the bolded video or audio or the with or without…

Yes, but you do that in bullet point number four. His bullet point number three does not match the code he quoted above it.

Ah, sorry I misread the bullets. Looks like you’re right.

It might be worth mentioning that, at least on the NES, there are cases where games react on the next frame. One such example is the selection screen you get to after you press Start on the Mega Man 2 title screen. That particular case will not cause any issues with this fix, but there might be other cases. Would perhaps be worth investigating.

I had toyed with a similar concept a couple of years ago but in my concept, it only loaded states when the input actually changes (which doesn’t happen as often as you would think):

The rest of the time, the game would be running one frame ahead, assuming you would still be pressing (or not) the same buttons one frame in the future (which is usually true).

For testing and debugging, I think we still need the save state/load state 60 times a second, just so we can identify bugs in the cores, especially audio bugs, and knock them out.

Then later on, for performance, we can compare input state and emulate fewer frames.

How would your method scale to running more than one frame ahead? How would it handle game audio? Seems that the best way to handle game audio is to output audio only from frames where we output video.

You sometimes need to handle two frames of input lag. Sonic the Hedgehog has 2 frames of lag. Punch-Out, famous for being one of the most lag-sensitive games, has a minimum of 2 frames of internal input lag for everything, including the main menu, and pressing start to start a match. Additional lag is added to most actions for game difficulty.

As for games which are able to react without any lag frames, such as in a menu, you’d end up missing one frame of animation, and possibly one frame worth of audio, but it’s not a big deal, since the game action part would still have a minimum of 1 frame of input lag.

It would function just like your method, AFAICT. The only difference is that it tries to predict input and uses fewer rollbacks when that prediction is correct.

I believe the rewind function does some special stuff to preserve (and reverse) the audio samples, so you might want to copy it or see if you can hook into the rewind function altogether, since it does half of what you want, anyway (that is, state every frame, put it in a rolling buffer).

Here is what I’m actually doing right now:


   //replacement for core_run here
   {
	  video_driver_set_stub_frame();
	  audio_driver_suspend();
	  core_run();
	  video_driver_unset_stub_frame();
	  audio_driver_resume();

	  retro_ctx_serialize_info_t serial_info;
	  retro_ctx_size_info_t info;
	  core_serialize_size(&info);
	  serial_info.data_const = NULL;
	  serial_info.size = info.size;
	  void * stateBuffer = malloc(info.size);
	  serial_info.data = stateBuffer;
	  core_serialize(&serial_info);
	  serial_info.data_const = stateBuffer;
	  serial_info.data = NULL;
	  core_run();

	  core_unserialize(&serial_info);

	  free(stateBuffer);
   }

To suspend video, it calls “video_driver_set_stub_frame” and “video_driver_unset_stub_frame”.

To suspend audio, it calls two new functions, “audio_driver_suspend” and “audio_driver_resume”. They stop the functions “audio_driver_sample” and “audio_driver_sample_batch” from doing anything, so the samples from skipped frames never even get put into any buffers.

What happens if you don’t do this part? Is audio messed up in a different way?

If you skip this step, it runs at half speed and plays all audio it generated.

Just fixed QuickNES, there was one tiny bug in the savestate code where it forgot to save the phase of the triangle wave.

edit: only fixed audio, still other graphical glitches.

1 Like

Good progress, though. If there’s graphical glitches, I guess there’s probably some other problem with the save state code in QuickNES, right? I couldn’t spot any visual glitches in Nestopia or Snes9x.

Cool idea, just wondering if you think the constant save/load state would add ware and tear to your hard drives. I dont know much about drives just wondering.

From what I can tell, it just keeps the state in memory and frees it immediately after using it to load state. So it never touches disk.

1 Like

Just fixed the bug in QuickNES that caused all the graphical corruption. It had nothing at all to do with savestates, and also happened in Vanilla Retroarch.