Strange behavior in MAME ( lag/control )

It sounds very hacky and i have a feeling it would break all kinds of raster effects, but feel free to send a PR if you feel confident about this, we’ll look into it.

I was thinking that a visual signal was needed to know when a button is pressed. “jstest” shows all the mapping and AppImage allows me to handle several instances of the same installation. The signal is recorded directly by CPU, I think this is more reliable than to record with an external camera the monitor. I placed pause in the L3 is shown as [9], and the frame of frames in the R3 that is shown as [10].

Some test videos

These are some results:

		ra-mame 	fbneo	mame
sf2ce
punch		5		5		4
move 		4		4		3

kof98
punch		7		6		6
specia		3		2		2
move 		4		3		3

mk1
punch		5		5		4
blocking	4		3		3

mk2
punch		5		5		4
blocking	5		5		4

1941
shoot		4		4		3
specia	    3		3		2

galaga
shot		5		3		4
move		4		3		3

mslug
all move	5-6-7 Variable in all. MAME stays more in 5 

mslug 5
all move	3-5 Variable in all. MAME stays more in 3

i can assure you it’s not hacky at all and if anything is “the right way” to do things in a run a full frame at a time based emulator. if done correctly nothing will be broken, other than maybe save state compatibility? (not sure about that…)

now, my quick and dirty patch implementation here is another matter - this may break some shit. (you’ve been warned) i’m not familiar enough with fbneo’s code nor its neogeo driver to know if what i’ve done is kosher, but at first glance nothing seems broken and it does what i said it would, that is reduce input latency in a lot of games:

patch file here (i can do a proper pr at a later time if need be, but i’m not so great at using git so you’ll need to bear with me):

md5sum hash of patch file:
03315e32c2c6650c04007e109a650f2a  FBNeo-reduceneogeolatency.patch

apply with:

cd /path/to/src/FBNeo/
patch --strip=1 < /path/to/FBNeo-reduceneogeolatency.patch

it’s a start and i figure this should be enough to convince people on the technical merits so that it can be put on urs and/or other devs radar. (cave driver could probably use a similar fix, i couldn’t help but notice for dodonpachi someone just fudged the numbers on vblank to get the latency down, which has the effect of pushing when the game polls input into the next emulated frame slice - which if you want to talk about hacky solutions…) anyway, if you want an explanation of why this reduces latency and is the “proper” (imo, i know…) way to structure an emulator frame loop, keep reading.

tl;dr time:

the vast majority of games will read input either during vblank or shortly after. (that is near the start of active frame scan out) sure, they can whenever, but most follow this incredibly common programming pattern, so why not accommodate it? with that in mind here’s a pseudo code example of one such common gameloop pattern on 2d raster based hw like the neogeo:

while (running) {
    wait_for_vblank()
    read_input()   // runs in vblank.
    setup_gfx()    // runs in vblank.  setup sprites, transfer tile data, etc.
    step_world()   // runs in active.
    prepare_gfx()  // runs in active.  buffer sprite attributes, etc.
}

notice how the graphics are pipelined, setup_gfx() sets up the hw to draw the previous frame, which is scanned out while step_world() and prepare_gfx() work on the current one. now consider how the above runs on the following emulators:

emulator_a (how neogeo w/fbneo currently works):

while (running) {
    read_input()
    emulate_vblank_last_half()
    emulate_active()
    emulate_render_gfx()
    emulate_vblank_first_half()
    display_and_throttle()
}

emulator_b (how the patch changes it):

while (running) {
    read_input()
    emulate_vblank()
    emulate_active()
    emulate_render_gfx()
    display_and_throttle()
}

(hint: start by looking at emulate_vblank_first_half() in the former and emulate_vblank() in the latter and what corresponding game pseudo code executes there, then keep going until the first reaction frame gets rendered and then displayed)

this pseudo code game, which on actual hw would have a minimum response time of 2 frames (thus an average response time of ~2.5 frames), will show a response when frame stepping emulator_a in 3 frames, whereas emulator_b will in 2 frames. this fixes viewpoint (and many other games like it) to respond on the 2nd frame. magician lord is unaffected since it reads input during active frame scan out (or last half vblank, can’t remember) thus runs in 2 frames on both. also this should stabilize the metal slug games so that they only differ by at most 1 frame (due to them running game logic at 30 hz) so long as no slowdown is going on (looking at you metal slug 2)

As i was suspecting, this change broke raster effects (look at river position in turf masters attract mode).

Well, i’ll see with other FBN devs if they can think of a way to use your idea without breaking gfx, however if we can’t i’m not too concerned about this since we have runahead support (standalone included).

ah, i see now. i’m fairly certain this is just due to me not compensating for the change by failing to adjust the various irqoffset logic stuff properly. it may be worth trying to fix (i’ll try when i have time, though i’m hoping one of the devs can quickly spot the issue) as this makes runahead more usable as well. for example, with this change metal slug x will have a solid frame response time of 3 to 4 frames, w/o it it’s unstable at 3 to 5 since where it reads input exactly oscillates between emulated frame slices, meaning if you use the general recommendation of only setting runahead to the minimum response time you can get then this reduces input latency with runahead by a frame as well.

As a Groovymame user, yes MAME inherited a special option originally available in GM called low latency mode which shaves a frame of lag off games.

It was added roughly a year ago.

1 Like

Thanks for confirming, it had a guess something happened in MAME’s frontend code, i remember reading the MAME libretro port had the same input lag as standalone (using the high-speed camera method at the very least).

Now i’m wondering if their method is more reliable than runahead, and if it could be implemented in RetroArch as an alternative or complement to runahead.

1 Like

I believe e-tank’s solution there is essentially the same thing brunnis did with snes9x and a few other cores awhile back.

2 Likes

i grep’d the source code for mame’s lowlatency option (src/emu/video.cpp:234), here’s what it effectively does:

while (running) {
    read_input()
    emulate_frame()
    if (low_latency) {
       display()
       throttle()
    } else {
       throttle()
       display()
    }
}

i also looked into the libretro core implementation of mame and can confirm it’s effectively structured like so:

while (running) {
   emulate_frame()
   read_input()
   display_and_throttle()
}

the above being a gross simplification, so fixing it isn’t exactly trivial. a consequence of trying to fit a square peg (mame frontend interface classes) into a round hole (libretro api)

once my eyes stop bleeding from having starred into the abyss that is mame’s code base and interface design in order to just determine this fact, maybe i’ll try and think of a way to address it… definitely no promises from me though

4 Likes

Some time ago I think of something that may be absurd.

MAME can be dissected?

It can be an advantage to have separate arcade board. Easier to update and keep the emulator and the romset. Additional characteristics such as Widesreen and HD, etc. can be added. The real “video games” in MAME are updated very little. In the last year, if you put 3 games and 10 clones (5 pirate) is a lot.
The same in FBNeo and the one that works best on each plate is used.

(promotion: read with music enthusiastic background.)
Play SFII with an exclusive core for the CPS-2 (based on FBNeo?). 0 lag with run-Ahead, a Widescreen hack and a pack of textures in HD, are things that only happen in RetroArch. :slightly_smiling_face:

Feel free to create your FBNeo fork, don’t name it FBNeosomething though, call it beetle-cps2 or whatever, i don’t want to deal with that shit. The best thing that happened from renaming FBAlpha to FBNeo is that fbalpha2012 bug reports don’t end up in my repo anymore.

2 Likes

I understand your frustration but, that does not answer my question.

Yes it can, standalone even has a modular build system to do this.

2 Likes

I do not know the technical challenges, I only comment it because it can be beneficial for the Retroch ecosystem.

I don’t think so, just more cores that would need to be maintained, so who will do it ? Not you apparently, and if there is no maintainer then it’s just more outdated cores using more buildbot ressources and confusing more people about which core they should pick (note : currently there are people who think fbalpha2012-neogeo is the core they must use to play neogeo on desktop computers…). FYI, current mame core is not actively maintained and already has quite a lot of issues.

1 Like

I understand that it is difficult to keep many cores, there are 17 (+ Mame and fbalpha and FBNEO) of SNes.

The idea is not to depend on MAME. Separate it, independent and if possible in the future, add characteristics. Zinc and Modeler are still used (HD-1080p). Obviously it will not be a core per plate, it is hundreds, but it can do for generations or by type. CPSX, Systemx…

It can also be an option for consoles without core. A single emulator for generation 1, a fixed romset, names, screenshots, etc…

After I posed the idea better, so as not to distort the main theme…

This is not the fault of the people, in RetroArch there is a lack of documentation, the one that exists is difficult to access.

I do not know how to do it, If I knew, I would, but it would surely be a… “shit”?

I always think of things before saying them, and Mame I have thought a lot, ease, accessibility, maintenance, organization, projection, etc. Always focusing on the user experience, beyond the technical context I do not know.

Is that I am convinced that. "It’s easier to break than to repair."

I hope the analogy is understood.

This I can’t agree with. It should not be expected that the devs hold the hand of the users. I might be able to count on one hand that the Docs section has not provided me with enough information, and that is where the forums come into play. it would come down to “Work on the core, or work on the documentation to include every nook and cranny of the cores, what ones to use properly in this or that situation”. Doing such I believe would simply wear out a dev to a “done” point.

1 Like

I am not saying that it is the fault of the developers, nor do they carry the users from hand. But there must be even if it is a minimum of description of what the emulator does, because if not, the effort is lost.

Without going far, there is nothing that explains to you so that the Core fbalpha2012 exists. There is no information in the Core or documents.

The N64DD is a mystery, if they do not explain it here it can not be used. The Widescreen is another hidden secret.

I am a person who looks for a lot on the Internet before asking, not to waste time. If @hunterk does not explain to me for each Core Mame, I do not find out because on the Internet at that moment, there was nothing.

And more things, but yes, the documentation is quite worked, it is getting better, what I said and I was explicit is that “lack or is difficult to access.”

I think we are getting a bit off topic but…

Just because MAME can be dissected doesn’t mean it would be easy to start adding fancy features. If it were, MAME would have accelerated 3D video. “Dissecting” really just means only a single driver is compiled. AFAIK a lot of other stuff, not really needed for the driver, is also compiled so there isn’t even a lot of space saved. Standalone full compile is 432,446,689 bytes and compile with only DEC PDP1 driver is 430,394,386 bytes. (An old version but you get my point. :grin:)

The MAME core is just barely a core at all. Only the minimum things to make it work have been done and we are very lucky it is updated at all.

All things considered it works very well.

3 Likes

This is the description which i believe is shown in the online updater for fbalpha2012-neogeo : https://github.com/libretro/libretro-core-info/blob/master/fbalpha2012_neogeo_libretro.info#L20. It sounds pretty clear to me.

If you think the documentation from https://docs.libretro.com/ is lacking, contributions are welcome, yet again i don’t think the documentation is the problem for picking the right core (because very few users read it anyway), only the overwhelming amount of cores (cores like fbalpha2012-neogeo shouldn’t be available on systems without heavy memory limitation, period), and your idea can only make things worse. Also, your reasons for wanting cpsX or systemX split cores are very lacking, because i doubt widescreen is achievable for those systems (aside from the games where it’s already supported), let alone HD…

2 Likes