Premise
Within any given MAME or FB Alpha ROM set, there could be one of four distinct, totally valid zipped romsets or one of four different, equally valid 7z
archives for the same title. CRC scanning romset zip
and 7z
files doesn’t make sense in that context – it’s too different from the ‘native’ validation approach used by MAME. MAME’s own validation is characterized below.
This thread is my attempt to start a specification for arcade ROM scanning based on the ‘native’ validation method employed by MAME and FB Alpha. I am making this effort in the interest of science.
Overview and terminology
Arcade games are packaged as zip files, most of which are composed of more than one individual ‘ROM’ files. In MAME and FB Alpha parlance, a ZIP file containing each of the ROM files needed to emulate one game is called a “ROM set”. Some resources refer to an individual arcade game as a ROM (like people use to describe a zipped game cartridge ROM, which is actually one ROM file inside the zip) while other resources refer to an individual arcade game as a ROM set or romset.
I will follow mamedev convention and use the term romset to refer to a zip
or 7z
with the ROM files for one game.
ROM set version and formats
Each version of an arcade emulator must be used with ROM sets that have the same exact version number. For example, MAME 0.37b5 sets are required by the mame2000 core, but will not work correctly with the mame2010 core, which requires MAME 0.139 ROM sets. MAME validates ROM sets by checking the CRCs of individual ROM files within a ROM set against its internal database. This database changes with each MAME release and can be generated by running the MAME executable with the flag -listxml.
Four Arcade Romset File Formats
Full Non-merged: All ROMs can be used standalone because each zip contains all the files needed to run that game, including any ROMs from ‘parent’ ROM sets and BIOS sets. (ClrMamePro users: access through the “Advanced” button in the Rebuild and Scanner menus, then deselect “Separate BIOS sets”.)
Non-merged ROMs: Except for romsets which require a BIOS archive, all romsets can be used standalone because each zip contains all the files needed to run that game, including any files from ‘parent ROMs’. BIOS romsets are ‘split’ from the game romsets and must be placed in the same folder as the game romset.
Split: Some ROMS that are considered clones, translations, or bootlegs also require a “parent ROM” to run. The parent ROM is often the first or most common variant of a game. In some cases the parent is not the most popular or best working version of the game, however. For example, in a Split set pacman.zip (a clone), will not work without puckman.zip (its parent). BIOS romsets are also ‘split’ from the game romsets and must be placed in the same folder as the game romset.
Merged: Clones are merged into the parent romset zip, meaning that more than one game is stored per file. Merged romsets are not well supported in the libretro arcade emulator cores as of this time.
Finally, necessary game content is sometimes distributed in the form of an an additional Sample ZIP file composed of individual audio samples or a CHD file with game data that was originally stored on an internal hard drive, CD-ROM, DVD, laserdisc, or other media.
If RetroArch were to add native arcade scanning support to the playlist generator, the most straightforward way would be to support “Full Non-Merged” sets only. I advocate for Full Non-Merged as a standard but I’m ready to help work through the requirements for any and all of the above.
An Example
To demonstrate how this works, the output of unzip -v for a Full Non-Merged 1941j.zip (1941 - Counter Attack (Japan). I believe the -v (verbose) command is also commonly available in standard zip libraries although I’m not sure what RetroArch uses for zip functionality.
Being able to examine the CRC values of the individual files within the ZIP without decompressing them is the cornerstone of this approach.
Archive: 1941j.zip
TORRENTZIPPED-4E6AC678
Length Method Size Ratio Date Time CRC-32 Name
-------- ------ ------- ----- ---- ---- ------ ----
131072 Defl:X 37719 71% 12/24/96 23:32 7fbd42ab 4136.bin
131072 Defl:X 30694 77% 12/24/96 23:32 c6464b0b 4137.bin
131072 Defl:X 65674 50% 12/24/96 23:32 c7781f89 4142.bin
131072 Defl:X 64180 51% 12/24/96 23:32 440fc0b5 4143.bin
65536 Defl:X 18355 72% 12/24/96 23:32 0f9d8527 41_09.rom
131072 Defl:X 117804 10% 12/24/96 23:32 d1f15aeb 41_18.rom
131072 Defl:X 82385 37% 12/24/96 23:32 15aec3a6 41_19.rom
524288 Defl:X 87410 83% 12/24/96 23:32 4e9648ca 41_32.rom
524288 Defl:X 269308 49% 12/24/96 23:32 ff77985a 41_gfx1.rom
524288 Defl:X 186206 65% 12/24/96 23:32 983be58f 41_gfx3.rom
524288 Defl:X 270331 48% 12/24/96 23:32 01d1cb11 41_gfx5.rom
524288 Defl:X 187229 64% 12/24/96 23:32 aeaa3509 41_gfx7.rom
-------- ------- --- -------
3473408 1417295 59% 12 files
1941j in the MAME 0.78 DAT file:
<game name="1941j" cloneof="1941" romof="1941">
<description>1941 - Counter Attack (Japan)</description>
<year>1990</year>
<manufacturer>Capcom</manufacturer>
<rom name="4136.bin" size="131072" crc="7fbd42ab" sha1="4e52a599e3099bf3cccabb89152c69f216fde79e"/>
<rom name="4137.bin" size="131072" crc="c6464b0b" sha1="abef422d891d32334a858d49599f1ef7cf0db45d"/>
<rom name="4142.bin" size="131072" crc="c7781f89" sha1="7e99c433de0c903791ae153a3cc8632042b0a90d"/>
<rom name="4143.bin" size="131072" crc="440fc0b5" sha1="e725535533c25a2c80a45a2200bbfd0dcda5ed97"/>
<rom name="41_09.rom" merge="41_09.rom" size="65536" crc="0f9d8527" sha1="3a00dd5772f38081fde11d8d61ba467379e2a636"/>
<rom name="41_18.rom" merge="41_18.rom" size="131072" crc="d1f15aeb" sha1="88089383f2d54fc97026a67f067d448eee5bd0c2"/>
<rom name="41_19.rom" merge="41_19.rom" size="131072" crc="15aec3a6" sha1="8153c03aba005bab62bf0e8b3d15ec1c346326fd"/>
<rom name="41_32.rom" merge="41_32.rom" size="524288" crc="4e9648ca" sha1="d8e67e6e3a6dc79053e4f56cfd83431385ea7611"/>
<rom name="41_gfx1.rom" merge="41_gfx1.rom" size="524288" crc="ff77985a" sha1="7e08df3a829bf9617470a46c79b713d4d9ebacae"/>
<rom name="41_gfx3.rom" merge="41_gfx3.rom" size="524288" crc="983be58f" sha1="83a4decdd775f859240771269b8af3a5981b244c"/>
<rom name="41_gfx5.rom" merge="41_gfx5.rom" size="524288" crc="01d1cb11" sha1="621e5377d1aaa9f7270d85bea1bdeef6721cdd05"/>
<rom name="41_gfx7.rom" merge="41_gfx7.rom" size="524288" crc="aeaa3509" sha1="6124ef06d9dfdd879181856bd49853f1800c3b87"/>
</game>
In order to implement Full Non-Merged arcade ROM scanning that works across MAME versions, some pseudo-code:
PlaylistScanner() {
...
if(ROM_file.extension == ".zip") { // or whatever other factor triggers scanning the arcade DATs
DAT_entry = SearchArcadeDATs(ROMfile.name_no_extension)
if(!DAT_entry) {
return false // the file being scanned can't be found in the DAT
}
DAT_entry_canonical_contents[] = ParseArcadeROMContents(DAT_entry)
ZIP_contents[] = ParseZIPManifest(ROMfile)
index = 0
for(index < DAT_entry_canonical_contents.length) {
if(!CompareArcadeCRCs(ZIP_contents, DAT_entry_canonical_contents[index])) {
return false // an expected file is missing from the ZIP
}
index++
}
index = 0
for(index < ZIP_contents.length) {
if(!CompareArcadeCRCs(DAT_entry_canonical_contents, ZIP_contents[index]) {
if!(isBIOS(DAT_entry_canonical_contents, ZIP_contents[index]))
return false // the ZIP file has extra files that are not expected by the DAT
}
index++
}
}
...
}
References
Note that the Logiqx site refers to ‘classic’ DAT format but the tags are the same as the newer XML format.
- DAT creation: http://www.logiqx.com/DatFAQs/DatCreation.php
- DAT layout: http://www.logiqx.com/DatFAQs/CMPro.php