Did some more tests.
In short, d3d11 is similar to glcore and vulkan with gsync, around 30ms.
More surprising, in windowed mode, bottom of the screen, it’s on win7 aero is enabled, gsync enabled for fullscreen mode only, same fceumm smb run ahead 1 frame (to get a 1 frame minimal internal lag):
windowed glcore default 11 10 9 11 9 8 10 9 12 11
10 = 42ms
windowed hardsync 0 frame (in case it does anything for nvidia drivers) 7 12 8 8 8 9 10 8 11 11
9.2 = 38ms
windowed vulkan 9 12 7 9 9 10 9 10 9 7
9.1 = 38ms
Around 40ms while fullscreen without gsync was giving 84ms (probably triple buffering with default nvidia drivers settings)…
Aero isn’t so bad in the end, most probably because my default monitor refresh is 120hz (or gsync helping with screen composition).