Citra core FAR slower than Standalone

I’ve been trying out the Citra core in Retroarch and noticed some severe speed issues not present in the standalone emulator. Citra standalone generally runs between 2 to 4 times as fast as the core.

Someone else noticed this as well, but j-selby believed the problem to be caused by a slow PC. The issue was brought up here on the github page for the core-

I commented on the issue as well, having the same problem despite my PC being significantly more powerful (4670k @4ghz, 8GB DDR3 RAM and a GTX 1060 6GB). I’ve done my best to ensure my comparisons are using identical settings and tested in identical areas with the same conditions onscreen. I also disabled vsync, and btw vsync being enabled seems to further break performance in 30fps 3DS games too, so that’s another issue.

This happens with other games as well, though I need to do some more testing before I can compile a thorough comparison list. For this particular test I used Kingdom Hearts 3D, which is a native 30fps game.

1x Resolution (native, 400x240)- Citra standalone- 107fps, Lilbretro core- 55fps

2x Resolution (800x480)- Citra standalone- 107fps, Libretro core- 28fps

8x Resolution (3200x1920)- Citra standalone- 75fps, Libretro core- 25fps

10x Resolution (4000x2400)- Citra standalone- 58fps, Libretro core- 24fps

I’m not trying to run this at 8 or 10 times resolution, but I still noted and documented the speed discrepancies. They’re quite apparent even compared at native res. And even at a very conservative 2x resolution bump, the libretro core is now failing to achieve full speed, while the standalone emulator chews through it with 4 times the performance. The funny part is looking at 10x resolution (4000x2400), Citra standalone is still achieving almost twice the game’s full speed. But it’s also slightly faster than the Libretro core is at native 1x resolution…

After sharing my results, j-selby stated that he believes the problem to be related to overhead in Retroarch itself and that there was nothing that could be done on the core’s end.

So what (if anything) should my next option be? I’m open to any advice on the matter. If this is an issue that should be reported, i’m unsure where to do it.

The first thing i do when i see a core struggling to go full speed is to check the Hard GPU Sync option. This needs to be OFF on high demanding cores. It helps with input lag but it also adds a lot of overhead. Check the Latency options in general and see if there’s something else there that slows down the core as well.

After sharing my results, j-selby stated that he believes the problem to be related to overhead in Retroarch itself and that there was nothing that could be done on the core’s end.

RetroArch cannot add such overhead to the extent you describe. Any overhead for x64 PIC cores is entirely negligible. There must be another explanation for this.

I have also heard opposite conclusions, where Citra Libretro was actually faster than standalone.

I did notice one thing though when I last tried Citra Libretro. It was for some reason far faster in windowed mode than fullscreen. I think there must still be some GL implementation issues going on somewhere.

Thanks for the responses so far. Sorry if i’m not being super helpful, i’ll provide any other information requested.

I mentioned in passing earlier that there was a separate performance issue with vsync in Citra. 30fps games are capped at 20fps. This is not present when running 30fps (or below) games in other cores like Desmume, Mupen64plus, Beetle PSX etc. So that’s also a weird issue in of itself. But seems unrelated to the real performance issues…

Hard GPU Sync is definitely off in all menus, I know it’s a demanding setting. I only use that for cores for old retro systems like SNES, Genesis and GBA.

Windowed mode and fullscreen yield the same performance for me.

I’ve been going through all the menus and options I can think of to try to find a setting that could fix this. I even did a full wipe of my Retroarch directory and started over to make sure there wasn’t a setting I ticked somewhere that was causing the issue.

I also went into the Nvidia control panel and changed all of the settings to favor performance.

I don’t THINK this issue is affecting other cores, or that there’s anything wrong with my PC. I’ve not tested the Dolphin core extensively yet, but I seem to be getting pretty excellent performance out of the other cores. Even the apparently demanding Beetle PSX with quite a few enhancements like high resolution, PGXP and overclocking (though the GL driver is faster than Vulkan, so I dunno if that’s supposed to be).

FZero GX definitely works.

I’ll try to see what’s going wrong on my end then with F Zero, thanks for the confirmation that it SHOULD work. Theoretically if it works in standalone Dolphin, should it also work in Retroarch? Indicating some sort of config problem with Retroarch on my end?

I’m fairly sure my PC is fine. I actually did a complete and clean reformat and reinstall of Windows about a month or so ago. Drivers seem okay and i’ve done some clean reinstalls for the nvidia stuff too. I’m trying to diagnose what is causing my speed issues.

BTW, i’m going to try to remove as much of Retroarch as possible from my PC and start from scratch. Except for saves and shader presets that is (which i’ll backup somewhere). See if I can remove any traces of potential problems or bugs that may be causing my apparent performance issues.

I currently have Retroarch running in portable mode inside a directory of my own choice. Besides deleting the folder and files there, do I need to do anything to the registry as well to remove any traces of it?

No, it doesn’t touch the registry or anything like that.

Just deleting configs and any overrides is typically all it takes to go back to approximately clean.

Thanks, deleting the config files is what i’ve been doing to get Retroarch back to a stock state.

I tried a completely fresh install of Retroarch on a different hard drive (I usually keep it on a separate storage drive separate from my Windows installation). Used the installer for 1.7.4 and made no alterations beyond downloading the Citra core. Performance problems still persist.

I’m really stumped as to what the problem is. My PC works perfectly well and fast on everything else…

I’m tempted to make a topic on the Retroarch subreddit about this too in the hopes I might get some more info and tips there. Is that okay to do?

Sure, but I’ll warn you in advance that it’s just the same half-dozen people answering questions on all of our communication platforms. It’s pretty much just me and Twinaphex answering questions on r/RetroArch.

Ah okay. I was hoping that a reddit page might attract a larger number of people sharing issues and solutions.

None of the Nvidia control panel options really do anything to help either btw, I know someone suggested that in the github issue. Might squeeze out 2-3 more FPS than normal but nothing to remotely close the performance gap.

The Dolphin core speed loss seems to be around 20-30% slower than standalone in most games i’ve tested so far. Though it does vary. Mario Kart Double Dash at native res gets about 220fps in standalone vs 165fps in Retroarch.

Is there something wrong with the buildbot and the citra core?

The last build for Citra and Citra Canary was May 19th 2018 for Linux x86_64.

The latest libretro Citra commit was August 12th.

that typically means there’s a build failure somewhere. I’ll check the buildbot logs and see what’s up.

I wanted to bring up something that may or may not be related.

I stumbled across something that makes me wonder if the performance disparity is related to a PR possibly implemented in the standalone Canary builds that hasn’t made it into the Libretro core yet.

I accidentally downloaded the Nightly variant of Citra (standalone) yesterday when I was updating. I meant to download Canary, which is what I have been using and had been my basis for comparing to the Libretro core. Upon booting my game up in the Nightly build, I noticed a large performance decrease. VERY similar to the performance drop I get in the libretro core. Upon realizing I had downloaded the Nightly version accidentally, I went and downloaded Canary instead and the problem was fixed.

So, I wondered if there had been a change made to the standalone Canary build that had not yet been added to the Libretro Canary core that might cause this performance disparity. This led me to find something on Citra’s github called “Ignore format reinterpretation hack”. Pull request #4089. The PR was made in August. I’m not sure if it’s in the current Canary standalone builds as it says “not ready for merge” but I assume this means not ready for Nightly. I presume it’s already implemented into Canary. Here’s the PR i’m referring to-

Several people in the comments for this hack have mentioned that for some games they achieved anywhere from 2-4 times the performance compared to before. This includes my aforementioned test game Kingdom Hearts 3D, which someone else tested and noted similar performance increases as I did. On an Intel 8700 at 3.2ghz they went from 18-25fps (which is similar to what I get in Libretro and older Nightly Citra builds at above native resolutions), and this new hack bumped them up to 60fps (which again is similar to what I get in the standalone Canary build of Citra also playing at higher than native resolution)

Someone even mentioned Kingdom Hearts 3D using an Intel 8700 3.2ghz going from 18-25fps in previous Citra versions (comparable to my performance in the Libretro core when running at 2x resolution or greater) all the way up to 59-60fps with this new hack (which is also similar to what I get when I bump the resolution way up in current standalone Canary builds).

So assuming this hack explains why standalone Citra Canary is so much faster than standalone Nightly, the final question would be- has this hack been implemented into the Libretro Canary core yet? Because if it has not, perhaps this is the root cause of my performance problems.

Okay scratch that entire spiel then, j-selby just replied on the github and said the hack was already implemented into the libretro core. So i’m back at square one on how to fix this. He requested providing a profile for the issue, but I regret that i’m completely ignorant on that process. I don’t have any clue what programs to use, where to even begin etc…

Thanks hunterk. Citra is building for Windows and is current on the buildbot since the last commit, but the Linux x86_64 core still has not been built since May even though there has been a significant number on commits since then.

I wasn’t sure if I should open an issue on Github or just mention it here in the forum.

I’ll take a look at the buildbot logs and see if I can spot the issue.

I believe a current build of the Linux Citra core can be extracted from the Lakka x86_64 nightly image.

After this github commit Lakka was able to build it.