Per-core Max CPU Clock speed, wasted cycles, optimising battery life

This issue came to mind using FreeIntv but it also applies to many other cores.

I believe RetroArch runs at the highest clock speed on the Nintendo New 3DS. On Original 3DS it has no choice but to run at a slower clock speed.

But for “simple” cores like this Intellivision, that can run easily on original (slower) 3DS, I think running N3DS at any higher MHz is overkill. Importantly, it also results in very high battery usage rates when using RetroArch.

So I was wondering how much CPU the Intellivision core is using during a game? Is there a way to see this? It achieves 60fps easily on N3DS, even original 3DS, so any extra MHz is just eating battery.

Maybe the CPU clock speed c poo ups be dialled down on less demanding cores and/or do some optimisations to throttle the CPU?

To contrast, playing the official Intellivision Lives! emulation ROM on my N3DS and the battery life is so much longer. That cartridge even runs on an original DS at seemingly full speed.

This optimisation would also benefit other battery-powered platforms: PS Vita, iOS, Android, etc.

Any thoughts appreciated.

1 Like

DS programs which use the tilemap hardware are far more efficient than 3DS programs which use a framebuffer.

1 Like

CPU management should be handled by your OS/jailbreak. i don’t see how retroarch could communicate with the system at that level, right? it’s just a program, not an OS.

it is just a program, but it is configured to run the system CPU at the highest possible clock speed (at least on New 3DS) when there are situations where that is not necessary

right but even if you wanted to it can’t in the same way that that some random windows program can’t change your current CPU speed. that’s an OS responsibility.

Maybe some slight crossed wires here: I am talking about Max CPU Clock Speed, not current CPU usage.

On New 3DS the Max CPU Clock Speed can be set by:

  • install time flag on applet (read more)
    • Max CPU Clock Speed is changed when the app is launched
  • at runtime by writing to a system register (read more)
    • Max CPU Clock Speed is changed on demand

There are 3 different Max CPU Clock Speeds:

  • 268MHz (Original 3DS and New 3DS, x1)
  • 536MHz (New 3DS only, x2, said to require kernel hack - needs further investigation)
  • 804MHz (New 3DS only, x3)

https://3dbrew.org/wiki/Hardware#Common_hardware

So it’s more like software CPU overclocking. Same goes for PSP, PS Vita, iOS, Android; even my Apple Mac with an Intel Core i7 CPU can have its Max CPU Clock Speed changed through software.

So my question can be rephrased:

Why does RetroArch overclock the CPU on every core when simple cores like Intellivision, etc, do not need it?

simple. device just not fast enough…

Easily said @wertz ! You are no doubt correct with regards to some cores. But which ones? Obviously, the really demanding cores. But what about the less demanding cores? How low can we go?

That gave me an idea… use BootNTR homebrew on N3DS to add the ability to adjust Max CPU Clock Speed on-the-fly using a key combo/custom menu.

Now, I don’t know if the RetroArch cores are optimised in any way as to expect to run at 804MHz, so take these results with a pinch of salt.

Core fps @ 268MHz fps @ 804MHz
RetroArch menu (rgui) 60 60
Beetle NGP ~12 ~45
EightyOne ~12 ~45
FBA (CPS2) ~25 60
FCeumm ~35 60
fMSX ~30 60
FreeIntv ~35 60
Gambatte 60 60
Genesis Plus DX (GG) ~50 60
Genesis Plus DX (MS) ~30 60
Genesis Plus DX (MD) ~15 60
G&W ~40 60
Mednafen PCE ~25 60
Mednafen WS ~25 ~70
NEStopia ~20 60
NXengine 60 60
PicoDrive ~40 60
ProSystem ~40 60
QuickNES 60 60
SNES9x 2002 ~30 ~50
Stella ~30 60
2048 ~15 ~35

~ = variable fps

I will edit the above table as I see how other cores run at 268MHz Max CPU Clock Speed.

Now to try to figure out how I can run it at 536MHz to get those results.

1 Like

I wonder if the old NDS emulators (SnesDS, SnezziDS, SnemulDS) could have their CPU cores reused in the retroarch SNES emulators? By necessity, they were written in ARM7/9 assembly to run on a 66MHz ARM processor. Obviously, this would be ARM specific.

1 Like

That’s an interesting idea: here’s a list of all emulators that were on original Nintendo DS

In general, when you need to optimize stuff, you do Profiling. Profiling tells you where all the ‘hot’ functions are, and what percentage of time they take.

From there, you can examine the functions and see if there are any obvious improvements to make, and peek at the assembly code to see if the compiler did anything stupid.

I was looking at the main Snes9x core, and saw some bad compiler-generated code in the “check interrupts” function, so I switched out the logical boolean operators for arithmetic boolean operators, and got a 11% speedup immediately, due to getting rid of so many branches. Logical boolean operators are conditional branches, not ands and ors.

Now for Snes9x 2010. I noticed that under some settings, MSVC will generate really stupid code for Snes9x 2010’s APU:

         if ( 1 && !--clocks_remain )
            break;

This wasn’t just skipping the "1 && " part, it was actually evaluating it. It was setting EAX to zero, then comparing it with 1, then doing a conditional branch. WTF. And optimizations were enabled, and set to “optimize for speed” at the time. Turning on “whole program optimizations” fixed that.

2 Likes

This is great, nice work.

I’m happy to profile some cores, do you think it’s worth trying to organise a concerted effort from the community on this?