Shader Branching Performance Impact

Could you elaborate on this as I have alot of if statements in my shader?

It seems it depends on how they are used, these were being executed for every sampling function call. Elsewhere I have other ifs which just test a parameter value from global or params to shut off chunks of code, (E.G. So the bezel isn’t added) and these seem to work intuitively as I expect.

1 Like

So basically if it’s calling textures it tends to tank speeds, but if it’s just cutting code on and off it tends to be fine?

Sorry if that’s not what you’re saying, lol.

If you’re just checking a parameter and then shutting off chunks of code, it’s not a big hit (especially if you can throw an early return to make sure it doesn’t calculate the other side just for fun), but if your branch is based on a calculation, it hurts more, especially if that calculation involves sampling (e.g., based on the color or brightness of the current pixel).

For example:

vec3 result = (params.test > 0.5) ? vec3(0.0) : vec3(1.0);

is pretty fast.

vec3 result = (texture(Source, vTexCoord.xy).r > 0.5) ? vec3(0.0) : vec3(1.0);

is probably slower, and

vec3 result = (texture(Source, vTexCoord.xy).r > texture(Source, vTexCoord.xy).g) ? vec3(0.0) : vec3(1.0);

is probably a lot slower, still.

1 Like

Thanks the clears things up for me.

So running calculation without sampling is a decent-ish performance hit?

In general, branches are bad and you want to avoid them whenever possible. However, if you’re shutting off a big, expensive chunk of code, it’s better than doing both sides every time. Ultimately, it depends on the GPU/compiler what’s fastest, so it’s a good idea to test a few different options.

In my experience, it’s both faster and easier to read to do (params.test > 0.5) ? a : b than mix(a, b, params.test), though I’ve seen people say the mix() should be faster because it’s not a true branch (ditto for creative uses of abs(), clamp(), min(), max(), etc.). I’m sure there are cases where that’s true, though.

2 Likes

I’m just going to start sending you PMs when I have questions, I feel bad for the constant having to branch threads because of me. I mean this is the second time this week…

Thanks for making this thread :slight_smile:

It’s good information to have, and it’s a bit of a tricky one to figure out without trial and error sometimes :slight_smile:

1 Like

GPUs are essentially SIMD architectures and so if different streams within an execution group take different paths through the code, both branches have to executed and the results ignored for streams that did not take that branch. There’s a good explanation in How does a GPU shader core work? Look for What about branches? starting from page 43.

If all streams in an execution group take the same branch the code in the other branch can be skipped but it depends on the exact circumstances and the GPU architecture. This explains why tests on parameters will be fast - all the streams will take the same branch.

4 Likes

Isnt’t that what forums are for? Spreading knowledge so other people can resolve the same doubts.

1 Like

Yeah, I don’t mind splitting off the threads. I just didn’t want to make the many people following HSM’s releases have to read our technical discussion :slight_smile:

1 Like

I just felt bad because I know it’s bad form to go off topic, and you’ve had to make two threads this week because of me doing just that, lol.

Don’t worry, it’s not a big deal. I wasn’t going to fool with it until I counted that we were 6+ posts deep in the tangent :stuck_out_tongue:

1 Like

Sorry to jump topic but it’s semi-related.

I’m trying to use an if statement, but not sure how to go about it.

The issue is I need it to be triggered if the value is less than 0.0, and it also needs to be triggered if the value is greater than 0.0. Basically it needs be triggered if the value isn’t exactly 0.0.

Would I just do this?

if (whatever >< 0.0)

Seems odd, and I’ve never seen it anywhere.

I believe it’s (value != 0.)

1 Like

Hmm wouldn’t have expected that I’ll try it.

Keep in mind that, because of rounding errors, floating point numbers frequently aren’t exact. Comparing with zero (value != 0.0) might not work if rounding errors meant value was very close to, but not exactly, zero. More info. here.

Despite what that says you can probably get away with if (abs(a-b) < 0.0001)

One more thing to watch out for in doing those comparisons is floating point precision in OpenGL ES shaders. mediump precision floats are only required to have a precision of 2^-10 - which is 1/1024th - minimum precision for mediump is roughly equivalent to a 16 bit float.

Most GPUs just use 32 bits for mediump but some don’t - e.g. Mali 400’s only support the minimum. Also, highp is optional in fragment shaders so you might not be able to just define your variables as highp.

2 Likes

Yeah, that’s why I typically avoid using >/== 0.0 for 0./1. parameter checks and knock it up to 0.5 instead. Obviously no precision variance is going to go that high, but it could potentially break an exact 0.0 check.

EDIT: for parameter checks, since they’re always floats and subject to precision variance, I will sometimes cast them to int when I need to do a multi-branch decision. That way I can use (int_value == 1, 2, 3 … whatever) without worrying about variance breaking the evaluation.

That is, I’ll have:

#pragma parameter float_value "Float Value" 5.0 0.0 10.0 1.0
int int_value = int(params.float_value);

and then later use:

if (int_value == 0) ...
else if (int_value == 1) ...
etc.

Should I change the value that I was using?