Dogway's grading shader (slang)

hunterk · 31 March 2020 14:40

You can just change the LUTs in the preset to point to the existing passthru LUTs. You’ll have the (minor) performance impact, but the image won’t change.

Syh · 31 March 2020 14:47

That’s why I was asking about removing them (not from his main branch, just a personal version), I mean if I don’t have to do anything complicated I could just remove them myself. (I probably could have worded everything better.)

So really the main questions are do I have to do anything special to remove them as compensation and how would I do the fragColor at the end with them removed for correct gamma?

If I know this I can probably just do it myself, lol.

Dogway · 31 March 2020 14:52

Sorry, I entertained myself debugging a possible chroma banding issue, anyways here it is:

grade_(noLUT).slang

#version 450

layout(push_constant) uniform Push
{
    float gamma_out;
    float gamma_in;
    float gamma_type;
    float vignette;
    float str;
    float power;
    float temperature;
    float luma_preserve;
    float sat;
    float dulvibr;
    float lum;
    float size;
    float cntrst;
    float mid;
    float black_level;
    float blr;
    float blg;
    float blb;
    float r;
    float g;
    float b;
    float rg;
    float rb;
    float gr;
    float gb;
    float br;
    float bg;
} params;

/*
   Grade
   > Ubershader grouping some color related monolithic shaders like color-mangler, vignette, white_point,
   > and the addition of vibrance, black level, sigmoidal contrast and proper gamma transforms.

   Author: hunterk, Guest, Dr. Venom, Dogway
   License: Public domain
*/

#pragma parameter gamma_out "Display Gamma" 2.20 0.0 3.0 0.05
#pragma parameter gamma_in "CRT Gamma" 2.40 0.0 3.0 0.05
#pragma parameter gamma_type "CRT Gamma (POW = 0, sRGB = 1)" 1.0 0.0 1.0 1.0
#pragma parameter vignette "Vignette Toggle" 1.0 0.0 1.0 1.0
#pragma parameter str "Vignette Strength" 40.0 10.0 40.0 1.0
#pragma parameter power "Vignette Power" 0.20 0.0 0.5 0.01
#pragma parameter temperature "White Point" 9311.0 1031.0 12047.0 72.0
#pragma parameter luma_preserve "WP Preserve Luminance" 1.0 0.0 1.0 1.0
#pragma parameter sat "Saturation" 0.0 -1.0 2.0 0.02
#pragma parameter dulvibr "Dullness/Vibrance" 0.0 -1.0 1.0 0.05
#pragma parameter lum "Brightness" 1.0 0.0 2.0 0.01
#pragma parameter cntrst "Contrast" 0.0 -1.0 1.0 0.05
#pragma parameter mid "Contrast Pivot" 0.5 0.0 1.0 0.01
#pragma parameter black_level "Black Level" 0.0 -0.5 0.5 0.01
#pragma parameter blr "Black-Red Tint" 0.0 0.0 1.0 0.005
#pragma parameter blg "Black-Green Tint" 0.0 0.0 1.0 0.005
#pragma parameter blb "Black-Blue Tint" 0.0 0.0 1.0 0.005
#pragma parameter r "White-Red Tint" 1.0 0.0 2.0 0.01
#pragma parameter g "White-Green Tint" 1.0 0.0 2.0 0.01
#pragma parameter b "White-Blue Tint" 1.0 0.0 2.0 0.01
#pragma parameter rg "Red-Green Tint" 0.0 -1.0 1.0 0.005
#pragma parameter rb "Red-Blue Tint" 0.0 -1.0 1.0 0.005
#pragma parameter gr "Green-Red Tint" 0.0 -1.0 1.0 0.005
#pragma parameter gb "Green-Blue Tint" 0.0 -1.0 1.0 0.005
#pragma parameter br "Blue-Red Tint" 0.0 -1.0 1.0 0.005
#pragma parameter bg "Blue-Green Tint" 0.0 -1.0 1.0 0.005

layout(std140, set = 0, binding = 0) uniform UBO
{
    mat4 MVP;
    vec4 SourceSize;
    vec4 OriginalSize;
    vec4 OutputSize;
} global;

#pragma stage vertex
layout(location = 0) in vec4 Position;
layout(location = 1) in vec2 TexCoord;
layout(location = 0) out vec2 vTexCoord;

void main()
{
    gl_Position = global.MVP * Position;
    vTexCoord = TexCoord;
}

#pragma stage fragment
layout(location = 0) in vec2 vTexCoord;
layout(location = 0) out vec4 FragColor;
layout(set = 0, binding = 2) uniform sampler2D Source;


// White Point Mapping function
//
// From the first comment post (sRGB primaries and linear light compensated)
//      http://www.zombieprototypes.com/?p=210#comment-4695029660
// Based on the Neil Bartlett's blog update
//      http://www.zombieprototypes.com/?p=210
// Inspired itself by Tanner Helland's work
//      http://www.tannerhelland.com/4435/convert-temperature-rgb-algorithm-code/

vec3 wp_adjust(vec3 color){

    float temp = params.temperature / 100.;
    float k = params.temperature / 10000.;
    float lk = log(k);

    vec3 wp = vec3(1.);

    // calculate RED
    wp.r = (temp <= 65.) ? 1. : 0.32068362618584273 + (0.19668730877673762 * pow(k - 0.21298613432655075, - 1.5139012907556737)) + (- 0.013883432789258415 * lk);

    // calculate GREEN
    float mg = 1.226916242502167 + (- 1.3109482654223614 * pow(k - 0.44267061967913873, 3.) * exp(- 5.089297600846147 * (k - 0.44267061967913873))) + (0.6453936305542096 * lk);
    float pg = 0.4860175851734596 + (0.1802139719519286 * pow(k - 0.14573069517701578, - 1.397716496795082)) + (- 0.00803698899233844 * lk);
    wp.g = (temp <= 65.5) ? ((temp <= 8.) ? 0. : mg) : pg;

    // calculate BLUE
    wp.b = (temp <= 19.) ? 0. : (temp >= 66.) ? 1. : 1.677499032830161 + (- 0.02313594016938082 * pow(k - 1.1367244820333684, 3.) * exp(- 4.221279555918655 * (k - 1.1367244820333684))) + (1.6550275798913296 * lk);

    // clamp
    wp.rgb = clamp(wp.rgb, vec3(0.), vec3(1.));

    // Linear color input
    return color * wp;
}

vec3 sRGB_to_XYZ(vec3 RGB){

    const mat3x3 m = mat3x3(
    0.4124564, 0.3575761, 0.1804375,
    0.2126729, 0.7151522, 0.0721750,
    0.0193339, 0.1191920, 0.9503041);
    return RGB * m;
}


vec3 XYZtoYxy(vec3 XYZ){

    float XYZrgb = XYZ.r+XYZ.g+XYZ.b;
    float Yxyg = (XYZrgb <= 0.0) ? 0.3805 : XYZ.r / XYZrgb;
    float Yxyb = (XYZrgb <= 0.0) ? 0.3769 : XYZ.g / XYZrgb;
    return vec3(XYZ.g, Yxyg, Yxyb);
}


vec3 XYZ_to_sRGB(vec3 XYZ){

    const mat3x3 m = mat3x3(
    3.2404542, -1.5371385, -0.4985314,
   -0.9692660,  1.8760108,  0.0415560,
    0.0556434, -0.2040259,  1.0572252);
    return XYZ * m;
}


vec3 YxytoXYZ(vec3 Yxy){

    float Xs = Yxy.r * (Yxy.g/Yxy.b);
    float Xsz = (Yxy.r <= 0.0) ? 0.0 : 1.0;
    vec3 XYZ = vec3(Xsz,Xsz,Xsz) * vec3(Xs, Yxy.r, (Xs/Yxy.g)-Xs-Yxy.r);
    return XYZ;
}

// This shouldn't be necessary but it seems some undefined values can
// creep in and each GPU vendor handles that differently. This keeps
// all values within a safe range
vec3 mixfix(vec3 a, vec3 b, float c)
{
    return (a.z < 1.0) ? mix(a, b, c) : a;
}


vec4 mixfix_v4(vec4 a, vec4 b, float c)
{
    return (a.z < 1.0) ? mix(a, b, c) : a;
}


float SatMask(float color_r, float color_g, float color_b)
{
    float max_rgb = max(color_r, max(color_g, color_b));
    float min_rgb = min(color_r, min(color_g, color_b));
    float msk = clamp((max_rgb - min_rgb) / (max_rgb + min_rgb), 0.0, 1.0);
    return msk;
}


float moncurve_r( float color, float gamma, float offs)
{
    // Reverse monitor curve
    color = clamp(color, 0.0, 1.0);
    float yb = pow( offs * gamma / ( ( gamma - 1.0) * ( 1.0 + offs)), gamma);
    float rs = pow( ( gamma - 1.0) / offs, gamma - 1.0) * pow( ( 1.0 + offs) / gamma, gamma);

    color = ( color > yb) ? ( 1.0 + offs) * pow( color, 1.0 / gamma) - offs : color * rs;
    return color;
}


vec3 moncurve_r_f3( vec3 color, float gamma, float offs)
{
    color.r = moncurve_r( color.r, gamma, offs);
    color.g = moncurve_r( color.g, gamma, offs);
    color.b = moncurve_r( color.b, gamma, offs);
    return color.rgb;
}


float moncurve_f( float color, float gamma, float offs)
{
    // Forward monitor curve
    color = clamp(color, 0.0, 1.0);
    float fs = (( gamma - 1.0) / offs) * pow( offs * gamma / ( ( gamma - 1.0) * ( 1.0 + offs)), gamma);
    float xb = offs / ( gamma - 1.0);

    color = ( color > xb) ? pow( ( color + offs) / ( 1.0 + offs), gamma) : color * fs;
    return color;
}

vec3 moncurve_f_f3( vec3 color, float gamma, float offs)
{
    color.r = moncurve_f( color.r, gamma, offs);
    color.g = moncurve_f( color.g, gamma, offs);
    color.b = moncurve_f( color.b, gamma, offs);
    return color.rgb;
}


//  Performs better in gamma encoded space
float contrast_sigmoid(float color, float cont, float pivot){

    cont = pow(cont + 1., 3.);

    float knee = 1. / (1. + exp(cont * pivot));
    float shldr = 1. / (1. + exp(cont * (pivot - 1.)));

    color = (1. / (1. + exp(cont * (pivot - color))) - knee) / (shldr - knee);

    return color;
}


//  Performs better in gamma encoded space
float contrast_sigmoid_inv(float color, float cont, float pivot){

    cont = pow(cont - 1., 3.);

    float knee = 1. / (1. + exp (cont * pivot));
    float shldr = 1. / (1. + exp (cont * (pivot - 1.)));

    color = pivot - log(1. / (color * (shldr - knee) + knee) - 1.) / cont;

    return color;
}



void main()
{

//  Pure power was crushing blacks (eg. DKC2). You can mimic pow(c, 2.4) by raising the gamma_in value to 2.55
    vec3 imgColor = texture(Source, vTexCoord.xy).rgb;
    imgColor = (params.gamma_type == 1.0) ? moncurve_f_f3(imgColor, params.gamma_in + 0.15, 0.055) : pow(imgColor, vec3(params.gamma_in));


//  Saturation agnostic sigmoidal contrast
    vec3 Yxy = XYZtoYxy(sRGB_to_XYZ(imgColor));
    float toLinear = moncurve_r(Yxy.r, 2.40, 0.055);
    float sigmoid = (params.cntrst > 0.0) ? contrast_sigmoid(toLinear, params.cntrst, params.mid) : contrast_sigmoid_inv(toLinear, params.cntrst, params.mid);
    vec3 contrast = vec3(moncurve_f(sigmoid, 2.40, 0.055), Yxy.g, Yxy.b);
    vec3 XYZsrgb = clamp(XYZ_to_sRGB(YxytoXYZ(contrast)), 0.0, 1.0);
    contrast = (params.cntrst == 0.0) ? imgColor : XYZsrgb;


//  Vignetting & Black Level
    vec2 vpos = vTexCoord*(global.OriginalSize.xy/global.SourceSize.xy);

    vpos *= 1.0 - vpos.xy;
    float vig = vpos.x*vpos.y * params.str;
    vig = min(pow(vig, params.power), 1.0);
    contrast *= (params.vignette == 1.0) ? vig : 1.0;

    contrast += (params.black_level / 20.0) * (1.0 - contrast);


//  RGB related transforms
    vec4 screen = vec4(max(contrast, 0.0), 1.0);
    float sat = params.sat + 1.0;

                   //  r               g           b  alpha ; alpha does nothing for our purposes
    mat4 color = mat4(params.r, params.rg,  params.rb,  0.0,  //red tint
                     params.gr,  params.g,  params.gb,  0.0,  //green tint
                     params.br, params.bg,   params.b,  0.0,  //blue tint
                    params.blr, params.blg, params.blb, 0.0); //black tint

    mat4 adjust = mat4((1.0 - sat) * 0.2126 + sat, (1.0 - sat) * 0.2126, (1.0 - sat) * 0.2126, 1.0,
                       (1.0 - sat) * 0.7152, (1.0 - sat) * 0.7152 + sat, (1.0 - sat) * 0.7152, 1.0,
                       (1.0 - sat) * 0.0722, (1.0 - sat) * 0.0722, (1.0 - sat) * 0.0722 + sat, 1.0,
                        0.0, 0.0, 0.0, 1.0);

    screen = clamp(screen * ((params.lum - 1.0) * 2.0 + 1.0), 0.0, 1.0);
    screen = color * screen;
    float sat_msk = (params.dulvibr > 0.0) ? clamp(1.0 - (SatMask(screen.r, screen.g, screen.b) * params.dulvibr), 0.0, 1.0) : clamp(1.0 - abs(SatMask(screen.r, screen.g, screen.b) - 1.0) * abs(params.dulvibr), 0.0, 1.0);
    screen = mixfix_v4(screen, adjust * screen, sat_msk);


//  Color Temperature
    vec3 adjusted = wp_adjust(screen.rgb);
    vec3 base_luma = XYZtoYxy(sRGB_to_XYZ(screen.rgb));
    vec3 adjusted_luma = XYZtoYxy(sRGB_to_XYZ(adjusted));
    adjusted = (params.luma_preserve == 1.0) ? adjusted_luma + (vec3(base_luma.r, 0.0, 0.0) - vec3(adjusted_luma.r, 0.0, 0.0)) : adjusted_luma;
    adjusted = clamp(XYZ_to_sRGB(YxytoXYZ(adjusted)), 0.0, 1.0);


    FragColor = vec4(moncurve_r_f3(adjusted, params.gamma_out + 0.20, 0.055), 1.0);
}

Syh · 31 March 2020 15:03

Your good man, you weren’t obligated to do anything anyway.

At least I don’t have to do it now, lol.

Really appreciate this!

Syh · 1 April 2020 05:39

Thanks again for the shader, looks great from my testing so far. Black level management is much better than in you’re earlier version (less drastic.)

The only thing I was having issues with was the dull/vibrance setting. I wasn’t really noticing any changes… (Maybe I need to take some screenshots for myself to compare it at -1 and 1?)

Looks good though, everything else was working great for me.

EDIT: Does the shader need to be ran with srgb_framebuffer set to true?

Dogway · 1 April 2020 13:47

Thanks! Yes I refined some settings so there’s more granularity or goes from negative to positive as for me that’s easier to grasp. EDIT: srgb_framebuffer to false, we are doing gamma inside the shader.

Dullnes/Vibrance is actually a mask for Saturation. It won’t do anything if that’s 0. In my opinion this is a better solution because it also allows you to desaturate low saturated areas (akin bleach bypass if used with contrast), or saturate even more saturated areas. More control in less settings, think of it as a 2D plot where Y is Saturation and X is the saturation mask.

The issue with the chroma banding (and partly the overexposition) is due to the Brightness setting, despite working in 16-bit half float. I integrated a rolled-off gain for Brightness so it works in a gradual manner in the high range. The code is dirty as it is but it works fine. I also implemented a Hotspot fix to bring down some overexposed highlights and recover some shading in those areas, but I’m not happy at all with the code. You can toggle that off though.

Another big issue I was having was posterization or banding in the low range. I thought this was a quantization side-effect of doing several gamma transforms in chain, and although I optimized how the transforms are taking place (check the presets below) the real culprit was crt-royale beam-min-sigma, by default 0.02 it was darkening the image, set to 0.10 and clipping goes away (EDIT: As I found out, only an issue with > 240px input sizes). Still, I really would like if the ntsc passes would work in linear space (seemingly they are designed with gamma encoded inputs), because it could save us 2 transformations that translates to better performance and less quantization.

After an advice from hunterk I also modified the variable names. As usual updates in the repo. I’m also sharing the full presets with recommended default settings because it’s not easy stuff, for both slang and glsl and 240 and 320 inputs.

Dogway · 1 April 2020 16:30

OMG, it was the atan() function in include\compat_macros.inc, I completely overlooked that since it’s not in the crt folder and my diff comparison software skipping minor changes. Finally after all these years I can get rid of glsl at last, thanks for making me look twice.

hunterk · 1 April 2020 19:20

oh lol yep, I felt the same way when I figured out what was wrong… What a dumb problem to have caused us so much grief.

Syh · 2 April 2020 16:24

@Dogway

Man I feel dumb, I read your post about the dull/vibrance setting when I was half asleep, and was so confused. Re-read it today, looked at the graph and was fuuuuuu that’s what it does, lol. (I mean damn, you drew me a picture and everything, I’m not stupid I swear, rofl.)

You’re grade shader is great man, the white point lum preserve does alot for the image.

I need to update the copy you did without the LUTs to current, which I can most certainly do I’ll just cherry pick the update from the repo.

I did make two alterations to the shader though, I separated the saturation into separate RGB saturation(this may slightly be pointless as the dull/vibrance is a thing, but I like that grainual control), and added the X/Y Modifier’s from image-adjustment (mainly so I don’t need image-adjustment for it, lol.)

All in all this is my favorite video management shader.

Syh · 6 April 2020 18:05

@Dogway

Jesus, you’ve been busy over the last couple of days.

Decided I was going to start cherry picking that update for grade, opened up your repo and you went ham on updating grade. Also interesting update for the signal-bandwidth shader, will have to check it out.

Dogway · 6 April 2020 18:35

oh well, many are cosmetics. I mainly implemented corner size since crt-royale doesn’t have that one. Ideally this fits better into image-adjustment (flip, mirror, zoom, translate, etc) but no way I’m going into that territory haha. The other thing was slang versions for lut_x2, gdapt stripes and signal-bandwidth.

BTW, I forgot to say that signal bandwidth for the ntsc passes is emulated through hardware resizing.

Those:

scale_type_x1 = "source"
scale_x1 = "4.000000"
scale_type_x2 = "source"
scale_x2 = "0.500000"

I’m not sure if this is a requirement for ntsc to work or an option, in which case I would replace that “hack” with signal-bandwidth. I guess after the ntsc passes.

Syh · 6 April 2020 18:43

That’s cake to do personally.

I haven’t updated grade since you posted the non-LUT version of it for me.

Could you elaborate more on this?

Dogway:

BTW, I forgot to say that signal bandwidth for the ntsc passes is emulated through hardware resizing.

Those:
scale_type_x1 = "source"
scale_x1 = "4.000000"
scale_type_x2 = "source"
scale_x2 = "0.500000"
I’m not sure if this is a requirement for ntsc to work or an option, in which case I would replace that “hack” with signal-bandwidth. I guess after the ntsc passes.

Dogway · 6 April 2020 19:18

Yes, well, I really don’t want to mess with my personal presets anymore. I personally don’t use image-adjustment but I guess people do.

On the ntsc bit, if you inspect ntsc-320px-svideo.glslp:

shaders = 2
shader0 = shaders/ntsc-pass1-svideo-2phase.glsl
shader1 = shaders/ntsc-pass2-2phase-gamma.glsl

filter_linear0 = false
filter_linear1 = false

scale_type_x0 = absolute 
scale_type_y0 = source
scale_x0 = 1280
scale_y0 = 1.0
frame_count_mod0 = 2
float_framebuffer0 = true

scale_type1 = source
scale_x1 = 0.5
scale_y1 = 1.0

You find that the image is upscaled to 1280 pixels wide before the ntsc (blur) pass. If you upscale more the blurring will be less prominent. So this is a non accurate way of simulating signal-bandwidth (my interpretation).

Syh · 6 April 2020 19:29

You can use this method for the first pass of ntsc, I’m pretty as @ProfessorBraun was discussing it in the CRT shader show-off thread. (It’s about 100-200 posts back.)

scale_type_x1 = "source"
scale_x1 = "4.000000"
scale_type_x2 = "source"
scale_x2 = "0.500000"

Ohh I was saying you can pull that stuff out of image-adjustment and place it into grade. (Already have done it myself and then removed the stuff I didn’t need.) Most of it happens in an earlier part of the shader before the void main()

Not sure if you were saying you didn’t want to add image-adjustment as a pass (to the shader preset chain), or that you didn’t want to add any code from image-adjustment.

Personally the only thing I kept from image-adjustment that I ported over into grade was the X/Y Modifiers.

Regardless, look forward to messing with the new grade update, and checking out your updates to the signal bandwidth shader.

Nesguy · 7 April 2020 19:42

I’m a bit late to the party. Is there a write-up or summary describing the new shaders and how they improve on existing shaders? Looks like good stuff from what I’ve seen but info seems a bit scattered.

Dogway · 7 April 2020 20:35

Basically it’s on the shader header, it’s an ubershader that comprises a set of existant shaders to work better in conjunction and ease of use. But I also included new code.

It’s all color/tone related that’s why “grade”. I implemented color-mangler, vignette, lut_x2 and white_point, and added vibrance, black level, corner size, rolled gain, sigmoidal contrast and proper gamma transforms.

This should be you first shader on the stack related to all color work, and next you can do ntsc emulation, blurring, scanlines…

If you can test and give some feedback that would be very welcome since I consider it pretty much done. The only thing I don’t like (and hence default to false) is the HotSpot fix. I need to come up with something better once I have time.

Syh · 7 April 2020 20:39

Could you elaborate on the sigmoidal contrast and proper gamma transformation?

Like what is the difference with sigmoidal contrast, and what is different with how you’re doing gamma vs how it was originally handled.

Dogway · 7 April 2020 21:10

TV sets (and therefore many shaders here) consider “contrast” as black lift (in color grading terms).

pasted image 0

For me that’s Lift (called Black Levels in grade.glsl). In the pic you can also see Gain, people commonly call that luminance or brightness. I kept the name Brightness but instead of a straight line I applied a soft curve (roll-off) towards the end, so highlights are not harshly clipped but gently.

In comparison, sigmoidal contrast is a non destructive way of increasing contrast by drawing an S curve shape on the function transform. What you get is brighter brights and darker darks (or the inverse) without clipping.

images

On gamma, after some research I read that while CRTs employed a pure power gamma LCDs abid to sRGB gamma, they are different, sRGB gamma has a linear part on the low end which are better at avoiding crushing the blacks. This was my first attempt but the blacks kept being crushed none-the-less due to the initial pure power linearization. I decided against literature to use sRGB gamma also for the CRT part, after some adjustement on the values what you get is the same image but with better dynamic range (brighter if you like) low range. You can check this on a dark frame real time, toggle Gamma Type on and off and decide which one do you prefer. Most shaders on retroarch use pure power gamma.

Syh · 11 April 2020 05:46

@Dogway

Hey, I don’t know if it’s my setup specifically or what, but the black_rgb tint settings seem pointless. If any of them go to 0.01 it immediately makes black that color, I know that is sorta it’s purpose but it’s fairly drastic(like too strong too fast to be of use imo).

You move them in tandem and you get a way stronger version of the black level setting, that’s not anywhere as nice imo.

I mean it’s your decision, but you may want to review the black tint settings and see if you want to phase them out, black lift (level) is great though, the tints are just weird.

I don’t want you to think I’m complaining, as this is an issue with color-mangler as well. I just thought you’d be interested in checking it out.

Shaders great regardless, lol.

Dogway · 11 April 2020 13:19

It’s color-mangler behavior untouched. I agree stepping is too high so I refined it, thanks for the observation.