Dogway's grading shader (slang)

I normally use slang anyway, just messing with glsl for a project lol. Just wanted to give you a heads up.

Btw Hue is working fine for me, so idk what’s going on. Maybe it’s a card specific issue?

I can’t get the HUE controls to work even with a CRT gamut selected (tried all of them). So, that might be related to the black screen issue. Seems like a lot of code is getting skipped regardless of what CRT gamut is set to.

Thanks for all the suggestions, that third shot you posted is looking pretty close.

I’m lazy to reconvert all my shots to sRGB but here’s the third shot converted from my display profile to srgb, so it might be closer to yours.

I’ll have a look at the issues tomorrow, I’m too tired today lol

3 Likes

lol no worries. Just looking at that discussion you posted makes me want to lie down and take a nap for the rest of the day.

3 Likes

Sometimes I wonder if I’m making things more complicated than necessary.

Food for thought:

“The sRGB colour space was developed in the 1990s to facilitate consistent colour reproduction of images on the Internet. Its RGB primaries are based on the most common CRT display phosphor primaries and the encoding of the gamma function, which largely governs the displayed image contrast, is based on the default CRT display gamma (see Chapter 10).”

(Langford’s Advanced Photography, 7th ed, p. 338)

If I’m reading this right, this is saying that we don’t really need to do anything to the color gamut for accurate color reproduction. All we need is to get the gamma function right and to adjust our display gamma based on the viewing environment and display conditions. If we also want composite video colors we also need to choose the appropriate palette in the emulator and/or a shader that emulates composite video. If we do this and if we calibrate our displays correctly then we will have technically accurate color reproduction. The CRT gamut stuff is just making things even more accurate.

I agree, it’s just not good.

I think the Conrac phosphors (the ones based on actual measurements) are pretty weird too. Maybe the ideal values are the ones you should actually use because of potential error in measurement and because they look better.

edit: I really need to remember to check the obvious. Revisiting that SMB3 shot, I realized that the problem was probably the emulator color palette and the lack of composite video colors. The photo I posted was of a North American Sharp NES TV. The Japanese version used an RGB connection, but the NA version used a high-quality composite video connection. Once I changed the color palette the colors were close enough to what I saw in the photo.

@Dogway

This is semi-unrelated but I thought you might find it interesting nonetheless, found this while doing some nonsense.

They’re selective color shaders for reshade (made by the same person and both have been updated within the last month), that uses Photoshop as a basis for color manipulation from my limited understanding.

https://github.com/prod80/prod80-ReShade-Repository/blob/master/Shaders/PD80_04_Selective_Color.fx

https://github.com/prod80/prod80-ReShade-Repository/blob/master/Shaders/PD80_04_Selective_Color_v2.fx

Just thought you’d find it interesting, not really suggesting you incorporate anything from just thought it’d be interesting reading material lol.

Here’s where the basis of the math for it is coming from.

http://blog.pkh.me/p/22-understanding-selective-coloring-in-adobe-photoshop.html

It’s wayyyyy over my head, but as someone interested in color management I thought you’d enjoy this.

1 Like

@Nesguy, sRGB is -as many standards are- an average of many things. I explained above that the sRGB primaries were a mishmash of the primaries of the three main color spaces, SMPTE, EBU and Japan, so no one would complain despite there was some dispute over one of the coefficients.

It’s the same with other standards, P3 is an average of measured theaters gamuts despite having a too green white point, ACES is based on film negatives and Pointer’s Gamut, and so on…

Now the question is, do you want to play an average of everything? Then select “g_gamut = 0”. If you want something more accurate or resemble more a specific region, select one of the 3 gamuts (#3 #4 #5), wanna go further and emulate an specific TV set? select the Sony or the Conrad gamut (might add more in the future). And since we are way past the 90s and can emulate so many things, why not emulate the gamut of the region the game was developed in? This way we can see colors as the developer intended…

Also I wanted to clear this up. I’m making grade as your one stop color management, correction and grading. And it checks all the boxes in those terms. We have first the composite emulation YIQ/YUV in proper full or legal range -no need for other shaders unless you also want the artifacts-, we have analogue controls, yesterday I added YCC space conversion, and along phosphor emulation I believe you can’t get any closer to the real thing color-wise. Does it look like the real thing? probably not, but we are closer and hence we have an array of grade settings to play with and fine adjust. Mind you your emulated room light conditions do also affect (warm reflection on glass or the opposite), so I might add some tint option to the glass in reflection in said shader. Current HDTVs are not very reflective so we have to emulate that part of CRTs.

That’s my take. To be honest I wouldn’t deviate much from the posted presets, here are the settings I recommend playing with and the ones I don’t recommend playing with much in terms of grading. Monitor calibration is a must, the most important thing IMO is balanced tint over a stepped greyscale ramp, D65 looks slightly warm for the untrained eye, but if you have a calibration instrument, the better. On the future I will work on this area more.

> Preset  * Tinker

> g_space_out "LCD Color Space" 0.0 0.0 2.0 1.0 (0.0 -sRGB- for most) 
* g_gamma_out "LCD Gamma" 2.20 0.0 3.0 0.05 ( between 2.00 -daylight environment- and 2.80 -PAL dark environment-)
> g_gamma_in "CRT Gamma" 2.40 0.0 3.0 0.05 (2.30 - 2.55)
> g_gamma_type "CRT Gamma (POW:0, sRGB:1, SMPTE-C:2)" 1.0 0.0 2.0 1.0 (1.0 -sRGB- mostly -can ignore SMPTE gamma for the time being)
g_vignette "Vignette Toggle" 1.0 0.0 1.0 1.0
g_vstr "Vignette Strength" 40.0 0.0 50.0 1.0
g_vpower "Vignette Power" 0.20 0.0 0.5 0.01
> g_crtgamut "Gamut (3:NTSC-U 4:NTSC-J 5:PAL)" 4.0 0.0 7.0 1.0 (4.0 -NTSC-J- for most)
g_hue_degrees "Hue" 0.0 -360.0 360.0 1.0
* g_I_SHIFT "I/U Shift" 0.0 -1.0 1.0 0.02
* g_Q_SHIFT "Q/V Shift" 0.0 -1.0 1.0 0.02
g_I_MUL "I/U Multiplier" 1.0 0.0 2.0 0.1
g_Q_MUL "Q/V Multiplier" 1.0 0.0 2.0 0.1
> wp_temperature "White Point" 9305.0 1621.0 12055.0 50.0 (7200 < x < 9305 for most, even NTSC-U games)
* g_sat "Saturation" 0.0 -1.0 2.0 0.02
g_vibr "Dullness/Vibrance" 0.0 -1.0 1.0 0.05
* g_lum "Brightness" 0.0 -0.5 1.0 0.01
g_cntrst "Contrast" 0.0 -1.0 1.0 0.05
g_mid "Contrast Pivot" 0.5 0.0 1.0 0.01
* g_lift "Black Level" 0.0 -0.5 0.5 0.01
blr "Black-Red Tint" 0.0 0.0 1.0 0.01
blg "Black-Green Tint" 0.0 0.0 1.0 0.01
blb "Black-Blue Tint" 0.0 0.0 1.0 0.01
wlr "White-Red Tint" 1.0 0.0 2.0 0.01
* wlg "White-Green Tint" 1.0 0.0 2.0 0.01 ( I like to lower this a bit on most systems)
wlb "White-Blue Tint" 1.0 0.0 2.0 0.01
* rg "Red-Green Tint" 0.0 -1.0 1.0 0.005
rb "Red-Blue Tint" 0.0 -1.0 1.0 0.005
gr "Green-Red Tint" 0.0 -1.0 1.0 0.005
* gb "Green-Blue Tint" 0.0 -1.0 1.0 0.005 ( a bit of this too)
br "Blue-Red Tint" 0.0 -1.0 1.0 0.005
* bg "Blue-Green Tint" 0.0 -1.0 1.0 0.005 ( and a bit more -> 0.05 - 0.10 on this)

2 Likes

Thanks for providing that explanation, very informative. I must have missed that part somehow. I figured you probably already knew all that stuff; I’m learning a lot as I go and sometimes I think out loud. I think the effort you’ve put into this shader has definitely been worth it!

Just downloaded the recent update and everything appears to be working fine now.

Can you provide a bit of info on LCD color space?

Here are your suggested settings. Very nice! POW gamma 2.4, LCD gamma 2.2 and did the recommended tint adjustments. “Composite direct capture” in Nestopia, NTSC-J preset.

2 Likes

Yeah, looks great. I’m not sure if Nestopia composite is doing something to colors though. Did the issues from yesterday solve?

To be honest I’m learning as I go too, that’s why you see me correct things over and over, but I do know where to look and had some background as well.

The last update fixed (now for real) all the coefficients. I used a more exact D65 white point up to 6 decimals to minimize rounding errors.

Also it might seem the SMPTE-C TRC wasn’t correct at all, who knows, but now I’m using 0.099 as alpha component, so it shouldn’t be as bright. In addition there’s now matching output TRCs to conform with the LCD color spaces.

Since we are doing so much color work, it won’t be rare to deviate into illegal sRGB values, so for those that have P3 monitors they can use the P3 color space (also for future proofing). All internal processing is done in this space so it’s not only an output thing. I don’t know much about P3 displays, are they usually tuned to DCI white point? D65 instead?

I also discovered a trick to clamp illegal YUV values, it’s a coefficient of the conversion matrix, depending on the position (YIQ, YUV, YCC primaries are each rotated). This area is the hardest one and the one last to get right since there’s so little information. I still need to run more tests but wanted to release something today to get some feedback.

One thing I would like to know from someone more emulation code savy, @hunterk might know maybe . Are the system palettes (say snes ones) extracted RAW into RGB or are they somewhat mapped to sRGB? I ask because RAW should be Rec.601 while sRGB is Rec.709 compliant, in case of the former I need a conversion.

2 Likes

@Dogway I don’t know if the hotspot thing is still bothering you, but I’ve found that the mask/scanlines can do a pretty good job of smoothing that out depending on how they’re configured. In fact, having that extra headroom at the high end is probably a good thing; lets you use stronger mask settings while maintaining good contrast in highlights.

Only a few of the masks are beneficial, though. A lot of them can have a pretty bad effect on gamma/dynamic range/etc.

I tested the masks in guest-dr-venom and the good ones are the CGWG masks, the built-in BW slotmask, and the rotated Lottes mask. The rest should be avoided if you’re interested in objective image quality. I plan on testing Royale next and posting my findings then.

Edit: issues from yesterday appear to be gone now. I probably just mucked up the code in the test version I was using.

1 Like

Nice, no issues lol pheww

Yes, sometimes I waver between royale or venom, royale BW works great for Sega Genesis, for SNES though now I’m using venom. I will try the masks you said, they are in-code masks right?

I haven’t had time to check the hotspot issue, I decided to think about it when everything else was correct.

I’m pretty much done with both grade and glass. There’s one thing I want to add to glass, it’s rolling shutter. The beam scan goes from top to bottom, if you go very fast in sonic or other games the up portion of the screen might be rendering a new “frame” while the bottom is still the previous. Since I can’t do anything below 16.6ms within the shader, I will use skewing, masking and blending to simulate a rolling shutter effect. I’m not sure if I know enough glsl now to do that but worths a try.

2 Likes

For the rolling shutter, you may want to reference mame_hlsl’s humbar code. (As I think that’s the same/similar to what you’re suggesting, at the very least it could be nice reference. You may have to flip the effect though idk.)

1 Like

mmmm humbar is a bit different I think, it’s that black line going upwards… I was referencing the motion blur shader from hunterk, it has examples for calling previous frames and blending, also for luma difference, when higher I can skew (some algebra here for skewing) the image further. The mask can be sine code maybe.

1 Like

The humbar in mame isn’t a black line, it’s different then that you should probably load up mame hlsl and try it, as that would be easier then explaining a motion effect, lol. (I don’t think they named it correctly lol)

It’s doing this brightness peak/fade thing that travels up the screen (hence me saying you may want to flip it.)

What is this??? lol

1 Like

Ok we had slightly different definitions of a black line lol. I took you literally, lolol.

Yeah it’s sorta like that but more blending happens from top to bottom of the effect, just ignore me lol.

1 Like

Here ya go!

https://www.shadertoy.com/view/ws3SRN

It’s a slow version and stuff but it might be nice reference.

1 Like

Here are the ones in guest-dr-venom that can actually improve the image:

Mask type 0 (cgwg aperture) 
Mask type 3 (rotated Lottes) 
The built-in BW slotmask with separate parameter settings 

I’m quickly becoming a fan of the BW slotmask. Set width and size to 1 for 1080p.

A caveat with masks: any pattern more than 1px tall needs an even integer scale for optimum results. Otherwise you can get an optical illusion that looks like inconsistent scanlines, like you would get when using a non-integer scale on the vertical axis. The effect might not be noticeable, though.

EDIT:

Also, here’s a version of guest-dr-venom that replaces the CGWG aperture grille with the CGWG dotmask if you want to give that a try. It’s also a good one to use for improving the image quality.

#version 450

/*
   CRT - Guest - Dr. Venom
   
   Copyright (C) 2018-2020 guest(r) - [email protected]

   Incorporates many good ideas and suggestions from Dr. Venom.
   
   This program is free software; you can redistribute it and/or
   modify it under the terms of the GNU General Public License
   as published by the Free Software Foundation; either version 2
   of the License, or (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
   
*/

layout(push_constant) uniform Push
{
	float TATE, IOS, OS, BLOOM, brightboost, brightboost1, gsl, scanline1, scanline2, beam_min, beam_max, beam_size,
      h_sharp, s_sharp, csize, bsize, warpX, warpY, glow, shadowMask, masksize, vertmask,
      slotmask, slotwidth, double_slot, mcut, maskDark, maskLight, CGWG, gamma_out, spike, inter;
} params;

layout(std140, set = 0, binding = 0) uniform UBO
{
	mat4 MVP;
   vec4 SourceSize;
	vec4 OriginalSize;
	vec4 OutputSize;
	uint FrameCount;
	float bloom;
	float interm;
	float scans;
	float slotms;
} global;

#pragma parameter TATE "TATE Mode" 0.0 0.0 1.0 1.0
#define TATE         params.TATE     // Screen orientation
#pragma parameter IOS "Smart Integer Scaling: 1.0:Y, 2.0:'X'+Y" 0.0 0.0 2.0 1.0
#define IOS          params.IOS     // Smart Integer Scaling
#pragma parameter OS "R. Bloom Overscan Mode" 1.0 0.0 2.0 1.0
#define OS           params.OS     // Do overscan
#pragma parameter BLOOM "Raster bloom %" 0.0 0.0 20.0 1.0
#define BLOOM        params.BLOOM     // Bloom overscan percentage
#pragma parameter brightboost "Bright Boost Dark Pixels" 1.40 0.50 4.00 0.05
#define brightboost  params.brightboost     // adjust brightness
#pragma parameter brightboost1 "Bright Boost Bright Pixels" 1.15 0.50 3.00 0.05
#define brightboost1  params.brightboost1     // adjust brightness
#pragma parameter gsl "Scanline Type" 0.0 0.0 2.0 1.0
#define gsl          params.gsl      // Alternate scanlines
#pragma parameter scanline1 "Scanline beam shape low" 6.0 0.0 15.0 1.0
#define scanline1    params.scanline1      // scanline param, vertical sharpness
#pragma parameter scanline2 "Scanline beam shape high" 8.0 0.0 23.0 1.0 
#define scanline2    params.scanline2      // scanline param, vertical sharpness
#pragma parameter beam_min "Scanline dark" 1.35 0.5 2.5 0.05
#define beam_min     params.beam_min     // dark area beam min - narrow
#pragma parameter beam_max "Scanline bright" 1.05 0.5 2.5 0.05
#define beam_max     params.beam_max     // bright area beam max - wide
#pragma parameter beam_size "Increased bright scanline beam" 0.70 0.0 1.0 0.05
#define beam_size    params.beam_size     // increased max. beam size
#pragma parameter h_sharp "Horizontal sharpness" 5.25 1.5 20.0 0.25
#define h_sharp      params.h_sharp     // pixel sharpness
#pragma parameter s_sharp "Substractive sharpness (relative)" 0.40 0.0 1.0 0.05
#define s_sharp      params.s_sharp     // substractive sharpness
#pragma parameter csize "Corner size" 0.0 0.0 0.07 0.01
#define csize        params.csize     // corner size
#pragma parameter bsize "Border smoothness" 600.0 100.0 600.0 25.0
#define bsize        params.bsize     // border smoothness
#pragma parameter warpX "CurvatureX (default 0.03)" 0.0 0.0 0.125 0.01
#define warpX        params.warpX     // Curvature X
#pragma parameter warpY "CurvatureY (default 0.04)" 0.0 0.0 0.125 0.01
#define warpY        params.warpY     // Curvature Y
#pragma parameter glow "Glow Strength" 0.02 0.0 0.5 0.01
#define glow         params.glow     // Glow Strength
#pragma parameter shadowMask "CRT Mask: 0:CGWG, 1-4:Lottes, 5-6:'Trinitron'" 0.0 -1.0 7.0 1.0
#define shadowMask   params.shadowMask     // Mask Style
#pragma parameter masksize "CRT Mask Size (2.0 is nice in 4k)" 1.0 1.0 2.0 1.0
#define masksize     params.masksize     // Mask Size
#pragma parameter vertmask "PVM Like Colors" 0.0 0.0 0.25 0.01
#define vertmask     params.vertmask     // Vertical mask
#pragma parameter slotmask "Slot Mask Strength" 0.0 0.0 1.0 0.05
#define slotmask     params.slotmask     // Slot Mask ON/OFF
#pragma parameter slotwidth "Slot Mask Width" 2.0 1.0 6.0 0.5
#define slotwidth    params.slotwidth     // Slot Mask Width
#pragma parameter double_slot "Slot Mask Height: 2x1 or 4x1" 1.0 1.0 2.0 1.0
#define double_slot  params.double_slot     // Slot Mask Height
#pragma parameter slotms "Slot Mask Size" 1.0 1.0 2.0 1.0
#define slotms  global.slotms     // Slot Mask Size
#pragma parameter mcut "Mask 5-7 cutoff" 0.25 0.0 0.5 0.05
#define mcut         params.mcut     // Mask 5-7 cutoff
#pragma parameter maskDark "Lottes&Trinitron maskDark" 0.5 0.0 2.0 0.05
#define maskDark     params.maskDark     // Dark "Phosphor"
#pragma parameter maskLight "Lottes&Trinitron maskLight" 1.5 0.0 2.0 0.05
#define maskLight    params.maskLight     // Light "Phosphor"
#pragma parameter CGWG "Mask 0&7 Mask Str." 0.3 0.0 1.0 0.05
#define CGWG         params.CGWG     // CGWG Mask Strength
#pragma parameter gamma_out "Gamma out" 2.4 1.0 3.5 0.05
#define gamma_out    params.gamma_out     // output gamma
#pragma parameter spike "Scanline Spike Removal" 1.0 0.0 2.0 0.10
#define spike params.spike
#pragma parameter inter "Interlace Trigger Resolution :" 400.0 0.0 800.0 25.0
#define inter         params.inter     // interlace resolution
#pragma parameter interm "Interlace Mode (0.0 = OFF):" 1.0 0.0 3.0 1.0
#define interm         global.interm     // interlace mode 
#pragma parameter bloom "Bloom Strength" 0.0 0.0 2.0 0.1
#define bloom         global.bloom     // bloom effect
#pragma parameter scans "Scanline 1&2 Saturation" 0.5 0.0 1.0 0.1
#define scans         global.scans     // scanline saturation

#define COMPAT_TEXTURE(c,d) texture(c,d)
#define TEX0 vTexCoord
#define InputSize SourceSize
#define TextureSize SourceSize

#define SourceSize global.SourceSize
#define OutputSize global.OutputSize
#define gl_FragCoord (vTexCoord * OutputSize.xy)

#pragma stage vertex
layout(location = 0) in vec4 Position;
layout(location = 1) in vec2 TexCoord;
layout(location = 0) out vec2 vTexCoord;

void main()
{
   gl_Position = global.MVP * Position;
   vTexCoord = TexCoord * 1.00001;
}

#pragma stage fragment
layout(location = 0) in vec2 vTexCoord;
layout(location = 0) out vec4 FragColor;
layout(set = 0, binding = 2) uniform sampler2D Source;
layout(set = 0, binding = 3) uniform sampler2D LinearizePass;
layout(set = 0, binding = 4) uniform sampler2D AvgLumPass;
layout(set = 0, binding = 5) uniform sampler2D GlowPass;

#define Texture Source
#define PassPrev5Texture AvgLumPass
#define PassPrev4Texture LinearizePass
#define PassPrev2Texture GlowPass

#define eps 1e-10 

float st(float x)
{
	return exp2(-10.0*x*x);
} 
   
vec3 sw0(vec3 x, vec3 color, float scanline)
{
	vec3 tmp = mix(vec3(beam_min),vec3(beam_max), color);
	vec3 ex = x*tmp;
	return exp2(-scanline*ex*ex);
} 

vec3 sw1(vec3 x, vec3 color, float scanline)
{	
	float mx = max(max(color.r, color.g),color.b);
	x = mix (x, beam_min*x, max(x-0.4*mx,0.0));
	vec3 tmp = mix(vec3(1.2*beam_min),vec3(beam_max), color);
	vec3 ex = x*tmp;
	float br = clamp(0.8*beam_min - 1.0, 0.2, 0.45);
	vec3 res = exp2(-scanline*ex*ex)/(1.0-br+br*mx);
	mx = max(max(res.r,res.g),res.b);
	float scans1 = scans; if (vertmask > 0.0) scans1=1.0;
	return mix(vec3(mx), res, scans1);		
}    

vec3 sw2(vec3 x, vec3 color, float scanline)
{
	float mx = max(max(color.r, color.g),color.b);
	vec3 tmp = mix(vec3(2.5*beam_min),vec3(beam_max), color);
	tmp = mix(vec3(beam_max), tmp, pow(abs(x), color+0.3));
	vec3 ex = x*tmp;
	vec3 res = exp2(-scanline*ex*ex)/(0.6 + 0.4*mx);
	mx = max(max(res.r,res.g),res.b);
	float scans1 = scans; if (vertmask > 0.0) scans1=0.75;	
	return mix(vec3(mx), res, scans1);	
} 

// Shadow mask (1-4 from PD CRT Lottes shader).
vec3 Mask(vec2 pos, vec3 c)
{
	pos = floor(pos/masksize);
	vec3 mask = vec3(maskDark, maskDark, maskDark);
	
	// No mask
	if (shadowMask == -1.0)
	{
		mask = vec3(1.0);
	}       
	



// Phosphor.
else if (shadowMask == 0.0)
{
	pos.x = pos.x + pos.y;
	pos.x = fract(pos.x*0.5);
	float mc = 1.0 - CGWG;
	if (pos.x < 0.5) { mask.r = 1.1; mask.g = mc; mask.b = 1.1; }
	else { mask.r = mc; mask.g = 1.1; mask.b = mc; }
}


    
   
	// Very compressed TV style shadow mask.
	else if (shadowMask == 1.0)
	{
		float line = maskLight;
		float odd  = 0.0;

		if (fract(pos.x/6.0) < 0.5)
			odd = 1.0;
		if (fract((pos.y + odd)/2.0) < 0.5)
			line = maskDark;

		pos.x = fract(pos.x/3.0);
    
		if      (pos.x < 0.333) mask.r = maskLight;
		else if (pos.x < 0.666) mask.g = maskLight;
		else                    mask.b = maskLight;
		
		mask*=line;  
	} 

	// Aperture-grille.
	else if (shadowMask == 2.0)
	{
		pos.x = fract(pos.x/3.0);

		if      (pos.x < 0.333) mask.r = maskLight;
		else if (pos.x < 0.666) mask.g = maskLight;
		else                    mask.b = maskLight;
	} 

	// Stretched VGA style shadow mask (same as prior shaders).
	else if (shadowMask == 3.0)
	{
		pos.x += pos.y*3.0;
		pos.x  = fract(pos.x/6.0);

		if      (pos.x < 0.333) mask.r = maskLight;
		else if (pos.x < 0.666) mask.g = maskLight;
		else                    mask.b = maskLight;
	}

	// VGA style shadow mask.
	else if (shadowMask == 4.0)
	{
		pos.xy = floor(pos.xy*vec2(1.0, 0.5));
		pos.x += pos.y*3.0;
		pos.x  = fract(pos.x/6.0);

		if      (pos.x < 0.333) mask.r = maskLight;
		else if (pos.x < 0.666) mask.g = maskLight;
		else                    mask.b = maskLight;
	}
	
	// Alternate mask 5
	else if (shadowMask == 5.0)
	{
		float mx = max(max(c.r,c.g),c.b);
		vec3 maskTmp = vec3( min( 1.25*max(mx-mcut,0.0)/(1.0-mcut) ,maskDark + 0.2*(1.0-maskDark)*mx));
		float adj = 0.80*maskLight - 0.5*(0.80*maskLight - 1.0)*mx + 0.75*(1.0-mx);	
		mask = maskTmp;
		pos.x = fract(pos.x/2.0);
		if  (pos.x < 0.5)
		{	mask.r  = adj;
			mask.b  = adj;
		}
		else     mask.g = adj;
	}    

	// Alternate mask 6
	else if (shadowMask == 6.0)
	{
		float mx = max(max(c.r,c.g),c.b);
		vec3 maskTmp = vec3( min( 1.33*max(mx-mcut,0.0)/(1.0-mcut) ,maskDark + 0.225*(1.0-maskDark)*mx));
		float adj = 0.80*maskLight - 0.5*(0.80*maskLight - 1.0)*mx + 0.75*(1.0-mx);
		mask = maskTmp;
		pos.x = fract(pos.x/3.0);
		if      (pos.x < 0.333) mask.r = adj;
		else if (pos.x < 0.666) mask.g = adj;
		else                    mask.b = adj; 
	}
	
	// Alternate mask 7
	else if (shadowMask == 7.0)
	{
		float mc = 1.0 - CGWG;
		float mx = max(max(c.r,c.g),c.b);
		float maskTmp = min(1.6*max(mx-mcut,0.0)/(1.0-mcut) , mc);
		mask = vec3(maskTmp);
		pos.x = fract(pos.x/2.0);
		if  (pos.x < 0.5) mask = vec3(1.0);
	}    
	
	return mask;
} 

float SlotMask(vec2 pos, vec3 c)
{
	if (slotmask == 0.0) return 1.0;
	
	pos = floor(pos/slotms);
	float mx = pow(max(max(c.r,c.g),c.b),1.33);
	float mlen = slotwidth*2.0;
	float px = fract(pos.x/mlen);
	float py = floor(fract(pos.y/(2.0*double_slot))*2.0*double_slot);
	float slot_dark = mix(1.0-slotmask, 1.0-0.80*slotmask, mx);
	float slot = 1.0 + 0.7*slotmask*(1.0-mx);
	if (py == 0.0 && px <  0.5) slot = slot_dark; else
	if (py == double_slot && px >= 0.5) slot = slot_dark;		
	
	return slot;
}   
 
// Distortion of scanlines, and end of screen alpha (PD Lottes Curvature)
vec2 Warp(vec2 pos)
{
	pos  = pos*2.0-1.0;    
	pos *= vec2(1.0 + (pos.y*pos.y)*warpX, 1.0 + (pos.x*pos.x)*warpY);
	return pos*0.5 + 0.5;
} 

vec2 Overscan(vec2 pos, float dx, float dy){
	pos=pos*2.0-1.0;    
	pos*=vec2(dx,dy);
	return pos*0.5+0.5;
} 


// Borrowed from cgwg's crt-geom, under GPL

float corner(vec2 coord)
{
	coord *= SourceSize.xy / InputSize.xy;
	coord = (coord - vec2(0.5)) * 1.0 + vec2(0.5);
	coord = min(coord, vec2(1.0)-coord) * vec2(1.0, OutputSize.y/OutputSize.x);
	vec2 cdist = vec2(max(csize, max((1.0-smoothstep(100.0,600.0,bsize))*0.01,0.002)));
	coord = (cdist - min(coord,cdist));
	float dist = sqrt(dot(coord,coord));
	return clamp((cdist.x-dist)*bsize,0.0, 1.0);
}

vec3 declip(vec3 c, float b)
{
	float m = max(max(c.r,c.g),c.b);
	if (m > b) c = c*b/m;
	return c;
}

void main()
{
	float lum = COMPAT_TEXTURE(PassPrev5Texture, vec2(0.1,0.1)).a;

	// Calculating texel coordinates
   
	vec2 texcoord = TEX0.xy;
	if (IOS > 0.0){
		vec2 ofactor = OutputSize.xy/InputSize.xy;
		vec2 intfactor = round(ofactor);
		vec2 diff = ofactor/intfactor;
		float scan = mix(diff.y, diff.x, TATE);
		texcoord = Overscan(texcoord*(SourceSize.xy/InputSize.xy), scan, scan)*(InputSize.xy/SourceSize.xy);
		if (IOS == 1.0) texcoord = mix(vec2(TEX0.x, texcoord.y), vec2(texcoord.x, TEX0.y), TATE);
	}
   
	float factor  = 1.00 + (1.0-0.5*OS)*BLOOM/100.0 - lum*BLOOM/100.0;
	texcoord  = Overscan(texcoord*(SourceSize.xy/InputSize.xy), factor, factor)*(InputSize.xy/SourceSize.xy);
	vec2 pos  = Warp(texcoord*(TextureSize.xy/InputSize.xy))*(InputSize.xy/TextureSize.xy);
	vec2 pos0 = Warp(TEX0.xy*(TextureSize.xy/InputSize.xy))*(InputSize.xy/TextureSize.xy);
   
	vec2 coffset = vec2(0.5, 0.5);
	if ((interm == 1.0 || interm == 2.0) && inter <= mix(SourceSize.y, SourceSize.x, TATE)) coffset = vec2(0.5, 0.0);	
	
	vec2 ps = SourceSize.zw;
	vec2 OGL2Pos = pos * SourceSize.xy - coffset;
	vec2 fp = fract(OGL2Pos);
	
	vec2 dx = vec2(ps.x,0.0);
	vec2 dy = vec2(0.0, ps.y);
   
	// Reading the texels
	vec2 x2 = 2.0*dx;
	vec2 y2 = 2.0*dy;

	vec2 offx = dx;
	vec2 off2 = x2;
	vec2 offy = dy;
	float fpx = fp.x;
	if(TATE > 0.5)
	{
		offx = dy;
		off2 = y2;
		offy = dx;
		fpx = fp.y;
	}
	float  f = (TATE < 0.5) ? fp.y : fp.x;
	
	vec2 pC4 = floor(OGL2Pos) * ps + 0.5*ps;
	
	float zero = exp2(-h_sharp);   
	float sharp1 = s_sharp * zero;
	
	float wl3 = 2.0 + fpx;
	float wl2 = 1.0 + fpx;
	float wl1 =       fpx;
	float wr1 = 1.0 - fpx;
	float wr2 = 2.0 - fpx;
	float wr3 = 3.0 - fpx;

	wl3*=wl3; wl3 = exp2(-h_sharp*wl3);	
	wl2*=wl2; wl2 = exp2(-h_sharp*wl2);
	wl1*=wl1; wl1 = exp2(-h_sharp*wl1);
	wr1*=wr1; wr1 = exp2(-h_sharp*wr1);
	wr2*=wr2; wr2 = exp2(-h_sharp*wr2);
	wr3*=wr3; wr3 = exp2(-h_sharp*wr3);
	
	float fp1 = 1.-fpx;

	float twl3 = max(wl3 - sharp1, 0.0);
	float twl2 = max(wl2 - sharp1, mix(0.0,mix(-0.17, -0.025, fp.x),float(s_sharp > 0.05)));
	float twl1 = max(wl1 - sharp1, 0.0);
	float twr1 = max(wr1 - sharp1, 0.0);	
	float twr2 = max(wr2 - sharp1, mix(0.0,mix(-0.17, -0.025, 1.-fp.x),float(s_sharp > 0.05)));
	float twr3 = max(wr3 - sharp1, 0.0);
	
	float wtt = 1.0/(twl3+twl2+twl1+twr1+twr2+twr3);
	float wt  = 1.0/(wl2+wl1+wr1+wr2);
	bool sharp = (s_sharp > 0.05);
	
	vec3 l3 = COMPAT_TEXTURE(PassPrev4Texture, pC4 -off2).xyz;
	vec3 l2 = COMPAT_TEXTURE(PassPrev4Texture, pC4 -offx).xyz;
	vec3 l1 = COMPAT_TEXTURE(PassPrev4Texture, pC4      ).xyz;
	vec3 r1 = COMPAT_TEXTURE(PassPrev4Texture, pC4 +offx).xyz;
	vec3 r2 = COMPAT_TEXTURE(PassPrev4Texture, pC4 +off2).xyz;
	vec3 r3 = COMPAT_TEXTURE(PassPrev4Texture, pC4 +offx+off2).xyz;
	
	vec3 sl2 = COMPAT_TEXTURE(Texture, pC4 -offx).xyz;
	vec3 sl1 = COMPAT_TEXTURE(Texture, pC4      ).xyz;
	vec3 sr1 = COMPAT_TEXTURE(Texture, pC4 +offx).xyz;
	vec3 sr2 = COMPAT_TEXTURE(Texture, pC4 +off2).xyz;
	
	vec3 color1 = (l3*twl3 + l2*twl2 + l1*twl1 + r1*twr1 + r2*twr2 + r3*twr3)*wtt;
	
	vec3 colmin = min(min(l1,r1), min(l2,r2));
	vec3 colmax = max(max(l1,r1), max(l2,r2));
	
	if (sharp) color1 = clamp(color1, colmin, colmax);
	
	vec3 gtmp = vec3(gamma_out*0.1); 
	vec3 scolor1 = color1;
	
	scolor1 = (sl2*wl2 + sl1*wl1 + sr1*wr1 + sr2*wr2)*wt;
	scolor1 = pow(scolor1, gtmp);	vec3 mcolor1 = scolor1;
	scolor1 = mix(color1, scolor1, spike);
	
	pC4+=offy;
	
	l3 = COMPAT_TEXTURE(PassPrev4Texture, pC4 -off2).xyz;
	l2 = COMPAT_TEXTURE(PassPrev4Texture, pC4 -offx).xyz;
	l1 = COMPAT_TEXTURE(PassPrev4Texture, pC4      ).xyz;
	r1 = COMPAT_TEXTURE(PassPrev4Texture, pC4 +offx).xyz;
	r2 = COMPAT_TEXTURE(PassPrev4Texture, pC4 +off2).xyz;
	r3 = COMPAT_TEXTURE(PassPrev4Texture, pC4 +offx+off2).xyz;
	
	sl2 = COMPAT_TEXTURE(Texture, pC4 -offx).xyz;
	sl1 = COMPAT_TEXTURE(Texture, pC4      ).xyz;
	sr1 = COMPAT_TEXTURE(Texture, pC4 +offx).xyz;
	sr2 = COMPAT_TEXTURE(Texture, pC4 +off2).xyz;
	
	vec3 color2 = (l3*twl3 + l2*twl2 + l1*twl1 + r1*twr1 + r2*twr2 + r3*twr3)*wtt;
	
	colmin = min(min(l1,r1), min(l2,r2));
	colmax = max(max(l1,r1), max(l2,r2));
	
	if (sharp) color2 = clamp(color2, colmin, colmax);

	vec3 scolor2 = color2;
	
	scolor2 = (sl2*wl2 + sl1*wl1 + sr1*wr1 + sr2*wr2)*wt;
	scolor2 = pow(scolor2, gtmp);	vec3 mcolor2 = scolor2;
	scolor2 = mix(color2, scolor2, spike);
	
	vec3 color0 = color1;

	if ((interm == 1.0 || interm == 2.0) && inter <= mix(SourceSize.y, SourceSize.x, TATE))
	{
		pC4-= 2.*offy;
	
		l3 = COMPAT_TEXTURE(PassPrev4Texture, pC4 -off2).xyz;
		l2 = COMPAT_TEXTURE(PassPrev4Texture, pC4 -offx).xyz;
		l1 = COMPAT_TEXTURE(PassPrev4Texture, pC4      ).xyz;
		r1 = COMPAT_TEXTURE(PassPrev4Texture, pC4 +offx).xyz;
		r2 = COMPAT_TEXTURE(PassPrev4Texture, pC4 +off2).xyz;
		r3 = COMPAT_TEXTURE(PassPrev4Texture, pC4 +offx+off2).xyz;
	
		color0 = (l3*twl3 + l2*twl2 + l1*twl1 + r1*twr1 + r2*twr2 + r3*twr3)*wtt;
	
		colmin = min(min(l1,r1), min(l2,r2));
		colmax = max(max(l1,r1), max(l2,r2));
	
		if (sharp) color0 = clamp(color0, colmin, colmax);
	}
	
	// calculating scanlines
	
	float shape1 = mix(scanline1, scanline2, f);
	float shape2 = mix(scanline1, scanline2, 1.0-f);	
	
	float wt1 = st(f);
	float wt2 = st(1.0-f);

	vec3 color00 = color1*wt1 + color2*wt2;
	vec3 scolor0 = scolor1*wt1 + scolor2*wt2;
	vec3 mcolor  = (mcolor1*wt1 + mcolor2*wt2)/(wt1+wt2);
	
	vec3 ctmp = color00/(wt1+wt2);
	vec3 sctmp = scolor0/(wt1+wt2);
	
	vec3 tmp = pow(ctmp, vec3(1.0/gamma_out));
	mcolor = clamp(mix(ctmp, mcolor, 1.5),0.0,1.0);
	mcolor = pow(mcolor, vec3(1.4/gamma_out));
	
	vec3 w1,w2 = vec3(0.0);
	
	vec3 cref1 = mix(sctmp, scolor1, beam_size);
	vec3 cref2 = mix(sctmp, scolor2, beam_size);
	
	vec3 shift = vec3(-vertmask, vertmask, -vertmask);
	
	vec3 f1 = clamp(vec3(f) + shift*0.5*(1.0+f), 0.0, 1.0); 
	vec3 f2 = clamp(vec3(1.0-f) - shift*0.5*(2.0-f), 0.0, 1.0);
	
	if (gsl == 0.0) { w1 = sw0(f1,cref1,shape1); w2 = sw0(f2,cref2,shape2);} else
	if (gsl == 1.0) { w1 = sw1(f1,cref1,shape1); w2 = sw1(f2,cref2,shape2);} else
	if (gsl == 2.0) { w1 = sw2(f1,cref1,shape1); w2 = sw2(f2,cref2,shape2);}
	
	vec3 color = color1*w1 + color2*w2;
	color = min(color, 1.0);
	
	if (interm > 0.5 && inter <= mix(SourceSize.y, SourceSize.x, TATE)) 
	{
		float line_no  = floor(mod(mix(  OGL2Pos.y,  OGL2Pos.x, TATE),2.0));		
		float frame_no = floor(mod(float(global.FrameCount),2.0));
		
		if (interm == 1.0)
		{
			vec3 icolor1 = mix(color1, color0, abs(line_no-frame_no));
			vec3 icolor2 = mix(color1, color2, abs(line_no-frame_no));
			color = mix(icolor1, icolor2, f);
		} 
		else if (interm == 2.0)
		{
			float v0 = exp2(-2.25*2.25);			
			float v1 = exp2(-2.25*(0.5+f)*(0.5+f)) - v0;
			float v2 = exp2(-2.25*(0.5-f)*(0.5-f)) - v0;
			float v3 = exp2(-2.25*(1.5-f)*(1.5-f)) - v0;
			color = (v1*color0 + v2*color1 + v3*color2)/(v1+v2+v3);
		}
		else color = mix(color1, color2, f);
	}
	
	ctmp = 0.5*(ctmp+tmp);
	color*=mix(brightboost, brightboost1, max(max(ctmp.r,ctmp.g),ctmp.b));
   
	// Apply Mask
	
	vec3 orig1 = color; float pixbr = max(max(ctmp.r,ctmp.g),ctmp.b); vec3 orig = ctmp; w1 = w1+w2; float w3 = max(max(w1.r,w1.g),w1.b); 
	vec3 cmask = vec3(1.0); vec3 cmask1 = cmask; vec3 one = vec3(1.0);
	
	cmask*= (TATE < 0.5) ? Mask(gl_FragCoord.xy * 1.000001,mcolor) :
		Mask(gl_FragCoord.yx * 1.000001,mcolor);
	
	color = color*cmask;
	
	color = min(color,1.0);
	
	cmask1 *= (TATE < 0.5) ? SlotMask(gl_FragCoord.xy * 1.000001,tmp) :
		SlotMask(gl_FragCoord.yx * 1.000001,tmp);		
	
	color = color*cmask1; cmask = cmask*cmask1; cmask = min(cmask, 1.0);
	
	vec3 Bloom = COMPAT_TEXTURE(PassPrev2Texture, pos).xyz;
   
	vec3 Bloom1 = 2.0*Bloom*Bloom;
	Bloom1 = min(Bloom1, 0.75);
	float bmax = max(max(Bloom1.r,Bloom1.g),Bloom1.b);
	float pmax = 0.825;
	Bloom1 = min(Bloom1, pmax*bmax)/pmax;
	
	Bloom1 = mix(min( Bloom1, color), Bloom1, 0.5*(orig1+color));
	
	Bloom1 = bloom*Bloom1;
	
	color = color + Bloom1;
	
	color = min(color, 1.0);
	if (interm < 0.5 || inter > mix(SourceSize.y, SourceSize.x, TATE)) color = declip(color, pow(w3,0.6));	
	color = min(color, mix(cmask,one,0.5));

	color = color + glow*Bloom;
		
	color = pow(color, vec3(1.0/gamma_out));
	
	FragColor = vec4(color*corner(pos0), 1.0);
}
2 Likes

Quick comparison of Nestopia’s “canonical” color palette vs “composite direct FBx”

Greens and reds are much better using composite direct FBx.

EDIT: the last SMB3 shot I posted was actually using “NTSC hardware.”

Oh yeah, I guess I should also mention that my backlight is at 100% for these settings. Otherwise the mask strength needs to come down quite a bit and/or mask cutoff needs to be increased.

Canonical:

Composite Direct FBx:

@Dogway

I think I spoke too soon… Something might be up with the LCD color space.

Does this seem correct…? Getting similar clipping with all gamma modes.

shaders = "1"
shader0 = "shaders_slang/misc/grade.slang"
filter_linear0 = "true"
wrap_mode0 = "clamp_to_border"
mipmap_input0 = "false"
alias0 = "WhitePointPass"
float_framebuffer0 = "false"
srgb_framebuffer0 = "false"
scale_type_x0 = "source"
scale_x0 = "1.000000"
scale_type_y0 = "source"
scale_y0 = "1.000000"
parameters = "g_space_out;g_gamma_out;g_gamma_in;g_gamma_type;g_vignette;g_vstr;g_vpower;g_crtgamut;g_hue_degrees;g_I_SHIFT;g_Q_SHIFT;g_I_MUL;g_Q_MUL;wp_temperature;g_sat;g_vibr;g_lum;g_cntrst;g_mid;g_lift;blr;blg;blb;wlr;wlg;wlb;rg;rb;gr;gb;br;bg;LUT_Size1;LUT1_toggle;LUT_Size2;LUT2_toggle"
g_space_out = "0.000000"
g_gamma_out = "2.200000"
g_gamma_in = "2.400000"
g_gamma_type = "1.000000"
g_vignette = "1.000000"
g_vstr = "40.000000"
g_vpower = "0.200000"
g_crtgamut = "4.000000"
g_hue_degrees = "0.000000"
g_I_SHIFT = "0.000000"
g_Q_SHIFT = "0.000000"
g_I_MUL = "1.000000"
g_Q_MUL = "1.000000"
wp_temperature = "9305.000000"
g_sat = "0.000000"
g_vibr = "0.000000"
g_lum = "0.000000"
g_cntrst = "0.000000"
g_mid = "0.500000"
g_lift = "0.000000"
blr = "0.000000"
blg = "0.000000"
blb = "0.000000"
wlr = "1.000000"
wlg = "1.000000"
wlb = "1.000000"
rg = "0.000000"
rb = "0.000000"
gr = "0.000000"
gb = "0.000000"
br = "0.000000"
bg = "0.000000"
LUT_Size1 = "16.000000"
LUT1_toggle = "0.000000"
LUT_Size2 = "64.000000"
LUT2_toggle = "0.000000"
textures = "SamplerLUT1;SamplerLUT2"
SamplerLUT1 = "shaders_slang/crt/shaders/guest/lut/sony_trinitron1.png"
SamplerLUT1_linear = "true"
SamplerLUT1_wrap_mode = "clamp_to_border"
SamplerLUT1_mipmap = "false"
SamplerLUT2 = "shaders_slang/crt/shaders/guest/lut/sony_trinitron2.png"
SamplerLUT2_linear = "true"
SamplerLUT2_wrap_mode = "clamp_to_border"
SamplerLUT2_mipmap = "false"