New CRT shader from Guest + CRT Guest Advanced updates

rafan · 18 November 2020 11:08

Question about games / cores that run at double horizontal resolution. As we talked about previously this is happening for example sometimes in PSX core and always in PUAE core when resolution is set to Automatic. Lets forget about PSX core for now, as it -dynamically- switches between these resolutions (which is difficult to create solution for).

I only focus on PUAE core, as it can be set to a fixed / predictable situation. I’ll explain below.

Because of shader picture quality I previously set PUAE core option Resolution to “Low 360px”.

So now I noticed that interlace screens don’t work well when I set the core option for video resolution from Automatic to Low 360px. But I would like interlace screens to work well.

For interlace screens to work properly I need to set the following core options:

Video > Resolution: Automatic
Video > Line Mode: Automatic

If you look closely the above conforms for the most situations to:

Video > Resolution: High 720px
Video > Line Mode: Single

and only when interlace happens it switches the second to: Video > Line Mode: double

Everything fine for interlace. But because Resolution is now set to “Automatic” it effectively is at “High 720px” for everything and then shader output is too sharp for most games, as we discussed. But I need to have this at automatic for interlace to work properly.

So now my question: your shader is interpolating 5 or 6 pixels horizontally for effects? Could you tell me what I need to adjust in the shader code to have it interpolate the double amount of pixels (for example 10-12 pixels), such that it works the same when PUAE core is set to resolution Automatic (effectively High 720px)? Since it is always at 720px, never dynamically switching like PSX core, this should work?

I then would like to keep this version / patch seperately from the normal shader, so no need to worry to add extra code to main shader. I keep it purely as a patched shader for this core, all other cores use your normal shader. Or maybe, if it would be no burden to the shader, it’s even possible to do add a switch that internally in the shader, doubles the amount of pixels it takes for horizontal interpolation effect? That would be most charming solution! I would only switch it on for PUAE core and off for all other cores, and everything would be fine.

Please tell me it is possible to create this patch to accomodate for fixed double horizontal / 720px resolution

guest.r · 18 November 2020 12:28

It’s 6 texels, increased from 5 to get better filtering with high res content and even some increase in speed related with auxiliary stuff. It’s not going to change though, because i recently managed to get the shader performance similar to the old version. And the filtering is different, would make most users angry.

But you can try the NTSC version first (has 12 texel filtering) and i can adapt it for you (without the ntsc passes). This could work out even nicer since the NTSC version has a nice horizontal deconvergence implemented. Need some feedback first lol.

Edit: almost forgot, you should replace the first two ntsc passes with the stock shader and set stock scalings to 1.0.

Edit: At the beginning of the crt-guest-dr-venom2-ntsc.slang main code part this line

vec4 SourceSize = global.OriginalSize * vec4(2.0, 1.0, 0.5, 1.0);

should be replaced with:

vec4 SourceSize = global.OriginalSize;

rafan · 18 November 2020 23:23

Thank you this works superb! I made a before and after screenshot comparison, and the difference is practically negligible. This is with same sharpness setting (5.20) for both situations, great!

Note: The “vec4 SourceSize” adjustment does not seem necessary, as it makes the picture at sharpness 5.20 a bit too blurry. EDIT: I think I need to test that a bit further by raising sharpness and see how it then compares.

Would it be possible to add this as a switch to your NTSC shader, or call it “RGB” mode (composite/s-video/rgb) that effectively disables the first two passes? Or does it need a separate preset?

Regardless of the above: could you maybe add the following two features from your main shader to the NTSC schader?:

Gamma correct
“Darken Scanline ‘edges’”

I guess it can’t hurt, only make things better?

EDIT: been playing around a bit more with the ntsc shader without the two ntsc passes for use with the PUAE core, and I think the default settings are looking pretty great, well done . I only changed the vertical glow radius by half, as I think the 2:1 pixel ratio also created this ratio for the glow, and with the V radius halved it looks nicely 1:1 hor/vert glow. Also interlaced screens look pretty great. All around very nice!

I’m not sure whether it would be possible, but of course the nicest thing would be if you could detect the source horizontal resolution and switch between the 6 and 12 texel interpolation accordingly (and maybe halve vert glow if in 12 texel mode), that could make it work for the dynamical resolution switching in the PSX core too?

EDIT2: did a test with the NTSC shader, replacing both NTSC passes with stock, but leaving the scaling factors for both of them as configured (4 and 0.5) and tried this on the SNES core. Just to compare with regular shader, and I have to say I like the “NTSC-stripped” quite a bit as an alternative to the regular shader. The “ntsc-stripped” one is capable of giving a softer image, without becoming blurry. Quite nifty .

Syh · 18 November 2020 14:16

Most likely needs a separate preset, afaik custom versions of the ntsc passes would have to be done for what you’re asking, mainly just a switch in the passes making them do nothing.

Nesguy · 18 November 2020 23:54

I think this is progress, still working on it though. Gray bars are still a bit weird. I like how the image stays nicely bright even with the increased mask strength, though. Still working without grade for now, since I broke it somehow and can’t figure out how to get it working again.

As always, scaling of the image has a pretty drastic effect on color/brightness/etc, only way to really judge settings is by firing up a game and seeing how it looks.

shaders = "10"
shader0 = "shaders_slang/crt/shaders/guest/lut/lut.slang"
filter_linear0 = "false"
wrap_mode0 = "clamp_to_border"
mipmap_input0 = "false"
alias0 = ""
float_framebuffer0 = "false"
srgb_framebuffer0 = "false"
scale_type_x0 = "source"
scale_x0 = "1.000000"
scale_type_y0 = "source"
scale_y0 = "1.000000"
shader1 = "shaders_slang/crt/shaders/guest/color-profiles.slang"
filter_linear1 = "true"
wrap_mode1 = "clamp_to_border"
mipmap_input1 = "false"
alias1 = ""
float_framebuffer1 = "false"
srgb_framebuffer1 = "false"
scale_type_x1 = "source"
scale_x1 = "1.000000"
scale_type_y1 = "source"
scale_y1 = "1.000000"
shader2 = "shaders_slang/crt/shaders/guest/d65-d50.slang"
filter_linear2 = "true"
wrap_mode2 = "clamp_to_border"
mipmap_input2 = "false"
alias2 = "WhitePointPass"
float_framebuffer2 = "false"
srgb_framebuffer2 = "false"
scale_type_x2 = "source"
scale_x2 = "1.000000"
scale_type_y2 = "source"
scale_y2 = "1.000000"
shader3 = "shaders_slang/crt/shaders/guest/afterglow.slang"
filter_linear3 = "true"
wrap_mode3 = "clamp_to_border"
mipmap_input3 = "false"
alias3 = "AfterglowPass"
float_framebuffer3 = "false"
srgb_framebuffer3 = "false"
scale_type_x3 = "source"
scale_x3 = "1.000000"
scale_type_y3 = "source"
scale_y3 = "1.000000"
shader4 = "shaders_slang/crt/shaders/guest/avg-lum.slang"
filter_linear4 = "true"
wrap_mode4 = "clamp_to_border"
mipmap_input4 = "true"
alias4 = "AvgLumPass"
float_framebuffer4 = "true"
srgb_framebuffer4 = "false"
scale_type_x4 = "source"
scale_x4 = "1.000000"
scale_type_y4 = "source"
scale_y4 = "1.000000"
shader5 = "shaders_slang/crt/shaders/guest/linearize.slang"
filter_linear5 = "true"
wrap_mode5 = "clamp_to_border"
mipmap_input5 = "false"
alias5 = "LinearizePass"
float_framebuffer5 = "true"
srgb_framebuffer5 = "false"
scale_type_x5 = "source"
scale_x5 = "1.000000"
scale_type_y5 = "source"
scale_y5 = "1.000000"
shader6 = "shaders_slang/crt/shaders/guest/linearize_scanlines.slang"
filter_linear6 = "true"
wrap_mode6 = "clamp_to_border"
mipmap_input6 = "false"
alias6 = "ScanPass"
float_framebuffer6 = "true"
srgb_framebuffer6 = "false"
scale_type_x6 = "source"
scale_x6 = "1.000000"
scale_type_y6 = "source"
scale_y6 = "1.000000"
shader7 = "shaders_slang/crt/shaders/guest/blur_horiz2.slang"
filter_linear7 = "true"
wrap_mode7 = "clamp_to_border"
mipmap_input7 = "false"
alias7 = ""
float_framebuffer7 = "true"
srgb_framebuffer7 = "false"
scale_type_x7 = "absolute"
scale_x7 = "800"
scale_type_y7 = "source"
scale_y7 = "1.000000"
shader8 = "shaders_slang/crt/shaders/guest/blur_vert2.slang"
filter_linear8 = "true"
wrap_mode8 = "clamp_to_border"
mipmap_input8 = "false"
alias8 = "GlowPass"
float_framebuffer8 = "true"
srgb_framebuffer8 = "false"
scale_type_x8 = "absolute"
scale_x8 = "800"
scale_type_y8 = "absolute"
scale_y8 = "600"
shader9 = "shaders_slang/crt/shaders/guest/crt-guest-dr-venom2.slang"
filter_linear9 = "true"
wrap_mode9 = "clamp_to_border"
mipmap_input9 = "false"
alias9 = ""
float_framebuffer9 = "false"
srgb_framebuffer9 = "false"
scale_type_x9 = "viewport"
scale_x9 = "1.000000"
scale_type_y9 = "viewport"
scale_y9 = "1.000000"
parameters = "TNTC;CP;CS;WP;wp_saturation;SW;AR;PR;AG;PG;AB;PB;sat;lsmooth;GAMMA_INPUT;SIZEH;GLOW_FALLOFF_H;SIZEV;GLOW_FALLOFF_V;glow;bloom;TATE;IOS;OS;BLOOM;brightboost;brightboost1;gsl;scanline1;scanline2;beam_min;beam_max;beam_size;decon;vertmask;scans;scansub;spike;h_sharp;s_sharp;csize;bsize;warpX;warpY;shadowMask;masksize;maskDark;maskLight;CGWG;mcut;slotmask;slotwidth;double_slot;slotms;inter;interm;gamma_out"
TNTC = "0.000000"
CP = "0.000000"
CS = "0.000000"
WP = "-50.000000"
wp_saturation = "1.000000"
SW = "1.000000"
AR = "0.070000"
PR = "0.050000"
AG = "0.070000"
PG = "0.050000"
AB = "0.070000"
PB = "0.050000"
sat = "0.100000"
lsmooth = "0.800000"
GAMMA_INPUT = "2.400000"
SIZEH = "4.000000"
GLOW_FALLOFF_H = "0.300000"
SIZEV = "4.000000"
GLOW_FALLOFF_V = "0.300000"
glow = "0.100000"
bloom = "0.000000"
TATE = "0.000000"
IOS = "0.000000"
OS = "1.000000"
BLOOM = "0.000000"
brightboost = "1.000000"
brightboost1 = "1.000000"
gsl = "2.000000"
scanline1 = "1.000000"
scanline2 = "20.000000"
beam_min = "2.000000"
beam_max = "1.000000"
beam_size = "0.000000"
decon = "0.000000"
vertmask = "0.000000"
scans = "0.500000"
scansub = "0.000000"
spike = "0.000000"
h_sharp = "3.000001"
s_sharp = "1.000000"
csize = "0.000000"
bsize = "600.000000"
warpX = "0.000000"
warpY = "0.000000"
shadowMask = "2.000000"
masksize = "1.000000"
maskDark = "0.500000"
maskLight = "1.000000"
CGWG = "0.300000"
mcut = "1.150000"
slotmask = "0.000000"
slotwidth = "2.000000"
double_slot = "1.000000"
slotms = "1.000000"
inter = "400.000000"
interm = "1.000000"
gamma_out = "2.200000"
textures = "SamplerLUT1;SamplerLUT2;SamplerLUT3"
SamplerLUT1 = "shaders_slang/crt/shaders/guest/lut/sony_trinitron1.png"
SamplerLUT1_linear = "true"
SamplerLUT1_wrap_mode = "clamp_to_border"
SamplerLUT1_mipmap = "false"
SamplerLUT2 = "shaders_slang/crt/shaders/guest/lut/sony_trinitron2.png"
SamplerLUT2_linear = "true"
SamplerLUT2_wrap_mode = "clamp_to_border"
SamplerLUT2_mipmap = "false"
SamplerLUT3 = "shaders_slang/crt/shaders/guest/lut/other1.png"
SamplerLUT3_linear = "true"
SamplerLUT3_wrap_mode = "clamp_to_border"
SamplerLUT3_mipmap = "false"

raise scanline shape low to 5.00:

guest.r · 19 November 2020 16:32

I worked on the crt-guest-dr-venom shaders some more and i think they are complete or very close to completion (alas regressions) for now.

The changes are mostly nice or useful, nothing that would intentionally mess things up.

Notable changes (compared with previous version):

afterglow reworked, from humbleness to something normal.
decent speed increase compared with the initial new release, 30% faster than early versions, 20% faster from last posted, for NTSC and standard version. Is on-pair with regular versions now.
hires version added, for hires content with horizontally doubled pixels and high resolution games in general. Has horizontal deconvergence feature due to nature of filtering.
brightboost has influence over mask clipping (couldn’t do this before).
brightboost dark can now better distribute brightness (reference gamma up to 1.4 from 1.25).
some default parameter tweaks, like sharpness, interlace resolution lowered to 350.
some cleanup in the ‘guest folder’
perhaps some minor tweaks. -The “Scanline darken ‘edges’” effect can be uncommented at parameters and later in shader.

Download link:

https://mega.nz/file/5xwhwSRR#ipaipUOxYR36LFzpMnCtOZ9yW9uteT-N5Knlfcbc3As

Feedback is welcome.

rafan · 19 November 2020 22:23

Quick question, what happened to the LUT pass in the presets?

My custom LUT doesn’t seem to work anymore, and I see you removed the LUT pass from the presets?

EDIT: ah I see you moved it to the “pre-shaders-…slang” pass. I just needed to set my lut size there Back to some testing

guest.r · 19 November 2020 23:21

Hey, i knew you’d figure it out!

Meanwhile i added 2 more ‘interlace’ modes 4&5 to the hi-res and ntsc versions. That was still on my mind to add. Also did some parameter range tweaking and cleanup.

Download link:

https://mega.nz/file/g0oXVAgb#7rwEzGqAbTBBxyxwfJ2-QwzVxUMmj8JOfVeJ37VjxWY

Edit: 4k support added for new interlaced modes (untested yet).

BendBombBoom · 20 November 2020 06:16

With Venom 2 it looks any passes you add to the top of the stack don’t work.

Syh · 20 November 2020 06:26

Might have to do with how it works now, what are you trying to put at the end of the chain?

guest.r · 20 November 2020 08:01

Good catch! I was already on this, will add a stock as the first pass, which can be replaced and/or previous passes added.

rafan · 20 November 2020 09:46

Could you tell us the difference between the five types of interlace modes?

guest.r · 20 November 2020 15:05

Mode0 disables ‘interlacing’. Mode1 is the core mode, but only works with 50/60 fps content. Mode2 is ‘baked’ mode 1 with no flickering, nice of 30fps content, screenshots or when convenient. Mode3 is ‘vertical linear filtering mode’, if Mode2 is to blurry. Mode4-Mode12 - best test it on preferably high res content.

Fixed now. It’s now possible to add custom prior passes or/and replace the first stock pass. ‘Interlace modes 6-12’ added (for different output resolutions convenience).

Download link

https://mega.nz/file/o9Qn1aBQ#Ir1pWTAAiAcddS8wmVsWXGBTm4Fa8vgMgWLRIiRRAmk

Edit: updated

rafan · 20 November 2020 14:03

Noticed a minor thing with the hires shader, the gamma out range is missing the 1.0 value in the range. This breaks gamma out as soon as you change the parameter in the shader.

#pragma parameter gamma_out "Gamma out" 1.8 5.0 0.05
#define gamma_out    params.gamma_out     // output gamma

Been playing a bit more with the ntsc shader and quite like it

Could you explain a bit the effect on what the scaling settings for the two passes below are actually achieving (first scaling x times 4.0 and then halving 0.5) in combination with the 12 pixel filtering? Just curious how you came about it as the effect is great; it gives a certain softness to the image without it becoming blurry

shader3 = shaders/guest/crt-gdv-new/ntsc/ntsc-pass1.slang
shader4 = shaders/guest/crt-gdv-new/ntsc/ntsc-pass2.slang

filter_linear3 = false
filter_linear4 = false

scale_type_x3 = source
scale_type_y3 = source
scale_x3 = 4.0
scale_y3 = 1.0
frame_count_mod3 = 2
float_framebuffer3 = true

scale_type4 = source
scale_x4 = 0.5
scale_y4 = 1.0 
alias4 = NtscPass

With regards to gamma, I noticed you changed the default from 2.4/2.4 to 1.8/1.8, so I wondered why 1.8/1.8 is darker, whereas previously I assumed this gamma_input / gamma out was purely a gamma neutral operation “decoding/encoding” but apparently it’s not.

So my question below relates to gamma correction for “vintage game developer CRT gamma” versus my current LCD gamma . Supppose I have a gamma neutral output of the shader, but I would like to correct for the difference of the developers CRT monitor gamma, versus mine.

First case would not need adaption: Developer was on a pro monitor PVM (which is studio calibrated at gamma 2.2) versus mine sRGB LCD monitor which is also about 2.2.

But now suppose second case: Game developer developed end of the 80’s on a vintage CRT monitor, which has a gamma between 2.35 - 2.55 (depending on make/model)

So I would like to make an adjustment that would accurately describe the gamma correction of that developers CRT gamma versus my monitor. I.e. in extreme case I want to do a “2.2 / 2.55” correction or in least extreme case “2.2 / 2.35” gamma correction. So how do I do this in your shader?

I have four options:

Lower or raise gamma_input and gamma_out in tandem (the image brightens for in tandem up or darkens for in tandem down)
Raise or lower only gamma_input (darkens or brightens the image)
Raise or lower only gamma_out (brightens or darkens the image)
Use “Gamma correction” parameter (default is at 1.0, raising brightens image and vice versa)

Do you have an idea what is the most correct way to do an accurate gamma adjustment for the use case I put in bold above? I don’t want to end up with a correction which I think is theoretically correct, but in effect is actually breaking the gamma curve.

guest.r · 20 November 2020 15:21

A very useful find indeed! Maintaining 3 versions sometimes lets some bugs slip. It should be caught by the shader compiler though.

The ntsc passes don’t linearize input pixels by default, resulting in output, which looks like it should by these circumstances. I was holding to this look a bit with lowering the gamma. Its should be increased to 2.2 though since i thought about it a bit more.

Gamma is neutral, but there are more circumstances which involve color interpolation in the process. Glow has less influence with lower gamma, and the general look is affected by scanlines calculation - is interpolation with lower gamma, which produces a darker image. Also ‘color edges are darker’ etc.

You can’t break gamma cureve this simply in the shader. In a pass-through situation you get (color^(2.4))^(1.0/2.2) = color^1.091 etc. Interpolation makes things less predictable, so it’s more a matter of taste. You should use input gamma of your liking and tweak output gamma together with other variables. Without interpolation and shader parameter influence it would be much more simple though.

It’s IMO a way to let later passes to do some interpolation, since you typically still can do a 2-3x resizing afterwards. And later shaders have acceptable circumstances and don’t have to downsample themselves.

The 12 pixel interpolations with accommodated coefficients is a bit different, but it gives very nice results with the ntsc preset indeed. That’s why i gave myself some work, because former approach could be improved out of the box.

Anyway, shader is updated at above dl. link.

rafan · 20 November 2020 22:24

Out of interest what is your reasoning currently for putting both at 2.2? Is it to be as gamma neutral to sRGB gamma monitors as possible?

guest.r · 20 November 2020 23:08

It’s not obligatory, but it’s more in the line with the standard ntsc preset character regarding horizontal filtering, but still very similar to the standard ntsc preset with other gamma related calculations.

Otherwise it’s tricky to assume direct comparisons with a shader that does a lot of stuff. But, to answer a previous question regarding different input/output gamma combinations - it will mess with saturation and it’s very hard to tell what the monitor will do, since it considers the gamma output neutral. like 2.2/2.2 or 2.4/2.4 etc. In other words, if you are happy with non-shaded output presentation of a game in Retroarch, then combinations like 2.2/2.55 will look much too bright and de-saturated.

rafan · 21 November 2020 09:18

I was thinking maybe one last thing about gamma could be considered.

If I look at the wiki page for sRGB (https://en.wikipedia.org/wiki/SRGB) I see that the transfer function is actually only power gamma (2.4) from a certain luminance threshhold, below which it is linear (you already are aware of these functions, but just for discussion purpose).

The actual sRGB forward transformation is:

and the reverse transformation is:

So I was thinking since the user’s LCD displays these days use sRGB gamma, it may be more correct if the shader encoding function (gamma_out) doesn’t use the current pure power gamma, but uses above sRGB transfer function?

Since these sRGB transfer functions seem trivial to implement, it may be an interesting exercise to implement a switch for both the “gamma correction” and the “gamma_out”, making both switchable between pure power and sRGB gamma? That way we can observe how both power and sRGB transfer functions impact color fidelity in the low range.

I guess gamma_input (the decoding to linear space) should always use pure power, since that is what it was originally “encoded” at?

guest.r · 21 November 2020 10:01

There are reasons for such procedures indeed, also rooting a bit in 32 bit per color buffers and interpolation of very dark colors. But with libretro float frame buffers can be used, which opens a nice way for different gamma combinations. Sure, there are some philosophical issues with representable numbers in an 16 bit floating point environment. For exampe, value A is quite small and representable, value B is large and representable, but happens that A + B = B (A+B is not representable) in a limited floating point environment, because of the numerical nature of the implementation.

But the story itself is quite old even for Libretro shaders.

If i would use sRGB buffers after linearization, then i should use the transformation, otherwise floating framebuffers do quite fine.

rafan · 21 November 2020 10:48

Thanks for the answer. Only I seem to miss your point.

You seem to be saying that it doesn’t matter if you use the power function for encoding gamma at the end of your shader or use sRGB function (which is what LCD user monitor decodes at)?

So are you saying that it doesn’t matter if you do your current conversion/encoding at end of shader with the power function:

u=(Color/255)
Power function: Color=(u^(Gamma_out))*255

Or you would do the sRGB gamma conversion situation 2 at end of shader

u =(color / 255)

If u < 0.00313
Color = (12.92u)*255
Else
Color = (1.055u^(1/2.4) - 0.055)*255

I think grade.slang has a similar switch, but I think it would be more accurate if this switch would be implemented at the stages in your shader, as mentioned in my previous post.

So are we not understanding each other? In my mind this is simple Color manipulation to accurately match the different gamma’s. I.e raw emulator values are power gamma encoded (since created in era of CRT), but what you send to user’s LCD display is decoded with sRGB gamma function. So what goes out to the used monitor should be encoded as such.

Is Dogway with his grade shader wrong on this also then?

Could you in light of the above give some more explanation why you think these conversions are not relevant in the context of your shader?