A new little shader i did (glsl)

DariusG · 1 May 2023 06:01

it’s glsl for the moment. At crt folder

DariusG · 1 May 2023 06:02

I was thinking hey what is missing from library? A cool shader to play at night

DariusG · 4 May 2023 07:45

There are some devices, old phones etc that have a 480p resolution, let’s see which glsl shaders work well with that screen (eg scanlines appear normally) at 2x scaling

In bold the ones most likely these devices could run

crt-aperture
crt-nes-mini
crt-potato-warm
fake-crt-geom-potato
fake-crt-geom
fakelottes no mask no curvature
grits
phosphorlut
zfast-composite

i have such an old phone around that i tweak here and there just to see what can be done and created this. Mask is not actually needed on such a screen

github.com

metallic77/shaders_glsl-slang/blob/main/htc_desire_hd.glsl

/*
    A hack of crt-pi - A Raspberry Pi friendly CRT shader.
    By DariusG 	
    Copyright (C) 2015-2016 davej

    This program is free software; you can redistribute it and/or modify it
    under the terms of the GNU General Public License as published by the Free
    Software Foundation; either version 2 of the License, or (at your option)
    any later version.

*/

#pragma parameter MASK "Mask brightness" 0.70 0.0 1.0 0.05
#pragma parameter SCANLINE_WEIGHT "Scanline weight" 0.5 0.0 1.0 0.05
#pragma parameter BLOOM "Bloom factor" 0.7 0.0 1.5 0.05

#define pi 6.28306


#ifdef GL_ES

This file has been truncated. show original

kokoko3k · 4 May 2023 17:35

Watch out for sign(), it is doggy slow on my igp.

Other trick you can try is dy=(dy * dy)*(dy * dy) that may trig parallel computations.

DariusG · 4 May 2023 18:07

Thanks i will check that! zfast avoids that by multiplying 3 times Actually is even smarter using a “MAD-mutiply and add” for defining position. Smart coding there, knowing what he does.

kokoko3k · 4 May 2023 19:19

yeah, misread the code, sorry for that.

But you can still do (dy * dy) * (dy * 8), it should be equivalent to the specific code; or, generally speaking, you could put several variables into a vecx() and do operations versus another one and there are good chances they will be executed in parallel.

Also, I see you count cycles even in vertex shader; that isn’t really important, since they are executed much, less times (one per vertex) than the fragment shader (one per ‘pixel’), so that the more things you put in there (nothing that needs granularity ofc), the more you gain.

Just yesterday i benchmarked dozens of native functions (that’s where i discovered sign()'s slowness ).

Jump in #programming-shaders on discord’s retroarch server if you wish, coding for slow machines is so much fun

DariusG · 4 May 2023 19:17

Yeah that’s really fun, trying to squeeze as much as you can

DariusG · 5 May 2023 05:34

Reduced to 7 cycles with this, pixel is 1.0/TextureSize.x executed in vertex. Filter nearest

    vec3 color1 = 0.50*COMPAT_TEXTURE(Source,vTexCoord).rgb;
	vec3 color2 = 0.25*COMPAT_TEXTURE(Source,vTexCoord + vec2(pixel*blur,0.0)).rgb;
	vec3 color3 = 0.25*COMPAT_TEXTURE(Source,vTexCoord - vec2(pixel*blur,0.0)).rgb;
	vec3 color = (color1 + color2 + color3);

DariusG · 5 May 2023 05:45

Reduced to 3 and gamma fixed too EDIT: actually 5 cycles

vec3 color1 = 0.75*COMPAT_TEXTURE(Source,vTexCoord).rgb;
vec3 color2 = COMPAT_TEXTURE(Source,vTexCoord + vec2(pixel*blur,0.0)).rgb;
vec3 color3 = COMPAT_TEXTURE(Source,vTexCoord - vec2(pixel*blur,0.0)).rgb;
vec3 color = (color1 + 0.25*(color2 * color3)); //MAD

860 fps vs 900 stock. That’s 5% speed loss with filter and gamma fixed.

DariusG · 7 May 2023 15:53

Did a "light"version of crt-geom, check the differences Keeps the scanlines of the original with much faster filter of hyllian (about the same look, a bit sharper) etc.

crt-geom-light

crt-geom

Comparison with “fast” shaders

DariusG · 12 May 2023 06:01

seems this line fixes “sine” scanlines for good (brightness problem and scanlines alignment).

WEIGHT is scanline parameter, 0.0 to 1.0

multiply res to

WEIGHT * sin(fract(vTexCoord.y*SourceSize.y)*3.141529) +(1.0-WEIGHT);

DariusG · 16 May 2023 05:55

How tiny can a shader be and still be convincing?

    float OGL2Pos = vTexCoord.y*SourceSize.y;
	float cent = floor(OGL2Pos)+0.5;
	float ycoord = cent*SourceSize.w; 

    vec3 res = texture2D(Source, vec2(vTexCoord.x, ycoord)).rgb;
    res *= SCANLINE_WEIGHT*sin(fract(OGL2Pos*0.999)*pi) + 1.0-SCANLINE_WEIGHT ;
    
    float lum = dot(vec3(0.22,0.7,0.08), res);
	res *= mix(0.85,BLOOM, lum);

HyperspaceMadness · 16 May 2023 11:37

That’s pretty impressive! How does it look at low resolution, like 600p or 400p?

DariusG · 16 May 2023 12:12

It looks fine on my netbook at 2x integer scale and runs about 100 fps snes9x2005 and that has a 9 gflops gpu Faster than absolutely anything else. And it still has a filter, gamma correct and bloom

DariusG · 25 May 2023 12:09

Another formula that creates that PVM look scanlines (probably will inject this to crt-beam glsl)

    #define SCANLINE 1.0
	vec3 res= texture2D(Source, coords).rgb*vec3(1.0,0.93,1.18); // NTSC colors look alike
	float lum = dot(res,vec3(0.15,0.55,0.10));
	res *= 1.0-(f-0.5)*(f-0.5)*45.0*(SCANLINE*(1.0-lum));

30-40% scanlines look better on low brightness monitors

kokoko3k · 25 May 2023 17:38

Nice and simple, but that formula doesn’t scale the scanline width linearly, it jumps alot in the upper values; I’d try to pow the lum to something around 0.1

It all depends on the intended input lum usecase ofc.

gizmo98 · 25 May 2023 17:33

This looks really cool! Due to the brightness loss, well defined scanlines are producing, a high brightness panel (or a OLED) should be used. But that’s ok.

Just one question. White produces the biggest vertical spread. 100% R or G or B separately look smaller. Is this right? If i look at CRT screenshots 100% white or single 100% R, G, B have the same vertical spread. Maybe modify lum as a vec3?

DariusG · 25 May 2023 17:49

I can change the lum to max r/g/b and will solve the width issue.

@kokoko3k i am open to suggestions/improvements.

DariusG · 25 May 2023 18:00

Here it is with max r/g/b value in to play. weight is used to have some control or else white (or full r/g/b) has no scanlines

float lum = max(max(res.r * weightr,res.g * weightg),res.b * weightb);

kokoko3k · 25 May 2023 18:21

Why don’t you just make vec3 lum=res?

That way, every subpixel would have independent height.

Next, for the width(height) thing, I’d try lum=pow(lum,vec3(0.1)) and see what it gives.

Edit:

You can feed the shader with a linear black to white gradient, and see if the apparent gamma is preserved (it could be altered by the scanline width versus lum, that’s the thing.)