Converting cgwg's CRT-Geom to .glsl or .cg

Maister · 15 January 2017 04:33

Does the shader in question have “half” types in them by any chance? This is known to kill nVidia drivers with Cg. It is a bug in the shader at any rate if that’s the case.

hunterk · 15 January 2017 04:33

Yeah, seems to have quite a few, in fact. Doing a find-and-replace for ‘float’ in each of the individual shaders doesn’t seem to hurt their functioning at all. Does that keep it from crashing for you, xadox?

Maister · 15 January 2017 04:33

It’s only an issue on the input variables to vertex shader, btw. Otherwise, half can be used.

xadox · 15 January 2017 04:33

After replacing all “half” words with “float” it looks like it is working.

xadox · 15 January 2017 04:34

Is the Shader working on PS3 also?

Hyllian · 15 January 2017 04:35

I’ve got the latest crt-interlaced-halation shader in cg format.

But, before putting on repo, I’d like some tests be done to set the best default parameters.

For now, I think I’m not getting correct parameters on PS3. (I’m using default ones).

Download here: https://anonfiles.com/file/85d04174fbd475e638d93670aa266430

It’s a 3-passe shader. The scale factor must be 1x, 1x and 3x (or higher).

hunterk · 15 January 2017 04:35

If you would like to make a flat version, comment out #define CURVATURE and then set corner size to something very small, like 0.001.

EDIT: Here’s a cgp for it:

shaders = 3
shader0 = crt-interlaced-halation-pass0.cg
shader1 = crt-interlaced-halation-pass1.cg
shader2 = crt-interlaced-halation-pass2.cg

filter_linear0 = false
scale_type0 = source

filter_linear1 = false
scale_type1 = source

filter_linear2 = false
scale_type2 = viewport

And here’s a version of the shader with the interlacing detection ifdefed, since stacking it with other shaders really wrecks the detection:

/* COMPATIBILITY
   - HLSL compilers
   - Cg   compilers
*/

/*
    CRT-interlaced-halation shader - pass2

    Like the CRT-interlaced shader, but adds a subtle glow around bright areas
    of the screen.

    Copyright (C) 2010-2012 cgwg, Themaister and DOLLS

    This program is free software; you can redistribute it and/or modify it
    under the terms of the GNU General Public License as published by the Free
    Software Foundation; either version 2 of the License, or (at your option)
    any later version.

    (cgwg gave their consent to have the original version of this shader
    distributed under the GPL in this message:

        http://board.byuu.org/viewtopic.php?p=26075#p26075

        "Feel free to distribute my shaders under the GPL. After all, the
        barrel distortion code was taken from the Curvature shader, which is
        under the GPL."
    )
*/

        // Comment the next line to disable interpolation in linear gamma (and
        // gain speed).
        #define LINEAR_PROCESSING

        // Enable screen curvature.
        #define CURVATURE

        // Enable 3x oversampling of the beam profile
        #define OVERSAMPLE

        // Use the older, purely gaussian beam profile
        //#define USEGAUSSIAN
        
        // Use interlacing detection; may interfere with other shaders if combined
        //#define INTERLACED

        // Macros.
        #define FIX(c) max(abs(c), 1e-5);
        #define PI 3.141592653589

        #ifdef LINEAR_PROCESSING
        #       define TEX2D(c) pow(tex2D(ORIG.texture, (c)), float4(CRTgamma))
        #else
        #       define TEX2D(c) tex2D(ORIG.texture, (c))
        #endif

                // START of parameters

                // gamma of simulated CRT
                static float CRTgamma = 2.4;
                // gamma of display monitor (typically 2.2 is correct)
                static float monitorgamma = 2.2;
                // overscan (e.g. 1.02 for 2% overscan)
                static float2 overscan = float2(1.00,1.00);
                // aspect ratio
                static float2 aspect = float2(1.0, 0.75);
                // lengths are measured in units of (approximately) the width
                // of the monitor simulated distance from viewer to monitor
                static float d = 2.0;
                // radius of curvature
                static float R = 1.5;
                // tilt angle in radians
                // (behavior might be a bit wrong if both components are
                // nonzero)
                const static float2 angle = float2(0.0,-0.0);
                // size of curved corners
                static float cornersize = 0.03;
                // border smoothness parameter
                // decrease if borders are too aliased
                static float cornersmooth = 80.0;

                // END of parameters


        float intersect(float2 xy, float2 sinangle, float2 cosangle)
        {
                float A = dot(xy,xy)+d*d;
                float B = 2.0*(R*(dot(xy,sinangle)-d*cosangle.x*cosangle.y)-d*d);
                float C = d*d + 2.0*R*d*cosangle.x*cosangle.y;
                return (-B-sqrt(B*B-4.0*A*C))/(2.0*A);
        }

        float2 bkwtrans(float2 xy, float2 sinangle, float2 cosangle)
        {
                float c = intersect(xy, sinangle, cosangle);
                float2 point = float2(c)*xy;
                point -= float2(-R)*sinangle;
                point /= float2(R);
                float2 tang = sinangle/cosangle;
                float2 poc = point/cosangle;
                float A = dot(tang,tang)+1.0;
                float B = -2.0*dot(poc,tang);
                float C = dot(poc,poc)-1.0;
                float a = (-B+sqrt(B*B-4.0*A*C))/(2.0*A);
                float2 uv = (point-a*sinangle)/cosangle;
                float r = R*acos(a);
                return uv*r/sin(r/R);
        }

        float2 fwtrans(float2 uv, float2 sinangle, float2 cosangle)
        {
                float r = FIX(sqrt(dot(uv,uv)));
                uv *= sin(r/R)/r;
                float x = 1.0-cos(r/R);
                float D = d/R + x*cosangle.x*cosangle.y+dot(uv,sinangle);
                return d*(uv*cosangle-x*sinangle)/D;
        }

        float3 maxscale(float2 sinangle, float2 cosangle)
        {
                float2 c = bkwtrans(-R * sinangle / (1.0 + R/d*cosangle.x*cosangle.y), sinangle, cosangle);
                float2 a = float2(0.5,0.5)*aspect;
                float2 lo = float2(fwtrans(float2(-a.x,c.y), sinangle, cosangle).x,
                             fwtrans(float2(c.x,-a.y), sinangle, cosangle).y)/aspect;
                float2 hi = float2(fwtrans(float2(+a.x,c.y), sinangle, cosangle).x,
                             fwtrans(float2(c.x,+a.y), sinangle, cosangle).y)/aspect;
                return float3((hi+lo)*aspect*0.5,max(hi.x-lo.x,hi.y-lo.y));
        }

        // Calculate the influence of a scanline on the current pixel.
        //
        // 'distance' is the distance in texture coordinates from the current
        // pixel to the scanline in question.
        // 'color' is the colour of the scanline at the horizontal location of
        // the current pixel.
        float4 scanlineWeights(float distance, float4 color)
        {
                // "wid" controls the width of the scanline beam, for each RGB
                // channel The "weights" lines basically specify the formula
                // that gives you the profile of the beam, i.e. the intensity as
                // a function of distance from the vertical center of the
                // scanline. In this case, it is gaussian if width=2, and
                // becomes nongaussian for larger widths. Ideally this should
                // be normalized so that the integral across the beam is
                // independent of its width. That is, for a narrower beam
                // "weights" should have a higher peak at the center of the
                // scanline than for a wider beam.
        #ifdef USEGAUSSIAN
                float4 wid = 0.3 + 0.1 * pow(color, float4(3.0));
                float4 weights = float4(distance / wid);
                return 0.4 * exp(-weights * weights) / wid;
        #else
                float4 wid = 2.0 + 2.0 * pow(color, float4(4.0));
                float4 weights = float4(distance / 0.3);
                return 1.4 * exp(-pow(weights * rsqrt(0.5 * wid), wid)) / (0.6 + 0.2 * wid);
        #endif
        }

struct orig
{
    float2 tex_coord;
    uniform float2 video_size;
    uniform float2 texture_size;
    uniform float2 output_size;
    uniform sampler2D texture;
};

struct input
{
    float2 video_size;
    float2 texture_size;
    float2 output_size;
    float frame_count;
    float frame_direction;
    float frame_rotation;
};

struct out_vertex {
    float4 position : POSITION;
    float4 color : COLOR;
    float2 texCoord : TEXCOORD0;
        float2 one;
        float mod_factor;
        float2 ilfac;
        float3 stretch;
        float2 sinangle;
        float2 cosangle;
};

/* VERTEX_SHADER */
out_vertex main_vertex
(
    float4 position : POSITION,
    float4 color : COLOR,
    float2 texCoord : TEXCOORD0,

    uniform float4x4 modelViewProj,
    orig ORIG,
    uniform input IN
)
{

    out_vertex OUT;

    OUT.position = mul(modelViewProj, position);
    OUT.color = color;


                // Precalculate a bunch of useful values we'll need in the fragment
                // shader.
                OUT.sinangle = sin(angle);
                OUT.cosangle = cos(angle);
                OUT.stretch = maxscale(OUT.sinangle, OUT.cosangle);
    OUT.texCoord = texCoord;


                OUT.ilfac = float2(1.0,floor(IN.video_size.y/200.0));

                // The size of one texel, in texture-coordinates.
                OUT.one = OUT.ilfac / ORIG.texture_size;

                // Resulting X pixel-coordinate of the pixel we're drawing.
                OUT.mod_factor = texCoord.x * ORIG.texture_size.x * IN.output_size.x / ORIG.video_size.x;

    return OUT;
}

/* FRAGMENT SHADER */
float4 main_fragment(in out_vertex VAR, uniform sampler2D decal : TEXUNIT0, orig ORIG, uniform input IN) : COLOR
{

/*        float2 transform(float2 coord)
        {
                coord *= ORIG.texture_size / ORIG.video_size;
                coord = (coord-float2(0.5))*aspect*stretch.z+stretch.xy;
                return (bkwtrans(coord)/overscan/aspect+float2(0.5)) * ORIG.video_size / ORIG.texture_size;
        }

        float corner(float2 coord)
        {
                coord *= ORIG.texture_size / ORIG.video_size;
                coord = (coord - float2(0.5)) * overscan + float2(0.5);
                coord = min(coord, float2(1.0)-coord) * aspect;
                float2 cdist = float2(cornersize);
                coord = (cdist - min(coord,cdist));
                float dist = sqrt(dot(coord,coord));
                return clamp((cdist.x-dist)*cornersmooth,0.0, 1.0);
        }
*/

                // Here's a helpful diagram to keep in mind while trying to
                // understand the code:
                //
                //  |      |      |      |      |
                // -------------------------------
                //  |      |      |      |      |
                //  |  01  |  11  |  21  |  31  | <-- current scanline
                //  |      | @    |      |      |
                // -------------------------------
                //  |      |      |      |      |
                //  |  02  |  12  |  22  |  32  | <-- next scanline
                //  |      |      |      |      |
                // -------------------------------
                //  |      |      |      |      |
                //
                // Each character-cell represents a pixel on the output
                // surface, "@" represents the current pixel (always somewhere
                // in the bottom half of the current scan-line, or the top-half
                // of the next scanline). The grid of lines represents the
                // edges of the texels of the underlying texture.

                // Texture coordinates of the texel containing the active pixel.
        #ifdef CURVATURE
                float2 cd = VAR.texCoord;
                cd *= ORIG.texture_size / ORIG.video_size;
                cd = (cd-float2(0.5))*aspect*VAR.stretch.z+VAR.stretch.xy;
                float2 xy =  (bkwtrans(cd, VAR.sinangle, VAR.cosangle)/overscan/aspect+float2(0.5)) * ORIG.video_size / ORIG.texture_size;

        #else
                float2 xy = VAR.texCoord;
        #endif
                float2 cd2 = xy;
                cd2 *= ORIG.texture_size / ORIG.video_size;
                cd2 = (cd2 - float2(0.5)) * overscan + float2(0.5);
                cd2 = min(cd2, float2(1.0)-cd2) * aspect;
                float2 cdist = float2(cornersize);
                cd2 = (cdist - min(cd2,cdist));
                float dist = sqrt(dot(cd2,cd2));
                float cval = clamp((cdist.x-dist)*cornersmooth,0.0, 1.0);

                float2 xy2 = ((xy*ORIG.texture_size/ORIG.video_size-float2(0.5))*float2(1.0,1.0)+float2(0.5))*IN.video_size/IN.texture_size;
                // Of all the pixels that are mapped onto the texel we are
                // currently rendering, which pixel are we currently rendering?
                #ifdef INTERLACED
                    float2 ilfloat = float2(0.0,VAR.ilfac.y > 1.5 ? fmod(float(IN.frame_count),2.0) : 0.0);
                    float2 ratio_scale = (xy * IN.texture_size - float2(0.5) + ilfloat)/VAR.ilfac;
                #else
                    float2 ratio_scale = xy * IN.texture_size - float2(0.5);
                #endif
                
        #ifdef OVERSAMPLE
                float filter = (IN.video_size / (IN.output_size * IN.texture_size)) * ratio_scale;
        #endif
                float2 uv_ratio = frac(ratio_scale);

                // Snap to the center of the underlying texel.
                #ifdef INTERLACED
                    xy = (floor(ratio_scale)*VAR.ilfac + float2(0.5) - ilfloat) / IN.texture_size;
                #else
                    xy = (floor(ratio_scale) + float2(0.5)) / IN.texture_size;
                #endif

                // Calculate Lanczos scaling coefficients describing the effect
                // of various neighbour texels in a scanline on the current
                // pixel.
                float4 coeffs = PI * float4(1.0 + uv_ratio.x, uv_ratio.x, 1.0 - uv_ratio.x, 2.0 - uv_ratio.x);

                // Prevent division by zero.
                coeffs = FIX(coeffs);

                // Lanczos2 kernel.
                coeffs = 2.0 * sin(coeffs) * sin(coeffs / 2.0) / (coeffs * coeffs);

                // Normalize.
                coeffs /= dot(coeffs, float4(1.0));

                // Calculate the effective colour of the current and next
                // scanlines at the horizontal location of the current pixel,
                // using the Lanczos coefficients above.
    float4 col  = clamp(mul(coeffs, float4x4(
                    TEX2D(xy + float2(-VAR.one.x, 0.0)),
                    TEX2D(xy),
                    TEX2D(xy + float2(VAR.one.x, 0.0)),
                    TEX2D(xy + float2(2.0 * VAR.one.x, 0.0)))),
            0.0, 1.0);
    float4 col2 = clamp(mul(coeffs, float4x4(
                    TEX2D(xy + float2(-VAR.one.x, VAR.one.y)),
                    TEX2D(xy + float2(0.0, VAR.one.y)),
                    TEX2D(xy + VAR.one),
                    TEX2D(xy + float2(2.0 * VAR.one.x, VAR.one.y)))),
            0.0, 1.0);


        #ifndef LINEAR_PROCESSING
                col  = pow(col , float4(CRTgamma));
                col2 = pow(col2, float4(CRTgamma));
        #endif

                // Calculate the influence of the current and next scanlines on
                // the current pixel.
                float4 weights  = scanlineWeights(uv_ratio.y, col);
                float4 weights2 = scanlineWeights(1.0 - uv_ratio.y, col2);
        #ifdef OVERSAMPLE
                uv_ratio.y =uv_ratio.y+1.0/3.0*filter;
                weights = (weights+scanlineWeights(uv_ratio.y, col))/3.0;
                weights2=(weights2+scanlineWeights(abs(1.0-uv_ratio.y), col2))/3.0;
                uv_ratio.y =uv_ratio.y-2.0/3.0*filter;
                weights=weights+scanlineWeights(abs(uv_ratio.y), col)/3.0;
                weights2=weights2+scanlineWeights(abs(1.0-uv_ratio.y), col2)/3.0;
        #endif
                float3 mul_res  = (col * weights + col2 * weights2).rgb;
                mul_res += pow(tex2D(decal, xy2).rgb, float3(monitorgamma))*0.1;
                mul_res *= float3(cval);

                // dot-mask emulation:
                // Output pixels are alternately tinted green and magenta.
                float3 dotMaskWeights = lerp(
                        float3(1.0, 0.7, 1.0),
                        float3(0.7, 1.0, 0.7),
                        floor(fmod(VAR.mod_factor, 2.0))
                    );

                mul_res *= dotMaskWeights;

                // Convert the image gamma for display on our output device.
                mul_res = pow(mul_res, float3(1.0 / monitorgamma));

                // Color the texel.
                return float4(mul_res, 1.0);
}

xadox · 15 January 2017 04:35

Great that the filter will come to PS3

Hyllian · 15 January 2017 04:35

Ok, managed to tweak some parameters and released on common-shaders.

xadox · 15 January 2017 04:35

Thx. The filter is looking great. Will there be also a variant without so much halation?

Hyllian · 15 January 2017 04:35

Humm, the level of halation is done in the first two passes.

Maybe I need to put some ifdefs there to control that level.

xadox · 15 January 2017 04:35

Just to be clear. I am using the filter preset in \common-shaders-master\crt\crt-interlaced-halation. Since I have found not any other new crt filter.

GPDP · 15 January 2017 04:39

So I’m in the process of switching from the deprecated XML version of this shader to the Cg one, but there’s one thing holding me back. On the XML version, I can artificially make it “sharper” by multiplying the Texture Size through the following little hack:

uniform vec2 rubyTextureSize; vec2 TextureSize = vec2(rubyTextureSize.x, rubyTextureSize.y);

Now, I figure it could be a bit more elegant by adding a define or something, but whatever. In any case, I can’t seem to port over this code over to the Cg version, since I am pretty unfamiliar with its syntax still, and I’m pretty much a noob at this as it is. Any ideas as to how I could achieve the same thing on the Cg shader? I’ve tried various things, but again, I don’t know the proper syntax, and the shader just fails to load.

Hyllian · 15 January 2017 04:39

You just need this line:


float2 TextureSize = IN.texture_size;

The uniform IN is already passed to vertex and fragment shaders, so there’s no need to declare it as in the xml version.

hunterk · 15 January 2017 04:39

I would need a bit more context to be sure, but you should be able to replace the 'vec2’s with float2s, and rubyTextureSizes with IN.texture_size. You don’t need to declare the uniform because it’s part of the input struct already.

GPDP · 15 January 2017 04:39

Bah, I just realized I screwed up the code I posted and didn’t make my objective clear. This is what the code I inserted in the XML version actually looks like:

vec2 TextureSize = vec2(2.0 * rubyTextureSize.x, rubyTextureSize.y);

What I want to do is multiply the Texture Size ONLY on the x axis by two, which makes the shader look sharper, while leaving the y axis alone. This works wonderfully on the XML shader, but I don’t know how to manage the same thing on the Cg one.

Hyllian · 15 January 2017 04:39


float2 TextureSize = float2(2.0*IN.texture_size.x, IN.texture_size.y);

GPDP · 15 January 2017 04:39

Tried it, got the following errors:

CRT-Geom-Flat.cg(78) : error C1031: swizzle mask element not present in operand "y"
CRT-Geom-Flat.cg(150) : error C1031: swizzle mask element not present in operand "y"
CRT-Geom-Flat.cg(169) : warning C7011: implicit cast from "float2" to "float"

Here’s the pastebin of the code as it currently stands:

http://pastebin.com/pAgR3ztN

A few things to note: on this shader, I deleted a bunch of code pertaining to things I don’t use, such as curvature, distortion, phosphor emulation, etc. so as to simplify it. I then tried to tie the sharpness setting to a define. If I comment out the define, the shader works like it should, without increased sharpness, which means I set up the define correctly. So the issue lies elsewhere with the code that you gave me, but I have no idea what the error log means.

Hyllian · 15 January 2017 04:39

There’s an error in this struct:


struct input      // These are things that can attach to 'IN' to achieve similar function to ruby* stuff in existing GLSL shaders. NOTE: the ruby* prefix is deprecated but still works in GLSL for legacy compatibility purposes.
{
  float2 video_size;
  float2 texCoord_size;
  float2 output_size;
  float frame_count;
  float frame_direction;
  float frame_rotation;
  float texture_size;
  sampler2D texture : TEXUNIT0;
};

texture_size should be float2 and not float.

hunterk · 15 January 2017 04:39

>.< I’ve gotten that error before and never knew what was causing it. I’ve reused that bit in a lot of shaders, too…

FAKEDIT: I just fixed it in the conversion thread where I copy that boilerplate stuff from, so I don’t forget and make the same mistake again later.