Optimal shaders, AR and visual settings

hunterk · 15 January 2017 04:18

There’s some gamma correction bits in the gaussian blur fragments I stole from cgwg, but I think it gets pretty much negated further down the line. That in mind, I would recommend adding the gamma-corrected NTSC filter in this case, though it may not be necessary at very large scale factors, which is where this shader should really shine (along with the 480pvert LUT).

GPDP · 15 January 2017 04:18

I see. I’ll tinker with it later.

It’s a shame that the shader requires such high resolutions, although I was able to get to 6x scale by combining two displays:

Sadly, I can’t go higher than 7x with aspect correction disabled, and I suspect it needs to go higher still. 480pvert.png doesn’t seem to work that well even at 6x or 7x. I want to at least get to 8x, but RetroArch doesn’t seem to want to expand the window size beyond the desktop resolution, which I guess is understandable, but dammit, I want to make a hi-res mock up of what the future looks like!

Another thing I want to bring up: what about scanlines? The shader does not appear to produce them. I know a lot of older, lower quality shadow mask TVs hardly showed any if at all, particularly if they were on the small side, so I guess this shader reproduces that look on a bigger scale, but I think for authenticity’s sake, there should be scanlines present. Could adding a pass to generate simple black scanlines before the shader does anything else work?

Oh, and just for shits, I made an aperture grille LUT for use with this shader. This uses 3 pixels for each phosphor color, so this should work nicely at 3x, 6x, and 9x scale, I think. I made one that used 10 pixels like your other LUTs, but I decided replicating the grille was kinda redundant, so I went with this instead. I could go overboard and include the usual two stabilizing wires for max authenticity, but nah, I think not.

hunterk · 15 January 2017 04:18

hey, 6x looks pretty good

At these low resolutions, I think leaving aspect uncorrected (i.e. 8:7) looks a lot better than stretching to 4:3, since it distorts the grid a bit, but that shouldn’t matter at the full 10x or whatever, I don’t think.

Sweet aperture grill I was hoping people would want to make their own. I look forward to trying it out.

EDIT: oh yeah, I forgot to mention: yes, adding in scanlines at some point early in the chain would probably be a good idea. I’ll see if I can make that happen.

GPDP · 15 January 2017 04:18

I’m thinking RetroArch’s gonna need an option to load the LUT and tell the shader to load it, rather than having to edit the shader manually to tell it which to load. Although that probably would mean another addition to the XML spec. Oh well, no biggie.

In any case, I cooked up another aperture grille-esque LUT, only this one is geared toward 4x scale at 4:3. I basically took CRT-Geom’s concept of tinting pixels green and magenta and used the information on the shader to approximate that, although it’s not perfect and the colors don’t match between this shader and the other. Not quite sure where the difference between the two lies, but whatever. It’s fine as is.

hunterk · 15 January 2017 04:18

I added scanlines into the first pass and it looks quite nice with your crtgeom and aperture2 LUTs. Oddly, it doesn’t look any different with the LUTs I made, except for mangling some colors… As usual, I think that will be a different story at very large scales. Here’s the code for the scanline version (set to work with your crtgeom LUT by default):

<?xml version="1.0" encoding="UTF-8"?>
<!-- 
     PhosphorLUT-scanlines v1.0
    This shader uses an external lookup texture (LUT) to create a shadow mask with individual RGB phosphor lenses.
    You can swap out the LUTs by changing the 'file' referenced in Line 11.
    You can also uncomment line 157 if you're using this shader on a low-resolution display (i.e., 1080p or below) to fix the colors a bit.
    Author: hunterk
     License: GPL (contains code from other GPL shaders).
-->
<shader language="GLSL">
   <texture id="phosphorLUT" file="crtgeom.png" filter="linear"/>

   <vertex><![CDATA[
        uniform vec2 rubyTextureSize;
     uniform vec2 rubyInputSize;
     uniform vec2 rubyOutputSize;

     varying vec2 omega;
      void main()
      {
         gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
         gl_TexCoord[0].xy = gl_MultiTexCoord0.xy;
         gl_TexCoord[1].xy = gl_MultiTexCoord1.xy;
             omega = vec2(3.1415 * rubyOutputSize.x * rubyTextureSize.x / rubyInputSize.x, 2.0 * 3.1415 * rubyTextureSize.y);
      }
   ]]></vertex>
   <fragment filter="linear" outscale="2.0"><![CDATA[
      uniform sampler2D rubyTexture;
      uniform sampler2D phosphorLUT;
     uniform vec2 rubyTextureSize;
     uniform vec2 rubyInputSize;
     uniform vec2 rubyOutputSize;

     varying vec2 omega;

     const float base_brightness = 0.95;
     const vec2 sine_comp = vec2(0.05, 0.85);

      void main()
      {
         vec4 frame = texture2D(rubyTexture, gl_TexCoord[0].xy);
       vec4 inverse = 1 - texture2D(rubyTexture, gl_TexCoord[0].xy);
         vec4 screen = texture2D(phosphorLUT, gl_TexCoord[1].xy);
         vec4 final = screen - inverse;
         
             vec4 scanline = final * (base_brightness + dot(sine_comp * sin(gl_TexCoord[0].xy * omega), vec2(1.0)));
    gl_FragColor = clamp(scanline, 0.0, 1.0);
       
         //gl_FragColor = screen - inverse;
      }
   ]]></fragment>
   <fragment filter="linear" outscale="1.0"><![CDATA[
uniform sampler2D     rubyTexture;
uniform vec2 rubyTextureSize;
uniform vec2 rubyInputSize;
uniform vec2 rubyOutputSize;

#define CRTgamma 2.5
#define display_gamma 2.0
#define TEX2D(c) pow(texture2D(rubyTexture,(c)),vec4(CRTgamma))

void main()
{
  vec2 xy = gl_TexCoord[0].st;
  float oney = 1.0/rubyTextureSize.y;

  float wid = 2.0;

  float c1 = exp(-1.0/wid/wid);
  float c2 = exp(-4.0/wid/wid);
  float c3 = exp(-9.0/wid/wid);
  float c4 = exp(-16.0/wid/wid);
  float norm = 1.0 / (1.0 + 2.0*(c1+c2+c3+c4));

  vec4 sum = vec4(0.0);

  sum += TEX2D(xy + vec2(0.0, -4.0 * oney)) * vec4(c4);
  sum += TEX2D(xy + vec2(0.0, -3.0 * oney)) * vec4(c3);
  sum += TEX2D(xy + vec2(0.0, -2.0 * oney)) * vec4(c2);
  sum += TEX2D(xy + vec2(0.0, -1.0 * oney)) * vec4(c1);
  sum += TEX2D(xy);
  sum += TEX2D(xy + vec2(0.0, +1.0 * oney)) * vec4(c1);
  sum += TEX2D(xy + vec2(0.0, +2.0 * oney)) * vec4(c2);
  sum += TEX2D(xy + vec2(0.0, +3.0 * oney)) * vec4(c3);
  sum += TEX2D(xy + vec2(0.0, +4.0 * oney)) * vec4(c4);

  gl_FragColor = pow(sum*vec4(norm),vec4(1.0/display_gamma));
}
]]></fragment>
   <fragment filter="linear" outscale="1.0"><![CDATA[
uniform sampler2D     rubyTexture;
uniform vec2 rubyTextureSize;
uniform vec2 rubyInputSize;
uniform vec2 rubyOutputSize;

#define CRTgamma 2.5
#define display_gamma 2.0
#define TEX2D(c) pow(texture2D(rubyTexture,(c)),vec4(CRTgamma))

void main()
{
  vec2 xy = gl_TexCoord[0].st;
  float oney = 1.0/rubyTextureSize.y;

  float wid = 6.0;

  float c1 = exp(-1.0/wid/wid);
  float c2 = exp(-4.0/wid/wid);
  float c3 = exp(-9.0/wid/wid);
  float c4 = exp(-16.0/wid/wid);
  float norm = 1.0 / (1.0 + 2.0*(c1+c2+c3+c4));

  vec4 sum = vec4(0.0);

  sum += TEX2D(xy + vec2(0.0, -4.0 * oney)) * vec4(c4);
  sum += TEX2D(xy + vec2(0.0, -3.0 * oney)) * vec4(c3);
  sum += TEX2D(xy + vec2(0.0, -2.0 * oney)) * vec4(c2);
  sum += TEX2D(xy + vec2(0.0, -1.0 * oney)) * vec4(c1);
  sum += TEX2D(xy);
  sum += TEX2D(xy + vec2(0.0, +1.0 * oney)) * vec4(c1);
  sum += TEX2D(xy + vec2(0.0, +2.0 * oney)) * vec4(c2);
  sum += TEX2D(xy + vec2(0.0, +3.0 * oney)) * vec4(c3);
  sum += TEX2D(xy + vec2(0.0, +4.0 * oney)) * vec4(c4);

  gl_FragColor = pow(sum*vec4(norm),vec4(1.0/display_gamma));
}
]]></fragment>

   <vertex><![CDATA[
attribute vec2 rubyOrigTexCoord;
varying vec2 orig_tex;

      void main()
      {
         gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
         orig_tex = rubyOrigTexCoord;
         gl_TexCoord[1].xy = gl_MultiTexCoord1.xy;
      }
   ]]></vertex>
   <fragment filter="linear" outscale="1.0"><![CDATA[
      uniform sampler2D rubyOrigTexture;
      uniform sampler2D phosphorLUT;
     varying vec2 orig_tex;

      void main()
      {
         vec4 frame = texture2D(rubyOrigTexture, orig_tex);
       vec4 inverse = 1 - texture2D(rubyOrigTexture, orig_tex);
         vec4 screen = texture2D(phosphorLUT, gl_TexCoord[1].xy);
       
         gl_FragColor = screen - inverse;
      }
   ]]></fragment>
   
   <vertex><![CDATA[
    attribute vec2 rubyPass1TexCoord;
    attribute vec2 rubyPass2TexCoord;
    varying vec2 pass1_tex;
    varying vec2 pass2_tex;

    void main() {
       gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
       gl_TexCoord[0] = gl_MultiTexCoord0;
       pass1_tex = rubyPass1TexCoord;
       pass2_tex = rubyPass2TexCoord;
    }
  ]]></vertex>
    <fragment filter="linear"><![CDATA[
    uniform sampler2D rubyPass1Texture; // Result from Pass 1.
     uniform sampler2D rubyPass2Texture; // Result from Pass 2.
     uniform sampler2D rubyTexture; // Result from Pass 3 (previous pass).
     varying vec2 pass1_tex;
     varying vec2 pass2_tex;
    
    //#define LOWRES

     void main() {
        vec4 pass1 = texture2D(rubyPass1Texture, pass1_tex);
        vec4 pass2 = texture2D(rubyPass2Texture, pass2_tex);
      vec4 pass3 = texture2D(rubyTexture, gl_TexCoord[0].xy);
   #ifdef LOWRES
      gl_FragColor = 1.0 - (1.0 - pass1) * (1.0 - pass2) * (1.0 - pass2) * (1.0 - pass2);
   #else
      gl_FragColor = 0.85 - (1.0 - pass1) * (1.0 - pass2);
   #endif
     }
  ]]></fragment>
   </shader>

GPDP · 15 January 2017 04:18

Scanlines aside, I find the shader looks a bit too dark on default settings. Judging from your blog post, I assume that’s intentional. I tried messing with the gamma options and changing the final pass to substract from pass3 rather than pass2, which lessened the black level to a more acceptable level for me, but then the white level becomes a little overbearing. What do you suggest?

hunterk · 15 January 2017 04:18

Yeah, that’s a tough nut to crack. I’ve been generally tacking on additional screen operations, the part that goes ‘* (1.0 - pass2)’ and that brightens things up, but it can still lead to washed out colors.

Eventually, I would like to be able to fine-tune the colors some, like I did with the photoshop renders, as well as add some real gamma correction, but I’ll have to work on that.

GPDP · 15 January 2017 04:18

I personally like how stuff looks overall on CRT-Geom. Colors look pretty accurate, with the only real distortions coming from the gamma correction and the phosphor emulation. The image doesn’t look too bright or too dark, and even if it does, you can adjust gamma correction and the scanline “weights” to brighten or darken the image as you wish.

Speaking of scanlines, I also like how CRT-Geom does it. It doesn’t seem to just plaster lines on the screen, but rather does some fancy math stuff so they result naturally. At least, that is what I gather.

I hope this doesn’t come across as “make your shader work exactly like CRT-Geom with a few tweaks”. But damn, it did set the bar quite high.

hunterk · 15 January 2017 04:18

Yeah, I totally agree. Really, my shader isn’t going to be worth using below like 10x scale (definitely not at the 4x scale I normally use; there’s too much weird distortion), and even then it will need more work to be a comprehensive solution, since all it really does is the phosphors / shadow mask, rather than all of the other good stuff that cgwg’s already does. Caligari’s scanline/rgb phosphor shader will be another one to revisit at large scales, since it can do phosphors and also does the whole ‘emergent scanlines’ effect like you described.

Basically, my shader does one thing theoretically well at some point in the future, while cgwg’s does a whole bunch of things demonstrably well right now.

GPDP · 15 January 2017 04:18

For what it’s worth, I did manage to get an older version of RetroArch, which let me go into 10x scale. This is the result:

I think some weird scaling issue is happening near the top, but whatever.

Could there perhaps be a way to like, modify cgwg’s shader so that it uses your LUT for its phosphor emulation? Might be worth checking out. You can disable CRT-Geom’s phosphor emulation by commenting out the following lines, btw:

vec3 dotMaskWeights = mix( vec3(1.0, 0.7, 1.0), vec3(0.7, 1.0, 0.7), floor(mod(mod_factor, 2.0)) );

mul_res *= dotMaskWeights;

hunterk · 15 January 2017 04:18

Ah, sweet! Thanks for posting that. It’s still a bit dark, of course, but I’m glad it actually looks pretty much like I had hoped.

Yes, it should be possible add LUT phosphors to cgwg’s shader. That’s a good idea. I’ll see what I can come up with.

GPDP · 15 January 2017 04:18

If you get it working, I’ll be sure to test it. I imagine it would look pretty damn good at 10x scale.

hunterk · 15 January 2017 04:18

It was surprisingly easy to get cgwg’s shader hooked in and it looks nice: Here’s the code:

<?xml version="1.0" encoding="UTF-8"?>
<!-- 
     PhosphorLUT-cgwg v1.1
    This shader uses an external lookup texture (LUT) to create a shadow mask with individual RGB phosphor lenses.
    cgwg's CRT shader code does most of the heavy lifting and my LUT code puts the shadow mask on top.
    You can swap out the LUT your own by changing the 'file' reference in line 11.
    Author: hunterk
     License: GPL (contains code from other GPL shaders).
-->
<shader language="GLSL">
   <texture id="phosphorLUT" file="240pvert.png" filter="linear"/>
    <vertex><![CDATA[
varying float CRTgamma;
varying float monitorgamma;
varying vec2 overscan;
varying vec2 aspect;
varying float d;
varying float R;
varying float cornersize;
varying float cornersmooth;

varying vec3 stretch;
varying vec2 sinangle;
varying vec2 cosangle;

uniform vec2 rubyInputSize;
uniform vec2 rubyTextureSize;
uniform vec2 rubyOutputSize;

varying vec2 texCoord;
varying vec2 one;
varying float mod_factor;
varying vec2 ilfac;

#define FIX(c) max(abs(c), 1e-5);

float intersect(vec2 xy)
{
  float A = dot(xy,xy)+d*d;
  float B = 2.0*(R*(dot(xy,sinangle)-d*cosangle.x*cosangle.y)-d*d);
  float C = d*d + 2.0*R*d*cosangle.x*cosangle.y;
  return (-B-sqrt(B*B-4.0*A*C))/(2.0*A);
}

vec2 bkwtrans(vec2 xy)
{
  float c = intersect(xy);
  vec2 point = vec2(c)*xy;
  point -= vec2(-R)*sinangle;
  point /= vec2(R);
  vec2 tang = sinangle/cosangle;
  vec2 poc = point/cosangle;
  float A = dot(tang,tang)+1.0;
  float B = -2.0*dot(poc,tang);
  float C = dot(poc,poc)-1.0;
  float a = (-B+sqrt(B*B-4.0*A*C))/(2.0*A);
  vec2 uv = (point-a*sinangle)/cosangle;
  float r = R*acos(a);
  return uv*r/sin(r/R);
}

vec2 fwtrans(vec2 uv)
{
  float r = FIX(sqrt(dot(uv,uv)));
  uv *= sin(r/R)/r;
  float x = 1.0-cos(r/R);
  float D = d/R + x*cosangle.x*cosangle.y+dot(uv,sinangle);
  return d*(uv*cosangle-x*sinangle)/D;
}

vec3 maxscale()
{
  vec2 c = bkwtrans(-R * sinangle / (1.0 + R/d*cosangle.x*cosangle.y));
  vec2 a = vec2(0.5,0.5)*aspect;
  vec2 lo = vec2(fwtrans(vec2(-a.x,c.y)).x,
         fwtrans(vec2(c.x,-a.y)).y)/aspect;
  vec2 hi = vec2(fwtrans(vec2(+a.x,c.y)).x,
         fwtrans(vec2(c.x,+a.y)).y)/aspect;
  return vec3((hi+lo)*aspect*0.5,max(hi.x-lo.x,hi.y-lo.y));
}


void main()
{

  // START of parameters

  // gamma of simulated CRT
  CRTgamma = 2.4;
  // gamma of display monitor (typically 2.2 is correct)
  monitorgamma = 2.2;
  // overscan (e.g. 1.02 for 2% overscan)
  overscan = vec2(1.00,1.00);
  // aspect ratio
  aspect = vec2(1.0, 0.75);
  // lengths are measured in units of (approximately) the width of the monitor
  // simulated distance from viewer to monitor
  d = 2.0;
  // radius of curvature
  R = 1.5;
  // tilt angle in radians
  // (behavior might be a bit wrong if both components are nonzero)
  const vec2 angle = vec2(0.0,0.0);
  // size of curved corners
  cornersize = 0.03;
  // border smoothness parameter
  // decrease if borders are too aliased
  cornersmooth = 80.0;

  // END of parameters

  // Do the standard vertex processing.
  gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
           gl_TexCoord[0].xy = gl_MultiTexCoord0.xy;
         gl_TexCoord[1].xy = gl_MultiTexCoord1.xy;

  // Precalculate a bunch of useful values we'll need in the fragment
  // shader.
  sinangle = sin(angle);
  cosangle = cos(angle);
  stretch = maxscale();

  // Texture coords.
  texCoord = gl_MultiTexCoord0.xy;
  
  ilfac = vec2(1.0,floor(rubyInputSize.y/200.0));

  // The size of one texel, in texture-coordinates.
  one = ilfac / rubyTextureSize;

  // Resulting X pixel-coordinate of the pixel we're drawing.
  mod_factor = texCoord.x * rubyTextureSize.x * rubyOutputSize.x / rubyInputSize.x;            
}
    ]]></vertex>
<fragment><![CDATA[
// Comment the next line to disable interpolation in linear gamma (and gain speed).
//#define LINEAR_PROCESSING

// Enable screen curvature.
//#define CURVATURE

// Enable 3x oversampling of the beam profile
#define OVERSAMPLE

// Use the older, purely gaussian beam profile
//#define USEGAUSSIAN

// Macros.
#define FIX(c) max(abs(c), 1e-5);
#define PI 3.141592653589

#ifdef LINEAR_PROCESSING
#       define TEX2D(c) pow(texture2D(rubyTexture, (c)), vec4(CRTgamma))
#else
#       define TEX2D(c) texture2D(rubyTexture, (c))
#endif

uniform sampler2D rubyTexture;
uniform vec2 rubyInputSize;
uniform vec2 rubyTextureSize;
uniform int rubyFrameCount;

varying vec2 texCoord;
varying vec2 one;
varying float mod_factor;
varying vec2 ilfac;

varying float CRTgamma;
varying float monitorgamma;

varying vec2 overscan;
varying vec2 aspect;

varying float d;
varying float R;

varying float cornersize;
varying float cornersmooth;

varying vec3 stretch;
varying vec2 sinangle;
varying vec2 cosangle;

float intersect(vec2 xy)
{
  float A = dot(xy,xy)+d*d;
  float B = 2.0*(R*(dot(xy,sinangle)-d*cosangle.x*cosangle.y)-d*d);
  float C = d*d + 2.0*R*d*cosangle.x*cosangle.y;
  return (-B-sqrt(B*B-4.0*A*C))/(2.0*A);
}

vec2 bkwtrans(vec2 xy)
{
  float c = intersect(xy);
  vec2 point = vec2(c)*xy;
  point -= vec2(-R)*sinangle;
  point /= vec2(R);
  vec2 tang = sinangle/cosangle;
  vec2 poc = point/cosangle;
  float A = dot(tang,tang)+1.0;
  float B = -2.0*dot(poc,tang);
  float C = dot(poc,poc)-1.0;
  float a = (-B+sqrt(B*B-4.0*A*C))/(2.0*A);
  vec2 uv = (point-a*sinangle)/cosangle;
  float r = FIX(R*acos(a));
  return uv*r/sin(r/R);
}

vec2 transform(vec2 coord)
{
  coord *= rubyTextureSize / rubyInputSize;
  coord = (coord-vec2(0.5))*aspect*stretch.z+stretch.xy;
  return (bkwtrans(coord)/overscan/aspect+vec2(0.5)) * rubyInputSize / rubyTextureSize;
}

float corner(vec2 coord)
{
  coord *= rubyTextureSize / rubyInputSize;
  coord = (coord - vec2(0.5)) * overscan + vec2(0.5);
  coord = min(coord, vec2(1.0)-coord) * aspect;
  vec2 cdist = vec2(cornersize);
  coord = (cdist - min(coord,cdist));
  float dist = sqrt(dot(coord,coord));
  return clamp((cdist.x-dist)*cornersmooth,0.0, 1.0);
}

// Calculate the influence of a scanline on the current pixel.
//
// 'distance' is the distance in texture coordinates from the current
// pixel to the scanline in question.
// 'color' is the colour of the scanline at the horizontal location of
// the current pixel.
vec4 scanlineWeights(float distance, vec4 color)
{
  // "wid" controls the width of the scanline beam, for each RGB channel
  // The "weights" lines basically specify the formula that gives
  // you the profile of the beam, i.e. the intensity as
  // a function of distance from the vertical center of the
  // scanline. In this case, it is gaussian if width=2, and
  // becomes nongaussian for larger widths. Ideally this should
  // be normalized so that the integral across the beam is
  // independent of its width. That is, for a narrower beam
  // "weights" should have a higher peak at the center of the
  // scanline than for a wider beam.
#ifdef USEGAUSSIAN
  vec4 wid = 0.3 + 0.1 * pow(color, vec4(3.0));
  vec4 weights = vec4(distance / wid);
  return 0.4 * exp(-weights * weights) / wid;
#else
  vec4 wid = 2.0 + 2.0 * pow(color, vec4(4.0));
  vec4 weights = vec4(distance / 0.3);
  return 1.4 * exp(-pow(weights * inversesqrt(0.5 * wid), wid)) / (0.6 + 0.2 * wid);
#endif
}

void main()
{
  // Here's a helpful diagram to keep in mind while trying to
  // understand the code:
  //
  //  |      |      |      |      |
  // -------------------------------
  //  |      |      |      |      |
  //  |  01  |  11  |  21  |  31  | <-- current scanline
  //  |      | @    |      |      |
  // -------------------------------
  //  |      |      |      |      |
  //  |  02  |  12  |  22  |  32  | <-- next scanline
  //  |      |      |      |      |
  // -------------------------------
  //  |      |      |      |      |
  //
  // Each character-cell represents a pixel on the output
  // surface, "@" represents the current pixel (always somewhere
  // in the bottom half of the current scan-line, or the top-half
  // of the next scanline). The grid of lines represents the
  // edges of the texels of the underlying texture.

  // Texture coordinates of the texel containing the active pixel.
#ifdef CURVATURE
  vec2 xy = transform(texCoord);
#else
  vec2 xy = texCoord;
#endif
  float cval = corner(xy);

  // Of all the pixels that are mapped onto the texel we are
  // currently rendering, which pixel are we currently rendering?
  vec2 ilvec = vec2(0.0,ilfac.y > 1.5 ? mod(float(rubyFrameCount),2.0) : 0.0);
  vec2 ratio_scale = (xy * rubyTextureSize - vec2(0.5) + ilvec)/ilfac;
#ifdef OVERSAMPLE
  float filter = fwidth(ratio_scale.y);
#endif
  vec2 uv_ratio = fract(ratio_scale);

  // Snap to the center of the underlying texel.
  xy = (floor(ratio_scale)*ilfac + vec2(0.5) - ilvec) / rubyTextureSize;

  // Calculate Lanczos scaling coefficients describing the effect
  // of various neighbour texels in a scanline on the current
  // pixel.
  vec4 coeffs = PI * vec4(1.0 + uv_ratio.x, uv_ratio.x, 1.0 - uv_ratio.x, 2.0 - uv_ratio.x);

  // Prevent division by zero.
  coeffs = FIX(coeffs);

  // Lanczos2 kernel.
  coeffs = 2.0 * sin(coeffs) * sin(coeffs / 2.0) / (coeffs * coeffs);

  // Normalize.
  coeffs /= dot(coeffs, vec4(1.0));

  // Calculate the effective colour of the current and next
  // scanlines at the horizontal location of the current pixel,
  // using the Lanczos coefficients above.
  vec4 col  = clamp(mat4(
             TEX2D(xy + vec2(-one.x, 0.0)),
             TEX2D(xy),
             TEX2D(xy + vec2(one.x, 0.0)),
             TEX2D(xy + vec2(2.0 * one.x, 0.0))) * coeffs,
            0.0, 1.0);
  vec4 col2 = clamp(mat4(
             TEX2D(xy + vec2(-one.x, one.y)),
             TEX2D(xy + vec2(0.0, one.y)),
             TEX2D(xy + one),
             TEX2D(xy + vec2(2.0 * one.x, one.y))) * coeffs,
            0.0, 1.0);

#ifndef LINEAR_PROCESSING
  col  = pow(col , vec4(CRTgamma));
  col2 = pow(col2, vec4(CRTgamma));
#endif

  // Calculate the influence of the current and next scanlines on
  // the current pixel.
  vec4 weights  = scanlineWeights(uv_ratio.y, col);
  vec4 weights2 = scanlineWeights(1.0 - uv_ratio.y, col2);
#ifdef OVERSAMPLE
  uv_ratio.y =uv_ratio.y+1.0/3.0*filter;
  weights = (weights+scanlineWeights(uv_ratio.y, col))/3.0;
  weights2=(weights2+scanlineWeights(abs(1.0-uv_ratio.y), col2))/3.0;
  uv_ratio.y =uv_ratio.y-2.0/3.0*filter;
  weights=weights+scanlineWeights(abs(uv_ratio.y), col)/3.0;
  weights2=weights2+scanlineWeights(abs(1.0-uv_ratio.y), col2)/3.0;
#endif
  vec3 mul_res  = (col * weights + col2 * weights2).rgb * vec3(cval);

  // dot-mask emulation:
  // Output pixels are alternately tinted green and magenta.
//  vec3 dotMaskWeights = mix(
//          vec3(1.0, 0.7, 1.0),
//          vec3(0.7, 1.0, 0.7),
//          floor(mod(mod_factor, 2.0))
//      );
                    
//  mul_res *= dotMaskWeights;

  // Convert the image gamma for display on our output device.
  mul_res = pow(mul_res, vec3(1.0 / monitorgamma));

  // Color the texel.
  gl_FragColor = vec4(mul_res, 1.0);
}
    ]]></fragment>
   <vertex><![CDATA[
attribute vec2 rubyOrigTexCoord;
varying vec2 orig_tex;

      void main()
      {
         gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
         orig_tex = rubyOrigTexCoord;
         gl_TexCoord[1].xy = gl_MultiTexCoord1.xy;
      }
   ]]></vertex>
   <fragment filter="linear" outscale="2.0"><![CDATA[
      uniform sampler2D rubyPass1Texture;
      uniform sampler2D phosphorLUT;
     varying vec2 orig_tex;

      void main()
      {
         vec4 frame = texture2D(rubyPass1Texture, orig_tex);
       vec4 inverse = 1 - texture2D(rubyPass1Texture, orig_tex);
         vec4 screen = texture2D(phosphorLUT, gl_TexCoord[1].xy);
       
         gl_FragColor = screen - inverse;
      }
   ]]></fragment>
   <vertex><![CDATA[
    attribute vec2 rubyPass1TexCoord;
    attribute vec2 rubyPass2TexCoord;
    varying vec2 pass1_tex;
    varying vec2 pass2_tex;

    void main() {
       gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
       gl_TexCoord[0] = gl_MultiTexCoord0;
       pass1_tex = rubyPass1TexCoord;
       pass2_tex = rubyPass2TexCoord;
    }
  ]]></vertex>
    <fragment filter="linear"><![CDATA[
    uniform sampler2D rubyPass1Texture; // Result from Pass 1.
     uniform sampler2D rubyTexture; // Result from Pass 2 (previous pass).
     varying vec2 pass1_tex;

     void main() {
      vec4 pass1 = texture2D(rubyPass1Texture, pass1_tex);
      vec4 pass2 = texture2D(rubyTexture, gl_TexCoord[0].xy);

      gl_FragColor = 1.0 - (1.0 - pass1) * (1.0 - pass2);
     }
  ]]></fragment>
   </shader>

GPDP · 15 January 2017 04:18

Again, a bit of weird scaling at the top, but I’m guessing it’s the older version of RetroArch I’m using for this shot at fault

This shot is at 10x, with gamma correction disabled and the scanline weights turned up a bit to lighten the lines, and the RGB filter on top.

Looks… ok. However, the phosphor effect doesn’t look as natural as I hoped it would. It’s missing something, I can’t quite put my finger on it.

From the look of things, it seems like the cgwg portion of the shader is just doing its usual thing, but without really taking the phosphors into account. On the regular shader, whites would still be tinted green and magenta, but in this one, whites appear pure white, whereas a white pixel should appear red, green, and blue. Also, on your shader phosphors would actually act naturally, increasing and decreasing in intensity. Here, everything is just kinda tinted red, green, and blue, with the exception of whites, as I stated before.

This is what I’m getting from my limited testing. Maybe I’m slightly off somewhere.

Could it perhaps be possible to use cgwg’s phosphor emulation code, but adapt it to use the texture somehow? Maybe then it will work with it better.

hunterk · 15 January 2017 04:18

I think you’re right. I was taking the phosphorized image and then combining it with the result of cgwg’s and that’s not really the right idea. I’ll keep working on it.

xadox · 15 January 2017 04:18

The new shader is realy looking fantastic.

GPDP · 15 January 2017 04:18

So I decided to test the scanline variant of the original shader some, but I found it always looks really dark no matter what, so I had an idea: why not use a pure white LUT to “disable” phosphor emulation and see how the scanlines really fare at 10x scale?

Believe it or not, your shader’s scanlines scale much, MUCH better than cgwg’s. At 4x scale they look rather similar, as they both do that “bloom” effect around bright colors and such. However, at higher scales, it turns out cgwg’s shader basically looks the same, just bigger, meaning that in that shader, pixels basically bloom or do not bloom. Your shader, however, is much more dynamic. Pixels bloom a little, some, a lot, or a ton. Looking at how this screen looks like on my Sony TV, your scanlines behave very similarly to my TV’s, definitely a closet match than cgwg’s.

Now, if only we could fix the brightness issue…

system · 15 January 2017 04:19

This is really looking good, if these scanline options are integrated into a future release of RA, would these graphics options be included in the Wii build?

Maister · 15 January 2017 04:19

This is a user-defined shader, we do not bake in things like this by default. (Except for e.g. PS3 where these things are bundled in the package, but not baked in).

system · 15 January 2017 04:19

No.

The Wii’s graphics system has a legacy fixed-function pipeline (not surprising since it is more or less identical to that of the Gamecube except for being faster clocked and having more TEV stages).

The PS3 and 360 (and every other modern GPU by comparison on PC) have a programmable pipeline. That is what you would need to run vertex and fragment shaders like this.

Unlike what tuedji and those other Wii guys think, there is a lot more to a CRT shader than just applying a ‘curvature effect’ with the TEV engine. So no, it really can’t be done on the Wii. The Wii’s CPU might be quite good and not really too bad when compared against the Cell and Xenon (from a single-core perspective), but when it comes to GPUs there is no question about it - it’s far inferior - not even in the running in fact compared to PS3/360.