Lcd-grid-v2.cg not working

I was hoping to use an LCD shader for the GBA and other handheld consoles, but any shaders which rely on lcd-grid-v2.cg (handheld/shaders/lcd_cgwg/) don’t work. No other shaders I’ve tried exhibit this problem. If I start by choosing an unaffected shader, and then I choose lcd-grid-v2.cgp (or -gba-color.cgp or any others), the first shader stays selected in the shader menu, and the image is displayed without any shader until “apply” is selected (this re-applies the original shader). If no shader is initially selected, then nothing happens when v2 is chosen. This is the relevant section of the log:

RetroArch [INFO] :: Loading Cg meta-shader: c:\Program Files\Libretro\shaders\shaders_cg\handheld\lcd-grid-v2.cgpRetroArch [INFO] :: Found #pragma parameter Colour of R subpixel: R (RSUBPIX_R) 1.000000 0.000000 1.000000 0.010000
RetroArch [INFO] :: Found #pragma parameter Colour of R subpixel: G (RSUBPIX_G) 0.000000 0.000000 1.000000 0.010000
RetroArch [INFO] :: Found #pragma parameter Colour of R subpixel: B (RSUBPIX_B) 0.000000 0.000000 1.000000 0.010000
RetroArch [INFO] :: Found #pragma parameter Colour of G subpixel: R (GSUBPIX_R) 0.000000 0.000000 1.000000 0.010000
RetroArch [INFO] :: Found #pragma parameter Colour of G subpixel: G (GSUBPIX_G) 1.000000 0.000000 1.000000 0.010000
RetroArch [INFO] :: Found #pragma parameter Colour of G subpixel: B (GSUBPIX_B) 0.000000 0.000000 1.000000 0.010000
RetroArch [INFO] :: Found #pragma parameter Colour of B subpixel: R (BSUBPIX_R) 0.000000 0.000000 1.000000 0.010000
RetroArch [INFO] :: Found #pragma parameter Colour of B subpixel: G (BSUBPIX_G) 0.000000 0.000000 1.000000 0.010000
RetroArch [INFO] :: Found #pragma parameter Colour of B subpixel: B (BSUBPIX_B) 1.000000 0.000000 1.000000 0.010000
RetroArch [INFO] :: Found #pragma parameter Gain (gain) 1.000000 0.500000 2.000000 0.050000
RetroArch [INFO] :: Found #pragma parameter LCD Gamma (gamma) 3.000000 0.500000 5.000000 0.100000
RetroArch [INFO] :: Found #pragma parameter Black level (blacklevel) 0.050000 0.000000 0.500000 0.010000
RetroArch [INFO] :: Found #pragma parameter Ambient (ambient) 0.000000 0.000000 0.500000 0.010000
RetroArch [INFO] :: Found #pragma parameter BGR (BGR) 0.000000 0.000000 1.000000 1.000000
RetroArch [INFO] :: Loading Cg shader: "c:\Program Files\Libretro\shaders\shaders_cg\handheld\shaders/lcd_cgwg/lcd-grid-v2.cg".
RetroArch [ERROR] :: CG error: The compile returned an error.
RetroArch [ERROR] :: Fragment:
c:\Program Files\Libretro\shaders\shaders_cg\handheld\shaders/lcd_cgwg/lcd-grid-v2.cg(141) : error C1115: unable to find compatible overloaded function "texelFetchOffset(sampler2D, int2, int, int2)"
c:\Program Files\Libretro\shaders\shaders_cg\handheld\shaders/lcd_cgwg/lcd-grid-v2.cg(142) : error C1115: unable to find compatible overloaded function "texelFetchOffset(sampler2D, int2, int, int2)"
c:\Program Files\Libretro\shaders\shaders_cg\handheld\shaders/lcd_cgwg/lcd-grid-v2.cg(143) : error C1115: unable to find compatible overloaded function "texelFetchOffset(sampler2D, int2, int, int2)"
c:\Program Files\Libretro\shaders\shaders_cg\handheld\shaders/lcd_cgwg/lcd-grid-v2.cg(144) : error C1115: unable to find compatible overloaded function "texelFetchOffset(sampler2D, int2, int, int2)"


RetroArch [ERROR] :: Failed to load shaders ...
RetroArch [INFO] :: CG: Destroying programs.
RetroArch [INFO] :: [Cg]: Vertex profile: arbvp1
RetroArch [INFO] :: [Cg]: Fragment profile: arbfp1
RetroArch [INFO] :: Loading stock Cg file.

Setup: Retroarch 1.3.6, Windows 8.1, AMD CPU/GPU

I don’t like the way lcd-grid.cgp looks, and it gives each pixel more than three subpixels, so I’d like to figure this out. Does v2 work for anyone else?

Complete log file is attached (I load Advance Wars without any shaders, try to apply lcd-grid-v2.cgp, then quit). log.zip (5.44 KB)

Can confirm, v2 does nothing for me either. On Win 7 with AMD GPU.

From what I’ve heard, lcd_grid_v2.cg will not work with any AMD GPU’s. Nvidia GPU’s only, apparently. Don’t know if there’s any other way around that.

Can you guys try this modification?

#pragma parameter RSUBPIX_R "Colour of R subpixel: R" 1.0 0.0 1.0 0.01
#pragma parameter RSUBPIX_G "Colour of R subpixel: G" 0.0 0.0 1.0 0.01
#pragma parameter RSUBPIX_B "Colour of R subpixel: B" 0.0 0.0 1.0 0.01
#pragma parameter GSUBPIX_R "Colour of G subpixel: R" 0.0 0.0 1.0 0.01
#pragma parameter GSUBPIX_G "Colour of G subpixel: G" 1.0 0.0 1.0 0.01
#pragma parameter GSUBPIX_B "Colour of G subpixel: B" 0.0 0.0 1.0 0.01
#pragma parameter BSUBPIX_R "Colour of B subpixel: R" 0.0 0.0 1.0 0.01
#pragma parameter BSUBPIX_G "Colour of B subpixel: G" 0.0 0.0 1.0 0.01
#pragma parameter BSUBPIX_B "Colour of B subpixel: B" 1.0 0.0 1.0 0.01
#pragma parameter gain      "Gain"                    1.0 0.5 2.0 0.05
#pragma parameter gamma     "LCD Gamma"               3.0 0.5 5.0 0.1
#pragma parameter blacklevel "Black level"            0.05 0.0 0.5 0.01
#pragma parameter ambient   "Ambient"                 0.0 0.0 0.5 0.01
#pragma parameter BGR       "BGR"                     0 0 1 1
#ifdef PARAMETER_UNIFORM
uniform float RSUBPIX_R;
uniform float RSUBPIX_G;
uniform float RSUBPIX_B;
uniform float GSUBPIX_R;
uniform float GSUBPIX_G;
uniform float GSUBPIX_B;
uniform float BSUBPIX_R;
uniform float BSUBPIX_G;
uniform float BSUBPIX_B;
uniform float gain;
uniform float gamma;
uniform float blacklevel;
uniform float ambient;
uniform float BGR;
#else
#define RSUBPIX_R 1.0
#define RSUBPIX_G 0.0
#define RSUBPIX_B 0.0
#define GSUBPIX_R 1.0
#define GSUBPIX_G 1.0
#define GSUBPIX_B 0.0
#define BSUBPIX_R 0.0
#define BSUBPIX_G 0.0
#define BSUBPIX_B 1.0
#define gain 1.0
#define gamma 3.0
#define blacklevel 0.05
#define ambient 0.0
#define BGR 0.0
#endif
 
 
#define outgamma 2.2
 
void main_vertex
(
        float4 position : POSITION,
        float2 texCoord : TEXCOORD0,
 
    uniform float4x4 modelViewProj,
 
        out float4 oPosition : POSITION,
        out float2 otexCoord : TEXCOORD
)
{
        oPosition = mul(modelViewProj, position);
        otexCoord = texCoord;
}
 
struct input
{
  float2 video_size;
  float2 texCoord_size;
  float2 output_size;
  float frame_count;
  float frame_direction;
  float frame_rotation;
  float2 texture_size;
  sampler2D texture : TEXUNIT0;
};
 
struct output
{
  float4 col    : COLOR;
};
 
// integral of (1 - x^2 - x^4 + x^6)^2
const float coeffs_x[] = float[](1.0, -2.0/3.0, -1.0/5.0, 4.0/7.0, -1.0/9.0, -2.0/11.0, 1.0/13.0);
// integral of (1 - 2x^4 + x^6)^2
const float coeffs_y[] = float[](1.0,      0.0, -4.0/5.0, 2.0/7.0,  4.0/9.0, -4.0/11.0, 1.0/13.0);
float intsmear_func(float z, float coeffs[7])
{
  float z2 = z*z;
  float zn = z;
  float ret = 0.0;
  for (int i = 0; i < 7; i++) {
    ret += zn*coeffs[i];
    zn *= z2;
  }
  return ret;
}
float intsmear(float x, float dx, float d, float coeffs[7])
{
  float zl = clamp((x-dx*0.5)/d,-1.0,1.0);
  float zh = clamp((x+dx*0.5)/d,-1.0,1.0);
  return d * ( intsmear_func(zh,coeffs) - intsmear_func(zl,coeffs) )/dx;
}
 
output main_fragment(in float2 texCoord : TEXCOORD0,
uniform input IN,
uniform sampler2D texture : TEXUNIT0
)
{
  float2 texelSize = 1.0 / IN.texture_size;
  float2 range;
  range = IN.video_size / (IN.output_size * IN.texture_size);
  //float2 range = sourceSize[0].xy / (targetSize.xy * sourceSize[0].xy);
 
  float3 cred   = pow(float3(RSUBPIX_R, RSUBPIX_G, RSUBPIX_B), float3(outgamma));
  float3 cgreen = pow(float3(GSUBPIX_R, GSUBPIX_G, GSUBPIX_B), float3(outgamma));
  float3 cblue  = pow(float3(BSUBPIX_R, BSUBPIX_G, BSUBPIX_B), float3(outgamma));
 
  int2 tli = int2(floor(texCoord/texelSize-float2(0.4999)));
 
  float3 lcol, rcol;
  float subpix = (texCoord.x/texelSize.x - 0.4999 - float(tli.x))*3.0;
  float rsubpix = range.x/texelSize.x * 3.0;
  lcol = float3(intsmear(subpix+1.0,rsubpix, 1.5, coeffs_x),
              intsmear(subpix    ,rsubpix, 1.5, coeffs_x),
              intsmear(subpix-1.0,rsubpix, 1.5, coeffs_x));
  rcol = float3(intsmear(subpix-2.0,rsubpix, 1.5, coeffs_x),
              intsmear(subpix-3.0,rsubpix, 1.5, coeffs_x),
              intsmear(subpix-4.0,rsubpix, 1.5, coeffs_x));
  if (BGR > 0.5) {
    lcol.rgb = lcol.bgr;
    rcol.rgb = rcol.bgr;
  }
  float tcol, bcol;
  subpix = texCoord.y/texelSize.y - 0.4999 - float(tli.y);
  rsubpix = range.y/texelSize.y;
  tcol = intsmear(subpix    ,rsubpix, 0.63, coeffs_y);
  bcol = intsmear(subpix-1.0,rsubpix, 0.63, coeffs_y);
 
  float3 topLeftColor     = ((pow(float3(gain) * tex2D(texture, texCoord - 0.25 * texelSize * int2(0,0)).rgb + float3(blacklevel), float3(gamma)) + float3(ambient))) * lcol * float3(tcol);
  float3 bottomRightColor = ((pow(float3(gain) * tex2D(texture, texCoord - 0.25 * texelSize * int2(1,1)).rgb + float3(blacklevel), float3(gamma)) + float3(ambient))) * rcol * float3(bcol);
  float3 bottomLeftColor  = ((pow(float3(gain) * tex2D(texture, texCoord - 0.25 * texelSize * int2(0,1)).rgb + float3(blacklevel), float3(gamma)) + float3(ambient))) * lcol * float3(bcol);
  float3 topRightColor    = ((pow(float3(gain) * tex2D(texture, texCoord - 0.25 * texelSize * int2(1,0)).rgb + float3(blacklevel), float3(gamma)) + float3(ambient))) * rcol * float3(tcol);
 
  float3 averageColor = topLeftColor + bottomRightColor + bottomLeftColor + topRightColor;
 
  averageColor = mul(averageColor, mat3x3(cred, cgreen, cblue));
 
  output OUT;
  OUT.col = float4(pow(averageColor,float3(1.0/outgamma)),0.0);
  return OUT;
}

Just tried out the modified shader, and it works on my AMD GPU now. Thanks!

Alright, I pushed it up to the repo, so everyone should be able to fetch it from the online updater soon.

Sorry but it’s not the same, the subpixels are getting out of place, especially the red ones are visibile:

comparison

Try changing the offset from 0.25 to 0.125.

Even if there’s a slight quality loss on nvidia, I think having it at least compile on Intel/AMD is more important.

Just tried 0.125. It doesn’t help here on Nvidia. Everything is misaligned really.

Rather keep an nvidia version and another AMD then.

@Tatsuya79 yeah, that’s fair. Wouldn’t be the first time…

Does this look any closer?

#pragma parameter RSUBPIX_R "Colour of R subpixel: R" 1.0 0.0 1.0 0.01
#pragma parameter RSUBPIX_G "Colour of R subpixel: G" 0.0 0.0 1.0 0.01
#pragma parameter RSUBPIX_B "Colour of R subpixel: B" 0.0 0.0 1.0 0.01
#pragma parameter GSUBPIX_R "Colour of G subpixel: R" 0.0 0.0 1.0 0.01
#pragma parameter GSUBPIX_G "Colour of G subpixel: G" 1.0 0.0 1.0 0.01
#pragma parameter GSUBPIX_B "Colour of G subpixel: B" 0.0 0.0 1.0 0.01
#pragma parameter BSUBPIX_R "Colour of B subpixel: R" 0.0 0.0 1.0 0.01
#pragma parameter BSUBPIX_G "Colour of B subpixel: G" 0.0 0.0 1.0 0.01
#pragma parameter BSUBPIX_B "Colour of B subpixel: B" 1.0 0.0 1.0 0.01
#pragma parameter gain      "Gain"                    1.0 0.5 2.0 0.05
#pragma parameter gamma     "LCD Gamma"               3.0 0.5 5.0 0.1
#pragma parameter blacklevel "Black level"            0.05 0.0 0.5 0.01
#pragma parameter ambient   "Ambient"                 0.0 0.0 0.5 0.01
#pragma parameter BGR       "BGR"                     0 0 1 1
#ifdef PARAMETER_UNIFORM
uniform float RSUBPIX_R;
uniform float RSUBPIX_G;
uniform float RSUBPIX_B;
uniform float GSUBPIX_R;
uniform float GSUBPIX_G;
uniform float GSUBPIX_B;
uniform float BSUBPIX_R;
uniform float BSUBPIX_G;
uniform float BSUBPIX_B;
uniform float gain;
uniform float gamma;
uniform float blacklevel;
uniform float ambient;
uniform float BGR;
#else
#define RSUBPIX_R 1.0
#define RSUBPIX_G 0.0
#define RSUBPIX_B 0.0
#define GSUBPIX_R 1.0
#define GSUBPIX_G 1.0
#define GSUBPIX_B 0.0
#define BSUBPIX_R 0.0
#define BSUBPIX_G 0.0
#define BSUBPIX_B 1.0
#define gain 1.0
#define gamma 3.0
#define blacklevel 0.05
#define ambient 0.0
#define BGR 0.0
#endif
 
 
#define outgamma 2.2
 
void main_vertex
(
        float4 position : POSITION,
        float2 texCoord : TEXCOORD0,
 
    uniform float4x4 modelViewProj,
 
        out float4 oPosition : POSITION,
        out float2 otexCoord : TEXCOORD
)
{
        oPosition = mul(modelViewProj, position);
        otexCoord = texCoord;
}
 
struct input
{
  float2 video_size;
  float2 texCoord_size;
  float2 output_size;
  float frame_count;
  float frame_direction;
  float frame_rotation;
  float2 texture_size;
  sampler2D texture : TEXUNIT0;
};
 
struct output
{
  float4 col    : COLOR;
};
 
// integral of (1 - x^2 - x^4 + x^6)^2
const float coeffs_x[] = float[](1.0, -2.0/3.0, -1.0/5.0, 4.0/7.0, -1.0/9.0, -2.0/11.0, 1.0/13.0);
// integral of (1 - 2x^4 + x^6)^2
const float coeffs_y[] = float[](1.0,      0.0, -4.0/5.0, 2.0/7.0,  4.0/9.0, -4.0/11.0, 1.0/13.0);
float intsmear_func(float z, float coeffs[7])
{
  float z2 = z*z;
  float zn = z;
  float ret = 0.0;
  for (int i = 0; i < 7; i++) {
    ret += zn*coeffs[i];
    zn *= z2;
  }
  return ret;
}
float intsmear(float x, float dx, float d, float coeffs[7])
{
  float zl = clamp((x-dx*0.5)/d,-1.0,1.0);
  float zh = clamp((x+dx*0.5)/d,-1.0,1.0);
  return d * ( intsmear_func(zh,coeffs) - intsmear_func(zl,coeffs) )/dx;
}
 
output main_fragment(in float2 texCoord : TEXCOORD0,
uniform input IN,
uniform sampler2D texture : TEXUNIT0
)
{
  float2 texelSize = 1.0 / IN.texture_size;
  float2 range;
  range = IN.video_size / (IN.output_size * IN.texture_size);
  //float2 range = sourceSize[0].xy / (targetSize.xy * sourceSize[0].xy);
 
  float3 cred   = pow(float3(RSUBPIX_R, RSUBPIX_G, RSUBPIX_B), float3(outgamma));
  float3 cgreen = pow(float3(GSUBPIX_R, GSUBPIX_G, GSUBPIX_B), float3(outgamma));
  float3 cblue  = pow(float3(BSUBPIX_R, BSUBPIX_G, BSUBPIX_B), float3(outgamma));
 
  int2 tli = int2(floor(texCoord/texelSize-float2(0.4999)));
 
  float3 lcol, rcol;
  float subpix = (texCoord.x/texelSize.x - 0.4999 - float(tli.x))*3.0;
  float rsubpix = range.x/texelSize.x * 3.0;
  lcol = float3(intsmear(subpix+1.0,rsubpix, 1.5, coeffs_x),
              intsmear(subpix    ,rsubpix, 1.5, coeffs_x),
              intsmear(subpix-1.0,rsubpix, 1.5, coeffs_x));
  rcol = float3(intsmear(subpix-2.0,rsubpix, 1.5, coeffs_x),
              intsmear(subpix-3.0,rsubpix, 1.5, coeffs_x),
              intsmear(subpix-4.0,rsubpix, 1.5, coeffs_x));
  if (BGR > 0.5) {
    lcol.rgb = lcol.bgr;
    rcol.rgb = rcol.bgr;
  }
  float tcol, bcol;
  subpix = texCoord.y/texelSize.y - 0.4999 - float(tli.y);
  rsubpix = range.y/texelSize.y;
  tcol = intsmear(subpix    ,rsubpix, 0.63, coeffs_y);
  bcol = intsmear(subpix-1.0,rsubpix, 0.63, coeffs_y);

	float3 test = tex2Dbias(texture, float4(1,1,1,1)).rgb;
 
  float3 topLeftColor     = ((pow(float3(gain) * tex2Dbias(texture, texCoord, int2(0,0)).rgb + float3(blacklevel), float3(gamma)) + float3(ambient))) * lcol * float3(tcol);
  float3 bottomRightColor = ((pow(float3(gain) * tex2Dbias(texture, texCoord, int2(1,1)).rgb + float3(blacklevel), float3(gamma)) + float3(ambient))) * rcol * float3(bcol);
  float3 bottomLeftColor  = ((pow(float3(gain) * tex2Dbias(texture, texCoord, int2(0,1)).rgb + float3(blacklevel), float3(gamma)) + float3(ambient))) * lcol * float3(bcol);
  float3 topRightColor    = ((pow(float3(gain) * tex2Dbias(texture, texCoord, int2(1,0)).rgb + float3(blacklevel), float3(gamma)) + float3(ambient))) * rcol * float3(tcol);
 
  float3 averageColor = topLeftColor + bottomRightColor + bottomLeftColor + topRightColor;
 
  averageColor = mul(averageColor, mat3x3(cred, cgreen, cblue));
 
  output OUT;
  OUT.col = float4(pow(averageColor,float3(1.0/outgamma)),0.0);
  return OUT;
}

Can confirm that this fixes all the shaders on my (AMD) setup. Thanks hunterk!

Actually, scratch that. The red subpixels seem to “straddle” the edges between the pixels in the x-direction. It’s more noticeable with GBC games with sharp edges (see Tetris DX).

It’s harder to notice in GBA games, but it’s there—look at the border between the red star and “WARS” in Advance Wars, or at the letters’ shadows.

Still an improvement overall for the AMD folks, heh.

[QUOTE=hunterk;45686]@Tatsuya79 Does this look any closer?[/QUOTE]

Perhaps slightly better but not there yet sadly.

Here are full screen PNG shots to give you a better idea:

OLD LCD v2 / NEW LCD V2 AMD fix

(the RGB/BRG parameter keeps those red bars on the same side too)

It’s definitely starting to shape up at least because before we AMD owners couldn’t even use v2 of this shader. Thanks for your hard work!