Accessing any output in multipass shaders

I’m not very used to work with multipass shaders, so there are some concepts i didn’t get yet.

After reading the RA cg readme (https://github.com/Themaister/Emulator-Shader-Pack/blob/master/Cg/README) I know that I can access the original texture by using ORIG. Well, I don’t know how to do it in practice. I’m more used to learn in this shader realm by mimic (like a monkey, yeah :P)

I’d like a simple shader example which access original texture in a second pass. Can anyone help me?

Here’s an example I made for Sp00kyFox in another thread: http://www.mediafire.com/download/vecac7jzldmbvn4/Halation.zip

As I warned him: there’s something weird going on with the texcoords for the ORIG texture, so I put a hacky multiplier in there that fixes it for SNES. Perhaps you’ll have better luck. Anyway, that’s the basic idea.

Thank you, hunterk! I’ll look into that!

As float_framebuffer doesn’t work on PS3, I’m looking for alternatives to pass info between passes. So, in the first pass I’ll extract all info I need and output it. In pass two I want access the original and the output to complete the filtering. I think xBR will run much faster that way.

hunterk: shouldn’t you use the texcoords from the ORIG structure (ORIG.tex_coord)?

@OV2 Possibly! Do you have any example code you could show? I’m still very much a n00b with Cg, as well, so it’s sort of the blind leading the blind at this point :stuck_out_tongue:

Like this: http://pastebin.com/Bh7QZBnQ

OV2: Using ORIG.tex_coord in fragment shader directly is wrong. All vertex attribs have to be passed to fragment shader (just like TEXCOORD0, etc).

I assumed there would be sort of ratio you could multiply the texcoords by, like (ORIG.texture_size / IN.texture_size) to get it to work on all cores and resolutions, but I haven’t had much luck with it.

Maister: guess I shouldn’t do this blindly at work :wink:

I’ve tried making something using hunterk shader ideas, but I failed.

If anyone can point where I’m doing wrong to get original samples:

https://anonfiles.com/file/eb9635445f3fc88e7e02a12d8af4b6c0

If you want to know ORIG’s texture size, it’s ORIG.texture_size.

Maister,

Here’s what I’m trying to do:

I only get success with the serial scheme. I’d like to access info in an additive way. I’ve already tried access original by ORIG struct, but, for some reason nothings happens in my shaders…

Here’s an example of how you can use ORIG:

Remember, you can only access ORIG in 2nd pass and later.


// orig.cg
/*
   Author: Themaister
   License: Public domain
*/

struct old_pass
{
   float2 tex_coord;
   uniform sampler2D texture;
};

struct vertex_data
{
   float2 tex;
   float2 orig_tex;
};

void main_vertex
(
   uniform float4x4 modelViewProj,
   float4 position : POSITION,
   out float4 oPosition : POSITION,

   float2 tex_coord : TEXCOORD0,

   old_pass ORIG,
   out vertex_data co 
)
{
   oPosition = mul(modelViewProj, position);
   co = vertex_data(tex_coord, ORIG.tex_coord);
}

float4 main_fragment(uniform sampler2D s0 : TEXUNIT0, old_pass ORIG, in vertex_data co) : COLOR
{
   float4 orig_color = tex2D(ORIG.texture, co.orig_tex);
   float4 color = tex2D(s0, co.tex);
   return lerp(orig_color, color, 0.5); // Mix ORIG and previous pass.
}

And a cgp:


shaders = 2
shader0 = dummy.cg
shader1 = orig.cg

scale_type0 = source
scale0 = 1.0
filter_linear0 = true
filter_linear1 = true

Ah, ok, I see what you did with the texcoords. That helps a lot. Thanks!

Thanks, maister! I’ll try that when I get home tonight.

Well, tried here, but failed again!

What’s wrong with my second pass ORIG?? I did exactly what maister did and nothing happens.

1st pass (use it at 1x):


/*
   Hyllian's xBR LVL1 pass0 beta
   
   Copyright (C) 2011/2012 Hyllian/Jararaca - [email][email protected][/email]

   This program is free software; you can redistribute it and/or
   modify it under the terms of the GNU General Public License
   as published by the Free Software Foundation; either version 2
   of the License, or (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.


   Incorporates some of the ideas from SABR shader. Thanks to Joshua Street.
*/

const static float coef           = 2.0;
const static float4 eq_threshold  = float4(15.0);
const static half y_weight        = 48.0;
const static half u_weight        = 7.0;
const static half v_weight        = 6.0;
const static half3x3 yuv          = half3x3(0.299, 0.587, 0.114, -0.169, -0.331, 0.499, 0.499, -0.418, -0.0813);
const static half3x3 yuv_weighted = half3x3(y_weight*yuv[0], u_weight*yuv[1], v_weight*yuv[2]);
const static float4 delta       = float4(0.3);

const static half4 xbr_info  = half4(0.01, 0.02, 0.04, 0.08);


float4 df(float4 A, float4 B)
{
    return float4(abs(A-B));
}

half c_df(half3 c1, half3 c2) {
                        half3 df = abs(c1 - c2);
                        return df.r + df.g + df.b;
                }




bool4 eq(float4 A, float4 B)
{
    return (df(A, B) < eq_threshold);
}

float4 weighted_distance(float4 a, float4 b, float4 c, float4 d, float4 e, float4 f, float4 g, float4 h)
{
    return (df(a,b) + df(a,c) + df(d,e) + df(d,f) + 4.0*df(g,h));
}



struct input
{
    half2 video_size;
    float2 texture_size;
    half2 output_size;
};


struct out_vertex {
    half4 position : POSITION;
    half4 color    : COLOR;
    float2 texCoord : TEXCOORD0;
    float4 t1;
    float4 t2;
    float4 t3;
    float4 t4;
    float4 t5;
    float4 t6;
    float4 t7;
};

/*    VERTEX_SHADER    */
out_vertex main_vertex
(
    half4 position    : POSITION,
    half4 color    : COLOR,
    float2 texCoord : TEXCOORD0,

       uniform half4x4 modelViewProj,
    uniform input IN
)
{
    out_vertex OUT;

    OUT.position = mul(modelViewProj, position);
    OUT.color = color;

    float2 ps = float2(1.0/IN.texture_size.x, 1.0/IN.texture_size.y);
    float dx = ps.x;
    float dy = ps.y;

    //    A1 B1 C1
    // A0  A  B  C C4
    // D0  D  E  F F4
    // G0  G  H  I I4
    //    G5 H5 I5

    OUT.texCoord = texCoord;
    OUT.t1 = texCoord.xxxy + half4( -dx, 0, dx,-2.0*dy); // A1 B1 C1
    OUT.t2 = texCoord.xxxy + half4( -dx, 0, dx,    -dy); //  A  B  C
    OUT.t3 = texCoord.xxxy + half4( -dx, 0, dx,      0); //  D  E  F
    OUT.t4 = texCoord.xxxy + half4( -dx, 0, dx,     dy); //  G  H  I
    OUT.t5 = texCoord.xxxy + half4( -dx, 0, dx, 2.0*dy); // G5 H5 I5
    OUT.t6 = texCoord.xyyy + half4(-2.0*dx,-dy, 0,  dy); // A0 D0 G0
    OUT.t7 = texCoord.xyyy + half4( 2.0*dx,-dy, 0,  dy); // C4 F4 I4

    return OUT;
}


/*    FRAGMENT SHADER    */
half4 main_fragment(in out_vertex VAR, uniform sampler2D decal : TEXUNIT0, uniform input IN) : COLOR
{
    bool4 edr, edr_left, edr_up, px; // px = pixel, edr = edge detection rule
    bool4 interp_restriction_lv1, interp_restriction_lv2_left, interp_restriction_lv2_up;
    bool4 nc, nc30, nc60, nc45; // new_color
    float4 fx, fx_left, fx_up, final_fx; // inequations of straight lines.
    half3 res1, res2, pix1, pix2;
    float blend1, blend2;


    half3 A1 = tex2D(decal, VAR.t1.xw).rgb;
    half3 B1 = tex2D(decal, VAR.t1.yw).rgb;
    half3 C1 = tex2D(decal, VAR.t1.zw).rgb;

    half3 A  = tex2D(decal, VAR.t2.xw).rgb;
    half3 B  = tex2D(decal, VAR.t2.yw).rgb;
    half3 C  = tex2D(decal, VAR.t2.zw).rgb;

    half3 D  = tex2D(decal, VAR.t3.xw).rgb;
    half3 E  = tex2D(decal, VAR.t3.yw).rgb;
    half3 F  = tex2D(decal, VAR.t3.zw).rgb;

    half3 G  = tex2D(decal, VAR.t4.xw).rgb;
    half3 H  = tex2D(decal, VAR.t4.yw).rgb;
    half3 I  = tex2D(decal, VAR.t4.zw).rgb;

    half3 G5 = tex2D(decal, VAR.t5.xw).rgb;
    half3 H5 = tex2D(decal, VAR.t5.yw).rgb;
    half3 I5 = tex2D(decal, VAR.t5.zw).rgb;

    half3 A0 = tex2D(decal, VAR.t6.xy).rgb;
    half3 D0 = tex2D(decal, VAR.t6.xz).rgb;
    half3 G0 = tex2D(decal, VAR.t6.xw).rgb;

    half3 C4 = tex2D(decal, VAR.t7.xy).rgb;
    half3 F4 = tex2D(decal, VAR.t7.xz).rgb;
    half3 I4 = tex2D(decal, VAR.t7.xw).rgb;

    float4 b = mul( half4x3(B, D, H, F), yuv_weighted[0] );
    float4 c = mul( half4x3(C, A, G, I), yuv_weighted[0] );
    float4 e = mul( half4x3(E, E, E, E), yuv_weighted[0] );
    float4 d = b.yzwx;
    float4 f = b.wxyz;
    float4 g = c.zwxy;
    float4 h = b.zwxy;
    float4 i = c.wxyz;

    float4 i4 = mul( half4x3(I4, C1, A0, G5), yuv_weighted[0] );
    float4 i5 = mul( half4x3(I5, C4, A1, G0), yuv_weighted[0] );
    float4 h5 = mul( half4x3(H5, F4, B1, D0), yuv_weighted[0] );
    float4 f4 = h5.yzwx;

    interp_restriction_lv1      = ((e!=f) && (e!=h)  && ( !eq(f,b) && !eq(f,c) || !eq(h,d) && !eq(h,g) || eq(e,i) && (!eq(f,f4) && !eq(f,i4) || !eq(h,h5) && !eq(h,i5)) || eq(e,g) || eq(e,c)) );

    edr      = (weighted_distance( e, c, g, i, h5, f4, h, f) < weighted_distance( h, d, i5, f, i4, b, e, i)) && interp_restriction_lv1;

    E.x = dot(half4(edr), xbr_info);

    return half4(E, 1.0);
}




2nd pass (use it at 4x or ‘don’t care’):


/*
   Hyllian's xBR LVL1 pass1 beta
   
   Copyright (C) 2011/2012 Hyllian/Jararaca - [email][email protected][/email]

   This program is free software; you can redistribute it and/or
   modify it under the terms of the GNU General Public License
   as published by the Free Software Foundation; either version 2
   of the License, or (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.


   Incorporates some of the ideas from SABR shader. Thanks to Joshua Street.
*/

const static float coef           = 2.0;
const static float4 eq_threshold  = float4(15.0, 15.0, 15.0, 15.0);
const static half y_weight        = 48.0;
const static half u_weight        = 7.0;
const static half v_weight        = 6.0;
const static half3x3 yuv          = half3x3(0.299, 0.587, 0.114, -0.169, -0.331, 0.499, 0.499, -0.418, -0.0813);
const static half3x3 yuv_weighted = half3x3(y_weight*yuv[0], u_weight*yuv[1], v_weight*yuv[2]);
const static float4 delta       = float4(0.4, 0.4, 0.4, 0.4);

float4 df(float4 A, float4 B)
{
    return float4(abs(A-B));
}

half c_df(half3 c1, half3 c2) {
                        half3 df = abs(c1 - c2);
                        return df.r + df.g + df.b;
                }




bool4 eq(float4 A, float4 B)
{
    return (df(A, B) < eq_threshold);
}

float4 weighted_distance(float4 a, float4 b, float4 c, float4 d, float4 e, float4 f, float4 g, float4 h)
{
    return (df(a,b) + df(a,c) + df(d,e) + df(d,f) + 4.0*df(g,h));
}

struct orig
{
    float2 tex_coord;
    uniform float2 texture_size;
    uniform sampler2D texture;
};

struct input
{
  half2 video_size;
  float2 texture_size;
  half2 output_size;
  float frame_count;
  float frame_direction;
  float frame_rotation;
};


struct out_vertex {
    half4 position : POSITION;
    half4 color    : COLOR;
    float2 texCoord : TEXCOORD0;
    float4 t1;
    float4 t2;
    float4 t3;
    float4 t4;
    float4 t5;
    float4 t6;
    float4 t7;
    float2 orig_tex;
};

/*    VERTEX_SHADER    */
out_vertex main_vertex
(
    half4 position    : POSITION,
    half4 color    : COLOR,
    float2 texCoord : TEXCOORD0,

       uniform half4x4 modelViewProj,
    orig ORIG
)
{
    out_vertex OUT;

    OUT.position = mul(modelViewProj, position);
    OUT.color = color;

    float2 ps = float2(1.0/ORIG.texture_size.x, 1.0/ORIG.texture_size.y);
    float dx = ps.x;
    float dy = ps.y;

    //    A1 B1 C1
    // A0  A  B  C C4
    // D0  D  E  F F4
    // G0  G  H  I I4
    //    G5 H5 I5

    OUT.texCoord = texCoord;
    OUT.orig_tex = ORIG.tex_coord;
    OUT.t1 = ORIG.tex_coord.xxxy + half4( -dx, 0, dx,-2.0*dy); // A1 B1 C1
    OUT.t2 = ORIG.tex_coord.xxxy + half4( -dx, 0, dx,    -dy); //  A  B  C
    OUT.t3 = ORIG.tex_coord.xxxy + half4( -dx, 0, dx,      0); //  D  E  F
    OUT.t4 = ORIG.tex_coord.xxxy + half4( -dx, 0, dx,     dy); //  G  H  I
    OUT.t5 = ORIG.tex_coord.xxxy + half4( -dx, 0, dx, 2.0*dy); // G5 H5 I5
    OUT.t6 = ORIG.tex_coord.xyyy + half4(-2.0*dx,-dy, 0,  dy); // A0 D0 G0
    OUT.t7 = ORIG.tex_coord.xyyy + half4( 2.0*dx,-dy, 0,  dy); // C4 F4 I4

    return OUT;
}


/*    FRAGMENT SHADER    */
half4 main_fragment(in out_vertex VAR, uniform sampler2D decal : TEXUNIT0, orig ORIG, uniform input IN) : COLOR
{
    bool4 edr, edr_left, edr_up, px; // px = pixel, edr = edge detection rule
    bool4 interp_restriction_lv1, interp_restriction_lv2_left, interp_restriction_lv2_up;
    bool4 nc, nc30, nc60, nc45; // new_color
    float4 fx, fx_left, fx_up, final_fx; // inequations of straight lines.
    half3 res1, res2, pix1, pix2;
    float blend1, blend2;

    float2 fp = frac(ORIG.tex_coord*ORIG.texture_size);


    half3 B  = tex2D(ORIG.texture, VAR.t2.yw).rgb;
    half3 D  = tex2D(ORIG.texture, VAR.t3.xw).rgb;
    half3 E  = tex2D(ORIG.texture, VAR.t3.yw).rgb;
    half3 F  = tex2D(ORIG.texture, VAR.t3.zw).rgb;
    half3 H  = tex2D(ORIG.texture, VAR.t4.yw).rgb;

    float4 b = mul( half4x3(B, D, H, F), yuv_weighted[0] );
    float4 e = mul( half4x3(E, E, E, E), yuv_weighted[0] );
    float4 d = b.yzwx;
    float4 f = b.wxyz;
    float4 h = b.zwxy;

    float4 Ao = float4( 1.0, -1.0, -1.0, 1.0 );
    float4 Bo = float4( 1.0,  1.0, -1.0,-1.0 );
    float4 Co = float4( 1.5,  0.5, -0.5, 0.5 );

    // These inequations define the line below which interpolation occurs.
    fx      = (Ao*fp.y+Bo*fp.x); 

    float4 fx45 = smoothstep(Co - delta, Co + delta, fx);


    half3 INFO  = tex2D(decal, VAR.texCoord).xyz;
    float w = (INFO.x)*100.0;

    edr.x = bool(fmod(w, 2)); w = floor(w/2.0);
    edr.y = bool(fmod(w, 2)); w = floor(w/2.0);
    edr.z = bool(fmod(w, 2)); w = floor(w/2.0);
    edr.w = bool(fmod(w, 2));

    nc45 = ( edr && bool4(fx45));

    px = (df(e,f) <= df(e,h));

    nc = (nc45);

    float4 final45 = nc45*fx45;

    float4 maximo = final45;

         if (nc.x) {pix1 = px.x ? F : H; blend1 = maximo.x;}
    else if (nc.y) {pix1 = px.y ? B : F; blend1 = maximo.y;}
    else if (nc.z) {pix1 = px.z ? D : B; blend1 = maximo.z;}
    else if (nc.w) {pix1 = px.w ? H : D; blend1 = maximo.w;}
    else {pix1 = E; blend1 = 0.0;}

    half3 res = lerp(E, pix1, blend1);

    return half4(res, 1.0);
}




ORIG.texture_size isn’t declared as a uniform in the struct.

tks, it changes something, though the output continues wrong. Well, I guess there are more bugs to find…

Should I bind “texture” to TEXUNIT1 and “tex_coord” to TEXCOORD1 in the struct?

There is no need to bind TEXUNITs explicitly (except for input texture TEXUNIT0). Input TEXCOORD1 is reserved for LUT tex coords. All tex coords other than TEXCOORD0 and 1 are looked up by name.

I think I figured out why this happening!

I’ve made some tests here and discovered the ORIG coords are a line below the original texture. They are misaligned. Maybe it’s the origin of all my problems.

make a three shader test: the first one just work as stock shader, returning the input. The second one you can make two shaders, the first one get input from the first shader (serial way) and output it. The second one get input from ORIG and output it. The test: use just two shaders each time and you’ll see the ORIG one will make the game screen go down by a line.

This is why I can’t divide my xbr in two shaders. They get misaligned, so the info read in the last pass is always wrong for the pixel I’m filtering.