Vertical downsampler for crt shaders and AA shaders

It’s quite known that crt shaders with scanlines have it’s issues on common (1080p) displays if the scaling factor is not integer. I tried to mitigate the problem with supersampling, but the downsampling algorithm in general use tends to give results that don’t use all the benefits of higher resolution shading. At least in my case.

My first aim was to create a vertical supersampling downsampler which uses very high viewport multipliers, like 8k. Surprisingly my testings revealed, that it’s much more convenient to use internal integer scaling (scale = source) and the results were very comparable. It’s still recommended to supersample, like 1.5x, but the speed loss is much lower.

The shader used for this is quite simple, it calculates an average subpixel sum of the “perfect pixel” range. Should work well enough. The drawbacks i found was the scanlines could lose some edginess and it can’t make hi-res contents shine.

I’ll post the shader and a test preset here, so please test it a bit. :face_with_monocle:

scanline_downsampler.glsl: - in test preset, goes to …crt/shaders/guest

/*
   Vertical Downsampling Shader (for even scanlines)
   
   Copyright (C) 2019 guest(r) - [email protected]
   
   This program is free software; you can redistribute it and/or
   modify it under the terms of the GNU General Public License
   as published by the Free Software Foundation; either version 2
   of the License, or (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
   
*/


#pragma parameter steps "Interpolation Steps" 4.0 1.0 12.0 1.0
#pragma parameter srange "Sampling Range" 0.75 0.05 2.0 0.05

#if defined(VERTEX)

#if __VERSION__ >= 130
#define COMPAT_VARYING out
#define COMPAT_ATTRIBUTE in
#define COMPAT_TEXTURE texture
#else
#define COMPAT_VARYING varying 
#define COMPAT_ATTRIBUTE attribute 
#define COMPAT_TEXTURE texture2D
#endif

#ifdef GL_ES
#define COMPAT_PRECISION mediump
#else
#define COMPAT_PRECISION
#endif

COMPAT_ATTRIBUTE vec4 VertexCoord;
COMPAT_ATTRIBUTE vec4 COLOR;
COMPAT_ATTRIBUTE vec4 TexCoord;
COMPAT_VARYING vec4 COL0;
COMPAT_VARYING vec4 TEX0;

vec4 _oPosition1; 
uniform mat4 MVPMatrix;
uniform COMPAT_PRECISION int FrameDirection;
uniform COMPAT_PRECISION int FrameCount;
uniform COMPAT_PRECISION vec2 OutputSize;
uniform COMPAT_PRECISION vec2 TextureSize;
uniform COMPAT_PRECISION vec2 InputSize;

void main()
{
    gl_Position = MVPMatrix * VertexCoord;
    COL0 = COLOR;
    TEX0.xy = TexCoord.xy * 1.00001;
}

#elif defined(FRAGMENT)

#if __VERSION__ >= 130
#define COMPAT_VARYING in
#define COMPAT_TEXTURE texture
out vec4 FragColor;
#else
#define COMPAT_VARYING varying
#define FragColor gl_FragColor
#define COMPAT_TEXTURE texture2D
#endif

#ifdef GL_ES
#ifdef GL_FRAGMENT_PRECISION_HIGH
precision highp float;
#else
precision mediump float;
#endif
#define COMPAT_PRECISION mediump
#else
#define COMPAT_PRECISION
#endif

uniform COMPAT_PRECISION int FrameDirection;
uniform COMPAT_PRECISION int FrameCount;
uniform COMPAT_PRECISION vec2 OutputSize;
uniform COMPAT_PRECISION vec2 TextureSize;
uniform COMPAT_PRECISION vec2 InputSize;
uniform sampler2D Texture;
COMPAT_VARYING vec4 TEX0;

#ifdef PARAMETER_UNIFORM
// All parameter floats need to have COMPAT_PRECISION in front of them
uniform COMPAT_PRECISION float steps;
uniform COMPAT_PRECISION float srange;
#else
#define steps        4.00
#define srange       0.75
#endif


void main()
{
	float ratio = srange*InputSize.y/OutputSize.y;
	vec3 color = vec3(0.0);
	float dy = 1.0/TextureSize.y;
	float wsum = 0.0;
	float dif = ratio/steps;
	
	for (float i = -ratio; i <= ratio; i = i + dif)
	{
		color += COMPAT_TEXTURE(Texture, TEX0.xy + vec2(0.0, dy * i)).rgb;
		wsum = wsum + 1.0;
	} 
	
	color = color / wsum;
	
	FragColor = vec4(color,1.0);
} 
#endif

Test preset:

shaders = 3

shader0 = shaders/guest/d65-d50.glsl
filter_linear0 = false
scale_type0 = source
scale0 = 1.0

shader1 = shaders/guest/crt-guest-sm.glsl
scale_type_x1 = viewport
scale_type_y1 = source
scale_x1 = 1.0
scale_y1 = 8.0

shader2 = shaders/guest/scanline_downsampler.glsl
filter_linear2 = true
scale_type_x2 = viewport
scale_type_y2 = viewport
scale_x2 = 1.0
scale_y2 = 1.0

You can play around with the scaling factor, higher value tends to bring better results. If someone want’s to improve the algorithm and/or create a better one, please be my guest. :upside_down_face:

Edit: small bugfix

4 Likes

This seems to work really well. I tried some of the more temperamental scanline shaders in place of crt-guest-sm and it does a really good job of minimizing the unevenness.

2 Likes

Thanks for bothering with it. :slight_smile: I find it interesting that it can work on a minimal integer scale too:

5x scale sampled to viewport: corrections

5xscale sampled with shader to viewport: corrections

But it won’t stop me to buy a new display. :smiley:

Merry Christmas everyone!

9 Likes

Dunno about everyone else but I m getting vertical scanlines with this

1 Like

Me too, i got vertical scanlines for some reason

1 Like

Hard to say, but i encountered situations where presets with different scaling types for one shader caused problems. Does it help if you change the crt part of the preset to something like this?

shader1 = shaders/guest/crt-guest-sm.glsl
scale_type_x1 = viewport
scale_type_y1 = viewport
scale_x1 = 1.0
scale_y1 = 2.0
1 Like

Same result unfortunately

I changed the vertex part of the shader, usually such things can happen there. It’s hard to tell what’s going on, because it’s working with some configurations.

Is there a slang version?

1 Like

Sure, i can make one. The idea is that you can use thiner/stronger scanlines with non-integer display output. But even in theory scanlines can’t get even with that ‘helper’ only visually more appealing. :grinning:

scanline_downsampler.slang

#version 450

/*
   Vertical Downsampling Shader (for even scanlines)
   
   Copyright (C) 2019-2020 guest(r) - [email protected]
   
   This program is free software; you can redistribute it and/or
   modify it under the terms of the GNU General Public License
   as published by the Free Software Foundation; either version 2
   of the License, or (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
   
*/ 


layout(push_constant) uniform Push
{
	vec4 SourceSize;
	vec4 OriginalSize;
	vec4 OutputSize;
	uint FrameCount;
	float steps, srange;
} params;

#pragma parameter steps "Interpolation Steps (0.0-OFF)" 5.0 0.0 12.0 1.0
#pragma parameter srange "Sampling Range" 1.0 0.05 2.5 0.05 

#define steps params.steps
#define srange params.srange

layout(std140, set = 0, binding = 0) uniform UBO
{
	mat4 MVP;
} global;

#pragma stage vertex
layout(location = 0) in vec4 Position;
layout(location = 1) in vec2 TexCoord;
layout(location = 0) out vec2 vTexCoord;

void main()
{
   gl_Position = global.MVP * Position;
   vTexCoord = TexCoord;
}

#pragma stage fragment
layout(location = 0) in vec2 vTexCoord;
layout(location = 0) out vec4 FragColor;
layout(set = 0, binding = 2) uniform sampler2D Source;


void main()
{
	float scale = params.SourceSize.y/params.OutputSize.y;
	float ratio = srange*scale;
	vec3 color = vec3(0.0);
	float dy = 1.0/params.SourceSize.y;
	float wsum = 0.0;
	float dif = ratio;
	if (steps > 0.0) dif = ratio/steps;
	
	for (float i = -ratio; i <= ratio; i = i + dif)
	{
		color += texture(Source, vTexCoord + vec2(0.0, dy * i)).rgb;
		wsum = wsum + 1.0;
	} 
	
	color = color / wsum; 	
	
	if (steps == 0.0) color = texture(Source, vTexCoord).rgb;
	
	FragColor = vec4(color,1.0);
}

And a sample preset:

shaders = 3

shader0 = shaders/guest/d65-d50.slang
filter_linear0 = false
scale_type0 = source
scale0 = 1.0

shader1 = shaders/guest/crt-guest-sm.slang
scale_type_x1 = viewport
scale_type_y1 = source
scale_x1 = 1.0
scale_y1 = 8.0

shader2 = shaders/guest/scanline_downsampler.slang
filter_linear2 = true
scale_type_x2 = viewport
scale_type_y2 = viewport
scale_x2 = 1.0
scale_y2 = 1.0

It should look OK like this, but increasing the y-scale usually gives better results.

Edit: new version, works better.

1 Like

Sorry but this isn’t working for me.

I placed “scanline_downsampler.slang” in the “RetroArch\shaders\shaders_slang\crt\shaders\guest” folder and

the preset, which I named “crt-guest-downsample”, in “RetroArch\shaders\shaders_slang\crt”

1 Like

Thanks for mentioning it. The “web formating” of the posted shader seemed wrong, now it should work.

1 Like

Thanks! Works great now.

BTW, can you explain what “Interpolation Steps” and “Sampling Range” do? And when you talk about y scaling, are you referring to “scale_y1 = 8.0” in your sample preset? How high can it go?

1 Like

BTW, can you explain what “Interpolation Steps” and “Sampling Range” do?

In my understanding pure linear vertical downsampling done by the HW samples 2 texels (maybe 4 in theory: tl,tr,bl,br, but the influence of 2 is zero) and makes a linear interpolation to the output texel. Dunno how should i feel about this, it’s very fast but it’s doesn’t use the benefits which supersampling offers.

“Interpolation Steps” and “Sampling Range” try to improve this weak approach. Ideally sampling range should be 0.5 (it works like -0.5 to 0.5 from current coordinate), but a bit greater value compensates for the fact, that we don’t hit the center of the pixel in most cases. Long story short target pixel consists of it’s “average” value over the sampling range probed by (2*step (+ 1)) lookups.

Number of steps also adds precision while greater sampling range gives better results when needed. I also discovered that with higher internal scales higher range is needed. Shader will be updated shortly.

And when you talk about y scaling, are you referring to “scale_y1 = 8.0” in your sample preset? How high can it go?

I would say 20x is the reasonable max for 224px content, but you really don’t need to get this high, since i can get decent results with 6x “source” scale too.

1 Like

Won’t start a new topic on this one, because it’s very similar. Later FXAA is a very cool anti-aliasing shader, but it has it’s ways of use. One of them is that it’s lacking proper downsample logic from my experience, but the results hidden in the buffer could look terrific!

I modified the downsampling algorithm for AA downsampling, 2 passes are needed for this. The shaders have some extra parameters for different looks as tastes can be different.

To use the preset ‘fxaa-downsample.slangp’ properly the internal resolution should be higher than the displayed one. It’s quite easy to test and setup though as a dl link is provided.

https://mega.nz/#!ltBH2IzT!7kQD3wdeSC0CLjwiwWKQVh0H6VhF5DdVW3tk7jiPNo0

Edit: SMAA preset added.

Wanna show an example of fxaa ‘downsampled’. There are some custom options available in the shaders, so it can look depending on the setup.

3 Likes