Shader to add padding to image

Mr_Figs · 19 September 2021 03:45

Apologies in advance for the noob question.

I’m trying to write a shader that pads the top edge of the image with black pixels. It could be used to add a cinematic “black bar” for instance. My ultimate goal is to add the bar in the middle of the screen to recreate the DS screen gap, but I’m starting small.

The challenge I face is that this entails changing the resolution and aspect ratio. For example, to pad an 8x8 image with 3 pixels (rows) at the top, the output should be 11x8.

Using the vertex stage, I can horizontally compress the image to reach the desired, taller aspect ratio. (The box.slang shader already does this for integer scales.) Vertical stretching doesn’t work, since it just crops off the top/bottom. But this isn’t really the desired outcome:

It leaves the left and right edges black, but doesn’t crop them off, so the image is pillarboxed. Really, the aspect ratio of the framebuffer hasn’t changed at all.
The whole image is now stretched—sampling this non-integer-scaled image and reconstructing it will be really hard to get right. Stretching and unstretching just…really seems like the wrong way to accomplish a simple pan.

Since Maister left this unresolved in box.slang, there is probably no hope…but I figured I would ask around anyway. Here are some leads:

Each shader pass is given the SourceSize as well as the OutputSize. You can’t just assign a different value to OutputSize from the shader, but if I knew what was driving it, maybe an arbitrary output resolution could be set.
The Slang Shader Development docs page, section Multiple passes shows a simple filter chain, and alludes to framebuffers and backbuffers of resolutions different to the input (not necessarily arbitrary resolutions, I guess):

(Input) -> [ Shader Pass #0 ] -> (Framebuffer) -> [ Shader Pass #1 ] -> (Backbuffer)

Framebuffer here might have a different resolution than both Input and Backbuffer. A very common scenario for this is separable filters where we first scale horizontally, then vertically.

Cropping is a pretty similar operation to padding—cropping a negative number of pixels from the edge, kind of. If a simple “crop image” shader exists, it would probably be pretty easy to rework into a padding shader. Only problem is, I can’t find one… (Masking the edges with black does not count!)

It’s pretty funny, this could be basically a single memmove() in a CPU-based filter. Not so simple with the GPU!

hunterk · 19 September 2021 04:55

So, are you trying to zoom out, essentially? if so, that’s pretty easy: just move the image so the center is at the origin (texcoord.xy - 0.5) and then multiply it by something to make it smaller. Then, put the texcoords back (+0.5) and you’re all set.

if you want to do the DS screen gap, you’ll have to duplicate the image and then mask out each half to avoid overlapping, but otherwise, the process would be similar.

Mr_Figs · 19 September 2021 19:22

Thanks for the quick reply!

So…not exactly. Here’s my implementation of your suggested “zoom” approach:

test.slang

#version 450

layout(std140, set = 0, binding = 0) uniform UBO
{
	vec4 SourceSize;
	vec4 OutputSize;
	mat4 MVP;
	float gap_px;
} global;

#pragma parameter gap_px "Gap height (px)" 64.0 0.0 4096.0 1.0

#pragma stage vertex
layout(location = 0) in vec4 Position;
layout(location = 1) in vec2 TexCoord;
layout(location = 0) out vec2 vTexCoord;

void main()
{
	gl_Position = global.MVP * Position;

	float newHeight = global.SourceSize.y + global.gap_px;
	float yScale = newHeight / global.SourceSize.y;  // yScale > 1

	vec2 centered = TexCoord - vec2(0.5);  // center image
	vec2 scaled = centered * vec2(yScale);  // scale X and Y
//	vec2 scaled = centered * vec2(1, yScale);  // scale Y only (stretch)
	vec2 shifted = scaled - vec2(0, global.gap_px / 2 / global.SourceSize.y);
	vTexCoord = shifted + vec2(0.5);
}

#pragma stage fragment
layout(location = 0) in vec2 vTexCoord;
layout(location = 0) out vec4 FragColor;
layout(set = 0, binding = 2) uniform sampler2D Source;

void main()
{
	FragColor = texture(Source, vTexCoord);
}

And the result, with a 64-pixel gap:

Screenshot is taken with integer scaling set to ON (3x scale due to the window size) It looks correct at first glance, but there are basically the same issues I ran into before:

Non-integer scaling throughout the image. Especially obvious with the text at the bottom; pixels are 3x3, 3x2, 2x3, or 2x2 on the display. This makes sense, since we took an image scaled to exactly 3x and shrunk it. The bar also has non-integer scaling.
Black bars on the left and right. In the screenshot above, I sized the window to exactly the minimum size, such that the integer scale will step down to 2x if I shrink the window at all—I can’t hide the pillarboxing unless I use the “integer scale overscale” option, which lets me use an arbitrarily-thin window with the same 3x scale.

Scaling the image only in y preserves the integer scale in x and fills the full width of the framebuffer. However, now we’ve wrecked the aspect ratio, so that’s not the answer.

There is a workaround: turn integer scale off, and resize the window to exactly 64px taller (or 64x3 in my case, since I have a 3x scale). Then make the window wide enough that the image spans from top to bottom. I verified that the pixel counts are exactly correct for the black bar and for the game image below it. Screenshot is cut off on the right a little bit due to my screenshot tool:

So…success?

Going back to the simple 2-shader pipeline from before:

(Input) -> [ Shader Pass #0 ] -> (Framebuffer) -> [ Shader Pass #1 ] -> (Backbuffer)

What we’re really doing here is controlling the backbuffer resolution by resizing the window. For fullscreen, maybe this could work using a custom viewport, but that’s a pretty inflexible solution. The viewport settings would have to change depending on the height of the bar and the original content resolution. Maybe feasible for a DS-only setup, but I don’t like it.

So let’s say we do two shader passes: first pass is the bar-insert shader, and second pass is box-max.slang or something. This way, the bar-insert shader can render into a framebuffer independent from the input- or backbuffer. Instead of resizing the window, can we set the intermediate framebuffer dimensions in the shader, or the shader preset?

hunterk · 19 September 2021 22:10

You can use ‘absolute’ scaling to set the exact number of pixels x and y, but that doesn’t work for a variable bar size.

Mr_Figs · 26 September 2021 02:53

Thank you so much! I had forgotten about the preset files entirely.

I rewrote the shader file to remove the “gap height” parameter, and just operate off of the standard SourceSize / OutputSize arguments. It adds lines right in the center of the image—however many are needed to bring the source height up to equal the output height. This way, the separation can be controlled from the absolute Y scale in the playlist file, without editing the shader file. Making an X version for side-by-side/rotated content should be easy.

I reverted the vertex shader to stock, and moved all the hard work to the pixel-shader step. The logic was easier than I expected. I don’t have any test patterns to to run (no 240p test suite for DS) but I think it’s pixel-perfect.

I have to run a pass of box-max.slang (or some other auto-box shader) to preserve the aspect ratio. Without that, the image gets vertically squashed down to the original aspect ratio. {1}

Playlist file

shaders = "2"
shader0 = "shaders/splitScreenV.slang"
shader1 = "../auto-box/box-max.slang"

scale_type_x0 = "source"
scale_x0 = 1.0
scale_type_y0 = "absolute"
scale_y0 = 448  // (192px * 2) + 64px

scale_type1 = viewport
scale1 = 1.0

Shader file (splitScreenV.slang)

#version 450

// Split Screen Vertical
// Changes an image to a taller resolution by inserting a monochromatic bar in
//     the center of the image. Aspect and resolution of the top and bottom
//     "screens" is not affected.
// Intended for use with Nintendo DS video.
// Control the size of the gap by modifying the output Y scale in the shader playlist.

layout(std140, set = 0, binding = 0) uniform UBO
{
	vec4 SourceSize;
	vec4 OutputSize;
	mat4 MVP;
	float mask_r;
	float mask_g;
	float mask_b;
} global;

#pragma parameter mask_r "Mask color (R)" 0.0 0.0 1.0 0.00390625
#pragma parameter mask_g "Mask color (G)" 0.0 0.0 1.0 0.00390625
#pragma parameter mask_b "Mask color (B)" 0.0 0.0 1.0 0.00390625

#pragma stage vertex
layout(location = 0) in vec4 Position;
layout(location = 1) in vec2 TexCoord;
layout(location = 0) out vec2 vTexCoord;

void main()
{
	gl_Position = global.MVP * Position;
	vTexCoord = TexCoord;  // null vertex stage
}

#pragma stage fragment
layout(location = 0) in vec2 vTexCoord;
layout(location = 0) out vec4 FragColor;
layout(set = 0, binding = 2) uniform sampler2D Source;

void main()
{
	float oldH_px = global.SourceSize.y;
	float newH_px = global.OutputSize.y;
	float gap_px = newH_px - oldH_px;

	float topScrEdge = (oldH_px / 2) / newH_px;  // bottom edge of top screen, normalized
	float botScrEdge = 1.0 - topScrEdge;  // top edge of bottom screen, normalized
	float scrH = topScrEdge;  // height of each screen, normalized (just for code clarity)

	if (vTexCoord.y < topScrEdge) {  // top screen
		float topScrY = vTexCoord.y / scrH;  // pos within top screen
		float y = topScrY / 2;  // pos in source img
		vec2 coord = vec2(vTexCoord.x, y);
		FragColor = texture(Source, coord);
	} else if (vTexCoord.y > botScrEdge) {  // bottom screen
		float botScrY = (vTexCoord.y - botScrEdge) / scrH;  // pos within bottom screen
		float y = (botScrY / 2) + 0.5;  // pos in source img
		vec2 coord = vec2(vTexCoord.x, y);
		FragColor = texture(Source, coord);
	} else {  // screen gap
		FragColor = vec4(global.mask_r, global.mask_g, global.mask_b, 1);
	}
}

There is one remaining issue, which is that “margins” are still added to the image with this method. For instance, this is the smallest window that can display a 1x-scaled output:

In this example, the window is exactly 512x768px, which is an exact 2x scale of the native DS resolution with no gap, as output by the core (256x384px). It appears that the absolute scaling mode just scales the source framebuffer up by an integer, then zooms/centers the output in this “canvas”. So I guess the original aspect ratio is kind of inescapable, huh? I can’t help but question this—doesn’t exactly seem like the expected behavior, and it’s clearly limiting.

Integer scale overscale sometimes fixes this, but not always, and sometimes actually overscales the image, so it’s only a workaround if you’re lucky.

Still, I’m really happy with the progress here. It might be possible to make this work with a custom viewport. Thanks again for your help so far—let me know if you see a solution for the issue above.

{1} Side note: border shaders

Some (maybe all) of the “border” shaders (e.g. sgb.slangp) skip this step, so their aspect ratio is actually incorrect! This also causes non-integer scaling—check out the pixels in the screencap below. However, this spares these shaders from the “margins” described above.

If the absolute scaling mode actually sized the output framebuffer to that exact size, I think these issues would all go away.

shaders = "1"
shader0 = "../shaders/imgborder-sgb.slang"

scale_type_x0 = "absolute"
scale_x0 = "256"
scale_type_y0 = "absolute"
scale_y0 = "224"

parameters = "box_scale;location;in_res_x;in_res_y"
box_scale = "1.000000"
location = "0.500000"
in_res_x = "160.000000"
in_res_y = "144.000000"

textures = "BORDER"
BORDER = "sgb.png"

hunterk · 26 September 2021 03:01

eyyyy, lookin’ good!

I would guess that it’s the auto-box shader that’s screwing with your margins.

Re: the border shaders, they are indeed affected by the aspect ratio setting. Most are intended to use the ‘full’ aspect to give the most draw area.

Mr_Figs · 26 September 2021 16:26

Aaaahhhh! That was it!

Setting the scaling mode to “Full” with integer scale off is the magic bullet. I guess those settings affect the whole shader chain, not just the final scale up to viewport resolution.

It’s…perfect 🥲

I’m still using the same box-max.slang pass to preserve integer scaling of the actual content, so we can have our cake and eat it too.

FYI, I tried a border preset (no auto-box) with the Full aspect and couldn’t get correct behavior. The aspect gets stretched to fit the window, and the X-axis is non-integer scaled, even with integer scale on. Maybe I’m missing something? My auto-box method works, but commonality with the other shaders would be preferable I guess.

I want to do a little cleanup and add XY functionality, but after that I can work on a PR if you’re interested. Thank you so much for the help hunterk!

hunterk · 26 September 2021 17:08

My pleasure! I’m glad you got it working perfectly

And yeah, I didn’t realize you had aspect/integer scaling settings going on in settings > video. That indeed takes the final output and then makes it conform to its settings.

For the other border shaders, I set them to default to a roughly 4:3-ish aspect (64:49, IIRC?) but you can change it to whatever you want in the shader parameters.

And sure, a PR would be great. Probably good to put it in either the ‘handheld’ or ‘misc’ directory.

Mr_Figs · 26 September 2021 17:28

Oh, I see. I think I’ll leave my preset as-is, in that case—calculating the aspect per-shader seems like kind of a pain. (I guess it makes sense if the console doesn’t have 1:1 PAR.) I’ll note the recommended scaling settings in the preset file.

XY split is done. I think I’m getting better at this.

Last issue is that box-max and friends center the image in the viewport, so it overlaps the button overlay on my phone. I’m going to work on a version of box-max with adjustable anchoring to deal with that.

hunterk · 26 September 2021 17:30

sounds good. Also, since you’re using a preset, ‘handheld’ is probably the best place for it, now that I think of it.

Mr_Figs · 26 September 2021 23:07

Okay, one last roadblock: mouse/touch input isn’t correctly mapped to the video.

On a touchscreen device, I have to tap far below the bottom of the screen to tap anywere on the bottom half in-game. The Y-coordinates converge around the top of the bottom screen, but it’s still not exact. I think the X-axis is okay. On a desktop with a mouse, the sensitivity is just really high.

I suspect that upscaling with a shader (box-whatever.slang) instead of letting RA handle it after the last shader pass is the cause. Is there any way to manually control the region of the viewport that’s treated as the “mouse area”?

hunterk · 27 September 2021 02:08

ooh, that’s a bummer, yeah. I don’t know of any way to modify that, no.

Mr_Figs · 28 September 2021 04:00

Yeah, this might be the end unfortunately…

It looks like video_driver_translate_coord_viewport() is doing the work here. It maps the mouse x/y pixel coords into normalized [-1, +1] coords within the viewport/window. But instead either the viewport or the window, we want these coords calculated relative to some arbitrary selection within the viewport. (And for touch input, I guess we want the absolute pixel coordinates offset by the same, too.)

What’s worse is that the geometry of this “selection” is driven by the shader. I don’t think there’s any way to handle it programmatically–maybe for the vertex stage, but not for the pixel stage with big discontinuities in the middle.

The best I can come up with is to add 4 options to the config file (selection x, y, width, height). Then these get fed into video_driver_translate_coord_viewport(), which outputs normalized “selection” coords, in addition to viewport and window coords. But this is such a niche use-case, it doesn’t really seem appropriate to add all that complexity IMO. Not to mention the user would have to tune these values by hand, so it sucks as a solution, too.

Maybe it just has to be implemented as a core option after all ¯\_(ツ)_/¯

hunterk · 28 September 2021 12:48

oof, yeah, 4 core options sucks but that sounds like the only way to do it. There’s no way to pass that sort of information from a core to the frontend to the shader pipeline