The Scanline Classic Shader

The bezel shader consumes more power than the entire rest of the shader chain combined, so if you have issues start there.

The next heaviest shader is scanline-advanced, largely because of anistropic filtering for masks and geometry.

Each shader has a bypass parameter that will eliminate most processing for that shader stage. I made it for debugging and fun, but it also has the benefit of finding out where there are performance bottlenecks.

While I’m developing this on a modern rig, I have a laptop with an iGPU that I’ll use to develop the scanline-basic shader. Originally scanline-basic was just a scanline shader without geometry and phosphors simulation (and it is the one, ported to GLSL, I still use to this day for 86Box). I’m not sure what scope the basic shader will expand to cover, probably a simplified composite/S-video chain, (e.g. notch only and a single YUV filter stage). Work on that probably won’t happen for a while though.

The plan right now is to make presets for all the major consoles from NES to N64. This will cover the majority use case of RetroArch and give enough of a sample to work through a lot of bugs. (I’ve already seemed to fix the scaling issue we discovered, so with the next release you can use an arbitrary scaling factor like 3x instead of 4x and get some performance back.)

Most of the screenshots of the composite/S-video shader you see, the ones before I did the bezel, use a single-pass composite filter. This was a clever implementation, but I soon ran into limitations based on the degree of analog modeling I wanted. Still, the single-pass shader I think still gives good results and I may salvage it for scanline-basic if it is indeed better performing than the two-pass method.

4 Likes

I have come up with something of a sharpening method. I have found the usual sharpening methods to be … clinical. Unsharp mask is probably the most aesthetically pleasing but lacks the analog goofiness I’m trying to discover.

The screenshots here show a completely unsharpened image and a sharpened image with user sharpness of 0.5, both using the notch filter. As the sharpness increases you can see the subcarrier start to come out. The sharpness is only applied to luma and I am pretty certain it’s going to only be a composite/RF feature for my shader, for simplicity. The moire patterns you see are interesting and I believe is due to scanline-advance geometry and scanlines interfering with the subcarrier. You can also, just barely, see a jailbar effect from the sharpening filter itself.

Now in motion you won’t notice the effect as strongly because the subcarrier shift helps cancel it out, and the 0.5 value I think does look better than the unfiltered image. The ‘correct’ way to calibrate sharpness is to generate a sweep pattern and set the sharpness so that you can see as many lines as possible without leading to brigthness shifts, ringing, etc. I can’t really do that because the SNES 240p Test Suite doesn’t have that kind of pattern and it would be too low resolution anyway, but the 0.5 value seems more correct than the unfiltered image.

I may have to finangle with it more to keep the subcarrier out. The issue is that it’s hard to sharpen so broadly while also keeping the subcarrier attenuated.

3 Likes

I’ve been playing around with masking methods. In order to avoid the pixel grid look, we have to do this:

  1. Filter the sampled input before applying the mask
  2. Apply the mask
  3. Filter again on upscale (or downscale)

This essentially creates two separate upscaling stages which makes the overall picture blurrier. However, we can at least control what filters we use at each stage. For now I am using bicubic filtering for both. However, it may be better to switch the second upscaler to Lanczos3 for more sharpness. The bicubic filter has been well-received as mimicking the beam spot.

The design parameter of the mask is TVL, but TVL is not well-defined in relation to a shadow mask, so I tried to keep it simple: 1 TVL is equal to one RGB cluster horizontally. I think this is the most intuitive definition because the patterns for the three shadow mask types (shadowmask, slot mask, aperture grille) are the same horizontally.

To get a better mask, black space is inserted around each dot in a pattern according to the mask. This is the spatial equivalent of ‘black frame insertion’ and does something to our perception to increase definition and clarity. The ‘black’ does not need to be the same as black level, simply being dark enough in relation to the bright dots is good enough. A feathering technique is used to reduce the black a bit and blend the dots a bit. This essentially acts as an anti-aliasing filter.

Aperture and shadow mask look okay, but slot mask has some issues I need to work out. Scanlines are present here but reduced to a 50% weight to provide more clarity to the mask. The TVL was set to 325 to represent a low-TVL screen, doable at 4K. A high TVL screen would need more resolution, but I am working on a grayscale mask that can support high TVL at a lower resolution (even if I had an 8K screen, I don’t think my video card would be able to support development of Scanline Classic on it). The shadow mask being darker is the correct behavior, but the slot mask is a little too light. The horizontal lines are too thin.

Shadow Mask

Slot Mask

Aperture Grille

5 Likes

Interesting shadow mask- normally each dot in the mesh is a single color (R, G or B) but here each dot contains a whole RGB triad. Is this intentional?

Also, have you considered how the subpixels will interact with the mask?

2 Likes

Each dot is a single color. The black space between dots creates a honeycomb effect because RGB dots aligned blend more as light than R blends with B. If the feathering is reduced, the dots become more distinct but at the cost of more moire.

Subpixel alignment requires assumptions about the colorspace and gamma of the output display. The use of dark regions is intended to reduce the effect the subpixel alignment would have as much as possible.

1 Like

I don’t think you need a higher resolution screen to simulate a high TVL.

I find the dots to be too tall and large in this mask. There’s no need to reinvent the wheel when it comes to mask implementation. There has been a lot of science, research, discussion and development on that front and different approaches have been successfully taken by CRT-Geom, CRT-Guest-Advanced, Koko-AIO, Sony Megatron Colour Video Monitor, CRT-Royale and more.

Please zoom in and take a look at the masks in these screenshots:

1 Like

This is just a WIP. The exact number of dots and the shape of the dots needed to be adjusted for aspect ratio. When I post some more screenshots tomorrow you’ll see the shadow mask dots come out circular.

The goal for this mask technique is to be input and output resolution independent. The only true-to-life parameters aside from the mask type are the TVL and the dot order (RGB, BGR, etc.). I could have used dot pitch, but that’s messy because it scales to the size of the actual screen and it’s fiddly to have to put in sizes. So I chose to make TVL correspond to number of dots horizontally. This is arbitrary; different manufacturers have different definitions of TVL, so I went with Sony’s definition where TVL is simply the number of horizontal phosphor triads on the screen.

Each triad requires six samples to resolve purely because there are black samples in between each dot, so for 4K screen at 4:3, 2880 / 6 = 480 TVL, medium TVL. However we can use a higher TVL if we allow for blending the dots, so I implemented that. And with the dot blending I can get up to a higher TVL. The dot blending also reduces brightness loss and I think that is very good for people still on SDR. Scanline Classic will still be an SDR shader and even if I get to implementing HDR down the road I still want to make this accessible in SDR and 1080p use cases.

There is also the marriage with the geometry technique. Geometry has to be applied both before and after the mask, so the mask has to be able to cope with that without breaking down (this is another reason why subpixel rendering will have to be ignored; with screen curvature means the subpixel antialiasing has to be calculated for each individual pixel).

3 Likes

Here is the almost complete pipeline rendering for RGB, just missing color correction and bezel. The geometry needs some tuning. Phosphor tails are maybe too strong…

2 Likes

Interesting. Is it that you’re not trying to emulate Phosphors down to the subpixel level but are capping it to the pixel level, so one emulated phosphor colour can be represented by one entire display pixel?

Yes, we can’t make use of the subpixels as phosphor dots because of the screen curvature and the phosphors being a slightly different color.

2 Likes

The original Scanline Classic shader directly transforms color coordinates to sRGB and clamps the result. The new version will have chromatic adaptation transform and gamut compression available. First, the chromatic adaptation. The test input is SMPTE C bars simulating an NTSC-J monitor (NTSC-J primaries, D93 white point). Zebra stripes show out-of-gamut colors (this is one of the many new debug tools available).

# Test NTSC-J
R_X = "0.618"
R_Y = "0.350"
G_X = "0.280"
G_Y = "0.605"
B_X = "0.152"
B_Y = "0.063"
R_WEIGHT = "0.2243"
G_WEIGHT = "0.6742"
B_WEIGHT = "0.1015"

No chromatic adaptation:

In this test pattern we see that red is the only strongly blown-out color (there are slight saturations for cyan and magenta which you can see as faint zebra stripes). That’s good. The drawback of a direct conversion is a loss of dynamic range. We are also fighting against the inherent non-linearity of the display. The closer we are to the white point, the more linear and accurate our display is. Chromatic adaptation allows us to estimate the ‘adapted’ response, where our eyes compensate for the difference in white point. We get back dynamic range and linearity.

Bradford Transform

The traditional Bradford transform includes a nonlinear transformation on blue. I have retained this because tests show a significant enough difference between the nonlinear and linear versions of Bradford. We can see clearly how a chromatic adaptation transform does not fix our issues with out-of-gamut colors and even introduces yellow and green blow-outs. A separate gamut mapping process is required to fix this. The white point is successfully transformed, but we can see how blue and cyan are darkened.

Zhai-Li CAT16 Method

This uses a more up-to-date color adaptation model. I don’t know whether or not its more accurate, but the blue channel looks brighter.

2 Likes

Color correction results

Gamut compression scales down luminance for high values and scales down chrominance for negative values.

Uncorrected, Rec. 709 primaries, D65

Corrected, NTSC-J primaries, D93 white point ‘absolute colorimetric’

Corrected, NTSC-J primaries, D93 transformed to D65 with gamut compression ‘perceptual’

5 Likes

How are you performing the gamut compression? One tool that I’ve used is this LUT generator https://github.com/ChthonVII/gamutthingy and I’ve heard of another LUT tool from ReShade.

Simple method scales colors according to the maximum component value when at least one component is greater than 1.0

Advanced method converts to CIELuv space, scales L until all components, when in RGB, are less than or equal to 1.0, then scales u and v until all components are greater than or equal to 0.

It’s not really compression, it’s controlled clipping. This method allows us to convert any arbitrary RGB colorspace (or RG, monochrome) to sRGB.

EDIT: It seems that gamut thingy tool is addressing a different use case. The color correction shown in these screenshots is mapping the phosphor gamut to sRGB space. It’s not simulating analog conversion circuitry. That will be addressed in a separate stage.

Gamutthingy includes both the phosphor gamut and nonstandard B-Y/R-Y demodulation wrapped together. You can perform one, the other, or both. The compression algorithm is for the phosphor gamut conversion, but it takes the B-Y/R-Y demodulation into account too to avoid compressing colors that won’t be output by the circuits.

I don’t see why a generic LUT shader couldn’t be used in place of the color output stage, so you could use your gamut thingy LUTs that way.

Well this is interesting … I may have accidentally discovered a way to easily make up lost brightness while maintaining gamma.

7 Likes

Can you provide details? I’m curious what this technique is. I basically blend in the unmasked scanlines starting when the the mask is fully saturated.

The mask is made with a series of Gaussian functions. Normally you are supposed to scale the output of a Gaussian by its integral because energy is supposed to be distributed across the function, but I just let it be free, so when you increase the sigma of these functions the mask dots blend into each other.

I then changed the User Picture and Brightness settings to operate on (virtual) luminance directly instead of voltage (and getting capped by the distortion modeler) so that I could pump values greater than 1.0 (to test my gamut compression algorithm). These two things essentially turned the CRT half of the pipeline into an HDR renderer. The zebra stripe mode I implemented allowed me to quickly scale the input without needing to measure. A tone mapper can give a little bit extra range but if it’s pushed too much the gamma will start to break down.

I then spent some time on math (I am not good at math). As long as you are working in a linear space and avoid exponentiating your input by itself it seems it should be possible to blend the image back to the original input level (at the expense of mask sharpness or whatever else you’re doing that cuts signal level):

https://www.desmos.com/calculator/6nl9xleqjo

6 Likes

There is a new beta available:

https://github.com/anikom15/scanline-classic/releases/tag/v6.0.0-beta2

You will find two kinds of presets: professional and consumer. I have still focused on SNES only but I also added an N64 preset as a bonus. I hope comparing the two can help to understand the settings better. If the moire from the slot mask bothers you, you can try shadow or aperture instead. I tried to organize the settings as best I could. Any feedback is welcome.

The next update will have RF and the remaining 16-bit consoles.

  • New parameter system
  • Sharpener circuit
  • Improved color correction
  • New tone mapper and gamut compressor
  • New masks
  • 1 wide color gamut preset included
  • Full HDRR pipeline (SDR output only; HDR output TBA)
  • Optimizations
5 Likes