New CRT shader - Any interest?

I have new shader files here (direct link to zip).

I’d say this is a pretty experimental release. It contains a few things.

Mask support

The shaders now have basic mask support. Currently only aperture grille is supported, and only for monitors with RGB or BGR subpixel arrangements. There is no “mask strength” parameter yet. Masks fade out as the area gets brighter so that they do not reduce the total brightness at all. (@kokoko3k, I ended up using the cubic blending as a “safer” default option.)

New presets: crt-beans-rgb and crt-beans-svideo

The original shader is now used in two preset files: crt-beans-rgb.slangp and crt-beans-svideo.slangp. They are the same set of shaders, but the S-Video preset has the low pass filter tuned to simulate S-Video signal bandwidth. Different consoles may have output signals with different bandwidths, so I just sort of picked some values that seemed like a reasonable default.

New preset: crt-beans-gaussian

There is a new crt-beans-gaussian.slangp preset. This is similar to the previous RGB preset, but uses a gaussian shape for the CRT spot. I would guess that this is more accurate to real CRTs. The image is slightly blurrier than the RGB preset because the gaussian has fatter tails.

The major advantage of this is that the spot width (and thus the scanline width) can be pushed to 1.2. You can basically get fatter, more overlapping lines. This allows simulating smaller, lower quality CRTs. The normal RGB preset doesn’t work properly with widths above 1.0 (that’s just a mathematical consequence of the spot function it uses).

The disadvantage is that the gaussian preset is much slower than the RGB preset. Because the gaussian’s tails are fatter, each area affects more pixels around it, which means a lot more texture samples are necessary. The scanline pass itself takes about 2x the time of the original, and I’d guess the whole shader takes about 1.6x the time of the original (I don’t have the numbers in front of me, currently).

New preset: crt-beans-monitor (no low pass filter)

There is a new crt-beans-monitor.slangp preset. The goal of this preset is to allow inputs other than 240p/480i or 288p/576i (basically, other than 15khz television). Thus the “monitor” name—it can be used for content at vga/480p and beyond.

This is a little complicated to explain. The original shader can’t be used without the low pass filter. It solves the scanline integral numerically, which requires the low pass filter to resample the input. This new preset solves the scanline integral analytically, which, conversely, requires that there is no resampling of the input. The input is interpreted as a piecewise-constant function and the integral is solved in pieces.

When resampling, I need to pick a number of samples to output for each line (a new image width, basically). The number of samples required changes based on the number of lines there are (the input image height). Unfortunately there is no way to do this within the limits of the slangp preset format. This, and another subtle issue with the presentation of the low pass frequencies parameters, led me to remove the low pass filter as a way to accommodate higher input resolutions.

There are two downsides to this. First is that there is no low pass filter, so you lose that capability. The output is a bit sharper horizontally than it probably should be. Second is that it is slightly slower. The analytical solution generally requires fewer texture samples but significantly more math. The worst performance is for very elongated pixel aspect ratios. For example, a 640x240 input will lead to worse performance than a 640x480 input.

Final thoughts

I may be able to upload some sample images tomorrow (the Streets of Rage screenshots above are pretty representative of the RGB preset).

I don’t know if there is really any interest in the gaussian or monitor presets. They might end up being a bit more confusing. They were interesting to make, though!

2 Likes

No need to reinvent the wheel as we already have Mask Layouts which happen to work with W-OLED/RWGB subpixel structures in CRT-Guest-Advanced and Sony Megatron Color Video Monitor.

You can probably use those as a starting point but of course better, novel implementations are welcomed as well.

So far 4K W-OLED users seem to be covered but this doesn’t seem to carry over to 1440p displays for whatever reason.

I’ll share the posts which led to Sony Megatron Color Video Monitor gaining the RWGB Display Subpixel Layout.

1 Like

Otoh, sometimes reinventing the wheel may lead to a rounder wheel :wink:

1 Like

I agree, that’s why I included this part:

2 Likes

Thanks, I will take a look at those posts. There won’t be any reinventing the wheel here—I don’t have an OLED display so I can’t even test any of these. And I assume nobody has any subpixels masks figured out for QD-OLEDs or small pentile OLEDs. It doesn’t really seem possible.

1 Like

There is a new version available here (direct link here).

The main addition is an option for a dynamically-generated aperture grille. This mask is not dependent on particular subpixel arrangements and can be scaled to arbitrary densities. I have 400-800 phosphor triads per output width as options, which seems reasonable to me. This should hopefully be a good option for weird subpixel arrangements (e.g. QD-OLED or pentile) and TATE mode (where the subpixels will be oriented incorrectly for subpixel masks), but it works well even on normal RGB LCD panels. It really shines at 4k but works reasonably well down to 1080p if you don’t go nuts with the density.

I have also reworked the mask blending function that I was using, to fix a big that caused clipping with certain inputs.

Finally, there are also some performance improvements, most notably for the crt-beans-monitor preset (the analytical scanline method).

@hunterk, the include file for the mask handling is here if you want to take a look. If you are interested in putting this in the repo, I can open a pull request. You may not want the subpixel stuff since you already have many of those masks in your own functions. I may add a mask strength parameter to the blend function, so it might make sense to wait for that (and maybe wait until this is tested a bit).

Here’s an example image at 1080p, with the dynamic aperture grille at 550 phosphor triads:

Even oxipng couldn’t compress the 4k image small enough for this forum, and webp isn’t supported, so you’ll have to see the 4k image at imgbb:

I only have a 4k monitor, so I’ve been testing these at lower resolutions by nearest neighbor scaling them to 200%. It would be great if anyone could let me know how they look on other native resolutions!

8 Likes

Testing on native 1080p. I think it gives pretty good results, congratulations for your work. However, it’s still a pretty heavy shader, compared to others which try similar results. The problem with 480p content also persists: the shader zooms in and it can’t be used with those.

2 Likes

I don’t mind having another mask function in the mix. Probably a good idea to wait until everything settles with testing and whatnot, but yeah, throw us a PR and we’ll get it in there :slight_smile:

1 Like

Thanks for testing!

What core, operating system, and graphics card are you using when you are seeing the zoomed in 480p problem? It works fine here. I’m using Linux, AMD graphics, and I tested on Flycast because Dolphin is broken on my install. Other cores (bsnes, mupen64, etc) also output something that looks like 480p when they are in interlacing modes or when you use upscaling, and that also works fine. I suspect there might be a Dolphin core or Retroarch bug that is triggering this behavior, but it is hard for me to fix because I can’t replicate it.

Note that I recommend the crt-beans-monitor preset for 480p if you don’t want interlacing. The rgb and svideo presets are tuned for 240p/480i (or 288p/576i, basically 15kHz TVs).

As for the performance, it has become a heavier shader than most, but it is still lighter weight than popular shaders like crt-guest-advanced or crt-royale. Currently the performance breakdown (on my hardware in 4k) is roughly 60% of time doing scanlines, 30% of time doing the glow, and 10% doing other stuff. The glow is already pretty fast for requiring such a wide blur (which is a fundamentally expensive operation). Most (but not all) other shaders have smaller blur radiuses.

The main performance hog is the scanline simulation. From what I can tell, most shaders basically blur the line horizontally, then spread the line vertically (maybe based on the brightness at that location). I actually calculate what a pixel would look like if a round (or round-ish, depending on the preset) spot were to be scanned over the line, varying in brightness and width based on the intensity of the input. I think this approach is in some ways more flexible and is more faithful to the way CRTs actually worked. It is more expensive, though!

A few of the advantages of this approach are:

  • Dark areas of the image have more detail, even horizontally, because the spot is smaller. Brighter areas get blown out in comparison.
  • It doesn’t matter if the input has the pixels doubled horizontally (or tripled, quadrupled, etc). The output will look the same. Some emulators actually do this for various reasons.
  • More generally, the sharpness of the output doesn’t change based on the horizontal resolution of the input. You could actually achieve this without fully simulating the spot, but it can be more expensive and most shaders don’t do it.
  • The scanlines don’t alter the tonality of the image. The dark and bright areas don’t get darker or brighter relative to each other. The average pixel value of the image is only scaled linearly by the maximum spot size parameter. I haven’t worked out the math, but it isn’t clear to me that this is the case with some of the simpler scanline implementations.

I think it’s valid to ask whether it’s worth it, though. If you favor performance, that’s a reasonable preference. I could make a faster version that works more like other shaders, I’m just not sure that I want to give up what I think makes this shader unique.

3 Likes

Just quicky tried your shader and I like it a lot!

I’m on a 1080p monitor so I had to set MaskType = "2.000000", other than that is good to go “out of the box”, without changing much else!

Well done! :+1:

1 Like

It sounds like it will be especially useful for Ares, which also uses our shaders but has problems with some of them due to its doubling/tripling/etc of the horizontal res on some of its cores.

Make a fast version if you like, but I will warn you that it will never be fast enough lol. Someone will always have a weaker device that they want to use it on, and if you just make it a little bit faster, it’ll be perfect, I promise :stuck_out_tongue:

1 Like

Sorry it took me long to reply, I was a little out of time.

Don’t worry much about what I said. Your shader is already good, and produces gorgeous results, definitely among the accurate ones. I’m far from being an expert, so I won’t argue, but I want to explain myself further:

  1. I have three devices (two of them with integrated GPUs) and your shader runs fullspeed in all of them. Far from being an “omg so slow” issue, it’s just there are already other works which emulate the beam dynamics of scanlines and are a bit lighter (like crt-hyllian and crt-royale-fast). I imagine their approach isn’t the same as yours, and the results different, but perhaps there’s still something that can be improved about your shader performance. If there’s not, it’s still not the end of the world and it won’t change the already beautiful picture quality it provides. It’s just that, by having a slower shader, it may (possibility) tip people into thinking “why not use guest’s instead”.

  2. I really forgot to clarify on my second report, but, yes, I was talking about the Dolphin core: your shader does not work on it, whereas many others do (be they simple or complex ones). The issue is still the same: seems like it divides the image into a 2x2 grid and zooms into the top left one. I tested on native resolution only. I also have a Linux + AMD device and Dolphin works fine there, using the AppImage version of RetroArch. Other 480i/p cores are fine indeed (like Flycast or LRPS2).

It’s the only couple of issues I really found with your shader. Don’t think I’m dissing your work, because I really like it. I’m fully aware the situation is not favorable either: there are already plenty of good crt shaders around, what more can be done? I recognize you’re fighting an uphill battle. However, I still believe you bring something interesting to the table, your shader builds upon the (classic, but outdated) GTU and it provides an excellent middleground for accuracy without a swarm of options.

Anyway, since I have a 1080p screen, feel free to ask for some tests, and I’ll do my best to execute them when I have the proper time. From a little less of two hours of testing, I can already say it’s pleasantly usable on 1080p, although I had to switch to the dynamic mask for better results.

1 Like

I was able to get Dolphin working from the Flatpak and I didn’t encounter the zooming behavior. I’ll try with some more settings and see if I can figure out what’s going on. Is it a Windows/Nvidia device where you see this problem?

This is basically what I’m aiming for. I didn’t necessarily want it to do everything, but I want what it does do to be top-notch with straightforward configuration options, with a focus on faithfulness to the way a CRT works.

I will think some more on the performance aspect and see if I can come up with anything. I think the biggest gains would come by applying the horizontal and vertical calculations for the scanlines into different passes, but that doesn’t work with the math as-is. It might be worth comparing to see if the quality difference is noticeable, though.

Thanks for testing, I appreciate it!

3 Likes

All my devices are Linux ones, although with different distributions. I will do further testing with Dolphin soon and see if the issue happens on every one of them. Luckily, I will be able to pinpoint it or blame it on some goofiness of mine.

By the way, here’s a single comparison of crt-beans against crt-hyllain and royale-fast, all shaders which emulate the beam dynamics of a CRT. The three pictures look great to me, with minor differences that boil down to personal preferences. Definitely usable already, I haven’t detected any glaring issue yet. As usual with proper scanline emulation, moire patterns happen, but it can be mitigated by adjusting the phosphor triads (the default values are fine).

Note: all shaders are using a composite LUT.

crt-beans-monitor (sRGB gamma and dynamic mask)

crt-hyllian

crt-royale-fast (using a slot mask)

2 Likes

Thanks, @md2mcb. I agree that those images look pretty similar. Especially at 1080p with a mask, the scanline probably doesn’t need to be simulated really accurately.

I’ve been thinking about a potential faster version, and I came up with something by splitting the scanline calculations into two passes. It’s a bit of a hack and probably still needs some tuning. The scanline dynamics are not as accurate, but I think it looks reasonable. I removed the glow as well, so it’s somewhere between crt-pi and crt-easymode in terms of speed.

The nice thing is that it keeps the resolution independence (the horizontal resolution still doesn’t affect how sharp the pixel transitions are) and also preserves the tonality of the original image.

I’m away from my computer but might have the code up in a couple days.

sor2-fast

2 Likes

It’s good. Having two options, one of them faster, greatly enlarges your audience. Most of the lighter shaders are stuff from many years ago, and there’s a shortage of lighter shaders that implement modern solutions. However, your default, heavier shader is also another great option for accuracy. Yes, it’s a shame it’s a bit heavier than intended, but the results are satisfactory and that shouldn’t be overlooked. The way it is now, I think crt-beans can easily fit into the official repository. I hope you can upload it there after you feel content with some adjustments.

2 Likes

There is a new version available here (direct link here).

  • The fast shader is available now. I still don’t know if I’m happy with it, but it might be useful. Splitting the scanline simulation into two passes is very fast but it limits how the scanline width can be varied while preserving the rounded edges. The fast shader also has no glow, which is a relatively slow effect.
  • I’ve made the dynamically-generated mask the default. This way it should generally look reasonable on any resolution or subpixel arrangement as long as it’s 1080p or higher. The subpixel masks are still there as a fallback.
  • There is a new interlacing option called “weave” that is available for all presets. This basically draws the even field and the odd field, then combines them. On cores which output 480p (e.g. Dolphin and Flycast), it gives a 480i look without the flickering, rather than a sharper 480p look. On cores which already output “woven” 480i, it also has no flickering but does have combing artifacts. I think this looks good on Dolphin and Flycast, and is also a good option if your monitor can’t handle the flickering of the normal interlacing simulation. Keep in mind that there is a performance cost because it is considering twice as many scanlines per pixel.
  • There is also a new “VGA” interlacing option available on the monitor and fast presets. This isn’t actually interlacing at all—it is VGA line doubling for mode 13h DOS games and similar. Any resolution under 350 pixels tall will have its lines doubled. VGA cards did this automatically. You’ll probably want a reasonably high resolution monitor for this, otherwise it’s hard to accurately draw so many lines. There shouldn’t be a performance cost to this, but I haven’t benchmarked.
  • Banding in the glow was annoying me, so I added a simple blue noise dither to the output. In a dark room, you might be able to see the difference in these two images at 100% (I turned the glow way up).

No dither:

With dither:

Or you can compare with imgsli (I can’t see a difference in mobile):

https://imgsli.com/Mzc3NTQ5

5 Likes

There is a new version available here (direct link here).

I fixed a bug in the new VGA line doubling that lead to incorrect scanline drawing.

2 Likes

Beans, I’ve been testing your shader. I’d like to give some proper feedback, but I don’t like to rush these things, I need to get the feeling of your work first. Just know that I appreciate your efforts and think they’re valid. It’s another good shader for the collection.

I can say two quick things:

  1. The fast version is not bad. Given what you wrote, I thought it would be much worse, but it’s actually satisfactory. Understandable that you may dislike it at first, as it’s just a stripped down version of your main work, but it also gives modest hardware a taste of your skills. Think of it as another way to express your personal view of how a crt shader could look good.

  2. Do you think there could be a way to set proper gamma controls? Normally, when you mess with beam width / spot size, you can end up with a brighter or darker picture, and you could compensate for that with gamma control. Right now, sRGB gamma looks nice on default values, but a tad darker than similar shaders (and messing with the spot size may darken the visuals too much). The 2.2 gamma washes out the picture quite a bit, it doesn’t necessarily brighten things up.

All in all, still testing. Pleasing approach you did with your masks, much malleable. It’s not easy to make this kind of shader to look fine on 1080p.

2 Likes

Thank you for your testing!

I could add a gamma parameter. The reason that I haven’t is that the current settings are basically “correct” from a theoretical point of view.

I am using 2.4 as the CRT gamma, which is basically standard. This does result in a slight darkening of the image as compared to the input without a shader, but it should accurately capture the gamma of an average CRT. CRTs varied a lot and people changed their own settings, so if I added a gamma parameter it would probably be for this.

The output gamma is pretty standard, and it should be set to match your screen. Most screens will be a simple 2.2 gamma, but some are actually calibrated to sRGB (with the linear portion in the shadows). They look very similar except for the dark shadows. If this isn’t set to approximately match your screen, you can get weird effects. The tonality of the image will not be accurate, the dither may not work correctly, and the scanline shapes may even be altered.

Regarding the brightness of the shader, if you set the maximum spot size to 1.0 it should be as bright as the original image if it was corrected for the difference between a CRT and modern screen gamma. The scanlines themselves would not actually darken the image at a maximum spot size of 1.0. Any value below that will decrease the brightness proportionally. That’s pretty much the only parameter that currently affects the brightness as a whole. The glow settings can sort of redistribute brightness, but they should have a minor effect on the average brightness of the image. The current mask blending method maintains the brightness of the pre-mask image on average (although, again, it sort of redistributes it). The minimum spot size also will not brighten or dim the image.

The low pass filter in the rgb and svideo presets can also dim small, bright detail. If you turn it up to 6.0MHz or use the monitor preset (which has no low pass filter), you may notice a difference. But this is faithful to the way CRTs worked as well. See @cgwg’s old post here:

Video amplifier. This is somewhat connected with the previous step, since the problem can be treated as an issue of signal processing. Since this applies before the gamma ramp, it can cause darkening of high-contrast areas — this is why gamma charts traditionally use horizontal lines.

The low pass filter is designed to mimic the limited bandwidth of the video amplifier (and other components in the signal path).

One thing that I have been careful about in this shader is not affecting the tonality of the image. The highlights don’t get rolled off and the shadows don’t get crushed or brightened. For example, here’s the output from one of my tests. I generate full screen grayscale images at varying brightness levels (corrected for CRT gamma) and run them through the shader. The x axis is the input brightness and the y axis is the output brightness. The solid, light blue line represents no change in brightness.

In this case, I used a maximum spot size of 0.9, so the actual values are shifted down slightly, representing a slight overall darkening. However, the curve stays linear. Changing the gamma will make this nonlinear, and this would change the tonality of the image.

All this is to say that, basically, it is complicated. The brightness of the shader is always at the maximum level that it can be without affecting the tonality of the image, and the current options guarantee linearity. Adding a simple gamma adjustment to brighten the output will cause details to be lost in the bright areas compared to an actual CRT, as it does in other shaders with such a feature. I don’t mean to rule it out as a future option, but that’s why I haven’t included it so far.

3 Likes