Scaling sprites by combining waifu2x and xBRZ output

3 Pages
←
1
2
3
→

You cannot start a new topic
You cannot reply to this topic

Scaling sprites by combining waifu2x and xBRZ output

MrFlibble

#31 Posted 09 October 2018 - 09:07 AM

MrFlibble, on 04 October 2018 - 10:11 AM, said:

It's also worth noting that I tried scaling up Doom sprites, with the results broadly similar to what another user achieved with NVIDIA GameWorks Super Resolution neural network scaling (I doubt that one was trained on anime images).

Further on this, I tried scaling up some Doom textures and I have to say there's a noticeable difference, namely, waifu2x blurs colours a lot, especially at low contrast arear whereas the NVIDIA network keeps the image sharp, even though there's some notable ringing sometimes.

MrFlibble

#32 Posted 16 October 2018 - 09:37 AM

Alrighty, following some feedback from the discussion in the NVIDIA GameWorks SuperResolution thread I'm trying to figure out how to counter the blur produced by waifu2x (other than attempting to retrain it using different images, I'm feeling ready to try that out yet).

The thing is that the NVIDIA nework is indeed capable of creating images that appear more or less redrawn at higher resolution by hand (check out a couple of comparisons I posted here), while waifu2x, while creating smooth curves, produces a generally blurry image. Adding noise only partially addresses this issue.

There are filter called Rock in IrfanView which creates a kinda-sorta emboss effect, accentuating edges of areas with different colours. I quickly ran this with a previous monster test:

Spoiler

The image was scaled up 4x with waifu2x and then the Rock filter was applied at 4x. I scaled it back down to 2x with Sinc3. Ignore the greenish halo at some of the sprites' edges, this was a quick test without the extra steps to remove those.

I'm not completely happy with the result though. Is there another way to get rid of the blur towards a pixel art-esque sharpening effect? Simply sharpening the image or applying unsharp mask filters doesn't quite cut it. A simple conversion of the resized image to the 8-bit palette also contributes very little to removing the blur.

Phredreeke

#33 Posted 16 October 2018 - 11:23 AM

The green lines are caused by the surrounding pink background. To get around that I think you would need to upscale only the luma channel and use something else for color. Alternatively you could expand the image onto the background and then apply the original mask later.

MrFlibble

#34 Posted 18 October 2018 - 02:50 AM

Phredreeke, on 16 October 2018 - 11:23 AM, said:

Alternatively you could expand the image onto the background and then apply the original mask later.

Yes, this is about how I do this. I was too lazy to process the image above this way for the test purposes.

Anyway, concerning the Duke3D sprites, using the rock filter as described above doesn't seem to positively contribute to image quality. In fact the result is rather similar to what the original unsmoothed image gets when scaled up.

Another thing to note is that the results of waifu2x scaling with xBRZ smoothing on the sprites' edges do not appear much different from what you get if you plain scale up the same shape in xBRZ:

Spoiler

(waifu2x processing above, xBRZ below)

This is a bit disappointing since I was kinda hoping that the smoothed waifu2x would get better results with various curves and straight lines at odd angles, but it only seems to work with images baked into a background.

Phredreeke

#35 Posted 18 October 2018 - 07:22 AM

still would like to see it in-game :rolleyes:

MusicallyInspired

The Sarien Encounter

#36 Posted 18 October 2018 - 10:25 AM

Worst-case scenario you have to do a bit of pixel editing on the frames. It's a high order to get supreme results from mere filters alone anyway.

MrFlibble

#37 Posted 20 October 2018 - 12:48 AM

MrFlibble, on 04 October 2018 - 10:11 AM, said:

I tried scaling up Doom sprites, with the results broadly similar to what another user achieved with NVIDIA GameWorks Super Resolution neural network scaling (I doubt that one was trained on anime images).

MrFlibble, on 09 October 2018 - 09:07 AM, said:

hidfan who posted the original Doom scaling results explained that what he has is not a raw result of using NVIDIA GameWorks but a combination of several iterations created by different methods.

I just checked hidfan's sources, and apparently the results of simple NVIDIA texture scaling are pretty bland. The illusion of extra pixel detail apparently stems from the NVIDIA PhotoHallucination process, which to me looks some kind of artistic filter from Photoshop when applied to textures (it is my understanding that photohallucination is actually intended to add imaginary detail to photographs when upscaling, similar to what Let's Enhance offers).

By which I mean to say that it might be quite possible to achieve similar results without relying on NVIDIA software altogether.

In the meantime I used a slightly different approach and got some sprites that are arguably more similar to hidfan's results.

leilei

#38 Posted 20 October 2018 - 04:10 PM

Hendricks266, on 03 October 2018 - 04:23 PM, said:

I presume waifu2x has been trained on anime-like drawings, and I suspect this is why your images look like oil paintings. I would bet the result would be better if you trained the neural network with a custom data set.

Yes it's trained on manga illustrations. There's also a model trained on photographs but it seems to look too much like Lanczos scaling to be worth it. There are also denoising models involved which will kill off those details as well.

the whole "it's made-for-anime" is generally why i've never encouraged Waifu2x for sprites other than a sarcastic joke. It is designed for thin anti-aliased linework for illustrations around 500x500+ resolutions. If it makes anything else look good, it's just a lucky coincidence. Waifu2x is not a silver bullet.

This post has been edited by leilei: 20 October 2018 - 06:05 PM

MrFlibble

#39 Posted 21 October 2018 - 10:21 AM

Well, the pre-rendered sprites look pretty decent at 2x. Stuff like this is pretty decent. Monster sprite upscale for Blood is a thing. Whether this is an intended effect or coincidence is largely irrelevant as long as such functionality has practical applications.

The thing is, to my knowledge there's no better alternative at this moment, and at any rate it is interesting to explore what can be done with existing tools. hidfan did some pretty cool stuff with the Doom textures, but they, too, are not a result of simply feeding original art to the NVIDIA neural networks and getting the output.

MrFlibble

#40 Posted 19 November 2018 - 03:37 AM

Phredreeke gave me a good idea to experiment with blending layers, here's one result (4x scaling):
Posted Image

This was done by blending three layers:

sprites on black background, softened as described here, processed with waifu2x in Y-channel mode (2 passes, TTA4)
sprites on white background, also softened and processed with waifu2x in RBG mode (2 passes, TTA4)
sprites on bright green background from the Duke3D palette, scaled to 4x with xBRZ

I used the Blend [median] function in GIMP's G'MIC plugin to combine these. The black and white backgrounds cancel each other, leaving the green background from the third layer unaltered.

I then converted the result to the Duke3D palette using mtPaint with the Dithered (effect) dithering type. To achieve best results I set Colour space to RGB (the default is sRGB), Reduce colour bleed to Strongly, Error propagation at 85% and Selective error propagation set to Separate/Sum.

I then removed the green background colour from the image.

Running waifu2x with two passes results in a pronounced sharpening effect that accentuates detail, but also contributes to the emergence of black outlines on the sprite edges. I'm not sure I'm 100% happy with the result, but mind this is still WIP. At least this is a workable model for getting sprite edges processed properly with waifu2x without the need for previous combination methods.

Here's the same stuff as above with the sprite outline cropped by 1 pixel (I'm not sure this is a good solution but at least worth a try):
Posted Image

OpenMaw

Judge Mental

#41 Posted 22 November 2018 - 06:52 AM

A lot of these look quite good. I think if there was someone interested in actually going over these with a more fine hand once the process was done, and actually give them some more detail by hand it would be perfect. Some of the detail just doesn't come up much, but other parts almost look like the ol' 640x480 renders of the original models, which is pretty cool.

Has this been tried on the HUD weapons?

Phredreeke

#42 Posted 22 November 2018 - 07:37 AM

I think a 2x scale is the sweet spot as of now TBH. I do like the idea of having an artist give a final pass adding detail that didn't come through on the upscale.

The problem of how to deal with palswaps/PLUs still remain. The way I dealt with it for Blood results in duplicating sprites for each colour variant. At least Duke3D doesn't use them to as high extent.

MrFlibble

#43 Posted 24 November 2018 - 09:55 AM

Commando Nukem, on 22 November 2018 - 06:52 AM, said:

Has this been tried on the HUD weapons?

Not yet.

Phredreeke, on 22 November 2018 - 07:37 AM, said:

I think a 2x scale is the sweet spot as of now TBH. I do like the idea of having an artist give a final pass adding detail that didn't come through on the upscale.

I agree with this.

On a side not, it turns out that someone else also came up with the xBRZ softening method:
https://github.com/n...mment-341915147

That discussion also reveals that if you do both upscaling and noise reduction you can get a pretty good result with the outlines at the cost of losing some detail, but this might actually be a good thing considering that 8-bit graphics have a lot of sharp contrast in neighbouring colours.

MrFlibble

#44 Posted 27 November 2018 - 02:15 AM

In the meantime I gave a try at scaling pre-rendered images with noise reduction on, as suggested in the discussion at GitHub. Here's what I got (ignore the text part):
Posted Image

This is a blend of:

original 320x200 image (no softening or other edits) scaled in Y-channel mode with noise reduction to 4x, TTA2, noise level 1
same as above but with two passes
original 320x200 image scaled to 4x using xBR (not xBRZ) from the ScalerTest utility

The result was downscaled to 640x480 using Sinc3 in GIMP, then converted to the original palette in mtPaint with limited Stucki dithering. I only used the same colours as on the original image, not the full range of the palette (accomplished in mtPaint by first saving the 320x200 original as a 24 bit RGB image, then converting to indexed as exact match).

For comparison, here's the same image scaled to 4x by nearest neighbour and xBRZ, then converted to 640x480 8-bit image using the same steps as above:

Spoiler

MrFlibble

#45 Posted 27 November 2018 - 06:35 AM

Can't edit my post already, so double posting (sorry).

Here's the loading screen and one frame of the Earth episodes tally screen animation. This time I used the RGB scaled image (with noise reduction) as the third layer instead of an xBR upscale, otherwise the steps are as described above. I also tried out two palette conversion modes (for parameters see this post): Stucki and DIthered (effect), both via mtPaint. I kinda like the Dithered version more.

Loading screen (dithered):
Posted Image

Stucki:

Tally screen (dithered):
Posted Image

Stucki:

This post has been edited by MrFlibble: 27 November 2018 - 06:36 AM

MrFlibble

#46 Posted 08 January 2019 - 10:44 AM

A while ago Phredreeke notified me of a new neural network called ESRGAN having been developed for the Single Image Super-Resolution (SISR) problem, and sometime later it surfaced at the Daggerfall Unity Forums. Folks at yet another forum have been playing around with ESRGAN and SFTGAN for a while, with pretty interesting results. A user even trained their own model on Manga images for better results.

So I decided to try this out, combining it with the xBRZ softening technique that allows to produce smooth lines with 8-bit input images. Here's a comparison between the same set of monster frames scaled to 4x with waifu2x (RGB mode scaling) and the Manga109 model:
Posted Image

The Manga result looks sharper and more detailed, but it also appears to reproduce JPEG compression artifacts — it seems that the model was trained on JPEG images.

MusicallyInspired

The Sarien Encounter

#47 Posted 08 January 2019 - 11:15 AM

I just figured it was added noise to give the appearance of finer detail.

EDIT: Just checked out the thread. Man those results are incredible!! The upscaled Monkey Island 2 backgrounds are especially gorgeous. Imagine having a filter like that working in real-time for programs like ScummVM and DOSBox! I can't get over how great that looks.

This post has been edited by MusicallyInspired: 08 January 2019 - 11:28 AM

MrFlibble

#48 Posted 09 January 2019 - 03:05 AM

MusicallyInspired, on 08 January 2019 - 11:15 AM, said:

I just figured it was added noise to give the appearance of finer detail.

The noise might be the network replicating the appearance of printed image scans as the model was trained on the Manga 109 dataset. But whatever the case it does produce neat results that look better than adding noise to waifu2x (at least, with HSV noise in GIMP). But this model also apparently reproduces JPEG compression artifacts as I mentioned above, having been trained on JPEG images.

I did some scaling of full static screens, you can compare to the previous results with waifu2x above:

Spoiler

These are scaled down to 640x480 using Sinc interpolation.

However I'm noticing that mtPaint starts to get problems with correctly converting the result back into the original palette. The colours are often off because of alterations and noise, presumably due to JPEG noise in the training data.

I tried countering this with Gaussian blur (1 pixel radius) or Selective Gaussian blur (2 pixel radius, threshold 15) before scaling down to 40x480, and using Stucki or FLoyd-Steinberg dithering with reduced colour bleeding for indexed palette conversion, but it doesn't work just as good. Selective Gaussian blur seems to fare a bit better though, here are results for two screens (with Stucki dithering):

Spoiler

Whenever I convert the loadins screen image with anything other than Stucki or Floyd-Steinberg dithering methods, the yellow colour on the nuke symbol gets terribly off.

Phredreeke

#49 Posted 12 January 2019 - 09:50 AM

Would you please release something though, you're giving me blue balls 😂

Jimmy

Let's go Brandon!

#50 Posted 12 January 2019 - 11:05 AM

I'd recommend using Image Analyzer to convert full color images to palettes. Just trust me, I've never found a program that does it better.

MusicallyInspired

The Sarien Encounter

#51 Posted 12 January 2019 - 11:17 AM

Thanks for introducing that to me. Looks rather interesting.

MrFlibble

#52 Posted 21 January 2019 - 04:03 AM

So I played around with some network interpolation, here's the results of combining Manga 109 with RandomArt:
Posted Image

RandomArt + Manga 109 (alpha = 0.5)

Posted Image

RandomArtJPEGFriendly + Manga 109 (alpha = 0.5)

Not sure which I like more out of these two but both look cleaner and sharper than pure Manga 109 (see the post above).

Also here's a texture test:
Posted Image

RandomARtJPEGFriendly + Manga 109

Pure Manga result I posted previously:
Posted Image

NightFright

The Truth is in here

#53 Posted 21 January 2019 - 04:58 AM

I like the texture samples more than the sprites, tbh. It's hard to guess what it would look like ingame, but it would be interesting to see e.g. Hollywood Holocaust with textures resampled with this filtering method.

MusicallyInspired

The Sarien Encounter

#54 Posted 21 January 2019 - 07:07 AM

It's interesting to me how so far none of the upscaled sprites look near as good as the Doom upscaled sprites. There's something about Duke's sprites that just translate to these warpy warbly smeared looking creatures. I guess it's more because it's based on 3D models while Doom is highly pixel edited?

MrFlibble

#55 Posted 21 January 2019 - 08:16 AM

MusicallyInspired, on 21 January 2019 - 07:07 AM, said:

It's interesting to me how so far none of the upscaled sprites look near as good as the Doom upscaled sprites.

I cannot tell how much these comparisons with the Doom upscale annoy me.

First off, don't forget that the 4x images posted above are not the final result. Here's the same monster sprites scaled down to 2x from the Random Art JPEG Friendly/Manga 109 processing and converted to the original palette, and zoomed in 2 using nearest neighbour for better visibility (some stray pixels at the edges might need cleanup):
Posted Image

And here's the original sprites, zoomed 4x with nearest neighbour, for comparison:
Posted Image

I think that the results are quite true to the original art.

Here's the set of the upscaled sprites above at their original size:
Posted Image

Phredreeke

#56 Posted 21 January 2019 - 09:10 AM

My first releases of the Blood upscale pack were full of stray pixels, worse they were PINK.

I did send you a set of the assault trooper/commander with in game colours before didn't I?

Still eagerly waiting to see this in game...

MrFlibble

#57 Posted 21 January 2019 - 10:19 AM

Phredreeke, on 21 January 2019 - 09:10 AM, said:

Still eagerly waiting to see this in game...

I'm sorry, I'm doing this in my spare time (also only got CPU-powered ESRGAN as my distro is not supported by CUDA). It also doesn't seem a good idea to me to process a lot of images before at least there's some idea what the optimal method would be, considering the unconventional tools we're using.

Phredreeke

#58 Posted 21 January 2019 - 11:24 AM

Holy crap, didn't realise you were stuck processing on CPU. Once you're ready to process the full set, upload it somewhere with the model and send me, and I'll process it and send back.

MusicallyInspired

The Sarien Encounter

#59 Posted 21 January 2019 - 02:19 PM

I don't know. The downscale to 2x looks alright. But there's just still something...asymmetrical and watery about it. I can't put my finger on it but it feels wrong.

MrFlibble

#60 Posted 22 January 2019 - 09:06 AM

MusicallyInspired, on 21 January 2019 - 02:19 PM, said:

The downscale to 2x looks alright. But there's just still something...asymmetrical and watery about it. I can't put my finger on it but it feels wrong.

It is quite possible that if a model is trained specifically to upscale renders it will produce better results. In fact, I discussed possible authentic sources for a data set to train such a model recently at the Daggerfall Workshop forums. However it is my understanding that training takes quite a while even with a GPU, and not recommended with CPU-only, so ATM I can't test this venue.

On the other hand, the sprites in Duke3D were almost certainly touched up after they were scaled down from angle views of the original Chuck Jones' models. In this respect, the sprites are not lacking detail in the same way as scaled down photographs which are the primary subject matter of ESRGAN in the first place. Rather, the sprites are low-resolution, and are appropriately detailed for the resolution they're in (at least, that's what the desired result is — remember that comparison of digitised actor sprites in ROTT and TekWar in Cage's Ion Maiden dev blog entry). It is not implausible to assume that some other underlying principle is required for a network that would properly handle the task of scaling up pre-rendered images.

That, or the results that we have not could be used as a basis for an artist's work driving the scaled up image home, similarly to how the small originals were refined after being scaled down. But I'm not an artist and there's not much I can do or even suggest to improve the quality of the sprites.