The ESRGAN AI Upscale non-Duke thread

12 Pages
←
1
2
3
4
→
Last »

You cannot start a new topic
You cannot reply to this topic

The ESRGAN AI Upscale non-Duke thread

MusicallyInspired

The Sarien Encounter

#31 Posted 16 January 2019 - 05:51 PM

Pixels do add an element of perceived finer and grittier detail that the brain fills in. ScummVM added a feature for Sierra parser+mouse EGA games that changed dithered colours to averaged colours. It's a controversial setting, especially when they made it the default setting (I fought long and hard to get them to change it back from default, that was a monster of a thread let me tell you). It makes everything look flat and bright where my brain would actually fill in darker details. I hated it. It's still an option but I never use it. This is kind of a similar argument to the downsides of the HRP that people have always gotten into, though. It's all preference.

This post has been edited by MusicallyInspired: 16 January 2019 - 05:52 PM

Forge

Speaker of the Outhouse

#32 Posted 16 January 2019 - 07:07 PM

Has anybody pulled some tapestries from the game Loom? Those I'd be interested in. Sentimental reasons.

MusicallyInspired

The Sarien Encounter

#33 Posted 16 January 2019 - 08:36 PM

It doesn't seem to do well with digital art made with Deluxe Paint like Loom and other early LucasArts adventures. Especially with dithering.

Forge

Speaker of the Outhouse

#34 Posted 16 January 2019 - 09:24 PM

They look nice to me. Thank you.

MrFlibble

#35 Posted 19 January 2019 - 04:02 AM

In the meantime I had an idea of how we could evaluate ESRGAN-Manga 109 performance in respect to 8-bit video game art.

I took a selection of textures from the PC version of Wolfenstein 3-D and scaled them up with prior xBRZ softening, then compared to the counterparts of the same textures from the Macintosh version (hand-made upscales of the PC textures):
https://imgur.com/a/sPopSXd

I didn't convert the Manga 109 results to any indexed palette, just run some quick surface blur to remove simulated JPEG noise and scaled down to 2x the original size with Sinc interpolation.

It's actually not bad but you can see how much the images changed due to manual editing by an artist.

MrFlibble

#36 Posted 20 January 2019 - 06:11 AM

So I played a bit with waifu2x-caffe, which has a model called UpRGB that produces sharper results than the regular RGB model I had used before. So I ran an image with scale + noise reduction (at level 1) to make a small comparison to ESRGAN-Manga 109:
Posted Image

waifu2x-caffe

Posted Image

Manga 109

Both images were created from the same input, a Warcraft briefing screenshot softened with xBRZ. Both upscaled to 4x the original size then resized to 640x480 is GIMP with Sinc interpolation. No conversion to indexed palette or other edits.

So as you can see while both methods handle large shapes more or less in the same way, but ESRGAN/Manga 109 really shines when it comes to accentuating small detail like the teeth of the Orc on the right or the wool trimming of the other Orc's belt. The same also produces erroneous results though, e.g. the sword handle of the Orc on the lft is obviously supposed to be decorated with what seems like dragon's head, but ESRGAN created a very odd configuration out of it, unlike waifu2x.

Forge

Speaker of the Outhouse

#37 Posted 20 January 2019 - 06:18 AM

Looks like manga 109 also took liberties with shading and brightened things up - unless that was user input.

MrFlibble

#38 Posted 20 January 2019 - 06:22 AM

Actually it's the waifu2x result (top) which seems brighter to me. Manga 109 does alter colours, not in the least because it introduces JPEG-like noise I suppose.

Forge

Speaker of the Outhouse

#39 Posted 20 January 2019 - 06:38 AM

MrFlibble, on 20 January 2019 - 06:22 AM, said:

Actually it's the waifu2x result (top) which seems brighter to me. Manga 109 does alter colours, not in the least because it introduces JPEG-like noise I suppose.

Maybe I need my eyes checked.

To me, things like the red banner, and the leg armor & buckle on the left orc look brightened in the bottom picture

This post has been edited by Forge: 20 January 2019 - 06:39 AM

MrFlibble

#40 Posted 20 January 2019 - 07:09 AM

I decided to try out the network interpolation thing with ESRGAN. As suggested in the ResetEra thread, I interpolated the default ESRGAN model with Manga 109 at alpha = 0.2. It gives some sharper results and noticeably decreases JPEG noise as expected. The Conjurer's ear is not fixed - it does get better with an interpolation of Manga 109 and PSNR but otherwise the image gets lots of ringing.

Here's some results (in each case the image was softened by scaling up with xBRZ then applying the pixelise filter in GIMP; going straight back to 320x200 with Sinc interpolation results in overly sharp image). I converted each to the original palette with Stucki dithering in mtPaint for some more authentic feel.
Posted Image

MusicallyInspired

The Sarien Encounter

#41 Posted 20 January 2019 - 09:04 AM

MrFlibble, on 20 January 2019 - 07:09 AM, said:

Can you explain this process further? I don't understand how to accomplish utilizing more than one model.

Phredreeke

#42 Posted 20 January 2019 - 09:07 AM

MusicallyInspired, on 16 January 2019 - 05:51 PM, said:

Forgive me if I'm mistaken, but I thought the graphics were stored undithered in the game's resources, and the game's engine itself added the dithering.

MusicallyInspired

The Sarien Encounter

#43 Posted 20 January 2019 - 09:21 AM

No, that's misleading and exactly what I was arguing against. Dithered colours are a palette entry in themselves and act as "one colour" when drawing in the engine's background vector picture resources. So technically, SCI0 games can have an overall palette of more than 16 "colours" by dithering (including some overlap with the same colours dithering in the opposite pattern). It makes painting in dithering easier, but not as fine-tuned as in Deluxe Paint with the likes of Mark Ferrari's incredible by-hand dithering in early LucasArts games.

But however you want to interpret how the engine interprets colours in SCI0 (because palette entries in code are just a value and don't care about dithering or not), there's only ever been 16 total colours on-screen both in-game and with whatever tools Sierra's artists were using. Changing dithered colours to averaged colours changes the authentic feel of what both game designers and players saw. There isn't a driver that's interpreting a greater colour down to 16 colours. You even choose which two colours you want to dither as a palette entry manually with the engine tools. Ken Williams did say that it was an attempt to make it look like more colours on screen at once, but there were never any more colours than 16. And when an artist is drawing images in 16 colours he'll make different decisions than he would if there were actually more than 16 colours with averaging palette entries instead of dithered ones.

MrFlibble

#44 Posted 20 January 2019 - 09:24 AM

MusicallyInspired, on 20 January 2019 - 09:04 AM, said:

Can you explain this process further? I don't understand how to accomplish utilizing more than one model.

It's in the readme:

Quote

Network interpolation demo

You can interpolate the RRDB_ESRGAN and RRDB_PSNR models with alpha in [0, 1].

Run python net_interp.py 0.8, where 0.8 is the interpolation parameter and you can change it to any value in [0,1].
Run python test.py models/interp_08.pth, where models/interp_08.pth is the model path.

You can interpolate any two models if you edit net_interp.py.

By interpolating the Manga model with both the pre-trained ESRGAN and PSNR at alpha = 0.5 I fixed the Conjurer's ear at once:
Posted Image

ESRGAN + Manga 109 (alpha = 0.5)

Posted Image

Manga 109 + PSNR (alpha = 0.5)

The PSNR interpolation gives a more blurry, softer image. There are some other small differences as well.

MrFlibble

#45 Posted 21 January 2019 - 02:47 AM

I also interpolated the Manga 109 and the Random Art models at alpha = 0.5. The result seems smoother than ESRGAN + Manga 109 and sharper than Manga 109 + PSNR:

Spoiler

The problem is that interpolation not only reduces Manga 109's inherent JPEG noise but also removes or weakens its ability to blend areas of colour with sharp contrasts. Here's a good example: a simple render (Duke3D loading screen) processed without any prior softening with pure Manga 109 and RandomArt + Manga 109:

Spoiler

It seems that generally interpolated models produce noticeable sharpening effects so applying softening is probably recommended for them.

Also a general observation is that whatever models are used, if someone seriously wanted to create high-res art with them the output would require manual touch-up at any rate.

UPD: You know what, just for completeness' sake I also interpolated Manga 109 and RandomArtJPEGFriendly at the same alpha = 0.5, and the results aren't half as bad as I expected them to be:

Spoiler

As a matter of fact, I like these reults more than the other stuff.

This post has been edited by MrFlibble: 21 January 2019 - 03:44 AM

leilei

#46 Posted 21 January 2019 - 06:07 PM

If there's anything that really needs some ESRGAN it should be about 90% of the N64 games out there. The texture quality was always lacking and only got by with "no pixels!! die pixels" texture filter hype through the 90s.

This post has been edited by leilei: 21 January 2019 - 06:07 PM

MusicallyInspired

The Sarien Encounter

#47 Posted 21 January 2019 - 10:28 PM

None of these were done with Manga109.

Zelda OOT Kokiri Forest before/after:

Spoiler

Zelda OOT Camera Locations

Spoiler

Mario 64 Paintings

Spoiler

This post has been edited by MusicallyInspired: 21 January 2019 - 10:29 PM

MrFlibble

#48 Posted 22 January 2019 - 09:33 AM

Felt like testing some Full Throttle stuff, looks mostly neat:
https://imgur.com/a/0WwhZvl

This is the RandomArt JPEG Friendly/Manga 109 model, each image converted to the original palette without any dithering in mtPaint.

All screenshots come from LucasArts (MoyGames mirror).

MusicallyInspired

The Sarien Encounter

#49 Posted 22 January 2019 - 10:22 AM

I don't know how I feel about converting back down to the original palette at all really. I understand the point for games like Duke3D because of the palette swapping effects and whatnot, but upscaling something like Full Throttle and then downgrading again seems to be missing the point to me.

This post has been edited by MusicallyInspired: 22 January 2019 - 10:23 AM

Altered Reality

#50 Posted 24 January 2019 - 04:27 PM

Wow, I'm genuinely impressed. I was wondering: if the source picture does not look like a painting, is there a way to make it so that the enhanced picture does not look like a painting either?

MusicallyInspired

The Sarien Encounter

#51 Posted 24 January 2019 - 04:30 PM

You could use or train a new model if you had the high res version of the image already. Or you could train one that works better with photographs or whatever. As I said before, though, I'd be interested in seeing a model trained so that it can interpret dithering as fine gradient shading.

I've been wanting to sit down and experiment with training my own models but I've got my hands full mastering the Mage's Initiation complete soundtrack in time for the game's release in a week.

This post has been edited by MusicallyInspired: 24 January 2019 - 04:31 PM

Altered Reality

#52 Posted 24 January 2019 - 05:03 PM

The fact is, I don't have the higher resolution version of the images I want to enhance.
I wanted to try and enhance the screenshots of the never released PC version of Damocles, as well as the Syndicate Wars textures. Of course, without downgrading them back to the original palette.

MusicallyInspired

The Sarien Encounter

#53 Posted 24 January 2019 - 06:03 PM

The trick would be to come up with some images with similar type graphics in higher res, scale them down, and use them together to train a model how to upscale images with a similar outcome.

EDIT: I see now that some of those screenshots are blurry scans from magazines. In that case, if you could come up with similar-type graphics but in clean high res and scale those down and apply a similar noise filter of some kind that looks like those bad scans, it might do a semi-decent job.

This post has been edited by MusicallyInspired: 24 January 2019 - 06:06 PM

MrFlibble

#54 Posted 26 January 2019 - 04:10 AM

There's another thing about training models that might be relevant. Models trained on different data sets obviously produce different results, but the underlying structure of ESRGAN is still designed to accomplish the original super-resolution task, that is, produce high-resolution images from their downscaled counterparts, and it was specifically created to work primarily with photographic images in mind.

As I discussed elsewhere, low-resolution video game art appears to be in certain respects different from scaled down photos, and may actually not as much lack detail as contain detail that exists on a different level altogether than in higher resolution images. For example, if you scale down a photograph some smaller detail will inevitably become a handful of pixels if not a single pixel. However a low-resolution video game image may be intentionally cleaned up from such noise (or created without it altogether if making from scratch) while meaningful detail may be enhanced, or specifically created from an arrangement of pixels that won't occur in scaled down photographs. I think it is not a stretch to assume that dealing with this kind of art requires different methods when compared to the super-resolution problem in relation to photographs or other high-res images that were simply scaled down.

For example, I ran some tests with pre-rendered sprites of an Orc from Daggerfall:
Posted Image

If you look closely you can see that the Orc's skin in the original image is intended to look "scaly" but this is smoothed in the ESRGAN result: the effect produced by a specific arrangement of a handful of pixels is completely lost.

The upscale also makes it very obvious that the original model was low-detailed. There are hardly any facial features and no individual fingers (clearly visible at this angle). The low-res sprite worked fine but is evidently lacking when blown up fourfold.

It's almost like if you zoom in on a printed image in a magazine too much it will fall apart into individual colour ink dots.

I think we could learn more about the idiosyncrasies of low vs. high-res video game art if we compare sets of images that originally came in two resolutions, e.g. the credist sequence stills from Red Alert, low- and high-res menu screens from other games and PC v.s Mac Wolf3D textures etc. I'm saying this because if you simply scale down high-res images and train the model on this data it will not be much different from existing results based on other similarly treated images, except perhaps better suited to produce images that look like they were created in an indexed palette.

As for scaling up screenshots this is probably a separate problem altogether, especially if there are some true 3D elements in the image.

Avoozl

#55 Posted 27 January 2019 - 01:15 AM

Has this been tried with Doom 64 sprites?

MrFlibble

#56 Posted 27 January 2019 - 02:37 AM

Altered Reality, on 24 January 2019 - 04:27 PM, said:

I was wondering: if the source picture does not look like a painting, is there a way to make it so that the enhanced picture does not look like a painting either?

I just tried a different network called SFTGAN (which is an earlier attempt by ESRGAN devs). It doesn't scale up images on its own but tries to recover texture for images that have been scaled by other means. So I fed some Command & Conquer screenshots that were scaled with waifu2x to it, with pretty interesting results (compare with Manga 109 result below):

Spoiler

As you can see, SFTGAN sharpens the image and reduces that "oily" look everyone is complaining about with neural upscales.

Altered Reality, on 24 January 2019 - 05:03 PM, said:

I wanted to try and enhance the screenshots of the never released PC version of Damocles

Out of curiosity I picked one image from that set (not scanned, original quality) and ran it through waifu2x + SFTGAN. Not really impressive:
Posted Image

Avoozl, on 27 January 2019 - 01:15 AM, said:

Has this been tried with Doom 64 sprites?

Here' you go:
Posted Image

ESRGAN_4x/Manga109 interpolation at 0.5, scaled back down to 2x with Sinc and zoomed in 2x for better viewing

Posted Image

wiafu2x+SFTGAN, scaled back down to 2x with Sinc and zoomed in 2x for better viewing

And this is a blend of the above using G'MIC's Blend [median] function:
Posted Image

Marphy Black

#57 Posted 27 January 2019 - 03:52 AM

Would anyone be so kind as to try an upscale of Ken's Labyrinth title screen image for the good of all Ken-kind?

Posted Image

MrFlibble

#58 Posted 27 January 2019 - 08:00 AM

I just processed the Ken's Labyrinth art with ESRGAN/Manga109 interpolated model and converted back to the original indexed palette (the 24-bit output file is over 3 MiB so Imgur will auto-convert it to JPEG):

Spoiler