The ESRGAN AI Upscale non-Duke thread

12 Pages
« First
←
5
6
7
8
9
→
Last »

You cannot start a new topic
You cannot reply to this topic

The ESRGAN AI Upscale non-Duke thread

MrFlibble

#181 Posted 23 June 2020 - 01:45 PM

Nice! I'm sure that with some polish, it could be made to look indistinguishable from a sprite set that was originally pre-rendered at this resolution.

I wonder if someone could be asked to do this kind of editing, because it seems to me that no matter how well the current ESRGAN models perform, it still shows that these are upscales of low-res stuff, if only in minuscule details.

Phredreeke

#182 Posted 23 June 2020 - 04:55 PM

Attempt to fix a badly scaled original sprite while upscaling.

Attached thumbnail(s)

MrFlibble

#183 Posted 24 June 2020 - 07:30 AM

Nice but I believe it could be sharper.

Phredreeke

#184 Posted 25 June 2020 - 04:40 AM

After I made the above I actually tried again running it through the unresize model before palettising.

Anyway. something I posted today in the Blood Discord.

Posted Image

As you can see the hands and guns have been redone. They are sourced from upscales of Dzierzian's cultists

MrFlibble

#185 Posted 25 June 2020 - 07:35 AM

The image itself is good but I think the GIF-like dithering and low colour count are making it a kind of disservice. I know you're aiming for authentic 8-bit look but here the contrast between some of the shades is so sharp it almost feels like posterising (especially the black/grey on his overcoat).

Phredreeke

#186 Posted 16 July 2020 - 06:02 PM

This is an image first processed with the SSAntialias9x model, then upscaled with an interpolation of the Rebout and Fatality-Comix models

https://imgsli.com/MTkxNTU

MrFlibble

#187 Posted 17 July 2020 - 05:27 AM

Phredreeke, on 16 July 2020 - 06:02 PM, said:

the SSAntialias9x model

What does it do?

Phredreeke

#188 Posted 18 July 2020 - 06:46 AM

smoothes out aliasing

MusicallyInspired

The Sarien Encounter

#189 Posted 11 August 2020 - 05:13 PM

You know, I was thinking. I know that the audio medium never gets the attention that the visual medium does (people care more about graphics than sound on average), but if the likes of ESRGAN can upscale low-res images and fill in missing details, couldn't a similar AI neural net algorithm work for resampling/upping the bitrate of low quality sound files? Say, taking the 8-bit 22KHz Duke3D one-liners and upping their quality to 16-bit 44KHz? Or better yet, 32-bit 96KHz?

To explain this in terms you might understand better if you're having trouble grasping the concept, you can resample and up the bitrate of a low quality audio file but it's still going to sound the same (similar to resizing a low res image without any filtering...the pixels just become bigger). But actually filling in the missing detail like ESRGAN and other AI upscalers do should be something that people look at. You could create a model based on high quality recordings of Jon St. John himself and use that model to "upscale" Duke3D lines. Or do the same thing with weapons and other sound effects we don't have the high quality versions of for the other sounds in Duke3D. Or ANY other older game with low quality files.

There are other uses for this as well. For instance, recreating the audio quality lost in lossy codec compression like MP3/OGG/etc and getting full quality WAVs or FLACs back.

Anyone know if something like this exists in some form?

EDIT: Found this reddit post from the GameUpscale subreddit. Half the people there don't seem to understand the benefits of this vs manual resampling, though.

https://www.reddit.c...resample_audio/

EDIT 2: Looks like the Game Upscale discord server has a channel for audio networks. Going to peruse it.

This post has been edited by MusicallyInspired: 11 August 2020 - 05:21 PM

Hendricks266

Weaponized Autism

#190 Posted 11 August 2020 - 05:27 PM

AI definitely has applications to upscaling audio. I would expect fewer people to be working on it than images, as you mention, so it might be hard to find a trained model like ESRGAN that works well for audio.

Phredreeke

#191 Posted 11 August 2020 - 05:32 PM

I attempted loading raw 8-bit audio as a greyscale image and then exporting to PNG and upscaled. I might post some samples tomorrow.

Jimmy

Let's go Brandon!

#192 Posted 11 August 2020 - 08:16 PM

I've been wondering for quite sometime if someone could create an AI model of Duke's voice.

Phredreeke

#193 Posted 12 August 2020 - 02:28 AM

Here are some of my attempts. Picked out the better ones.

Attached File(s)

duke-enhsamples.zip (136.46K)
Number of downloads: 360

MrFlibble

#194 Posted 15 August 2020 - 10:27 AM

Phredreeke, on 11 August 2020 - 05:32 PM, said:

I attempted loading raw 8-bit audio as a greyscale image and then exporting to PNG and upscaled. I might post some samples tomorrow.

Actually that sounds like a great idea, but I think it'd work best if you trained a model to specifically upscale such images. Theoretically shouldn't be problematic, take some high-quality sound effects, downsample to 8 kHz, convert both sets to greyscale PNGs and train an ESRGAN model on that.

Because your current results honestly don't sound quite good.

Phredreeke

#195 Posted 15 August 2020 - 12:07 PM

No. If you were to train a sound upscaling model you would use a neural network designed for it. The way I did caused a lot of excessive processing (scaled it up beforehand in the vertical axis to avoid interference with neighbouring rows), only reason I did was so I could use an existing network.

Also the matter that it's still just 8 bits of dynamic range.

MusicallyInspired

The Sarien Encounter

#196 Posted 15 August 2020 - 12:23 PM

Here are a few audio networks I discovered. DDSP seems to be the most promising one so far as it has a model trainer, but I haven't messed with it yet. The Bandwidth Extension and Audio Super Res ones yield very impressive results but there are zero models available and the guys who made the examples won't release them. I don't know how usable any of these are at the moment, but they show promise. Some of them are for different purposes as well, like translating the notes of one instrument into the timbres of a completely different one. The capabilities are quite vast in the audio realm beyond merely upscaling.

This post has been edited by MusicallyInspired: 15 August 2020 - 12:24 PM

Phredreeke

#197 Posted 07 September 2020 - 03:16 AM

Phredreeke

#198 Posted 08 September 2020 - 05:29 AM

MrFlibble

#199 Posted 10 September 2020 - 10:16 AM

Please stop.

NightFright

The Truth is in here

#200 Posted 10 September 2020 - 10:32 AM

Automatically reminds me of this shit here:
https://media.npr.or...077b1cde4a7.jpg

Phredreeke

#201 Posted 13 September 2020 - 05:43 AM

MrFlibble

#202 Posted 14 September 2020 - 05:14 AM

So which model(s) did you use?

At this moment I'm convinced that LyonHrt's Rebout Blend is the best model for sprites. Here's what it does to the same Dianoga frame with no pre- or post-processing:
Posted Image

Same frame downscaled to 2x (then zoomed back to 4x with nearest neighbour), palettised:
Posted Image

Phredreeke

#203 Posted 14 September 2020 - 05:33 AM

That's Manga interpolated with DeviantHD, preprocessed using my antialiasing script as used on most of my upscales.

I've been playing around with different workflows. Might post some examples later

MrFlibble

#204 Posted 14 September 2020 - 05:54 AM

Phredreeke, on 14 September 2020 - 05:33 AM, said:

preprocessed using my antialiasing script as used on most of my upscales

I'm noticing that your antialiasing routine tends to produce a certain "thickening" of the lines, mostly noticeable on the sprites' edges, which is sometimes so pronounced as to slightly alter the original shape of the sprites (making the outlines more roundish for the lack of a better word).

MrFlibble

#205 Posted 14 September 2020 - 06:27 AM

Here's another small test of Dark Forces characters with Rebout Blend. This time I palettised the 4x image and applied the mask first, then scaled down to 2x using nearest neighbour to get a more pixel-y look (that's how the Doom upscales were processed except they had no masks to work with):
Posted Image

All the processing done in mtPaint. I'm noticing that Image Analyzer produces different results when scaling images down compared to both mtPaint and Paint.NET, as if the image gets offset 1 pixel down/right on the x/y scales.

Phredreeke

#206 Posted 14 September 2020 - 07:09 AM

MrFlibble, on 14 September 2020 - 05:54 AM, said:

Draw a thin white line on black background, then blur it. You get a thicker grey line.

MrFlibble

#207 Posted 14 September 2020 - 07:23 AM

Phredreeke, on 14 September 2020 - 07:09 AM, said:

Draw a thin white line on black background, then blur it. You get a thicker grey line.

I get it that this comes from applying the blur, but I mean, is it really necessary? We're long past the point when ESRGAN models needed this kind of crutch to produce decent results.

MrFlibble

#208 Posted 14 September 2020 - 12:04 PM

Here's a second attempt on the Dark Forces baddies:
Posted Image

This time I interpolated Rebout Blend with another model by LyonHrt, Lollypop, at alpha 0.3. I think this one came out a bit better.

Phredreeke

#209 Posted 12 January 2021 - 02:18 PM

Tested out a new script I've made on the Donkey Kong Country 2 title screen (upscaled using SGI model)

https://imgsli.com/Mzc0MjM

More:
https://imgsli.com/Mzc0MjU
https://imgsli.com/Mzc0Mjk
https://imgsli.com/Mzc0MzA

This post has been edited by Phredreeke: 12 January 2021 - 03:33 PM

MrFlibble

#210 Posted 13 January 2021 - 04:12 AM

Very good overall, but colour banding does not really work well on the characters, especially in the first title screen image.

I wonder why it looks okay in the original resolution, but gets really noticeable in the upscale?