Show HN: Fixing Google Nano Banana Pixel Art with Rust

184 points by HugoDz 5 days ago

krisoft a day ago

It feels weird to me that on the before/after comparision they felt the need to zoom in on the “before” but not on the “after”.

Either both should have the magnifying glass or neither. This just makes it hard to see the difference.

thih9 10 hours ago

There are more details in the fixed version too, e.g. an extra detailed dark line within right leg (tibia) that is not present in the original; where do these details come from?
im3w1l 9 hours ago

The purpose of zoomed out comparison is to show the quality reduction of applying this tool. The purpose of zoomed in before picture is to show how a typical pixel misalignment. Aligned pixels can be easily imagined.
- krisoft 6 hours ago
  
  > The purpose of zoomed out comparison is to show the quality reduction of applying this tool.
  Reduction? Shouldn't the tool be improving the quality of the image? If it is reducing the quality then why do it?
  > The purpose of zoomed in before picture is to show how a typical pixel misalignment.
  Okay, but how does this supposed "misalignment" look on the picture? Would I even notice it? If not, does it matter? Did they just zoom in, and draw a misaligned grid over the zoomed in image? Or the grid fault lines are visible in the gestalt?
  > Aligned pixels can be easily imagined.
  Everything can be easily imagined. Misaligned pixels can be imagined. They could just write "our processed images look better" and let me imagine how much nicer they are. The purpose of a comparison is to prove that they are nicer/better/crisper whatever they want to claim.
  - coldtea 5 hours ago
    
    >Okay, but how does this supposed "misalignment" look on the picture?
    People who are the target audience for this tool already know.
    >Would I even notice it?
    Yes.
    >The purpose of a comparison is to prove that they are nicer/better/crisper whatever they want to claim.
    They don't need to prove it to their target users. They already know the problem (for which several tools exist).
  - im3w1l 4 hours ago
    
    The way I see it, converting something to pixel art is akin to lossy compression or quantization. The goal is to retain as much detail as possible given the constraints.
    The exact way that pixels are misaligned is a feature of the specific AI models that generated the almost-pixel art.

vunderba a day ago

Nice. There's a couple of these (unfake which uses pixel snapping/palette reduction, sd-palettize which uses k-means to palette reduce, etc.) that I've used in the past in a Stable Diffusion -> Pixel Art pipeline.

I think it'd be worth calling out the differences.

[1] - https://github.com/jenissimo/unfake.js

[2] - https://github.com/Astropulse/sd-palettize

jasonjmcghee a day ago

I can't explain it, but it's like uncanny valley pixel art. Like the artist hasn't done the final polish pass maybe?

Maybe it's the inconsistent lights/shadows?

Maybe a pixel artist has the proper words to explain the issues

SXX a day ago

Not pixel artist, but game dev working with pixel art:
1 - AI just try to compress too many details into so few pixels.
When artists create pixel art they usually add details along the way and only important ones because otherwise it will look like rubbish on some screens.
Also it's easier to e.g add different hats or heads or weapons on the same body. AI generated ones is always too unique.
2 - AI try to mimic realistic poses that look like art supposed to be animated in 3D.
For a real game if you make lets say isometric tactical game you'll never make tiles larger than 64x64 because of how much labour they will take to animate. Each animation at 8fps take hours of work.
So pixel art is usually either high-fidelity and static or low-fi and animated in very basic ways.
smusamashah a day ago

The skeleton has issues, floor tiles are very inconsistent for example. I haven't looked more carefully. We probably notice something wrong subconsciously but it takes time to point those out.
Generated pixel art for now is 80-90% done state. To use them in prod, issues should be fixed which seems to be the palette and some semantic issues. If you only generate small parts of the big picture with AI, it will be perfectly usable.
doctorpangloss 16 hours ago

The borders of shapes are all wrong. It’s not too complicated. There is a small vocabulary of valid border patterns (e.g. a line rising one pixel up and two pixels right) that none of these generative models adhere to.

lxgr 21 hours ago

I'd love this, but for removing "transparent background" checkerboards.

Nano Banana beats it on many other dimensions, but this is one thing that gpt-image-1 usually does much better.

IgorPartola 20 hours ago

This is perfect! I have had such a time with Nano Banana asking it to generate some very simple pixel art. One of the worst things is that it cannot seem to generate transparent backgrounds or even solid ones. It’s always some blotchy cloud of off-white pixels or a simulated fuzzy grid that shows up in some places. I will need to give this a try to clean up some of what I had to try by hand.

forgotoldacc 10 hours ago

I simply cannot understand people who'll spend forever trying to get AI to generate basic art that any amateur with a bit of practice could do in a minute.

andai 13 hours ago

At last! I have been dreaming about such a tool for years. I often find pixel art that has been scaled or poorly compressed. So it's a bunch of fuzzy squares. Can't wait to try this.

threeducks a day ago

Could you explain a bit how the code works? For example, how does it detect the correct pixel size and how does it find out how to color the (potentially misaligned) pixels?

westoque 21 hours ago

> Current AI image models can't understand grid-based pixel art.

sounds like a good use case to fix this problem from the model layer. an image gen model that is trained to make pixel perfect art.

razster a day ago

That's an actually nice setup. Have you looked at Z-Image and the Pixel LoRA that was released? I've found it works fairly well at keeping the pixels matched with the grid.

vunderba a day ago

The Z-image turbo model is pretty heavily distilled. I can't imagine using it for any marginally complicated prompts.
Are you talking about the LoRA by LuisaP?
Somewhat ironically, that LoRA's showcase images themselves exhibit the exact issues (non-square pixels, much higher color depth than pixel art, etc) that stuff like this project / unfake.js / etc. are designed to fix.
https://imgur.com/a/vfvARkt

LorenDB 21 hours ago

How is the "with Rust" part relevant?

Zecc 19 hours ago

For what it's worth, it's what caught my attention. I wouldn't have found it so captivating if it had only said "Fixing Google Nano Banana Pixel Art". To be clear, it's not because of Rust in particular. It would have been the same if it said "with C#", or "with Python", or even just "programmatically". And on that note: I feel disappointed. I thought I would be reading about the development process, and not just a product presentation.
baq 10 hours ago

As a Rust fan I consider this a very valid question. Rust projects should be able to defend their worth without piggybacking onto the love Rust receives from programmers anymore. ‘Not written in js/ts/golang/python’ works for me, too, but it’s a mouthful.
dymk 20 hours ago

this is a site where people discuss programming languages and tools
rust is a programming language
people interested in rust may find a tool written in rust relevant to their interests where they otherwise might not
Svoka 20 hours ago

I guess writing something in Rust is cool. I believe that wanting to be cool is a fundamental human desire.

29athrowaway a day ago

Another annoyance of Nano Banana (and its Pro version) is that it cannot generate transparent pixels. When it wants to, it creates a hallucinated checkerboard background that makes it worse.

vunderba a day ago

Yep. Your best bet is to ask for "solid white/black background" and then feed it into something like rembg [1]. It's an extra step but it'll get you partly there.
On the OpenAI side, the gpt-image-1 model has actually had the ability to produce true alpha transparent images for a while now. Too bad quality-wise they're lagging pretty badly behind other models.
[1] - https://github.com/danielgatis/rembg
SXX a day ago

Ask it for just white background. Works good for both art and to-be-3d-models.

jtfrench 16 hours ago

Go Hugo!

cipehr a day ago

Is it possible that some of the reason pixels are messed up is because of the watermarking? https://deepmind.google/models/synthid/

Or is it purely because the models just don't understand pixel art?

skavi a day ago

I wonder if this would be a simple (limited) example of defeating the watermarking? Surely there's no way SynthID is persisting in what is now a handful of pixels.
29athrowaway a day ago

They also don't understand spritesheets.