
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
Carl Lapierre tried to force an OSS 20B model to emit pixel art one color-token at a time — instead of using DALL·E-style image generation, he mapped tokens like R, G, B, Y, O, P, W, K, and . to palette colors so he could control every pixel directly.
Constraint decoding works cleanly for banning tokens like em dashes, but gets messy fast for art generation — Carl shows how setting unwanted logits to negative infinity removes outputs like the em dash, then explains why token-level pixel constraints break at scale because tokens vary in length and often bundle dots and newlines together.
The failed decoding experiment turned into an accidental benchmark of LLMs for pixel art — after testing multiple models, he found some odd behaviors like 'fur mini' making circular pixel grids, while most models simply weren’t reliable enough for usable sprite generation.
Gemini 3 was the breakthrough: it generated 'pixel perfect' sprites in about six seconds — Carl demos hearts, ducks, and a health potion, and says Gemini 3's low-reasoning mode suddenly made this feel like a different paradigm because it 'gets it right all the time.'
The real advantage over diffusion models is editability, not just generation — once the sprite exists as constrained text tokens, Carl can directly swap colors, like turning a health potion into poison, which he says normal diffusion pipelines don’t give him.
Prompting still matters even under constraints — to keep the model oriented in a 'sea of points,' he injects a schema-like prompt explaining which token maps to which color, uses few-shot examples, and notes weird tokenizer artifacts like O and K repeatedly appearing because the model wants to say 'okay.'
Carl opens with a very specific itch: he wants lots of pixel-art assets for simulation games, but tools like DALL·E or 'nano banana' only give him finished PNGs, not control over individual pixels. So he goes one layer deeper, to token generation itself, with the hope that each token could stand for a color in a sprite.
Before getting to art, he gives a clean demo of constrained decoding using an em dash as the villain. Since LLMs generate logits over tokens, you can effectively ban a token by setting its score to negative infinity — perfect, he jokes, if you write generic LinkedIn posts and don’t want people thinking 'you’re a robot.'
Carl then remaps a tiny token set into colors: R, G, B, Y, O, P, W, K, plus . for transparent, which he says worked better than using T. On a small example like a heart, it looks magical: token by token, the model emits colored cells and starts resembling real pixel art.
Then the demo hits reality: at larger scales, the output turns into junk. His key point is that token constraints are not like regex over characters — tokens have uneven sizes, sometimes bundling multiple dots or newlines, so you can’t just limit the model to a tiny clean alphabet and expect stable structured images.
That failure became a new project: comparing models on pixel generation. Carl says one model, 'fur mini,' hilariously kept making circles — it somehow understood pixel grids, but only in round form — and for a while there just wasn’t a model that could produce consistently good sprites.
Then came Gemini 3, 'this week,' and the room wakes up with him. Using its low-reasoning mode, he demos a heart, mentions a duck he made with a friend, and shows a health potion, all generated in around six seconds; the crowd claps, and he calls it 'pixel perfect' and his 'new toy.'
The big payoff is control after generation. Because the image is represented as constrained tokens, he can tweak colors directly — like flipping a health potion into a poison potion — which he says is exactly the kind of granular editing normal diffusion models don’t really offer.
In Q&A, Carl explains that constraints alone aren’t enough; like structured JSON outputs, the model still needs a schema-like prompt telling it what each token means and what a valid drawing looks like, plus few-shot examples. He also shares a very tokenizer-specific bug: O and K often appeared at the start because the model kept trying to say 'okay,' and when he first banned em dashes, it simply switched to every other dash-looking character it could find.
Asked to draw something abstract like 'loneliness,' Carl jokes, 'Should I show a picture of us?' Then the model produces what he calls 'a gray mountain,' which feels like the perfect ending: half demo, half comedy, and a reminder that constrained decoding is both a real science and a playground for strange model behavior.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.