Vibe coding

Vibe coding

I like to think of myself as an “old-skool” dev - though let’s be honest, that’s just a hip way of saying I’m an old dev that’s been writing code since another millennium.

It wasn’t that long ago that professionals working on the web would describe their work using the language of a “craftsman”. We were digital artisans, building pixel perfect designs, creating delightful digital experiences and shipping hand-crafted code that we were proud of.

These days developers have a new language that captures the zeitgeist of the modern AI engineer: Vibe coding.

@karpathy on Vibe Coding

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

Karpathy goes on to describe vibe coding in practice as some kind of slider.

@karpathy

All the way on the left you have programming as it existed ~3 years ago. All the way on the right you have vibe coding.

I like the slider analogy and see this as a transition from craft to vibes. It’s the modern day equivalent of when textile work shifted from a highly skilled craft to an entirely automated process. The transition didn’t occur overnight, but one Spinning Jenny here and a Jacquard Loom there, and the next thing you know the craft of textile working is no more. The march of progress is inexorable, and we’ve seen that transition repeat many times through history.

But where exactly are we on this slider? And if I can stretch the analogy even further, if our slider is a DJ’s cross-fader, are we in for a quick chop, or is this going to be a long fade mix?

Vibe slider

Current state of play

I recently spent a few days with a company that are cranking that vibe slider as far as they can. They’re building their own agent tooling that will take a boilerplate app and spec, and use AI to create a fully functioning app in 15 to 30 minutes.

It’s really impressive to watch an agent loop over the spec and iterate on the build step by step. What they’ve built so far is great, and one of their generated apps has been out in the wild for a while and is already generating revenue.

But… their generated apps are still what you might consider (and I’m quoting their CEO here), “shitty little apps”. We’re talking single feature apps - a small database, user auth, a handful of routes and a couple of endpoints. For a competent dev, it’s the kind of thing you could bash out in a day or two.

I can pick other bones too. When the agents struggle with a more complex feature, what they’re typically doing is hand writing the code and wrapping it in a function so the AI-generated code can implement the feature just by calling a function. That’s a totally pragmatic thing to do, but we’re already edging that slider back towards craft coding.

I think what this company is doing is really cool, and their ambition is far greater than shitty little apps. But I mention all this because I think it reflects the messy reality of vibe coding. Anyone who’s used Cursor for anything more complex than a one-shot Flappy Bird clone will know there’s a lot of effort involved in prompting, providing context, and testing and validating outputs. Without a skilled human pulling the strings, the vibes can get ugly, quickly.

Prompting and context challenges

Prompting isn’t just about describing a task or instruction. Agents need to know everything about your codebase in quite a lot of detail. They need to know the stack, the key dependencies, and they need to see lots of examples of how you actually want them to code.

And as a project and its architecture evolves, so too must the prompts and fragments of prompts that your agents depend on. Managing all of this is quite a lot of effort. Being able to prompt clearly and effectively is a skill that not every developer is going to be blessed with. But for vibe coding, it’s an essential.

There are sites popping up for users to share .cursorrules files for different stacks, and some developer tools and libraries are starting to share prompt fragments in their documentation. This is all great to see and does help the vibe coder find their way in all this.

Context management poses another challenge. In principle, agents work better when they are shown the right context at the right time, and not necessarily all the context all the time. Ideally, IDEs would make this just work™, but in practice this is tricky stuff to get right and another messy aspect of vibe coding that developers need to grapple with.

Validation challenges

When we’re writing code ourselves, we inherently validate our work as we go. But with vibe coding, validation becomes a critical challenge that can make or break the entire approach. We can either fly blind, or we can implement robust validation strategies and take advantage of mathematics and probability.

@jonas on Probability

If an AI agent gets a task (say, building an app) right only 1/10 times, it means that with enough money it can get it right 99.99% of the time.

For that to work, though, the agent needs good validators that tell it whether it did the right thing.

The idea here is if an agent succeeds only 10% of the time, if you run enough of those agents (88 I think, but someone brainier than me can explain the maths), then there’s a 99.9% chance at least one of them succeeds.

All that depends on having good tests and checks in place to determine when an agent gets it right. There’s some low-hanging fruit here: your agents should definitely be seeing your Typescript compile errors and linter warnings for a start.

And then there’s testing. In a vibe coding maximalist world the AI will write the tests too, but for now this is a role for developers whilst they push the vibe slider as far as they’re comfortable with.

UI and UX challenges

If moving the vibe coding slider to the right comes at a cost, by far the greatest cost in my view, is the detrimental effect on quality of design and user experience.

Whilst agents can do a reasonable job throwing components together using frameworks like shadcn and similar, let’s be honest, AI-gen design is always very, very average. We’re talking functional but uninspired layouts, predictable typography and color schemes that you know you’ve seen a thousand times over. It’s bland, generic, just “meh” kinda stuff.

Sometimes that’s fine. If it’s a private or internal project, we can make these kinds of compromises. But for a real, consumer facing, commercial app, I think the vibe coder falls well short of any craft coder with good design chops.

There’s a missing piece here: not just models that have visual understanding, but models that are specifically trained on decades of examples of great design and can review and can improve their own code from a UX point of view.

I’m sure somewhere there are some smart people working on exactly that, but for now I think vibe coders just fundamentally don’t have the tools for doing good design and UX.

The times, they are a changin’

The software engineer, as a role, is well and truly in the mix. You can hear the beat - DJ Vibes is moving that slider from left to right. But I don’t think we’re quite as far into the transition as others seem to think.

@shl on Junior Devs

No longer hiring junior or even mid-level software engineers.

Our tokens per codebase:

Gumroad: 2M
Flexile: 800K
Helper: 500K
Iffy: 200K
Shortest: 100K

Both Claude 3.5 Sonnet and o3-mini have context windows of 200K tokens, meaning they can now write 100% of our Iffy and Shortest code if prompted well.

Our new process:

1. Sit and chat about what we need to build, doing research with Deep Research as we go.
2. Have AI record everything and turn it into a spec.
3. Clean up the spec, adding any design requirements / other nuances.
4. Have Devin code it up.
5. QA, merge, (auto-)deploy to prod.

Sahil’s post doesn’t really pass the sniff test. Not withstanding the obvious fact that if you don’t have junior devs, then it isn’t long until you don’t have senior devs either, I simply don’t think the tooling and models are ready for this. But as a vision for what a maxed out vibe coding environment looks like… maybe?

I still don’t believe coding is dead, and I don’t believe junior or mid level devs are done for, but I do believe this slide to the right is in play. The role of a developer or software engineer, across all levels, is fundamentally changing. Whilst we can dream of sitting around on beanbags, chatting, and brainstorming while the agents do the coding, the reality is we’re trading writing code for wrestling with prompts, managing context, and writing many, many tests and validators.

Conclusions… any?

The slide from craft to vibes isn’t a clean transition - it’s messy, complex, and full of hard problems that need to be solved - and we’re still very much in the early days of this transition.

The reality is that effective vibe coding today requires a peculiar mix of skills: traditional coding expertise, prompt engineering finesse, and a willingness to test outputs far more vigorously than you’d test your own outputs. It’s less about replacing development skills and more about augmenting them with new tools and approaches.

Just as the industrial revolution was a more gradual transition than the age’s technological advancements suggested - a transition limited by the practical realities of human adaptation - the shift to vibe coding will likely be just as nuanced. The greatest opportunities will come to those who can navigate this gap between AI capability and human readiness.

For now, the slider sits closer to craft than vibes. The future may yet bring Sahil’s vision of developers as spec-writers and AI-wranglers into reality, but the path there isn’t as straight or as short as some might suggest. The beat goes on, the mix continues, and somewhere between craft and vibes, we’ll find our groove.