LLMs: How to Get Past “Meh” Outputs
When I’m doing deep technical work, I sometimes get a nagging feeling that pulls me down a specific troubleshooting route, line of investigation, or coding path. It’s hard to explain, but it feels a lot like what the book “Thinking, Fast and Slow” describes as System 1 thinking and heuristics. My take is that highly experienced professionals build intuition over time, and that intuition gives them a spider-sense-like advantage.
It’s also hard to explain how to develop that spider sense. But it seems tied to deep practice: Real reps, under pressure, right at the edge of what you can currently do. As per Gemini:
“Deep practice is an intense, focused method of skill development that pushes you to the edge of your current abilities, making and correcting mistakes to build new neural pathways, rather than just mindlessly repeating what you already know.”
And I think the key phrase there is “pushing to the edge.” Mainly because it sucks. It’s uncomfortable, frustrating, and makes you feel like a fool until you get it. Few people go all the way through that part consistently.
Now I want to pivot to GenAI, specifically coding with it (them?).
The results I’m getting with recent tools and models are nothing short of amazing. But if you look at social media, you’d think code generated by AI is either the best thing in the world or basically crap that should by no means go beyond a small “Hello World” prototype. At the same time, plenty of staff and principal engineers at top companies are genuinely impressed by what they’re seeing. So why the disconnect?
To me, the disconnect boils down to experience and effort.
Some people treat AI like a vending machine. They throw in a basic prompt, grab the output, and call it done. When the code comes out messy or broken, they blame the tool. It looks like crap because they didn’t guide it.
Others treat AI like a smart but raw intern. You coach it step by step. You iterate on prompts, fix errors in follow-ups, test the code yourself, and tweak until it’s solid. Over time, you build intuition for what works: when to use a certain model for speed, when to use a chain of prompts for complex logic, and when to stop and do it yourself. That’s how you get the kind of results that actually ship fast.
And that’s the connection back to deep practice and System 1 thinking. I jumped on the GenAI bandwagon as soon as I could get my hands on ChatGPT in the early days. From the get-go, I put in hours—trying different ways to get things done, tweaking prompts and approaches each time, and consuming everything I could find. As soon as other tools started to appear, I signed up for them. I put in not only my time, but also my money, on multiple fronts, because the technology was that fascinating.
Even back when early LLMs were pretty hit-or-miss, I was still getting enough value out of them to move faster on a bunch of things. The first version of the code they spit out was usually rough, but if I nudged it with a few more prompts, it would get noticeably better. Not perfect, but often good enough to ship, or at least good enough to use as a solid starting point.
They’ve also been awesome for docs, and for cranking out user stories and epics for agile work. And honestly, they’ve been great at reviewing and poking holes in my writing for years now.
Part of my job now is helping my team—and others in my organization—get closer to that same level of effectiveness. No doubt, I see lots of people excited. Many are actively using the same tools I am, and most are getting great results. But not ground-breaking results.
Only a few people really seem to be getting it. They’re putting in the effort, doing the extra work, trying different things, and not just sharing general articles and videos: They’re actually showing their work and the results.
If there’s a takeaway here, it’s that GenAI isn’t a cheat code. It’s a multiplier. And like every multiplier, it mostly amplifies what you already bring to the table: your taste, your judgment, your ability to spot nonsense, and your willingness to push past the first draft. The people getting “magic” results aren’t blessed by the model gods. They’re doing deep practice: iterating, testing, refining, and slowly building that System 1 intuition for how to steer these tools.
So here’s my ask: don’t use AI like a vending machine this week. Pick one real task you’d normally do the slow way—a refactor, a flaky test, a tricky Terraform module, a gnarly edge case—and commit to working it with an LLM end-to-end. Make it uncomfortable. Push the edge. Keep the receipts: the prompts you tried, what failed, what finally worked, and what you changed by hand.
And if you do it, share it. Post a short write-up, drop your prompt chain, or even just a before/after diff and what you learned. The fastest way for more people to “get it” is for the folks doing the work to show their work.