On Using LLMs Without Becoming One

A year ago I would have told you I was LLM-skeptical in the specific way that comes from watching too many hype cycles. I've been in software since the 80s. I watched expert systems, neural networks (the first time), and blockchain each get crowned as the thing that was going to change everything, and watched each of them either evaporate or settle into something smaller and more specific than the announcement implied.

I still think that's the right prior for most technology announcements. But after a year of using AI coding tools seriously — not dabbling, actually building things with them in my workflow — I've had to update.

The update is not "AI is going to replace programmers." The update is more boring: these tools are useful, in specific ways, for specific tasks, and there's a real skill to using them well that most people (including me, initially) don't have on day one.


What actually helps

Generating boilerplate that I know how to write but find tedious. The kind of code where I know exactly what I want — a CRUD API for this data model, a regex that handles these edge cases, a migration script for this schema change — and the question is just: how long will it take me to type it. For these tasks, a good LLM is fast and mostly accurate, and I review the output the same way I'd review code from a capable junior developer: with attention, not with blind acceptance.

Explaining unfamiliar code. I work in a lot of contexts — PHP, JavaScript, SQL, occasional Python, configuration formats I encounter once and forget. When I'm reading code in a context I'm less fluent in, being able to paste a block and ask "what is this doing and why would someone write it this way" is genuinely faster than Stack Overflow. Not because the LLM is always right — it's often confidently slightly wrong — but because it can give me vocabulary and a starting point, and I can go look up the actual answer from there.

Rubber duck debugging, but the duck talks back. Sometimes I describe a problem to an LLM not to get a solution but to structure my own thinking. The act of explaining it clearly enough for the model to engage with it sometimes surfaces the problem before the model responds.


What doesn't help (or actively causes problems)

Generating code in a codebase the model hasn't seen. This is the most common failure mode: you ask for a feature implementation and get something that's locally correct but doesn't fit the actual architecture, naming conventions, existing abstractions, or error handling patterns. The code compiles. The tests pass. The PR review catches the problem. Or it doesn't and the problem shows up in production.

This isn't a criticism of the tools — it's a description of a mismatch between what the tool is good at and what the task requires. Codebase integration requires knowledge of the codebase. Uploading the whole codebase to the context window helps some; it doesn't fully solve the integration problem.

Long-horizon planning. "What should the architecture of this system be" is not a question LLMs answer reliably. They'll give you an answer — a confident, well-formatted answer with headers and bullet points — and that answer will be the average of a lot of architecture discussions in their training data, applied to your situation without the specific context that makes architecture decisions right or wrong. Sometimes that's useful as a starting point. Often it's a distraction from the harder work of thinking through your actual constraints.


The thing I worry about

I've watched how some developers — good developers, people I respect — have started interacting with LLMs in a way that makes me uncomfortable. Not uncomfortable because they're using them, but because the mode of use has become: describe what you want, accept the output, move on.

The problem isn't the tools. The problem is that software development requires judgment that comes from understanding — from having written enough code to recognize when something is wrong before you can articulate why it's wrong. The question is whether using LLMs to generate code you don't fully read and understand erodes that judgment over time.

I don't know the answer. The feedback loops are long. A developer who relies heavily on generated code for two years might write substantially more code in that time, which builds some kinds of knowledge faster. Or they might accumulate code they didn't understand when it was written, and when it breaks, they won't have the context to debug it efficiently. I've seen early evidence of both.


The version that would make me more comfortable

Using LLMs as acceleration, not autopilot. Which sounds obvious but feels like genuine discipline in practice. Read the output. Understand it well enough to explain it. Push back when something looks wrong. Treat the model as a fast first draft, not a final answer.

For me, the question is always: do I understand this code? If the answer is no, the answer isn't "that's fine, the LLM wrote it." The answer is: I need to understand it before it ships, because I'm the one on the hook when it breaks.

That framing has kept me from a few problems I'd have had otherwise. It's also made me slower than some of my colleagues who are more willing to ship code they didn't fully read. Maybe I'm leaving productivity on the table. Maybe I'm not.


I'm not a convert in the evangelical sense. I use these tools and I find them useful and I'm aware they're getting better faster than I would have predicted. I'm also aware that "this time it's different" is a sentence that has been wrong many more times than it's been right.

Both things can be true. The tools are useful now. The claims about them are still inflated. Hold both.

That's about as enthusiastic as I get.