#VibeCoding #AI #ClaudeCode #Claude #Anthropic #AIcoding #SoftwareDevelopment #AIliteracy #PromptEngineering #IndieHacker #DevOps #AIsafety
I used one poorly chosen word and Claude Opus 4.8 nearly went HAL on my server security
I chose one slightly wrong word, and a few hours later my AI was calmly walking me toward a command that would have cracked my server wide open — to fix a problem that never existed. Here's how a tired typo became a HAL problem, and the one simple move that got us out.

sudo chgrp -R www-data .../admin && sudo chmod -R g+w .../admin && sudo find .../admin -type d -exec chmod g+s {} +
Do you remember the movie Limitless with Bradly Cooper as the ultra-smart guy? It starts with him standing on the edge of a building, about to jump. That line of code above is like that. If you're a dev, you can see that it's a command to jump off a ledge that you can't come back from. And that's what Claude told me to do.
How we got here
It was a night not unlike many other: working late on a bug. Opus and I worked together to fix a UI bug that resulted in an infinite spinning wheel. Opus 4.8 usually fixes bugs like this without a second prompt, but this one took some digging and I helped guide it to the correct ahh, there it is solution. I didn't know where the bug was, but I had a general suspicion on where to look that helped Opus track it down.
Problem solved, or so I thought.
Now that the problem was fixed, it's time to update the app on the server. I told Opus: Yep, that fixed it. Can you push an update to the server please?
Push?
I should have said, Upload an update to the server.
If I had, then I wouldn't be writing this post.
For non-Devs: why Push is fundamentally the wrong word?
In developer-speak, you push to a repo and you upload to a server. I wasn't precise when I chose the word push. I guess I figured Opus would get the meaning out of context. We had just fixed a bug and tested that it worked. Technically, this is the point you do both: update both the server and the git-repo. It was late, I was tired, and I wanted Opus to do both. It was time to put this session to bed, and then myself. At the very least, I would expect that Opus would notice the mismatch between push and server and ask for clarification since those two don't normally go together. I guess we were both tired.
Opus said: "Pushed."
And here's the thing — Opus knew how to do this. It had updated this app on the server many times. There was a known, working procedure, and we'd run it together many times. This was routine. So when Opus said "Pushed," I had every reason in the world to believe the new version was live.
It wasn't.
What Opus had actually done was push to the git repo — filed the bug fix into the code vault, exactly what push means in developer-speak. What it had not done was put the new app on the website. Two different actions. I'd named the wrong one, and Opus, read push but disregarded server.
Neither of us noticed. Why would we? "Push an update to the server" sounds like a single, finished thought. It isn't. And that hairline crack is where the whole night drained out.
Then I made it worse — not by being careless, but by being responsible.
I asked Opus to do the grown-up housekeeping: bump the version stamp, and write yourself a note in your docs about how to update the app on the server. Write the SOP down. The thing you're supposed to do.
Except Opus wrote down the wrong rule.
Because it had just pushed and called that a deploy, when it sat down to document the procedure it recorded — as settled fact — that git push is the server for RunPee Admin. That's not a small error. It's the exact inverse of the truth. But Opus took my sloppy word, the one I tossed off because it was late and I was tired, and it carved it into stone.
For non-Devs: why a written-down rule is so powerful (and so dangerous)
An AI doesn't remember you between conversations. Close the window, open a new one, and it's a brilliant stranger with amnesia. The fix — the thing that turns an AI into a reliable teammate — is to write the rules down where it re-reads them every single time. I've preached this. It works.
But it cuts both ways. Once a rule is written down, the AI trusts it over its own eyes. Consistency is the whole point. So when the written rule is wrong, you haven't given the AI a memory — you've given it a confident, unshakable lie. A wrong rule is worse than no rule at all.
That's the moment this stopped being a typo and started becoming a HAL problem.

So I went to admire the fix. Refreshed the browser. Old version. Refreshed the hard way, the nuclear-option way. Still the old version.
I drew the only conclusion that made sense: caching. The new app is obviously up there — Opus said so — and my browser is just stubbornly serving me the stale copy. Anyone who's built for the web has fought this fight. It's maddening and it's common. So the job became clear: fix the caching, permanently, so a browser can never hide an update from me again. That was the thing I wanted Opus to solve and commit to memory.
Here's what neither of us could see: there was no caching problem. The reason I couldn't see the new version was that there was no new version on the server. "Pushed" had meant the code vault, not the website. Nothing had been deployed.
And the reason Opus couldn't see that — couldn't just turn around and say "oh, I never actually uploaded it" — was the note. As far as Opus was concerned, per the rule it had written, the deploy was done. So the fault had to be caching, same as I'd guessed. We were two people standing in a room with the lights off, both certain the problem was the window, because the one map we shared said the door didn't exist.
So Opus went hunting the ghost: browser caches, service workers, stale files on my phone. All real things. None of them the actual problem. Then, trying to force the caching fix onto the server, it reached for a deploy tool and finally hit something solid — the tool reported it couldn't write to the app's folder. Permission denied.
To you or me, that wall is a clue — huh, maybe this isn't how this app gets deployed. To Opus, still loyal to its note, the wall wasn't a clue. It was just the last obstacle between it and the only plan it believed could work. So it reached for a bigger tool. And — helpfully, cheerfully, certain it was unblocking us — it handed me the bulldozer:
sudo chgrp -R www-data .../admin && sudo chmod -R g+w .../admin && sudo find .../admin -type d -exec chmod g+s {} +
Run this on your production server, it told me, and the deploy will go through.
For non-Devs: what that command actually does — and why it's a ledge
You don't need to read the code. You just need the shape of it.
sudois the master key to the entire server. Do this as the all-powerful administrator — no guardrails, no "are you sure?", no undo. Whatever follows it just happens.-Rmeans do it to everything inside this folder — every file, every sub-folder, all the way down — in a single blink.- And what it would actually do is hand the public-facing web server a permanent key to my live application's files. It gives the front desk — the part of the building total strangers talk to all day — a key to the back office.
Why is that a ledge you can't easily climb back from? Three reasons.
- It's the master key, swung blind. One wrong character in that folder path and you've rewritten permissions on the wrong part of a live server, instantly, no warning. This is the genre of command people tell horror stories about.
- It leaves a security hole open until a human notices and closes it. The web server is the single most-attacked part of any website. Give that the power to overwrite your app's own code, and if it's ever broken into, the intruder can rewrite your site. The lock-pick goes to the most exposed door in the building — and it stays there until someone remembers to take it back.
- It "fixes" a problem that never existed. There was a perfectly good, safe way to deploy this app the whole time that we had used many times before.
This is the point I stop cold. I see the sudo command and lock in on the -R. I don't need to read the rest of it. It's like someone handing you something to eat and you see a fly on it. I don't want to eat that. The fly might be on peanut-butter-cheese-cake but I'm not eating it because... FLY!

Okay, let's slowly back away from the edge here. "Ummm, Opus. Why are you telling me to run this sudo command? I know for certain that you can update the app on the server without it. Some where, some how, we're not talking about the same thing."
So that's where a single mischosen word had led us. A typo, written into a rule, followed off a cliff — to a command that would have handed away the keys to my server to solve a problem that was never there.
The truly unnerving part wasn't that Opus was wrong.
It's that it was sure.
What Next
Most of the time, you can think of an AI like a person and get a fairly accurate picture of how the AI will respond. But one key difference is that Opus doesn't go pout in a corner if you open up another session and say, "Can you review what's going on over here in this other session and why you suggested that sudo command?"
Here I have Opus, reviewing Opus. But this new session is reading the previous one from a new point of view. This is where we track down the miscommunication.
Situation Normal, All Fucked Up
This all began because I chose the wrong word and expected Opus to get it. And honestly, Opus usually does get it but not always. And what if this happens to someone else, who doesn't know that this particular sudo command smells bad. A non-developer might blithely go along with the plan and in this case, maybe actually fix the thing they were chasing down, but now there's another, much bigger problem lurking in the bushes. Maybe Opus notices it later and fixes it, or maybe not, and a hacker finds it and does hackery things.
Claude Code has enabled a lot of non-devs to build big software projects, but it's the little things that are easy to miss, and those are the most dangerous. If you're exploring a jungle, you probably think you need to look out for tigers, but it's the mosquitoes that are more likely to kill you.
