6 Months of Vibe Coding — What Broke and What I’d Never Do Again

Speed is real. So is technical debt. A year of building three applications with AI — without external developers — taught me five concrete mistakes and six rules I follow now. Post-mortem from naswoim.org, industrverse, and marcinpaszkiewicz.com.

April 2024. I’ve just pushed the first working version of naswoim.org to production. The app runs. Users can log in. Data persists in Supabase. On paper, it’s a success.

Under the hood: two competing design systems, seventeen duplicate utility functions scattered across twelve files, and a Supabase RLS policy that silently fails for one specific edge case I won’t discover for three more weeks.

Vibe coding works. But it fails in ways that are invisible — until they aren’t.

Code on screen with neon lighting — Photo: Jakub Żerdzicki / Unsplash

Quick context: what I built

Over roughly twelve months, I built three applications using vibe coding — almost without external developers:

naswoim.org — a platform for property investors: checklists, budgets, documents, expert marketplace, land maps, AI assistant. Web + Android + iOS + admin portal.
industrverse.com — B2B SaaS for industrial VR training: 7 user roles, 9 dashboards per role, real-time communication, VR session gateway, full backend API.
marcinpaszkiewicz.com — this site. Astro SSR + WordPress headless. Simpler, but instructive.

I used Claude Code as my primary tool, with occasional help from Cursor. I wrote somewhere between 40,000 and 60,000 lines of production code this way. I shipped everything. It all works.

Here’s what I got wrong.

Mistake #1: I let AI pick the architecture

When I started naswoim.org, I described the project to Claude and asked what stack it recommended. It gave me a solid answer: React 19, Vite, Supabase, Tailwind CSS 4. All excellent choices.

Then I asked about UI components. It suggested MUI 7. I said yes — it was fast, it had everything I needed.

The problem: I was already using Tailwind CSS 4. Now I had two design systems:

Diagram — konflikt design systemów / design system conflict

Tailwind CSS 4

▸ utility-first
▸ spacing: rem scale
▸ breakpoints: sm/md/lg/xl
▸ tokens: CSS vars

⚡

każdy
nowy ekran

MUI 7

▸ component-first
▸ spacing: 8px grid
▸ breakpoints: xs/sm/md/lg
▸ tokens: theme object

Dwa systemy spacingu. Dwie filozofie responsywności. Dwie warstwy CSS do utrzymania. — Two spacing systems. Two responsiveness philosophies. Two CSS layers to maintain.

The lesson: AI doesn’t know your 18-month vision. It optimizes for „working right now.” Architecture decisions — especially around design systems, data models, and module boundaries — must come from you. AI implements. You decide what to implement.

What I’d do differently: write a one-page architecture decision record before the first prompt. Not a full spec — just: what’s the single source of styling truth? What’s the state management philosophy? How are we splitting modules? Give AI constraints, not blank permission.

Mistake #2: AI never says no — and that’s dangerous

By the time I was three months into industrverse, the backend had 13 NestJS modules, 7 user roles, and 9 separate dashboards. Each role had its own data access logic, its own notification system, its own workflow.

None of it was in the original spec.

industrverse — MVP plan vs reality

Backend modules
planned 5 → actual 13

User roles
planned 3 → actual 7

Dashboards
planned 3 → actual 9

Each element made sense in isolation. The problem appears when you add up all the 11pm decisions.

Wall covered in sticky notes — what happens when scope has no limits — Photo: Jakub Żerdzicki / Unsplash

Here’s what happens: you have an idea at 11pm. You describe it to Claude. It builds it in twenty minutes. It works. You ship it. Three weeks later, you realize that adding this feature broke the mental model for the next feature. But AI doesn’t tell you this — it just builds what you ask.

Every developer on a team has a colleague who says „wait — are we sure we need this?” AI doesn’t say wait. AI says yes.

The lesson: You must be the PM for your AI. Not just the visionary — the person who says no. The question isn’t „can AI build this?” (it can). The question is „should this exist at all?”

I now have a rule before any new feature: write one sentence about what problem this solves for a specific user. If I can’t write that sentence, I don’t prompt it.

Mistake #3: Debugging code you didn’t write is slower than it looks

Magnifying glass over a maze — hunting for a bug in AI-generated code — Photo: TSD Studio / Unsplash

In naswoim.org, I had a bug in the Supabase Row Level Security policies. Users in one role could occasionally see documents they shouldn’t — but only when a specific sequence of operations had happened first.

It took me three days to find it.

Not because the bug was complex. Because the code was AI-generated and I hadn’t read it carefully enough when it was written. The RLS policy looked right. It was syntactically correct. It passed my basic tests. The edge case was subtle — a combination of two different policy conditions that interacted in a non-obvious way.

When you write code yourself, you build a mental model of it as you write. When AI writes it, you review it — which is faster, but shallower. The model in your head is less complete. And shallow mental models make debugging slow.

The lesson: Never merge AI code you can’t explain line by line. For anything touching auth, data access, or business-critical logic: read it like a code reviewer, not like someone checking a shopping list.

Mistake #4: Context ends — and AI „forgets” everything

In a long Claude Code session, the AI sees everything you’ve built together. It knows your naming conventions, your patterns, your preferences. It’s coherent.

In the next session, it starts fresh.

Schemat — pamięć kontekstu między sesjami / context memory across sessions

Sesja 1

▸ React hooks
▸ TanStack Query
▸ Zustand atoms

↺ reset

Sesja 2

▸ useEffect pattern
▸ local useState
▸ API calls inline

↺ reset

Sesja 3

▸ custom hooks
▸ Context API
▸ Axios interceptors

Każda sesja generowała spójny kod — ale każda sesja nie wiedziała nic o poprzedniej. — Each session generated coherent code — but knew nothing about the previous one.

In naswoim.org, I started a new session after a two-day break and asked Claude to build a new feature component. It generated something that worked — but used completely different patterns from everything else in the codebase. Different state management approach. Different error handling style. Different naming.

By month four, the codebase had three distinct „eras” — each reflecting the conventions of whoever I’d been talking to at that time.

The lesson: A CLAUDE.md file is not optional. Set it up on day one. It should contain: naming conventions, patterns to follow, patterns to avoid, which libraries to use for which problems. This is the persistent memory that bridges sessions.

Mistake #5: Security is invisible until it isn’t

AI generates working code. It doesn’t reliably generate secure code.

Red padlock on a keyboard — security invisible until it isn't — Photo: FlyD / Unsplash

In industrverse, I had an API endpoint that was supposed to be accessible only to users with the „trainer” role. The endpoint worked correctly. It returned the right data. It handled errors gracefully.

It also didn’t verify the JWT role claim on one specific HTTP method. A user with any authenticated token could call it.

I found this in a manual security review — not because Claude flagged it, not because my tests caught it. Because I sat down and read through every auth-related endpoint one afternoon.

Security review — checklista po każdym auth feature

    ✓

    Czy każdy endpoint weryfikuje JWT/session?
Is every endpoint verifying JWT/session?
  
    ✓

    Czy rola użytkownika jest sprawdzana server-side?
Is the user role checked server-side?
  
    ✓

    Czy RLS działa dla wszystkich kombinacji ról?
Does RLS work for all role combinations?
  
    ✓

    Czy input jest walidowany przed zapisem do bazy?
Is input validated before writing to DB?
  
    ✓

    Czy wrażliwe pola są filtrowane w response?
Are sensitive fields filtered in the response?
  
    ✓

    Czy edge case (brak roli, wygasły token) jest obsłużony?
Is the edge case (missing role, expired token) handled?

The lesson: After every feature that touches authentication, authorization, or user data — do a manual security review. Not a vibe. A checklist.

What I’d do differently: 6 rules

If I started today, with everything I know now:

Write an architecture brief before the first prompt

One page. What’s the design system? What’s the state management approach? What are the module boundaries? Give AI constraints, not blank permission.

One design system. Zero exceptions.

Pick either a component library or a utility CSS framework. Not both. If you pick Tailwind, every component is Tailwind. If you pick MUI, every component is MUI.

One-sentence problem statement before every new feature

„This solves [specific problem] for [specific user].” If you can’t write this sentence, you’re not ready to prompt.

If you can’t explain it line by line, don’t merge it

Especially for anything touching auth, RLS, permissions, or data models. Review it like a code reviewer — not like someone checking a shopping list.

CLAUDE.md from day one

Naming conventions, preferred patterns, libraries to use for which problems, patterns to avoid. Update it every time you make a significant decision. This is the persistent memory between sessions.

Schedule a security review after every auth-related feature

It’s not optional. AI generates working code, not secure code. Treat auth, RLS, and input validation as areas where you always read manually.

What I’m not saying

I’m not saying vibe coding is flawed or that AI tools are unreliable. All three projects I built work. They have real users. They solve real problems. The productivity gain is genuine — I built in one year what a team of three would have taken eighteen months to build.

What I’m saying is that the failure modes are specific, and they’re not obvious at the start.

The biggest risk in vibe coding isn’t that AI writes bad code. It’s that AI writes code that looks fine — until something goes wrong. And by then, you’re looking at a codebase you half-understand, with a bug you didn’t write, and a mental model that has gaps in exactly the wrong places.

The speed is real. Build the habits that make it safe.