AI-assisted coding needs smaller loops
Notes from Matt Pocock's advice on context limits, vertical slices, TDD, and where human judgment still matters.
- ai-tools
- workflow
- engineering
Source: Matt Pocock’s video (YouTube).
The more I use AI coding agents, the more obvious one thing becomes: the tool is only as good as the workflow around it.
Matt Pocock’s advice on AI-assisted coding is useful because it avoids the lazy version of the conversation. It is not just “prompt better” or “use a bigger model.” The point is more practical: keep the model inside a small, well-shaped problem with fast feedback, then use human judgment where it still matters.
Stay inside the smart zone
Large context windows are useful, but they are not magic.
A model may advertise 200k tokens, 1M tokens, or more. That does not mean the conversation stays equally sharp as the context fills up. The more tokens you add, the more relationships the model has to track. Eventually it enters what Matt calls the “dumb zone”: the place where it still sounds confident, but starts making worse decisions.
The practical rule is simple. Keep tasks small enough to fit inside the model’s smart zone. In practice, that means keeping active context fresh, focused, and usually well under 100k tokens.
Compaction helps, but it is not the same as a clean reset. Sometimes the correct move is to clear the context entirely and restart from the base state. The Memento analogy is crude but accurate: damaged memory makes bad decisions look coherent.
Get grilled before code
The pure specs-to-code version of vibe coding sounds clean and usually fails in practice.
The promise is seductive: edit a spec file, hand it to the agent, and let the system compile software from requirements. The problem is that you lose your handle on the codebase. You stop designing the thing and start reviewing whatever the model happened to infer.
A better pattern is to make the agent interview you before it writes code.
Use a “grill me” prompt. Make it ask one question at a time. Force it to clarify the product, architecture, edge cases, constraints, data model, UI expectations, and failure modes before it touches the repository.
Here is how to install it and use the “grill me” skill:
npx skills@latest add mattpocock/skills
That process is slower at the beginning, but it creates shared shape. The agent is no longer guessing from a vague spec. It is helping you turn blurry intent into a design you can actually inspect.
Build vertical slices, not horizontal phases
Agents naturally drift toward horizontal implementation plans.
First they want to build all the schemas. Then all the API routes. Then all the UI. It looks organized, but it delays integrated feedback until the end, which is exactly when mistakes become expensive.
Vertical slices work better.
A vertical slice, or tracer bullet, cuts through the whole stack. One issue might include the data shape, service logic, and the smallest useful UI for a single behavior. It does not need to be complete. It needs to prove that the layers work together.
This matters because AI agents need feedback as much as humans do. A vertical slice gives you something real to run, review, and correct. A horizontal phase gives you a pile of parts and a lot of deferred risk.
Prefer Kanban over sequential plans
Long sequential plans serialize the work.
Only one agent can really follow “phase one, then phase two, then phase three” without stepping on the rest of the plan. That is a bad shape for parallel execution.
A better structure is a Kanban board of independent issues with explicit blocking relationships. If task C depends on tasks A and B, say that directly. Treat the work like a directed acyclic graph, not a long checklist.
That gives you two advantages. First, it makes the real dependencies visible. Second, it lets multiple agents work on non-blocking tasks at the same time.
The board matters because it separates planning from scheduling. You can still have a product direction, but execution becomes a set of small, independently verifiable units.
Make AFK coding earn the right to exist
Planning should stay interactive. Implementation can sometimes become AFK work.
That split only works if the codebase has strong feedback loops. Without tests, type checks, linting, smoke tests, and clear failure signals, an agent coding alone is just producing text into the dark.
TDD is one of the best ways to tighten the loop.
Make the agent write a failing test first. Then make it implement the smallest change that passes. Then run the checks. The test is not just ceremony. It gives the agent a target and gives you evidence that the behavior exists.
AFK coding is not magic autonomy. It is automation sitting inside a system that can reject bad work quickly.
Design deep modules for the agent to work inside
Garbage codebases make garbage agents.
A codebase full of tiny, highly coupled files is hard for humans to navigate and even harder for models. The agent has to chase relationships across the graph, keep too many local conventions in memory, and guess which detail is important.
Deep modules are better.
A deep module has a small, stable public interface and hides a lot of internal behavior behind it. That shape gives the human a useful design surface and gives the agent a bounded interior to work on.
This is where the human engineer should spend taste and judgment. Design the interfaces. Decide the module boundaries. Make the external behavior boring and stable. Then delegate implementation inside that box.
Do not automate taste away
The final lesson is the least convenient one: human review still matters.
AI can generate code, tests, docs, scaffolding, migrations, and review notes. It can move quickly when the task is shaped well. But manual QA and code review are where taste enters the system.
That is where you notice whether the interaction feels right, whether the abstraction is worth keeping, whether the code is merely passing checks or actually belongs in the codebase.
The goal is not to remove the engineer from the loop. The goal is to move the engineer to the highest-leverage parts of the loop: designing the work, constraining the agent, reading the output, and deciding what is good enough to keep.
AI-assisted coding works best when it looks less like blind generation and more like disciplined engineering with a faster execution engine.