Deep modules: structuring code for AI agents, not humans

I just audited a codebase I built entirely with AI coding agents. 33,000 lines of TypeScript across 135 source files. Two plans, 166 items, 95 sessions. Every line written by Claude, directed by me through Pi.

The code works. The architecture is clean. The types are strict. The conventions are documented.

And it’s structured completely wrong for the thing that maintains it.

The problem with “clean code” for AI

The codebase follows every principle a senior engineer would approve of. Small files. Single responsibility. Separation of concerns. The Zustand store is split into 15 files — one per slice. The contracts package has 12 type files. The server handlers are organized into 8 domain-specific modules.

This is perfect for a human with an IDE. You press Cmd+P, type “connectionSlice”, and you’re there. You navigate with tree views, jump-to-definition, find-all-references. The granularity helps because your navigation tools are instant and free.

AI agents don’t have any of that.

When my coding agent needs to understand the store, it reads files. Each file read is a tool call — a round trip that costs time and adds to the conversation. To understand how a prompt flows from the UI to the AI model, the agent must read:

Composer.tsx — the input component
sessionActions.ts — the action that sends the prompt
wireTransport.ts — the WebSocket wiring
transport.ts — the WebSocket client
wsProtocol.ts — the message types
handlers/session.ts — the server handler
piProcess.ts — the subprocess bridge

Seven files. Seven tool calls. ~3,400 lines. For one feature.

And that’s just reading. If the agent needs to add a new method to this pipeline, it edits all seven. Each edit needs exact text matching. Each cross-file change is a place where the agent can break something. More files touching the same feature = more opportunities for mistakes.

Deep modules vs. shallow modules

John Ousterhout introduced the concept in A Philosophy of Software Design. A deep module has a simple interface hiding significant implementation. A shallow module has an interface that’s nearly as complex as what it hides — you don’t gain much by having the abstraction.

His example: the Unix file I/O interface (open, read, write, close) is deep. Five functions that hide enormous complexity — disk drivers, block allocation, caching, journaling, permissions. A ConnectionSlice that contains three one-line setters is shallow. It exists for organization, not abstraction.

Here’s what shallow modules look like in practice. This is an actual store slice from my codebase:

// connectionSlice.ts — 20 lines total
export const createConnectionSlice = (set) => ({
  connectionStatus: "connecting",
  reconnectAttempt: 0,
  lastError: null,
  setConnectionStatus: (status) => set({ connectionStatus: status }),
  setReconnectAttempt: (attempt) => set({ reconnectAttempt: attempt }),
  setLastError: (error) => set({ lastError: error }),
  clearLastError: () => set({ lastError: null }),
});

Twenty lines. Zero logic. No encapsulation. The file exists because someone (an AI agent following human best practices) decided each concern deserves its own file.

I have five slices like this. Five files, five tool calls to read, five files the agent must remember exist. Combined, they’re 121 lines — a single screen of code. They should be one file.

What “deep” looks like for AI maintenance

The refactoring principle is simple: minimize the number of files an agent must read and edit per task.

This means:

Merge files that always change together. My 8 action files in lib/ all follow the exact same pattern — call the transport, update the store, handle errors. Eight files, one pattern. Five of them can become one appActions.ts. The pattern appears once instead of five times.

Merge files that are always read together. The 12 type files in contracts/ define the Pi RPC protocol — events, commands, responses, base types. To understand the protocol, you read all four. To add a new command, you edit at least three. Merge them into one piProtocol.ts. One file, one read, one place to edit.

Merge files that add no abstraction. Those five tiny store slices? They’re organizational, not architectural. Combine them by domain: one slice for app-level state (connection, UI, updates, notifications), one for session state (messages, models, active session), one for workspace features (tabs, git, terminal, plugins, projects). Three files instead of fifteen.

Leave files that are already deep. My piProcess.ts at 570 lines encapsulates the entire subprocess lifecycle — spawning, JSONL parsing, command correlation, error recovery, cleanup. That’s a deep module. Same for wireTransport.ts at 728 lines — the complete event-to-state mapping from WebSocket to Zustand. Don’t touch these. They’re already right.

The 1M token recalibration

Here’s the thing that surprised me. My initial instinct was to consolidate everything to “save context.” But Claude’s context window is 1 million tokens. My entire codebase — all 33,000 lines — is roughly 500K tokens. It fits with room to spare.

So the problem isn’t “can the AI fit this in context?” The entire codebase fits. The problem is:

Tool call count. 135 files means up to 135 reads. Each read is latency — a round trip, a function call, tokens for the response. Reducing to 90 files doesn’t save context. It saves dozens of tool calls per session.

Co-change sets. Adding a new WebSocket method currently requires editing 6+ files across 3 packages. With deep modules, it’s 3-4 files. Fewer edits = fewer places to make mistakes = faster, more reliable changes.

Redundancy. I had 7 mandatory context files the agent reads at the start of every session — 870 lines of “who you are, how to behave, what you can do.” Much of it was restated across files. The agent reads three slightly different framings of the same rule and hedges between them. Consolidating to 3 files with no redundancy gives the agent sharper, more consistent instructions.

The plan

Here’s the actual refactoring, phase by phase:

Phase	What changes	Files before → after
Context docs	7 mandatory agent docs → 3	15 → 6
Contracts	12 type files → 4 (one per domain)	12 → 4
Store + Actions	15 store slices + 13 action files → 5 + 8	28 → 13
Server handlers	8 handler files → 3	10 → 4
Chat components	10 chat rendering files → 3	10 → 3

Total: 135 files → ~90 files. A 33% reduction in tool calls per full codebase read.

The interesting thing: no code is being deleted. No features change. The application does exactly what it did before. What changes is the shape of the codebase — optimized for how an AI agent navigates, reads, and edits, rather than how a human browses file trees.

This is a new kind of technical debt

We have a name for code that’s hard for humans to maintain: technical debt. But we don’t have a name for code that’s well-structured for humans but poorly structured for AI agents.

I’d call it agent debt. It’s the cost of organizing code for IDE navigation instead of tool-call navigation. It’s the cost of granularity that helps humans think but slows agents down. It’s the cost of documentation spread across files that should be one source of truth.

Every codebase that was written with AI assistance but structured with human conventions has this debt. As more code is maintained primarily by AI agents — with humans directing rather than typing — the ROI of paying it down increases.

The meta-lesson

This is the same pattern I keep finding: the system that builds the thing should shape the thing.

When humans write code, small files with single responsibilities make sense. The navigation tools are free (Cmd+P, Cmd+click), and the cognitive cost of context-switching between files is low.

When AI agents write code, deep modules with rich interfaces make sense. Each file read costs a tool call, and each cross-file edit is a failure point.

The code should be shaped by who maintains it. For a growing number of codebases, that’s no longer a person with an IDE. It’s an agent with a tool belt.

Build the codebase for the maintainer you actually have.