Meta-tasks: when the agent writes its own plan

The .plan/ protocol gave AI agents memory. Autopilot gave them a loop. But both assumed something: that a human writes the plan and the agent executes it. The plan is fixed. The agent checks boxes.

That works when you know every task upfront. It falls apart when you don’t — when the tasks themselves depend on data the agent hasn’t seen yet.

The problem

I’m building an enrichment pipeline for sgcaselaw.com. Phase 1 scrapes firm websites. Phase 3 extracts facts from the scraped content. But here’s the thing: I can’t write the Phase 3 plan items until Phase 1 runs. I don’t know which firms will have scrapeable content, how many pages each firm’s website has, or which entities have enough material to warrant extraction.

The naive approach: run Phase 1, then sit down and manually write 50 plan items for Phase 3 — one per firm. Then do it again for lawyers. Then again for content generation in Phase 4. That’s exactly the kind of ceremony the autopilot was supposed to eliminate. I’d be the bottleneck again — not doing the work, but deciding what work should exist.

The insight: deciding what work should exist is work. And it’s work the agent can do.

CREATE TASKS

The solution is a new type of plan item. Instead of “do this work,” the instruction is “figure out what work needs to be done, add it to the plan, then exit.”

Here’s what it looks like in PLAN.md:

- [ ] 3.4 — CREATE TASKS: Read scraped content from
      pipeline/data/firm-websites/ for Wave 0 firms.
      For each firm, add a plan item below:
      "Extract facts for {firm name}: read
      pipeline/data/firm-websites/{slug}/*.json,
      follow instructions in fact-extraction-prompt-firms.md,
      write output to pipeline/data/extracted-facts/firms/{slug}.json."
      Then EXIT — autopilot will execute the new items.

*(Wave 0 firm extraction items will be inserted here by 3.4)*

- [ ] 3.5 — CREATE TASKS: Same as 3.4 but for Wave 0 lawyers...

*(Wave 0 lawyer extraction items will be inserted here by 3.5)*

When autopilot reaches item 3.4, the agent doesn’t extract any facts. It reads the filesystem, sees which firms were scraped in Phase 2, and writes new plan items — concrete, executable tasks — into PLAN.md itself. Then it checks off 3.4 and exits.

The next autopilot iteration reads the freshly updated PLAN.md, finds the first new unchecked item — “Extract facts for Allen & Gledhill” — and executes it. The iteration after that gets Rajah & Tann. And so on, until every generated task is done.

The plan wrote itself.

The anatomy of a meta-task

A CREATE TASKS item has four parts:

1. A data source to inspect. The agent needs to look at something — a directory of scraped files, a database query, a list of entities — to determine what tasks should exist. It can’t generate tasks from nothing. It reads the world first.

Read scraped content from pipeline/data/firm-websites/

2. A template for the tasks it will create. The meta-task describes the shape of each generated item — what the agent should do, where to read input, where to write output. This is specific enough that each generated task is self-contained and executable by a future agent session that has no memory of the meta-task.

For each firm, add a plan item:
"Extract facts for {firm name}: read pipeline/data/firm-websites/{slug}/*.json,
follow instructions in fact-extraction-prompt-firms.md,
write output to pipeline/data/extracted-facts/firms/{slug}.json."

3. A placement marker. The plan has a comment showing where new items should be inserted. This keeps the plan structured — generated items appear in the right phase, in the right order, not appended randomly at the bottom.

*(Wave 0 firm extraction items will be inserted here by 3.4)*

4. An explicit EXIT instruction. After creating the tasks, the agent exits. It does not start executing them. This is critical for two reasons: the agent just spent its context window on planning, not on the actual work. And autopilot’s fresh-context-per-iteration model means each generated task gets a clean, full attention budget.

Then EXIT — autopilot will execute the new items.

Why the agent exits after planning

This confused me at first. If the agent just decided what tasks to do, why not start doing them immediately? The answer is context window economics.

A CREATE TASKS iteration reads directories, counts files, makes decisions about scoping, and writes structured plan items. By the time it’s done, the context window has been used for planning. Starting a complex extraction task in the same window — with all that planning residue still loaded — means less attention budget for the actual work.

The autopilot pattern already solves this: every iteration gets a fresh context window. The meta-task iteration uses its window for planning. The next iteration uses its fresh window for executing the first generated task. No contamination. No degraded attention.

It’s the same principle as the .plan/ protocol itself: write decisions to files, then start fresh. The plan files are the communication channel. The context window is disposable.

The plan as a program

Once you allow meta-tasks, the plan stops being a static checklist and becomes something closer to a program. It has:

Sequential execution. Items execute top to bottom.
Conditional expansion. CREATE TASKS items generate different items depending on what data exists at runtime.
Loops. Autopilot iterates through generated items until they’re all checked.
Variables. Entity names, file paths, slugs — all determined at generation time, not at plan-authoring time.

The enrichment pipeline plan uses this pattern seven times:

Item	What it generates
3.4	Per-firm fact extraction tasks (Wave 0)
3.5	Per-lawyer fact extraction tasks (Wave 0)
3.8	Per-firm fact extraction tasks (remaining top 50)
3.9	Per-lawyer fact extraction tasks (top 30)
4.7	Per-firm content generation tasks (Wave 0)
4.8	Per-lawyer content generation tasks (Wave 0)
4.13	Per-entity content generation tasks (remaining batch)

Each meta-task reads the output of previous phases and generates work for its own phase. The plan unfolds as execution progresses — early phases produce data, meta-tasks read that data and produce plan items, autopilot executes those items.

If I’d written all these items upfront, the plan would have been hundreds of lines of speculative tasks for entities I hadn’t scraped yet. Half of them might not even apply — maybe a firm’s website is down, or a lawyer has no scrapeable content. The meta-tasks ensure the plan only contains tasks that make sense given what actually happened.

What the generated items look like

After a CREATE TASKS iteration runs, the plan might look like this:

- [x] 3.4 — CREATE TASKS: Read scraped content for Wave 0 firms...

- [ ] 3.4a — Extract facts for Allen & Gledhill: read
      pipeline/data/firm-websites/allen-and-gledhill/*.json +
      pipeline/data/competitor-pages/allen-and-gledhill/*.json,
      follow instructions in fact-extraction-prompt-firms.md,
      write output to pipeline/data/extracted-facts/firms/
      allen-and-gledhill.json.

- [ ] 3.4b — Extract facts for Rajah & Tann: read
      pipeline/data/firm-websites/rajah-and-tann/*.json +
      pipeline/data/competitor-pages/rajah-and-tann/*.json,
      follow instructions in fact-extraction-prompt-firms.md,
      write output to pipeline/data/extracted-facts/firms/
      rajah-and-tann.json.

- [ ] 3.4c — Extract facts for WongPartnership: read...

- [ ] 3.4d — Extract facts for Drew & Napier: read...

- [ ] 3.4e — Extract facts for Davinder Singh Chambers: read...

- [ ] 3.5 — CREATE TASKS: Same as 3.4 but for Wave 0 lawyers...

Each generated item is fully self-contained. It names the input files, references the instruction document, specifies the output path. A fresh agent session with no knowledge of the meta-task can pick up 3.4b and know exactly what to do. The generated task is the briefing.

This self-containment is the key design constraint. Meta-tasks don’t just list entity names — they generate complete, executable work items. The template in the meta-task ensures consistency. Every firm extraction task follows the same pattern. Every content generation task reads the same instruction file. The meta-task is both a planner and a template engine.

The two kinds of agent work

This creates a clean separation between two fundamentally different kinds of work:

Planning work — read data, assess scope, generate tasks. This requires broad awareness (what exists, what’s missing, what’s feasible) but not deep execution. One iteration, moderate context usage.

Execution work — read specific input, follow instructions, produce output. This requires deep focus on one entity’s data but no awareness of the broader plan. One iteration per entity, full context budget for the actual work.

Traditional project management conflates these. The person who decides what to build is also building it, in the same session, with the same mental context. Meta-tasks separate them in time and in context window.

The planning iteration and the execution iterations don’t even need to be the same model. The CREATE TASKS iteration could run on a model that’s good at reasoning and scoping. The extraction iterations could run on a model that’s good at structured output. In practice, they’re the same model with the same autopilot loop — but the architecture doesn’t require it.

Why not just write a script?

The obvious question: if you’re generating repetitive tasks from data, why not write a script that reads the directory and creates a JSON manifest? Why use an AI agent for planning?

Because the planning isn’t purely mechanical. The meta-task might need to:

Skip entities that don’t have enough scraped content to warrant extraction
Combine small entities into batch tasks to avoid overhead
Adjust the template based on what data sources exist (some firms have competitor page data, others don’t)
Note anomalies in DRIFT.md (“Davinder Singh Chambers website returned 403 — skipping, flagging for manual review”)

These are judgment calls. A script would need conditionals for every edge case. The agent handles them by reading the data and reasoning about it — the same way it handles CAPTCHAs in the browser.

The meta-task pattern is exactly where AI agents shine: the work is structured enough to be repeatable, but variable enough that hardcoding all the logic would be more complex than letting the agent figure it out.

Recursive planning

The pattern composes. A meta-task in Phase 3 generates extraction tasks. A meta-task in Phase 4 reads the output of those extraction tasks and generates content tasks. Each phase’s meta-task depends on the previous phase’s completed work.

Phase 2: Scrape firms
    ↓ (data on disk)
Phase 3.4: CREATE TASKS → read scraped data → generate extraction items
    ↓ (extraction items in PLAN.md)
Phase 3.4a-e: Extract facts for each firm
    ↓ (facts on disk)
Phase 4.7: CREATE TASKS → read extracted facts → generate content items
    ↓ (content items in PLAN.md)
Phase 4.7a-e: Generate content for each firm

The plan unfolds like a lazy evaluation chain. Nothing is generated until the data it depends on exists. The human writes the meta-structure — “extract facts for whatever firms got scraped” — and the agent fills in the specifics at runtime.

This is why I call it recursive planning: the plan contains instructions for how to expand itself, and those expansions can trigger further expansions. The depth is bounded by the phase structure, but within phases, the plan grows organically based on what the data demands.

Safety considerations

A self-modifying plan sounds dangerous. What stops the agent from generating nonsense tasks, or deleting existing items, or creating an infinite expansion loop?

The template constrains output. The meta-task specifies exactly what kind of items to generate and where to place them. The agent isn’t free-forming — it’s filling in a template with data-derived values.

The build gate still applies. After the meta-task iteration, autopilot runs the build. If the modified PLAN.md somehow breaks something (it won’t — it’s markdown — but the principle matters), the loop stops.

Git checkpoints. The PLAN.md modification is committed before the next iteration runs. If generated tasks are wrong, git diff shows exactly what changed. Rolling back is one command.

Phase boundaries. The human reviews at phase boundaries. If item 3.4 generated bad tasks, you see them before any get executed. In practice, you’d review the diff between “plan before 3.4” and “plan after 3.4” in seconds — the generated items are repetitive and scannable.

EXIT forces separation. The agent can’t generate tasks and start executing them in the same breath. There’s always a commit boundary between planning and execution, which means there’s always a reviewable checkpoint.

The safety model is the same as autopilot: don’t prevent mistakes, make them visible and reversible.

The evolution continues

The .plan/ protocol gave agents memory across sessions. Autopilot gave them unattended execution. Meta-tasks give them the ability to plan their own work.

Each step removes a human bottleneck:

System	What the human used to do	What the agent now does
.plan/ protocol	Re-explain context every session	Read plan files, orient in seconds
Autopilot	Start sessions, confirm scope, close	Loop unattended, fresh context each time
Meta-tasks	Write per-entity task lists after each phase	Read data, generate tasks, exit

The human’s role keeps moving up the abstraction stack. I don’t write individual tasks anymore. I write the rules for generating tasks. I don’t manage execution anymore. I review phase boundaries.

The plan isn’t a checklist I hand to the agent. It’s a program we co-author — I write the structure and the meta-tasks, the agent fills in the concrete work based on reality.

A plan that can’t change is a wishlist. A plan that changes based on what the agent discovers is a system. The meta-task is the hinge between the two — the moment the agent stops being an executor and starts being a collaborator in deciding what to build.