Compound Engineering: a workflow for research projects

Four commands — /ce:brainstorm, /ce:plan, /ce:work, /ce:review — turn a drifting AI session into a structured workflow. This dispatch walks through what each one does, how they chain together on a real research task, and when to use or skip the whole thing.

What it is, and why I use it

Compound Engineering (CE) is a Claude Code plugin that wraps a software-engineering workflow around any non-trivial task. Instead of typing prompts and hoping the agent stays on track, you run the task through four stages: brainstorm, plan, work, and review. Each stage produces a durable artifact — a requirements document, an implementation plan, committed code, a review report — that the next stage builds on.

For research projects, this matters because the work spans weeks or months. A single long session drifts. A fresh session starts without context. CE forces the context to live in files on disk, not in the model's head, so the work survives session boundaries and is auditable after the fact.

The stage I care about most is review. The agent that wrote the code is the least likely to notice what it missed — like an author proofreading their own paper. CE dispatches a second agent with a blank slate, and that agent routinely catches things the building agent rationalized away.

Setup

Install the plugin once. From a Claude Code session:

/plugin marketplace add every-inc/every-marketplace
/plugin install compound-engineering@every-marketplace

After restart, the four commands become available as /ce:brainstorm, /ce:plan, /ce:work, and /ce:review. Artifacts land in docs/brainstorms/ and docs/plans/.

1. Brainstorm — `/ce:brainstorm`

What it does. Collaborative dialogue to answer what to build. The agent asks one question at a time, challenges assumptions, and proposes 2–3 concrete approaches with tradeoffs. Output: a requirements document in docs/brainstorms/ that captures the problem frame, scope boundaries, and success criteria.

When to use it. The request is vague ("reorganize my website"), the scope is unclear, or multiple plausible directions exist. Skip when the request is already specific and you know exactly what needs to happen.

Example prompt:

/ce:brainstorm I want to reorganize my How-I-Use-AI
page. The current layout is messy. I think a blog-style
dispatches format might work better, but I'm open.

2. Plan — `/ce:plan`

What it does. Turns the requirements doc into an implementation plan. Each unit has a goal, file list, approach, test scenarios, and verification criteria. Output: a plan file in docs/plans/YYYY-MM-DD-NNN-...-plan.md with checkbox-tracked implementation units.

When to use it. The what is settled; now you need the how. Especially valuable when work spans multiple files or has dependency ordering.

Note on plan mode. Claude Code's built-in plan mode (Shift+Tab) is lighter-weight — it proposes a plan in-session without writing a file. Use it for single-session tasks; use /ce:plan when the plan needs to outlive the session.

3. Work — `/ce:work`

What it does. Executes the plan systematically. Creates a feature branch (or worktree), builds a task list from the plan's implementation units, and works through them with tests and incremental commits. Output: the actual code changes, committed.

What I watch for. The work phase is where drift happens if the plan is weak. If /ce:work starts asking lots of clarifying questions mid-implementation, that's a signal to stop and tighten the plan first. Pausing to fix the plan is almost always faster than pushing through on a shaky one.

4. Review — `/ce:review`

What it does. Dispatches independent reviewer agents, each starting with a blank slate, to audit the diff against the plan, the codebase conventions, and explicit quality criteria (correctness, security, performance, tests, maintainability). Output: a review report with findings graded by severity.

Why this is the most valuable stage. The building agent is invested in its own solution. A fresh agent doesn't know what you tried and rejected; it just sees the code as a reader would. In my experience, most of the useful catches happen here — missed edge cases, convention drift, tests that mock too much, dead code the building agent forgot to remove.

How they chain in practice

A typical research-adjacent task, start to finish:

I start with a vague itch ("my website section on AI is badly organized") and run /ce:brainstorm. After ten or so exchanges, I have a requirements doc that says: move to a Principles + Dispatches structure, drop beginner content, and migrate existing tricks.
I run /ce:plan on that doc. The plan file lists five implementation units: confirm dispatch lineup, rewrite hub, create sub-page for the first dispatch, rewrite the LaTeX mirror, deploy. Each unit names specific files.
I run /ce:work. The agent creates a feature branch and works through the plan. I'm still in the loop — I confirm the dispatch lineup interactively, spot-check the rendered page, approve the deploy — but the agent is driving.
Before merging, I run /ce:review. The reviewer catches, say, three things the building agent missed: a broken relative path, a dispatch that still contains a beginner-framed sentence, and a LaTeX section I forgot to update.
I fix the review findings, then merge.

The whole cycle — brainstorm through review — for this kind of task takes half a day. Without CE I would have spent the same half-day, but with more rewriting and less confidence that I caught the rough edges.

When to skip or shortcut

CE is overkill for small tasks. Rough rules:

Skip brainstorm when you already know exactly what to build. Go straight to /ce:plan or just describe the change to Claude Code directly.
Skip plan when the work is under roughly three files and the approach is obvious. Direct execution is fine.
Never skip review for anything that matters. Running /ce:review on a diff is cheap — a few minutes — and the catches pay for it.
Skip the whole thing for trivial edits (typo fixes, single-line config changes, adding a reference to a bib file). The ceremony costs more than it saves.

I lean on CE most heavily for work that (a) touches more than a handful of files, (b) spans multiple sessions, or (c) has non-obvious design decisions that I want captured in writing so future-me or a co-author can follow along.

Compound Engineering: a workflow for research projects

What it is, and why I use it

Setup

1. Brainstorm — /ce:brainstorm

2. Plan — /ce:plan

3. Work — /ce:work