Guild logo Guild
v2.0 Open Source MIT License

Stop re-explaining your project. Give Claude Code a team that remembers.

Guild runs 14 domain specialists through a 7-step lifecycle. Each specialist loads only the context it needs, runs at the model tier the work demands, and hands off results through a typed envelope. Decisions, specs, and artifacts persist under .guild/ — so every run starts informed.

terminal
$ claude plugin marketplace add lookatitude/guild
$ /guild "Build a Stripe subscription flow, add tests, update the docs, draft a launch email."
▶ lifecycle: brainstorm → team-compose → plan → context-assemble → execute → review → verify
▶ dispatching: architect (`powerful` tier)
▶ dispatching: backend (`mid` tier)
▶ dispatching: qa (`cheap` tier)
▶ dispatching: technical-writer (`mid` tier)
▶ dispatching: copywriter (`mid` tier)
✓ 5 handoffs received · review passed · verify green
Guild 7-step lifecycle: brainstorm → team-compose → plan → context-assemble → execute → review → verify, with three user approval gates at brainstorm, team-compose, and plan.
Guild 7-step lifecycle · three approval gates · 14 specialists

Why Guild

Four problems Guild solves

Memory, cost, context, and coordination — the four failure modes of stateless AI sessions.

Memory & Recall

Your project context persists — specialists pick up where you left off.

Every new Claude Code session starts blank. You re-explain the codebase structure, the architectural decisions from last sprint, the conventions your team agreed to. Guild eliminates this. When you run /guild init, Guild scans the repo and writes synthesized knowledge to .guild/wiki/: architecture patterns, active decisions, codebase standards. Before each specialist dispatches on a subsequent run, guild:context-assemble loads the relevant wiki pages into that specialist's context bundle. The architect starts the next run knowing your module boundaries. The backend specialist starts knowing the naming conventions already decided. No re-scanning. No re-explaining.

Concrete example

You run /guild init in an unfamiliar Rails monorepo. Guild writes a codebase-map.json and an architecture-map.md stub, then synthesizes findings into .guild/wiki/. Next /guild "add OAuth login" run: the architect specialist loads the architecture map; the backend specialist loads the conventions wiki page. Neither re-reads the whole repo. Each starts informed — faster and with fewer wrong turns.

Later runs compound on earlier ones. Specialists recall constraints and decisions without you restating them each time.

Memory and Recall flow: /guild init writes synthesized knowledge to .guild/wiki/; guild:context-assemble loads the relevant wiki pages into each specialist's context bundle before dispatch — accumulate, assemble, inform.
guild:context-assemble loads wiki pages per specialist — no full repo re-scan on subsequent runs.

Cost-Effective Model Tiering

Heavy models only when the work demands it. Routine work runs cheap.

Running every agent at maximum model capacity is expensive and unnecessary. A task that reads and classifies needs very different compute than one that designs a service architecture. Guild auto-scores each specialist lane to one of three tiers — cheap, mid, or powerful — based on deterministic signals: the work type (read/classify vs. plan/draft vs. design/secure), blast radius, and security sensitivity. The score is printed before execution. No configuration is required for stable defaults. When a cheaper-tier specialist hits a sub-problem beyond its tier — say, a nuanced security question during a mid-tier backend lane — it emits status: "escalate" in its handoff envelope and gets one powerful-tier advisor answer for that specific question, then continues. No full lane re-run.

Concrete example

A five-lane run dispatches: architect → backend → qa → technical-writer → copywriter. Guild auto-scores: architect = powerful (service boundary design), backend = mid (implementation), qa = cheap (assertion checks and test runs), technical-writer = mid (changelog drafting), copywriter = mid (launch email). You see the resolved tier for each lane in the plan — and can override per-lane with model_tier: or globally in settings.json (models.*).

You spend on the model tier each task actually warrants — not a flat heavy rate across every agent in every run.

Cost tiering diagram: three bands — cheap, mid, and powerful — with example work types mapped into each band, and an advisor escalation arrow showing one powerful-tier answer for a targeted sub-problem.
Auto-scored tiers — cheap, mid, powerful — with targeted advisor escalation, not full lane re-runs.

Context Management

Each specialist gets exactly the context it needs — nothing it doesn't.

Handing an agent your entire project is expensive and counterproductive. Irrelevant context dilutes focus: the copywriter doesn't need the database schema; the security specialist doesn't need the UI wireframes. Before each specialist dispatches, guild:context-assemble builds a per-specialist, per-run context bundle: only the skills and wiki pages relevant to that specialist's lane, with a hard cap of 6 k tokens. The bundle is different for every specialist in the same run. This keeps each specialist's context tight, relevant, and within budget.

Concrete example

A run with four specialists — architect, backend, copywriter, seo — produces four different context bundles. Architect receives: architecture wiki pages + system-design skill + the approved spec. Backend receives: TDD fallback skill + conventions wiki + the spec. Copywriter receives: voice-guide wiki + copywriter-email-sequences specialist skill + the spec summary. SEO receives: keyword brief wiki page + seo specialist skill. The 6 k-token cap is enforced on every bundle — none gets more.

Specialists stay focused on their lane. Context from one specialist's domain doesn't contaminate another's output.

Context management flow: guild:context-assemble builds separate context bundles for each specialist — only the relevant skills and wiki pages — each hard-capped at 6k tokens.
Per-specialist context bundles — 6 k-token hard cap, assembled fresh each run.

Agent Communication Standards

Specialists hand off work through typed envelopes — not loose text.

Chaining AI agents naively produces coordination failures: one agent's output doesn't reliably become the next agent's useful input. Guild solves this with a structured handoff contract. When a specialist finishes its lane, it returns a guild.handoff.v2 envelope — a versioned receipt with a declared status, a summary, artifact paths, and flagged issues. The lifecycle accumulates these compact envelopes, not raw transcripts. Lanes declare depends-on: ordering in the plan, so a QA lane waits for the backend lane's envelope before it dispatches. Review checks spec-match across the full envelope chain. Every lane's output is attributable.

Concrete example

The backend specialist finishes the Stripe subscription endpoint and returns a handoff: status: done, artifact paths listed, one issue flagged — webhook validation is not yet covered. The QA lane, declared depends-on: backend in the plan, dispatches next. It receives the backend handoff and targets the flagged gap. Review checks the full chain against the approved spec — not isolated lane outputs.

Multi-specialist work composes cleanly. Each specialist's output is structured input for the next — not a wall of text to parse and re-interpret.

Agent communication diagram: specialist lanes handing off guild.handoff.v2 envelopes in sequence — status, summary, artifact paths, flagged issues — with depends-on ordering between lanes.
guild.handoff.v2 typed envelopes — sequenced by depends-on, not loose text chains.

More capabilities

Built for the whole lifecycle

14 specialists, gated skill evolution, scoped permissions — all from a single command surface.

14 Specialists, Three Groups

The right specialist routes to the right work without you wiring agents.

Guild ships 14 domain specialists across three groups — engineering (architect, backend, frontend, devops, qa, mobile, security, researcher), content (copywriter, technical-writer, seo, social-media), and commercial (marketing, sales). Each specialist owns a scoped lane with a defined skill set. For a given run, Guild proposes the relevant subset — you approve, swap, or skip specialists before any work starts.

Specialist teams diagram: three groups — engineering, content, and commercial — with 14 named specialists and an approval gate before the plan is written.
14 specialists across three groups — you approve the team before the plan is written.

Seven-Phase Lifecycle, Three Gates

Full multi-specialist work from a single command — with control at every decision point.

/guild "task" runs brainstorm → team-compose → plan → context-assemble → execute → review → verify. You confirm after brainstorm (spec check), after team-compose (team approval), and after plan (work-breakdown approval). Later phases run from the approved plan with minimal interruption.

Lifecycle diagram: seven phases from brainstorm to verify, with three user approval gates marked at brainstorm, team-compose, and plan.
Seven phases, three approval gates — from a single /guild command.

Skills That Evolve Through Gating

The system improves from real runs — on your terms, with your sign-off.

After each run, Guild proposes skill edits based on what worked and what fell short. Any promotion goes through a 10-step pipeline: paired evals surface regressions, a flip report confirms the improvement, shadow-mode validation runs before any change ships, and a promotion gate requires your explicit approval. Nothing auto-promotes.

Security and Observability

Scoped permissions, credential redaction, structured cost audit.

security.bypass_permissions_policy scopes what specialists can access. Three-stage secrets redaction keeps credentials out of transcripts and logs. The optional guild-telemetry MCP exposes structured cost rollup by tier, model, and specialist. Audit the full installation at any time with /guild audit.

How it works

From command to verified output

What you see and do at each step — not what Guild executes internally.

Run one command

Run /guild "task" — or /guild with no arguments and let the brainstorm phase prompt for the task. From the first run, Guild writes everything to .guild/: spec, team, plan, context bundles, handoff envelopes, review, telemetry, and reflections.

Review the spec, confirm or redirect

Guild drafts a spec and surfaces blocking questions before any code is written. You read it, answer the questions, and confirm the direction — or redirect. Nothing proceeds until you sign off.

Approve the team

Guild proposes the relevant specialists with a rationale for each. You approve the team, replace a specialist, or skip one. The plan is written only after team approval.

Review the plan, then approve

Each specialist's lane is written with depends-on: ordering so sequenced work runs in the right order and parallel work runs where safe. You review the full plan and approve it — no specialist dispatches until you do.

Specialists execute with lean context

Each specialist receives a tight context bundle — only the skills and wiki pages relevant to its lane, capped at 6 k tokens — then executes. Specialists on independent lanes run in parallel where the plan permits.

Review, verify, and reflect

Guild checks every lane's output against the approved spec, then verifies tests, scope, and success criteria. After the run, Guild proposes skill edits based on what went well and what fell short. You decide what gets promoted.

The difference

Structured output. Auditable by design.

Guild is a Claude Code plugin that structures a session into a disciplined team: 14 domain specialists, a 7-step lifecycle, and skills that upgrade through a gated eval pipeline. Where a single-agent session produces ephemeral output, Guild writes every spec, plan, handoff, and review as a versioned artifact under .guild/ — permanent and auditable.

Each specialist receives only the context it needs, runs at the model tier the work demands, and returns a typed guild.handoff.v2 envelope the next lane can act on. Skills improve through paired evals and shadow-mode gating; you approve every promotion before it ships.

Start with a single run.

Guild is open source, MIT licensed, and zero-config to install. The source is on GitHub; the docs cover everything from installation to skill authoring.