agents · 2026-05-23
How to Design a Planner-Executor-Reviewer Agent Workflow
A practical guide to designing AI-agent workflows with separate planning, execution, review, and acceptance gates.
AI-assisted: this article was produced with AI assistance and editorial quality gates.
How to Design a Planner-Executor-Reviewer Agent Workflow
A useful AI-agent workflow is not just a bigger prompt. It is a system of roles, handoffs, checks, and stop conditions. The planner-executor-reviewer pattern is a practical way to split agent work into three responsibilities: deciding what should be done, doing the work, and deciding whether the result is good enough to ship.
This pattern is especially useful for developer-facing work: writing code, maintaining content sites, running research pipelines, updating documentation, or operating a small automation system. It does not make agents magically correct. It gives you a structure where mistakes are easier to find before they reach production.
AI-assisted disclosure: this guide was produced with AI assistance and reviewed against KnowToAct editorial gates for source quality, practical usefulness, and risk controls.
The core idea
A planner-executor-reviewer workflow separates strategy from implementation and quality control.
- The planner defines scope, constraints, acceptance criteria, and the order of work.
- The executor performs the work using tools, files, APIs, or code.
- The reviewer checks the result against the original requirements and quality standards.
- An optional acceptance gate decides whether to publish, deploy, merge, or escalate to a human.
Anthropic's guide to building effective agents distinguishes predictable workflows from more autonomous agents and recommends simple, composable patterns before adding complexity. That advice matters here: the planner-executor-reviewer pattern is not meant to be a giant swarm. It is a minimal structure for work that is too risky or too complex for a single unchecked agent call.
Workflow diagram
User goal
|
v
Planner agent
|-- writes task brief
|-- defines constraints
|-- defines acceptance criteria
v
Executor agent
|-- performs implementation or research
|-- records commands, files, sources, and uncertainty
v
Reviewer agent
|-- checks spec compliance
|-- checks quality and risks
|-- requests changes or approves
v
Acceptance gate
|-- runs deterministic checks
|-- publishes, deploys, merges, or escalates
The important point is that each stage leaves an artifact. A plan, a diff, a source list, a test log, or a review note is more reliable than relying on a later agent to remember what happened in the conversation.
When this pattern is worth using
Use this workflow when the output has external consequences or quality matters. Good examples include:
- publishing a technical article to a public website;
- modifying production code;
- creating a GitHub Actions workflow;
- updating Cloudflare Pages deployment settings;
- researching tools where hallucinated sources would damage trust;
- generating docs that developers will follow step by step.
A single-agent workflow is often enough for quick brainstorming, throwaway scripts, or private notes. Adding more agents increases cost, latency, and coordination overhead. The pattern earns its keep only when the review and acceptance gates catch enough errors to justify that overhead.
Role boundaries
The pattern works best when each agent has a narrow job. If every agent can rewrite the goal, change the implementation, approve its own output, and deploy to production, the workflow is just a single agent with extra steps.
| Role | Primary job | Should write down | Should not do |
|---|---|---|---|
| Planner | Convert a goal into a scoped task | Brief, constraints, dependencies, acceptance criteria | Make unreviewed production changes |
| Executor | Produce the artifact | Files changed, commands run, sources used, known uncertainty | Approve its own work |
| Reviewer | Compare the result with the brief | Spec gaps, quality issues, risk notes | Rewrite the goal without saying so |
| Acceptance gate | Decide publish/deploy/merge/escalate | Final checklist and verification output | Skip deterministic checks |
OpenAI's Agents SDK documents concepts such as agents, tools, handoffs, guardrails, and tracing. Those concepts map well to this table: handoffs define how work moves between roles, guardrails constrain behavior, and tracing helps you inspect what happened when a multi-step workflow fails.
Handoff artifacts
A planner-executor-reviewer workflow needs explicit handoffs. For a content article, the planner should not simply say "write a good post." It should produce a task contract.
task: upgrade-agent-workflow-article
owner: writer-agent
section: agents
objective: publish a practical guide to planner-executor-reviewer workflows
audience: developers building AI-agent automation systems
constraints:
- avoid unsupported productivity claims
- include at least five reliable sources
- include human approval rules for risky actions
- no affiliate recommendations in this article
acceptance_criteria:
- 1500-2500 words
- includes workflow diagram, role table, and checklist
- passes content validation and build
- reviewer confirms no hallucinated sources
escalation:
- unclear legal, security, or production-deployment advice
- missing or unreachable primary sources
For software work, the same idea becomes a different contract: files to modify, tests to run, expected failure mode, and rollback plan. For operations work, it may include API scopes, allowed commands, and a manual approval point.
Reviewer gates should combine AI review with deterministic checks
Reviewer agents are useful, but they are not a guarantee of correctness. They can miss bugs, over-trust the executor, or reinforce a flawed plan. Treat reviewer output as one signal, not the only control.
Stronger gates combine model review with external checks:
- unit tests and integration tests;
- link checks and source validation;
- static analysis or schema validation;
- GitHub Actions CI jobs;
- Cloudflare Pages preview deployments;
- human approval for high-risk changes.
GitHub Actions is useful because it turns acceptance criteria into repeatable workflow steps. A reviewer may say a change looks good, but CI can still catch a broken build. Cloudflare Pages preview and production deployments can provide a visible artifact for final inspection before a public release.
Example: publishing a technical article
A content workflow can use the pattern like this:
- Planner writes the article brief, search intent, required sources, and quality gates.
- Researcher or executor gathers official docs, standards, and credible engineering sources.
- Writer produces the draft using only the brief and source notes.
- Fact-checker verifies that claims are supported by sources.
- Reviewer checks usefulness, SEO quality, anti-spam risks, and compliance disclosures.
- Acceptance gate runs content validation, build, sitemap/RSS generation, and preview checks.
- The article is published only after all gates pass.
This is slower than asking one model to write a post. It can also be safer for a site that wants long-term trust, because review gates make mistakes easier to catch before publication. A low-quality AI article can hurt both readers and search visibility; a source-backed article with a checklist, examples, and clear limitations is more likely to be useful.
Failure modes and fixes
| Failure mode | What it looks like | Practical fix |
|---|---|---|
| Vague planning | Executor receives a broad goal with no acceptance criteria | Require a written task contract before execution |
| Reviewer rubber-stamp | Reviewer says "looks good" without checking sources or tests | Use a review checklist and require evidence |
| Excessive agency | Executor can deploy, delete, or spend money without approval | Apply least privilege and manual approval for risky actions |
| Context loss | Later agents do not know why a decision was made | Store decisions in files, issues, or task artifacts |
| Infinite iteration | Agents keep revising without a stop condition | Define max attempts and escalation rules |
| Hallucinated sources | Article cites nonexistent docs or unsupported claims | Require reachable URLs and source-to-claim mapping |
OWASP's LLM application guidance highlights risks such as prompt injection, sensitive information disclosure, insecure output handling, and excessive agency. Those risks become more serious when agents can use tools. The fix is not to avoid automation entirely; it is to limit permissions, validate inputs and outputs, log actions, and require human approval for high-impact steps.
Human approval rules
Do not let a reviewer agent be the only approval layer for irreversible actions. Require human approval when a task involves:
- production deployments;
- financial transactions;
- legal, medical, tax, or compliance advice;
- security-sensitive code or credentials;
- deleting data;
- contacting users or customers;
- publishing claims about people, companies, or products;
- changing DNS, billing, or account permissions.
The NIST AI Risk Management Framework emphasizes governance, measurement, and managing AI risks in context. A small developer project does not need enterprise bureaucracy, but it still benefits from clear roles, logs, and escalation rules.
What this pattern does not solve
This pattern does not eliminate hallucinations, security risks, unclear requirements, or bad judgment. It only creates clearer places to catch and correct them. If the original goal is wrong, if the sources are weak, or if the tools have excessive permissions, adding more agents can make the failure harder to debug. Treat the pattern as a control system, not a correctness guarantee.
Implementation notes for Hermes-style teams
In a Hermes-based workflow, you can represent these roles with profiles, prompts, skills, or explicit task instructions. A practical starting team might be:
- a planner for scope, architecture, and acceptance criteria;
- a researcher for source collection and notes;
- a writer or executor for drafts and implementation work;
- a fact-checker for claim verification;
- a reviewer for quality and risk review;
- an acceptance gate for final checks and release decisions.
Those role names are examples, not required components or endorsements. The important design choice is that planning, execution, review, and final acceptance are not all performed by the same unchecked actor.
Start with one workflow and one repository. Store stable project facts in the repo, not only in chat memory. Put repeatable procedures into checklists or scripts. Use CI as the final gate whenever possible.
Publication readiness checklist
Before publishing an AI-assisted technical artifact, check:
- The planner wrote a task contract with acceptance criteria.
- The executor recorded files changed, commands run, sources used, and uncertainty.
- The reviewer checked both spec compliance and quality.
- Factual claims have reliable sources or are framed as recommendations.
- No agent approved its own work as the final authority.
- Tests, build, source checks, or link checks passed.
- Risky actions have a human approval path.
- The output includes disclosure when AI assistance materially contributed.
- The system has a rollback or correction path.
Start small
The biggest mistake is building a complicated swarm before the basic loop works. Begin with one planner, one executor, one reviewer, and one acceptance checklist. Run the process on a real task. Record what failed. Tighten the handoffs. Only then add specialized agents, cron jobs, memory layers, or analytics-driven planning.
A planner-executor-reviewer workflow is not about making agents look organized. It is about making the work inspectable. When every stage leaves evidence, you can improve the system instead of guessing why the last run went wrong.
Sources
- Anthropic Engineering: Building Effective Agents: https://www.anthropic.com/engineering/building-effective-agents
- Hermes Agent Documentation: https://hermes-agent.nousresearch.com/docs
- OpenAI Agents SDK Documentation: https://openai.github.io/openai-agents-python/
- GitHub Actions Documentation: https://docs.github.com/en/actions
- Cloudflare Pages Git Integration: https://developers.cloudflare.com/pages/configuration/git-integration/
- NIST AI Risk Management Framework: https://www.nist.gov/itl/ai-risk-management-framework
- OWASP Top 10 for LLM Applications: https://genai.owasp.org/llm-top-10/
Original value in this guide
- planner-executor-reviewer workflow diagram
- role and handoff table
- agent task contract template
- publication readiness checklist