agents · 2026-05-26
How to Hand Off Work Between AI Agents Without Losing Context
A practical handoff protocol for AI-agent teams: current state, task cards, truth priority, completion logs, human gates, and resume checks.
AI-assisted: this article was produced with AI assistance and editorial quality gates.
How to Hand Off Work Between AI Agents Without Losing Context
Agent handoff is where otherwise good workflows fall apart.
The first agent understands the plan because it lived through the work. The next agent sees a pile of files, an old chat summary, maybe a half-updated TODO list, and one vague instruction: "continue." That is how drafts get published too early, credentials get mentioned in the wrong place, or the same discovery work gets repeated three times.
A handoff protocol fixes a boring but expensive problem: how does a fresh agent know what is true now, what to do next, and what it must not touch?
This article gives a Git-backed handoff pattern for small AI-agent teams. It fits content sites, documentation projects, and lightweight product repos where agents plan, research, draft, review, test, and monitor work over multiple sessions. It is not a universal standard. It is a practical way to keep context small and decisions inspectable.
AI-assisted disclosure: this article was produced from a planner brief and source-backed research packet, then checked by independent fact-check and editorial/SEO review before human-approved publication.
Why handoffs fail
Most failed handoffs are not dramatic. They look like normal productivity until you inspect the damage.
A new agent might:
- trust a week-old completion log over the current repo state;
- read every historical note and inherit stale assumptions;
- miss a blocker because it was buried in a long transcript;
- treat "ready" as "approved for publication";
- run a deployment command because the previous agent mentioned it as a future option;
- repeat source checks because the last verification result was never recorded.
The root problem is mixed context. Current state, historical evidence, task scope, user approval, and implementation notes all get dumped into one prompt. The model then has to infer what matters.
Some agent frameworks treat handoff as an explicit transfer point. OpenAI's Agents SDK, for example, documents handoffs as a way for one agent to delegate work to another, and its agent model makes instructions and tools part of the agent definition. Anthropic's guidance on agent systems favors simple, composable patterns over sprawling autonomy. The same idea applies at the project level: do not make the next agent reconstruct state from scattered notes and stale transcripts. Give it a small, explicit packet.
The minimum handoff packet
A useful handoff packet separates current state from history. Here is a file-based version for a Git-backed project.
| File | Purpose | Put this here | Keep out |
|---|---|---|---|
handoff/README.md |
Entry point | project, startup reads, major risks, active task pointer | full timeline |
handoff/STATE.json |
Machine-readable state | active task, stage, production URLs, risk IDs, last verification | long prose, credentials |
handoff/CURRENT.md |
Human-readable state | concise status, allowed actions, current risks | old debates |
handoff/NEXT.md |
Task queue | active task, ready tasks, blocked tasks | execution logs |
handoff/tasks/<id>.md |
Scope control | goal, allowed files, out-of-scope actions, gates, acceptance criteria | unrelated context |
handoff/log/<date>-<task>.md |
Evidence | files changed, commands, verification, blockers, next task | secrets, raw noisy transcripts |
The point is not the exact filenames. The point is separation.
The Model Context Protocol architecture is a useful contrast here: it makes the relationship between hosts, clients, servers, tools, and context explicit. A repo-level handoff file is not MCP, but it should follow the same practical habit. Do not hide operational state inside a long narrative when a later agent needs to inspect it quickly.
A fresh agent should be able to read the entry point, current state, next-task queue, and active task card, then start work without reading every old log. Older logs are still useful, but they are evidence. They are not automatically instructions.
This is similar to how CI systems preserve artifacts between workflow steps. GitHub Actions artifacts are not agent handoffs, but the habit is useful: keep the output of a step somewhere the next step can inspect. GitHub Issues are another useful analogy. A good issue carries scope, status, and discussion. A good task card does the same for an agent.
Truth priority when files disagree
Handoff files will disagree eventually. A task card says ready, but production is broken. A log says CI passed, but the current branch fails. A user says continue, but the active task requires human approval before deployment.
Define the conflict order before the conflict happens.
| Priority | Source | Why it wins |
|---|---|---|
| 1 | Latest explicit user instruction | The user can change direction. |
| 2 | Verified repo or production state from tools | Current reality beats written notes. |
| 3 | Machine-readable state file | It is the compact current-state record. |
| 4 | Current human-readable state | It explains the machine state. |
| 5 | Next-task queue | It shows planned continuation. |
| 6 | Active task card | It defines scope for the selected task. |
| 7 | Source-of-truth docs | They define standing policy. |
| 8 | Recent completion logs | They explain what just happened. |
| 9 | Older logs and archived notes | They are history, not current instruction. |
This table prevents a common agent mistake: treating the most detailed text as the most authoritative text. Detail is not authority. A long old log can be useful and still be stale.
Progressive disclosure beats context hoarding
The next agent does not need every detail at startup. It needs enough to act safely.
A good startup rule looks like this:
- Read the handoff entry point.
- Read the protocol or operating rules.
- Read the machine-readable state.
- Read the current-state summary.
- Read the next-task queue.
- Read the active task card.
- Stop reading unless the task points somewhere else or verification fails.
That last step matters. Agents often over-read because it feels safer. It is not always safer. Too much old context can contaminate the next decision. If an old log says "deployment is next" but the current task is only allowed to create a draft, the extra context has made the agent more dangerous, not more informed.
Use deeper reads for concrete reasons:
- a source-of-truth doc is referenced by the task;
- verification fails;
- files conflict;
- a blocker needs history;
- the user asks for an explanation of past decisions.
Otherwise, start the task.
A copy-paste starting template
For a small repo, the first handoff packet can be plain Markdown plus one JSON file.
handoff/
README.md
STATE.json
CURRENT.md
NEXT.md
tasks/
log/
Minimal STATE.json:
{
"project": "Project name",
"stage": "drafting",
"active_task": {
"id": "TASK-0001",
"title": "Create first draft",
"status": "ready",
"task_file": "handoff/tasks/TASK-0001.md",
"requires_human_approval_before_start": false,
"requires_human_approval_before_side_effects": true
},
"last_verification": {
"date": "YYYY-MM-DD",
"ci": "pass",
"production_health": "not_applicable"
},
"open_risks": []
}
Minimal CURRENT.md:
# Current state
- One-line status:
- Repo path:
- Branch:
- What is allowed now:
- What requires human approval:
- Current risks:
- Last verification:
Minimal NEXT.md:
# Next tasks
## Active task
Task ID:
Title:
Status:
Task file:
Why this is next:
## Backlog
- P1:
- P2:
- Blocked:
This is enough to test the system. Start with one task, finish it, update the packet, then ask a fresh agent to resume from only these files. If it cannot safely continue, the handoff is missing something.
Task cards control scope
A task card is the contract between the project and the next agent. It should be boring and explicit.
A compact status model helps too:
| Status | Meaning | Who can usually move it forward |
|---|---|---|
ready |
The task is clear and unblocked. | Agent can start if no approval gate blocks it. |
in_progress |
Work has started. | Current agent. |
blocked |
Work cannot continue without a decision, credential, source, or fix outside scope. | Human owner or a task that resolves the blocker. |
needs_review |
Draft or change exists and needs fact-check, editorial review, or acceptance. | Independent reviewer or acceptance agent. |
approved |
Required review passed, but publication or deploy may still be gated. | Human owner when the action is gated. |
completed |
Acceptance criteria passed and handoff was updated. | Agent after verification. |
archived |
No longer active, kept for record. | Human owner or maintainer. |
Do not let status names blur gates. approved for editorial quality is not the same as "approved to deploy." If the project has human publication gates, keep those gates explicit.
# KTA-0007: Example task title
Status: ready
Priority: P2
Owner: agent
Human approval required before start: no
Human approval required before side effects: yes
## Goal
One paragraph describing the intended outcome.
## Scope
Allowed:
- Create or edit `exact/path.md`.
- Run local validation.
- Update handoff files.
Not allowed:
- Publish content.
- Deploy production.
- Modify credentials.
- Schedule recurring automation.
## Acceptance criteria
- Artifact exists at the expected path.
- Required sections are present.
- Verification command passes.
- Handoff is updated.
## Verification commands
```bash
pnpm run ci
git status --short
The "not allowed" section is as important as the goal. Agents are good at finding ways to be helpful. Scope control tells them which helpful actions would be wrong.
For content sites, common gated actions include moving drafts into public content directories, changing `status` to `published`, deployment, cron scheduling, credential changes, domain or canonical changes, analytics verification changes, monetization, and deleting published pages.
Those gates must survive handoff. If agent A knew publication required approval, agent B must know it too. This is especially important for AI-assisted content: Google's helpful-content guidance focuses on usefulness and reliability, not on whether a page was produced quickly. A handoff that drops the review gate can turn a useful draft workflow into low-value publishing by accident.
## Completion logs are evidence
A completion log should answer one question: what can the next agent trust without replaying the whole session?
Use a compact structure:
```markdown
# Completion log: YYYY-MM-DD task-name
Task ID:
Task title:
Date:
Status:
Related commits:
## Goal
## Completed work
## Files changed
## Commands run
## Verification
## Acceptance criteria result
## Blockers or risks
## Decisions made
## Next recommended task
Do not paste raw transcripts. Do not paste secrets. Do not record every false start unless it explains a current blocker.
The log should include exact commands and whether they passed. "Checked production" is weak. "node scripts/check-production-health.mjs: pass against https://agents.goodwallet.dpdns.org" is useful. The next agent can decide whether that result is still fresh enough or needs to be rerun.
A bad handoff and a safer one
Bad handoff:
We finished most of the setup. Continue with the article stuff. The site was working earlier. Maybe deploy after CI. There were some token issues but should be fine.
This forces the next agent to guess. What setup? Which article? What does "working" mean? Is deployment approved? What token issues?
Safer handoff:
Active task: KTA-0006, create the next agents article.
Allowed now: create brief, research packet, draft under drafts/agents/, run reviews and CI, update handoff.
Not allowed without human approval: move draft to content/agents/, set status to published, deploy production.
Last verification: pnpm run ci passed on 2026-05-26; production health check passed on 2026-05-26.
Current blocker: Cloudflare API token rotation remains open, but this task does not need deployment credentials.
Start by reading handoff/tasks/KTA-0006-next-agents-article.md and docs/content-workflow.md.
The safer version is not much longer. It is just sharper.
Fresh-agent resume checklist
Before a new agent acts, it should run this checklist:
- I know the active task ID.
- I know the task file path.
- I know what is allowed.
- I know what is explicitly out of scope.
- I know which actions require human approval.
- I know the latest verification status.
- I checked the live repo state when it matters.
- I know which files I am expected to update after completion.
- I know the next verification command.
- I have not treated an old log as current instruction.
If any item is unclear, the next action is not implementation. The next action is state discovery.
How this fits the agent content pipeline
Handoff is the record that lets one role resume another role's work without guessing.
A planner-executor-reviewer workflow separates who designs, who implements, and who reviews. A research-writing-review pipeline separates source gathering from drafting and editorial judgment. A draft-safety gate keeps unfinished work out of public output, as described in preventing agents from publishing drafts. A review gate catches unsupported claims and AI-sounding prose before publication, as covered in reviewing AI-written technical articles. The larger AI-agent content pipeline ties those pieces into a publishable workflow.
Handoff is what lets those roles operate across time. It gives the next agent a clean starting point instead of a pile of guesses.
Do that before adding more automation. Create the six handoff files, run one real task through them, then start a fresh session and see whether the next agent can resume without reading the whole transcript. If it cannot, fix the packet before you trust it with publishing, deployment, credentials, or scheduled jobs.
Original value in this guide
- Git-backed handoff packet structure
- truth-priority table for conflicting context
- task-card template with gated actions
- completion-log template
- fresh-agent resume checklist
- bad versus safer handoff example