How to Hand Off Work Between AI Agents Without Losing Context

Agent handoff is where otherwise good workflows fall apart.

The first agent understands the plan because it lived through the work. The next agent sees a pile of files, an old chat summary, maybe a half-updated TODO list, and one vague instruction: "continue." That is how drafts get published too early, credentials get mentioned in the wrong place, or the same discovery work gets repeated three times.

A handoff protocol fixes a boring but expensive problem: how does a fresh agent know what is true now, what to do next, and what it must not touch?

This article gives a Git-backed handoff pattern for small AI-agent teams. It fits content sites, documentation projects, and lightweight product repos where agents plan, research, draft, review, test, and monitor work over multiple sessions. It is not a universal standard. It is a practical way to keep context small and decisions inspectable.

AI-assisted disclosure: this article was produced from a planner brief and source-backed research packet, then checked by independent fact-check and editorial/SEO review before human-approved publication.

Why handoffs fail

Most failed handoffs are not dramatic. They look like normal productivity until you inspect the damage.

A new agent might:

trust a week-old completion log over the current repo state;
read every historical note and inherit stale assumptions;
miss a blocker because it was buried in a long transcript;
treat "ready" as "approved for publication";
run a deployment command because the previous agent mentioned it as a future option;
repeat source checks because the last verification result was never recorded.

The root problem is mixed context. Current state, historical evidence, task scope, user approval, and implementation notes all get dumped into one prompt. The model then has to infer what matters.

Some agent frameworks treat handoff as an explicit transfer point. OpenAI's Agents SDK, for example, documents handoffs as a way for one agent to delegate work to another, and its agent model makes instructions and tools part of the agent definition. Anthropic's guidance on agent systems favors simple, composable patterns over sprawling autonomy. The same idea applies at the project level: do not make the next agent reconstruct state from scattered notes and stale transcripts. Give it a small, explicit packet.

The minimum handoff packet

A useful handoff packet separates current state from history. Here is a file-based version for a Git-backed project.

File	Purpose	Put this here	Keep out
`handoff/README.md`	Entry point	project, startup reads, major risks, active task pointer	full timeline
`handoff/STATE.json`	Machine-readable state	active task, stage, production URLs, risk IDs, last verification	long prose, credentials
`handoff/CURRENT.md`	Human-readable state	concise status, allowed actions, current risks	old debates
`handoff/NEXT.md`	Task queue	active task, ready tasks, blocked tasks	execution logs
`handoff/tasks/<id>.md`	Scope control	goal, allowed files, out-of-scope actions, gates, acceptance criteria	unrelated context
`handoff/log/<date>-<task>.md`	Evidence	files changed, commands, verification, blockers, next task	secrets, raw noisy transcripts

The point is not the exact filenames. The point is separation.

The Model Context Protocol architecture is a useful contrast here: it makes the relationship between hosts, clients, servers, tools, and context explicit. A repo-level handoff file is not MCP, but it should follow the same practical habit. Do not hide operational state inside a long narrative when a later agent needs to inspect it quickly.

A fresh agent should be able to read the entry point, current state, next-task queue, and active task card, then start work without reading every old log. Older logs are still useful, but they are evidence. They are not automatically instructions.

This is similar to how CI systems preserve artifacts between workflow steps. GitHub Actions artifacts are not agent handoffs, but the habit is useful: keep the output of a step somewhere the next step can inspect. GitHub Issues are another useful analogy. A good issue carries scope, status, and discussion. A good task card does the same for an agent.

Truth priority when files disagree

Handoff files will disagree eventually. A task card says ready, but production is broken. A log says CI passed, but the current branch fails. A user says continue, but the active task requires human approval before deployment.

Define the conflict order before the conflict happens.

Priority	Source	Why it wins
1	Latest explicit user instruction	The user can change direction.
2	Verified repo or production state from tools	Current reality beats written notes.
3	Machine-readable state file	It is the compact current-state record.
4	Current human-readable state	It explains the machine state.
5	Next-task queue	It shows planned continuation.
6	Active task card	It defines scope for the selected task.
7	Source-of-truth docs	They define standing policy.
8	Recent completion logs	They explain what just happened.
9	Older logs and archived notes	They are history, not current instruction.

This table prevents a common agent mistake: treating the most detailed text as the most authoritative text. Detail is not authority. A long old log can be useful and still be stale.

Progressive disclosure beats context hoarding

The next agent does not need every detail at startup. It needs enough to act safely.

A good startup rule looks like this:

Read the handoff entry point.
Read the protocol or operating rules.
Read the machine-readable state.
Read the current-state summary.
Read the next-task queue.
Read the active task card.
Stop reading unless the task points somewhere else or verification fails.

That last step matters. Agents often over-read because it feels safer. It is not always safer. Too much old context can contaminate the next decision. If an old log says "deployment is next" but the current task is only allowed to create a draft, the extra context has made the agent more dangerous, not more informed.

Use deeper reads for concrete reasons:

a source-of-truth doc is referenced by the task;
verification fails;
files conflict;
a blocker needs history;
the user asks for an explanation of past decisions.

Otherwise, start the task.

A copy-paste starting template

For a small repo, the first handoff packet can be plain Markdown plus one JSON file.

handoff/
  README.md
  STATE.json
  CURRENT.md
  NEXT.md
  tasks/
  log/

Minimal STATE.json:

{
  "project": "Project name",
  "stage": "drafting",
  "active_task": {
    "id": "TASK-0001",
    "title": "Create first draft",
    "status": "ready",
    "task_file": "handoff/tasks/TASK-0001.md",
    "requires_human_approval_before_start": false,
    "requires_human_approval_before_side_effects": true
  },
  "last_verification": {
    "date": "YYYY-MM-DD",
    "ci": "pass",
    "production_health": "not_applicable"
  },
  "open_risks": []
}

Minimal CURRENT.md:

# Current state

- One-line status:
- Repo path:
- Branch:
- What is allowed now:
- What requires human approval:
- Current risks:
- Last verification:

Minimal NEXT.md:

# Next tasks

## Active task

Task ID:
Title:
Status:
Task file:
Why this is next:

## Backlog

- P1:
- P2:
- Blocked:

This is enough to test the system. Start with one task, finish it, update the packet, then ask a fresh agent to resume from only these files. If it cannot safely continue, the handoff is missing something.

Task cards control scope

A task card is the contract between the project and the next agent. It should be boring and explicit.

A compact status model helps too:

Status	Meaning	Who can usually move it forward
`ready`	The task is clear and unblocked.	Agent can start if no approval gate blocks it.
`in_progress`	Work has started.	Current agent.
`blocked`	Work cannot continue without a decision, credential, source, or fix outside scope.	Human owner or a task that resolves the blocker.
`needs_review`	Draft or change exists and needs fact-check, editorial review, or acceptance.	Independent reviewer or acceptance agent.
`approved`	Required review passed, but publication or deploy may still be gated.	Human owner when the action is gated.
`completed`	Acceptance criteria passed and handoff was updated.	Agent after verification.
`archived`	No longer active, kept for record.	Human owner or maintainer.

Do not let status names blur gates. approved for editorial quality is not the same as "approved to deploy." If the project has human publication gates, keep those gates explicit.

# KTA-0007: Example task title

Status: ready
Priority: P2
Owner: agent
Human approval required before start: no
Human approval required before side effects: yes

## Goal

One paragraph describing the intended outcome.

## Scope

Allowed:
- Create or edit `exact/path.md`.
- Run local validation.
- Update handoff files.

Not allowed:
- Publish content.
- Deploy production.
- Modify credentials.
- Schedule recurring automation.

## Acceptance criteria

- Artifact exists at the expected path.
- Required sections are present.
- Verification command passes.
- Handoff is updated.

## Verification commands

```bash
pnpm run ci
git status --short


The "not allowed" section is as important as the goal. Agents are good at finding ways to be helpful. Scope control tells them which helpful actions would be wrong.

For content sites, common gated actions include moving drafts into public content directories, changing `status` to `published`, deployment, cron scheduling, credential changes, domain or canonical changes, analytics verification changes, monetization, and deleting published pages.

Those gates must survive handoff. If agent A knew publication required approval, agent B must know it too. This is especially important for AI-assisted content: Google's helpful-content guidance focuses on usefulness and reliability, not on whether a page was produced quickly. A handoff that drops the review gate can turn a useful draft workflow into low-value publishing by accident.

## Completion logs are evidence

A completion log should answer one question: what can the next agent trust without replaying the whole session?

Use a compact structure:

```markdown
# Completion log: YYYY-MM-DD task-name

Task ID:
Task title:
Date:
Status:
Related commits:

## Goal

## Completed work

## Files changed

## Commands run

## Verification

## Acceptance criteria result

## Blockers or risks

## Decisions made

## Next recommended task

Do not paste raw transcripts. Do not paste secrets. Do not record every false start unless it explains a current blocker.

The log should include exact commands and whether they passed. "Checked production" is weak. "node scripts/check-production-health.mjs: pass against https://agents.goodwallet.dpdns.org" is useful. The next agent can decide whether that result is still fresh enough or needs to be rerun.

A bad handoff and a safer one

Bad handoff:

We finished most of the setup. Continue with the article stuff. The site was working earlier. Maybe deploy after CI. There were some token issues but should be fine.

This forces the next agent to guess. What setup? Which article? What does "working" mean? Is deployment approved? What token issues?

Safer handoff:

Active task: KTA-0006, create the next agents article.
Allowed now: create brief, research packet, draft under drafts/agents/, run reviews and CI, update handoff.
Not allowed without human approval: move draft to content/agents/, set status to published, deploy production.
Last verification: pnpm run ci passed on 2026-05-26; production health check passed on 2026-05-26.
Current blocker: Cloudflare API token rotation remains open, but this task does not need deployment credentials.
Start by reading handoff/tasks/KTA-0006-next-agents-article.md and docs/content-workflow.md.

The safer version is not much longer. It is just sharper.

Fresh-agent resume checklist

Before a new agent acts, it should run this checklist:

I know the active task ID.
I know the task file path.
I know what is allowed.
I know what is explicitly out of scope.
I know which actions require human approval.
I know the latest verification status.
I checked the live repo state when it matters.
I know which files I am expected to update after completion.
I know the next verification command.
I have not treated an old log as current instruction.

If any item is unclear, the next action is not implementation. The next action is state discovery.

How this fits the agent content pipeline

Handoff is the record that lets one role resume another role's work without guessing.

A planner-executor-reviewer workflow separates who designs, who implements, and who reviews. A research-writing-review pipeline separates source gathering from drafting and editorial judgment. A draft-safety gate keeps unfinished work out of public output, as described in preventing agents from publishing drafts. A review gate catches unsupported claims and AI-sounding prose before publication, as covered in reviewing AI-written technical articles. The larger AI-agent content pipeline ties those pieces into a publishable workflow.

Handoff is what lets those roles operate across time. It gives the next agent a clean starting point instead of a pile of guesses.

Do that before adding more automation. Create the six handoff files, run one real task through them, then start a fresh session and see whether the next agent can resume without reading the whole transcript. If it cannot, fix the packet before you trust it with publishing, deployment, credentials, or scheduled jobs.

How to Hand Off Work Between AI Agents Without Losing Context

How to Hand Off Work Between AI Agents Without Losing Context

Why handoffs fail

The minimum handoff packet

Truth priority when files disagree

Progressive disclosure beats context hoarding

A copy-paste starting template

Task cards control scope

A bad handoff and a safer one

Fresh-agent resume checklist

How this fits the agent content pipeline

Original value in this guide

Sources