Human-in-the-Loop Approval AGENTS.md Template

Overview

Direct answer: This AGENTS.md template defines a human-in-the-loop approval system using AI coding agents. It governs the workflow, agent roster, handoffs, and governance for auditable, compliant approvals. It supports single-agent operation and multi-agent orchestration with a supervisor orchestrator.

It provides concrete operating context for a typical approval pipeline: data extraction, automated assessment, evidence gathering, human review, and final sign-off. It describes roles, responsibilities, memory, source-of-truth rules, and escalation paths to ensure traceability and safe decision-making.

When to Use This AGENTS.md Template

When approvals require human judgement and auditable records.
When you must enforce disciplined handoffs between automated agents and humans.
When you need a shared context and memory for each approval task.
When you want clear tool governance and secret handling in an approval workflow.
When you want a repeatable operating model for single or multi-agent orchestration.

Copyable AGENTS.md Template

# AGENTS.md
Project role: Platform Owner / AI Platform Architect
Agent roster and responsibilities:
- Planner: orchestrates the workflow, maintains memory, coordinates handoffs, ensures visibility to humans.
- Implementer: performs automated inference, data gathering, API calls, and prepares items for review.
- Researcher: fetches external context sources, data provenance, and evidence for decisions.
- Domain Specialist: provides domain-specific judgement and interpretation of outputs.
- Reviewer (Human): performs final judgement and approves or rejects.
- Auditor: logs actions for compliance and audits the decision trail.
Supervisor or orchestrator behavior:
- The Planner maintains the master plan, triggers sub-tasks, collects status, and routes to human review when necessary.
Handoff rules between agents:
- Planner -> Implementer: handoff data payload with context, sources, and required actions.
- Implementer -> Reviewer: present summary, evidence, and risk flags for human sign-off.
- Researcher -> Domain Specialist: supply domain-specific interpretations or contradictions.
Context, memory, and source-of-truth rules:
- All task context is stored in a central memory store keyed by task-id.
- Source-of-truth is the data from sources and the final decision logs.
- Memory is read/write with strict TTL and access controls.
Tool access and permission rules:
- Implementer may call allowed APIs with scoped tokens.
- Secrets must be retrieved from a secret vault and never logged.
- Production system calls require an approval gate.
Architecture rules:
- Event-driven orchestrator coordinates stateless agents; state is persisted to memory store.
- Agents are deterministic given same memory.
File structure rules:
- Keep the project lean; include only folders needed for this workflow.
Data, API, or integration rules:
- Use versioned data sources; ensure provenance and time-stamped evidence.
Validation rules:
- Each handoff includes a validation payload; the reviewer must confirm evidence.
Security rules:
- Secrets never leak into logs; rotate secrets; access limited to required scopes.
Testing rules:
- Unit tests for each agent; integration tests for handoffs; end-to-end tests for the approval path.
Deployment rules:
- Deploy in stages; require human-in-the-loop sign-off for production gating changes.
Human review and escalation rules:
- If the AI confidence is below threshold, escalate to Reviewer within 1 business hour; for high risk, escalate to Domain Specialist.
Failure handling and rollback rules:
- If an error occurs, revert to last approved state; retry up to 3 times with backoff; log and alert.
Things Agents must not do:
- Do not bypass human review; do not store secrets in memory; do not perform irreversible production changes without sign-off; do not share private data outside allowed contexts.

Recommended Agent Operating Model

The operation relies on a Planner orchestrating a transparent, auditable multi-agent workflow. Implementer handles automation, Researcher and Domain Specialist provide context, and Reviewer makes the final call with human oversight. The model supports escalation to human review and ensures a clear rollback path when decisions fail.

Recommended Project Structure

human-in-the-loop-approval/
├── agents/
│   ├── planner/
│   │   └── planner.py
│   ├── implementer/
│   │   └── implementer.py
│   ├── researcher/
│   │   └── researcher.py
│   ├── domain-specialist/
│   │   └── domain_specialist.py
│   ├── reviewer/
│   │   └── reviewer.py
│   └── auditor/
│       └── auditor.py
├── orchestrator/
│   └── orchestrator.py
├── data/
│   ├── sources/
│   ├── memory/
│   └── provenance/
├── tests/
│   ├── unit/
│   └── integration/
└── README.md

Core Operating Principles

Always require human review for high risk decisions.
Maintain an auditable decision trail with immutable logs.
Use a central memory store for context; keep prompts and data source references.
Enforce least privilege for tool access and secret handling.
Design for idempotence and safe rollback on failures.

Agent Handoff and Collaboration Rules

Planner coordinates all agents and defines entry criteria for handoffs.
Implementer performs automated actions only after Planner approves a task and before Reviewer sign-off.
Researcher fetches evidence and sources, then informs Domain Specialist when domain interpretation is needed.
Domain Specialist provides judgement and notes any caveats to the Reviewer.
Reviewer validates evidence, approves or requests changes, and triggers post-approval logging.
Auditor continuously records actions and changes for compliance.

Tool Governance and Permission Rules

All tool calls must occur within approved scopes and under Planner oversight.
Secrets must be retrieved from a vault and never logged or exposed in outputs.
Production system access requires explicit human approval gates and auditable changes.
External services must be sandboxed in non-production environments until approved.

Code Construction Rules

Write modular, testable code with clear interfaces between agents.
Do not hardcode secrets; fetch from vaults at runtime.
Log actions with task IDs and agent names; avoid leaking sensitive data.
Validate inputs and outputs at each handoff.

Security and Production Rules

Encrypt sensitive data in transit and at rest.
Limit production changes to approved sign-off paths.
Implement anomaly detection for unusual agent activity and escalate.

Testing Checklist

Unit tests for each agent with deterministic seeds.
Integration tests for handoffs and escalations.
End-to-end tests simulating human review and approval path.
Security and access tests for vaults and secrets management.

Common Mistakes to Avoid

Bypassing human review for critical decisions.
Storing secrets in memory or logs.
Overfitting prompts leading to inconsistent decisions.
Unclear escalation paths or undefined thresholds.

FAQ

What is the purpose of this AGENTS.md Template?

This template provides a repeatable operating manual for human in the loop approval workflows, detailing agent roles, handoffs, and governance to ensure auditable decisions.

How are approvals escalated to humans?

If AI confidence or risk exceeds thresholds, the Planner routes the task to a Reviewer or Domain Specialist within defined SLAs and with full context provided.

What constitutes high risk in this template?

High risk includes decisions impacting user safety, regulatory compliance, or critical data integrity where automated decisions require human interpretation.

How do I validate this template before production?

Run unit tests for each agent, simulate end-to-end handoffs, and perform manual testing of the human review path in a staging environment.

How is data provenance handled?

All sources are logged in memory with timestamps and the final decision includes references to provenance data for auditability.

Target User

Use Cases