Skip to content

Agent Architecture

Every agent definition follows a standard structure:

# Frontmatter
---
name: 06b-Bicep CodeGen
description: Expert Azure Bicep IaC specialist...
model: ["Claude Sonnet 4.6"] # (1)!
tools: [list of allowed tools] # (2)!
handoffs:
- label: "Step 6: Deploy"
agent: 07b-Bicep Deploy # (3)!
prompt: "Deploy the Bicep templates..."
---
# Body (≤ 500 lines)
## MANDATORY: Read Skills First
1. **Read** `.github/skills/azure-defaults/SKILL.md` # (4)!
2. **Read** `.github/skills/azure-artifacts/SKILL.md`
  1. Model selection — the Orchestrator can override this based on task complexity
  2. Tool allowlist — agents only access tools they need
  3. Handoff target — the next agent in the workflow
  4. Skills are loaded on demand to preserve context budget

The frontmatter is machine-readable metadata. The body is the agent’s full operating manual — the runtime loads it into the system prompt at invocation.

Agents interact with external systems through tools — structured interfaces provided by MCP servers and the VS Code runtime. Each agent’s frontmatter declares a tools: allowlist that restricts which tools it can call. Common tool categories:

  • MCP tools: Cloud API wrappers (Azure pricing queries, GitHub operations, Terraform registry lookups)
  • File tools: Read and write workspace files (create artifacts, read prior step outputs)
  • Terminal tools: Execute CLI commands (Bicep build, Terraform validate, Azure CLI)
  • Subagent tools: Delegate to specialised subagents via #runSubagent

Agents do not communicate directly. Instead, each agent produces artifact files in agent-output/{project}/ that the next agent reads as input. The Orchestrator orchestrates this by delegating to one agent at a time, collecting its output, and routing to the next step. At approval gates, the Orchestrator writes a 00-handoff.md summary document that enables session resume.

AgentRolePrimary Skills
01-OrchestratorMaster orchestratorworkflow-engine, apex-recall
02-RequirementsCaptures project requirementsazure-defaults, azure-artifacts
03-ArchitectWAF assessment and cost estimationazure-defaults
04-DesignDiagrams and ADRsdrawio, python-diagrams, azure-adr
04g-GovernancePolicy discovery and complianceazure-defaults
05-IaC PlannerIaC implementation planning (Bicep & Terraform)azure-bicep-patterns, terraform-patterns
06b-Bicep CodeGenBicep template generationazure-bicep-patterns
06t-Terraform CodeGenTerraform configuration generationterraform-patterns
07b-Bicep DeployBicep deployment executionazure-validate, iac-common
07t-Terraform DeployTerraform deployment executionazure-validate, iac-common, terraform-patterns
08-As-BuiltPost-deployment documentationazure-artifacts, drawio, python-diagrams
09-DiagnoseAzure resource troubleshootingazure-diagnostics
10-ChallengerStandalone adversarial review
11-Context OptimizerContext window audit and optimisationcontext-management
e2e-orchestratorPrompt-invoked end-to-end validation driverworkflow-engine, apex-recall

For a live, always-current roster, see the Architecture Explorer. The count is computed from tools/registry/count-manifest.json and the source of truth is the .github/agents/*.agent.md files on disk.

First-run project decisions (iac_tool, review_depth)

Section titled “First-run project decisions (iac_tool, review_depth)”

The 01-Orchestrator captures two project-scoped decisions the first time a project boots (or during the first approval gate after apex-recall init). Both are persisted to apex-recall and never re-asked — every downstream agent reads them via apex-recall show <project> --json.

DecisionDefaultWhen askedPersistence key
iac_toolnone (must be chosen)Step 1 (Requirements), Phase 2decisions.iac_tool
review_depthdefault — single-pass comprehensive (recommended)Project boot or first gate after initdecisions.review_depth

review_depth values:

  • default — one comprehensive challenger pass at Steps 1, 2, 4 (plus governance-reconciliation at Step 3.5). Right for most workshops, MVPs, and single-region projects.
  • deep — rotating-lens multi-pass cascade per adversarial-review-protocol.md (Pass 1 security-governance → Pass 2 architecture-reliability → conditional Pass 3 cost-feasibility). Worth the ~3× challenger cost for regulated workloads (HIPAA/PCI), prod migrations, or multi-region designs.

Changing the value later01-Orchestrator writes once and never re-prompts. To switch a project from default to deep (or vice versa) after boot, edit the persisted decision directly:

Terminal window
apex-recall decide <project> --key review_depth --value deep --rationale "Escalated to deep review after Step 2 reliability gap" --json

Alternatively, escalate a single artifact without flipping the whole project by invoking the 10-Challenger user-invocable agent manually — it runs the rotating-lens passes against one file on demand.

Authoritative contract: 01-orchestrator.agent.md → Computing decisions.review_depth.

Per-Step User Gates (Architect, Design, Governance)

Section titled “Per-Step User Gates (Architect, Design, Governance)”

Each user-facing step raises a small, predictable set of questions before continuing to the next step. The questions exist to keep creative AI decisions in the user’s hands; the agent never silently assumes a non-default value.

StepGateQuestion typeDecision key recorded
2SKU confirmation (before pricing)Approve / Revise / Discusssku_confirmation_status
2Budget gate (after pricing)Approve / Revise SKUs / Revise requirementsbudget_decision
2Per-finding decisionsAccept / Skip / Defer (one question per finding — never batched)decision_log
3Diagram tool choice (one-time)Draw.io (default) / Python diagramsdiagram_tool
3.5Phase 2.7 resolutionRG tag keys + casing, allowed locations (two questions)tag_contract, governance_status

The full registry of valid decision keys lives at tools/apex-recall/docs/decision-keys.md; the validator tools/scripts/validate-decision-keys.mjs enforces that every apex-recall decide --key reference in an agent file appears in the registry.

Tag schema (greenfield projects): when Governance Discovery finds no tag policy at any inherited scope, projects use the lowercase environment, owner, costcenter, project set per CAF tag-strategy guidance (see .github/skills/azure-defaults/references/tag-strategy.md). The legacy PascalCase 4-tag set (Environment, ManagedBy, Project, Owner) is a deprecated convention retained only for backward compatibility on existing projects.

SKU manifest MD ↔ JSON sync: the human-readable agent-output/{project}/sku-manifest.md is rendered deterministically from sku-manifest.json via node tools/scripts/render-sku-manifest-md.mjs <project>. Agents mutate the JSON; the renderer (wired into lefthook pre-commit and CI) re-emits the MD. Hand-editing the MD is forbidden and reverted on the next commit.

Subagents are not user-invocable. They are delegated to by parent agents for isolated, specific tasks:

SubagentPurposeInvoked By
challenger-review-subagentAdversarial review of artifactsSteps 1, 2, 4, 5, 6
cost-estimate-subagentAzure Pricing MCP queriesSteps 2, 7
bicep-validate-subagentLint + AVM/security code reviewStep 5 (Bicep)
bicep-whatif-subagentaz deployment what-if previewStep 6 (Bicep)
terraform-validate-subagentLint + AVM-TF/security code reviewStep 5 (Terraform)
terraform-plan-subagentterraform plan previewStep 6 (Terraform)

The challenger-review-subagent implements adversarial review at critical workflow steps. It operates with rotating lenses:

  • 1-pass review (comprehensive): A single review covering all dimensions. This is the default for all steps. Used for requirements (Step 1), architecture (Step 2), deploy (Step 6), and optionally for planning (Step 4) and code (Step 5).
  • Multi-pass review (rotating lenses, opt-in): Multiple separate reviews, each focused on a specific dimension (security, reliability, cost). Available for architecture (Step 2), planning (Step 4), and code (Step 5) when explicitly requested. Recommended for complex projects.

Findings are classified as must_fix (blocking) or should_fix (advisory). Only must_fix findings block workflow progression.

Conditional passes (when multi-pass is opted in): Pass 3 of the rotating lens review is conditional — it only runs if Pass 2 returned ≥1 must_fix finding. If Pass 2 returns zero must_fix items, Pass 3 is skipped entirely, saving approximately 4 minutes per review cycle.

Context Shredding for Challenger Inputs: The challenger is instructed to apply context compression tiers when loading predecessor artefacts for review:

Context UsageLoading Strategy
< 60%Full artefact
60–80%Key H2 sections only (resource list, SKUs, WAF scores, budget)
> 80%Decision summary from 00-session-state.json + resource list

After each review pass, only the compact_for_parent string (~200 characters) is carried forward — not the full JSON findings. This prevents context bloat across multi-pass reviews and is enforced by the output schema.

New Challenger Checklists: Two mandatory checklist categories were added:

  • Cost Monitoring: Budget resource, forecast alerts at 80/100/120%, anomaly detection.
  • Repeatability: Parameterised values, multi-tenant deploy, projectName required.

Agents communicate through artefact files, not direct message passing. The Orchestrator delegates to a step agent, which produces output files in agent-output/{project}/. The next agent reads those files as input. This design:

  • Eliminates context leakage between agents
  • Enables resume from any point (artefacts are persistent)
  • Allows human review at every gate (artefacts are human-readable markdown)
  • Supports parallel development of different steps

Phase Handoff Document: At each approval gate, the Orchestrator writes a 00-handoff.md file containing a summary of what was completed, key decisions made, what comes next, and (at Gates 2 and 3) a session break recommendation. This enables resume from any gate without needing to re-read all prior artefacts.


This section walks through creating a new agent from scratch.

TypeFile LocationUser-InvocableUse When
Top-level agent.github/agents/{name}.agent.mdYesUser-facing workflow steps
Subagent.github/agents/_subagents/{name}.agent.mdNoIsolated tasks delegated by parent agents

Model selection depends on the task. Use tools/registry/agent-registry.json as the source of truth, but the current repo pattern is:

  • Planning agents (accuracy-first) — typically Claude Opus 4.7 at high reasoning effort
  • OrchestratorMAI-Code-1-Flash, Microsoft’s fast coding model. Standard tier suits handoff-only routing without creative generation; the agent body keeps its outcome-first skeleton (Role / Goal / Success / Constraints / Output / Stop) as a sound routing structure.
  • Design + Code generationClaude Sonnet 4.6 for Anthropic XML-tagged output contracts and stronger verbatim invariant retention (security baseline, AVM contract, HARD GATE language)
  • Governance + ChallengerGPT-5.5 for balanced execution quality with explicit retrieval budgets and stopping conditions
  • Execution, deploy, and validation subagents — model varies; consult tools/registry/agent-registry.json
  • Adversarial review — use a different model family than the artifact author when possible

Create a .agent.md file with the required frontmatter:

---
name: My Custom Agent
description: >-
One-line description of what this agent does.
USE FOR: keyword triggers. DO NOT USE FOR: anti-triggers.
model:
- GPT-5.5
tools:
- read_file
- create_file
- replace_string_in_file
- run_in_terminal
- runSubagent
handoffs:
- label: "Next Step"
agent: next-agent-name
prompt: "Hand off with context..."
---

Required frontmatter fields: name, description, model, tools. Optional: handoffs, user-invocable (defaults to true for top-level).

See .github/instructions/agent-authoring.instructions.md for the complete frontmatter specification.

The body (below the frontmatter) is the agent’s operating manual:

## MANDATORY: Read Skills First
1. **Read** `.github/skills/azure-defaults/SKILL.md`
## DO (required behaviours)
- Always check for existing session state before starting
- Load skills progressively (SKILL.md first, then references/ on demand)
## DO NOT (prohibited behaviours)
- Do not skip approval gates
- Do not hardcode project-specific values

Keep the body under 350 lines. Use skill references for deep domain knowledge rather than inlining it.

Update the registry file:

  1. tools/registry/agent-registry.json — add the agent’s role, file path, model, and skill list
Terminal window
# Validate frontmatter syntax, body size, and language quality
npm run validate:agents
# Verify registry consistency
npm run validate:agent-registry
ProblemCauseFix
Agent not appearing in chatMissing or invalid frontmatterRun npm run validate:agents
Tool not availableTool not in tools: allowlistAdd the tool name to frontmatter
Handoff not triggeringWrong agent name in handoffs:Verify target agent file exists
Skills not loadingTypo in skill pathCheck path matches .github/skills/{name}/SKILL.md