πŸ‘€ Human Review Gates

BaC Principle: Humans Direct, Agents Execute

In a Business as Code workflow, humans are not removed β€” they are repositioned. Instead of spending 10 hours/week writing emails and researching leads, you spend 45 minutes/week reviewing what agents produced and approving what’s next.

Manifest Stages: S3, S5 β†’ See 01-Process-Manifest

There are two human gates in this process. Both are deliberate checkpoints β€” not exceptions or escalations. They are where your judgment shapes the entire batch.


🚦 Gate Overview

GateStageTriggerTime RequiredDecision
[[#Gate 1 β€” Lead Batch Review|Gate #1]]S3After 03-Lead-Sourcing-Agent + 04-Research-Agent complete~15 minApprove / Prune / Flag
[[#Gate 2 β€” Email Template Review|Gate #2]]S5After 06-Personalization-Agent completes~10 minApprove sample / Edit / Pause

πŸ”΅ Gate 1 β€” Lead Batch Review

Triggered by: 03-Lead-Sourcing-Agent + 04-Research-Agent completion Frequency: Twice per week (Mon + Thu) Time required: ~15 min Outcome feeds: 06-Personalization-Agent

What You Review

You receive a sorted list of ~50 leads with:

  • Score (0–100), score tier (hot/warm/cold)
  • Key ICP match signals
  • Research summary (top 2 signals from 04-Research-Agent)
  • Disqualification flags

Your Decision Per Lead

βœ… APPROVE    β†’ include in this batch, proceed to personalization
❌ REJECT     β†’ remove, log reason (feeds [[09-Metrics-and-Self-Improvement]])
⏸️ DEFER      β†’ hold for next batch (wrong timing, not enough context)
🚩 FLAG       β†’ something the agent shouldn't have included β€” note why

Decision Heuristics

What to look for in 15 minutes

You’re not re-researching every lead. You’re pattern-matching for things agents miss:

  • Competitor β€” you know your space, agents don’t always catch this
  • Bad-fit company culture β€” sometimes firmographics match but you know it won’t work
  • Personal connection β€” someone you know should be handled manually
  • Recent context β€” you heard about this company at an event or on a podcast

Approval Thresholds

gate_1_rules:
  minimum_approval_rate: 0.65          # if you approve <65% of a batch, trigger ICP review
  maximum_rejection_rate: 0.35         # >35% rejection = [[02-ICP-Definition]] needs update
  auto_escalate_if:
    - batch_size < 20                  # sourcing agent produced too few leads
    - avg_score < 60                   # quality dropped significantly
  action_on_escalate: pause_and_alert_COO

If You Reject >35% Consistently

This is a signal that 02-ICP-Definition is miscalibrated, not that agents are failing. Trigger an ICP review via the improvement loop.


🟒 Gate 2 β€” Email Template Review

Triggered by: 06-Personalization-Agent completion on approved batch Frequency: Twice per week (Mon + Thu, 2–4 hours after Gate #1) Time required: ~10 min Outcome feeds: 05-Email-Sequence-Engine

What You Review

You see a 10% random sample of personalized emails, plus any auto-flagged emails:

  • Full email (subject A + B, body)
  • Hook type used
  • Confidence score
  • Lead context summary (why this hook was chosen)

Your Decision

βœ… APPROVE SAMPLE    β†’ release entire batch to sequence
✏️ EDIT + RESAMPLE   β†’ fix template/prompt, re-generate 10% and review again
⏸️ PAUSE BATCH       β†’ something systemic is wrong, investigate before sending
🚩 FLAG EMAIL        β†’ this specific email should not send (remove from batch)

What β€œGood” Looks Like

Approve if all of these are true:

  • Hook references a real, specific signal (not a guess or assumption)
  • Email reads like a human wrote it (no AI tells: bullet lists, excessive politeness)
  • The ask is clear and single (not multiple CTAs)
  • Subject line is curious, not salesy
  • Word count is 80–120 words

Edit or Pause if any of these are true:

  • Hook is generic despite agent claiming β€œhigh confidence”
  • Email is >140 words
  • Subject line is a question with a ”?” (lower deliverability)
  • More than 30% of sample uses the fallback hook template

Template Change Protocol

If you edit an email manually during Gate #2:

template_change_flow:
  1: Edit the specific email β†’ approve manually
  2: If it reflects a systemic issue:
     - Note the issue in [[09-Metrics-and-Self-Improvement]]
     - Propose a prompt update to [[06-Personalization-Agent]]
     - Tag for A/B test before deploying to full batch
  3: Never change the prompt directly β€” always go through [[09-Metrics-and-Self-Improvement]] loop

πŸ” Feedback Loop to Agents

Every rejection and flag at both gates produces structured feedback:

{
  "gate": "gate_1",
  "lead_id": "lead_042",
  "decision": "REJECT",
  "reason": "competitor",
  "agent_source": "LeadSourcingAgent",
  "feedback_category": "ICP_gap",
  "timestamp": "2026-04-01T10:15:00Z"
}

This data flows to the improvement loop and accumulates into:

  • ICP filter updates
  • Lead scoring recalibrations
  • Personalization prompt improvements

πŸ“… Weekly Human Time Budget

Monday:
  09:00 β€” Gate #1: Lead Batch Review     (~15 min)
  13:00 β€” Gate #2: Email Template Review (~10 min)

Thursday:
  09:00 β€” Gate #1: Lead Batch Review     (~15 min)
  13:00 β€” Gate #2: Email Template Review (~10 min)

Friday:
  Weekly metrics review                  (~20 min)
  β†’ [[09-Metrics-and-Self-Improvement]]

Total: ~70 min/week (incl. metrics review)

This is the entire human involvement in the process.

Everything else runs autonomously.