📊 Metrics & Self-Improvement Loop

BaC Core Principle

“A Business-as-Code company doesn’t improve once a quarter. It improves minute by minute.”

This file is where that happens.

Manifest Stage: S8 → See 01-Process-Manifest Cadence: Weekly analysis + monthly ICP/manifest review Human time: ~20 min/week (review and approve improvement proposals)

This is the self-improvement engine of the entire outreach process. Every metric here traces back to a specific agent or gate and generates actionable change proposals — not observations.

🔄 Improvement Loop Architecture

flowchart LR
    DATA["Raw Data<br/>sends, opens, replies,<br/>meetings, gate decisions"]

    DATA --> AGG["Weekly Aggregation<br/>MetricsAgent"]

    AGG --> ANALYSIS["Anomaly Detection<br/>+ Trend Analysis"]

    ANALYSIS --> PROP["Improvement Proposals<br/>draft changes to agents / manifest"]

    PROP --> REVIEW["👤 Human Review<br/>COO approves / rejects (~20 min)"]

    REVIEW -->|"Approved"| DEPLOY["Deploy change<br/>to manifest / agent"]
    REVIEW -->|"Rejected"| LOG["Log reason<br/>+ archive"]

    DEPLOY -->|"Next cycle"| DATA

📈 Primary KPI Dashboard

North Star Metric

Meetings booked per week. Every other metric exists to explain why this number is or isn’t where it should be.

Funnel Metrics (Weekly)

Stage	Metric	Target	Current	Status
Sourcing	Leads sourced/batch	50	—	—
Sourcing	Email validity rate	≥85%	—	—
Gate #1	Approval rate	≥65%	—	—
Sequence	Open rate (Step 1)	≥45%	—	—
Sequence	Reply rate (all steps)	≥7%	—	—
Sequence	Positive reply rate	≥3%	—	—
Conversion	Meetings booked/week	≥4	—	—
Conversion	Meeting show rate	≥80%	—	—
Quality	Unsubscribe rate	<0.5%	—	—
Quality	Bounce rate	<2%	—	—

Populate "Current" weekly after the MetricsAgent runs. A/B test results tracked separately below.

🧪 A/B Test Registry

All active and archived A/B tests for 05-Email-Sequence-Engine templates and 06-Personalization-Agent prompts.

ab_tests:
  active:
    - test_id: AB-001
      name: "Subject line — question vs. statement"
      variant_a: "Scaling outbound with fewer SDRs"
      variant_b: "Is your outbound process ready to scale?"
      started: 2026-04-01
      min_sends: 100                   # per variant before declaring winner
      metric: reply_rate
      status: running
      current_sends_a: 0
      current_sends_b: 0
      results_a: null
      results_b: null
      winner: null
 
  archived: []

ab_promotion_rules:
  winner_declaration:
    minimum_sends: 100                 # per variant
    minimum_difference: 0.015          # 1.5% absolute difference required
    confidence_level: 0.95             # 95% statistical confidence
  promotion_flow:
    1: MetricsAgent declares winner
    2: Proposal submitted to COO at [[07-Human-Review-Gates]]
    3: COO approves → [[05-Email-Sequence-Engine]] updated
    4: Losing variant archived, test logged here

🚨 Anomaly Detection Rules

The MetricsAgent monitors these automatically and triggers alerts:

anomaly_rules:
  - metric: reply_rate
    condition: drops_20pct_vs_7d_baseline
    action: alert_COO + pause_new_sends + propose_rollback
    reference: [[01-Process-Manifest]] rollback protocol
 
  - metric: bounce_rate
    condition: exceeds_0.03
    action: pause_sequence + alert_COO + audit_[[03-Lead-Sourcing-Agent]]
    severity: critical
 
  - metric: spam_complaint_rate
    condition: exceeds_0.001
    action: pause_ALL_sequences + alert_COO_immediately
    severity: critical
 
  - metric: gate_1_rejection_rate
    condition: exceeds_0.35_for_2_batches_consecutive
    action: propose_ICP_update → [[02-ICP-Definition]]
    severity: medium
 
  - metric: positive_reply_rate
    condition: drops_below_0.01_for_2_weeks
    action: propose_sequence_overhaul + prompt_review → [[06-Personalization-Agent]]
    severity: high
 
  - metric: meetings_booked
    condition: zero_for_1_week
    action: escalate_immediately_to_COO
    severity: critical

💡 Improvement Proposal Format

When the MetricsAgent identifies an improvement, it generates a structured proposal:

proposal_template:
  id: PROP-{{date}}-{{sequence_number}}
  created_at: "{{timestamp}}"
  created_by: MetricsAgent
  type: [icp_update, agent_prompt_update, sequence_change, scoring_recalibration]
 
  problem:
    metric_affected: "{{metric_name}}"
    current_value: "{{value}}"
    target_value: "{{target}}"
    trend: "{{improving|declining|stable}}"
    evidence: "{{data_points}}"
 
  proposed_change:
    file: "[[{{target_file}}]]"
    section: "{{section_name}}"
    current: "{{current_value}}"
    proposed: "{{new_value}}"
    rationale: "{{explanation}}"
 
  test_plan:
    a_b_test_required: true/false
    test_duration_weeks: 2
    minimum_sends: 100
 
  human_decision_required: true
  decision_deadline: "{{date + 48h}}"

📅 Review Cadences

Weekly (~20 min, every Friday)

1. Review MetricsAgent summary (5 min)
   - Funnel metrics vs. targets
   - Any anomalies triggered this week
   - A/B test status updates

2. Review improvement proposals (10 min)
   - Approve / reject each proposal
   - Log reasons for rejections

3. Note any qualitative observations (5 min)
   - Anything from INTERESTED reply conversations
   - Patterns you noticed in email quality at Gate #2

Monthly (~45 min)

1. Full funnel review — month-over-month
2. ICP accuracy review — are we booking meetings with right companies?
3. Sequence structure review — is 5-step still optimal?
4. Agent prompt audit — read 10 recent emails, grade quality
5. Update manifest version if major changes made → [[01-Process-Manifest]]

🏛️ Improvement History Log

improvement_log:
  - date: 2026-04-01
    proposal_id: PROP-2026-04-01-001
    type: initial_setup
    change: "Process launched — baseline metrics collection begins"
    approved_by: COO
    result: pending

🔗 What Each Agent Reports Here

Agent	Metrics Reported	Frequency
03-Lead-Sourcing-Agent	Batch size, score distribution, email validity	Per batch
04-Research-Agent	Context richness scores, signal types found	Per batch
05-Email-Sequence-Engine	Opens, replies, step performance, A/B results	Daily
06-Personalization-Agent	Hook type, confidence, word count	Per email
07-Human-Review-Gates	Gate approval/rejection rates, flags	Per gate
08-Reply-Classification-Agent	Classification breakdown, response times	Daily

01-Process-Manifest — Master manifest (updated when proposals are approved)
02-ICP-Definition — Updated when Gate #1 rejection rate signals miscalibration
03-Lead-Sourcing-Agent — Sourcing quality KPIs
04-Research-Agent — Research quality KPIs
05-Email-Sequence-Engine — Sequence performance + A/B tests
06-Personalization-Agent — Personalization quality + prompt versioning
07-Human-Review-Gates — Gate feedback flows here
08-Reply-Classification-Agent — Reply data + classification accuracy
00-MOC — Back to vault index

Explorer

Metrics & Self-Improvement Loop

📊 Metrics & Self-Improvement Loop

🔄 Improvement Loop Architecture

📈 Primary KPI Dashboard

Funnel Metrics (Weekly)

🧪 A/B Test Registry

🚨 Anomaly Detection Rules

💡 Improvement Proposal Format

📅 Review Cadences

Weekly (~20 min, every Friday)

Monthly (~45 min)

🏛️ Improvement History Log

🔗 What Each Agent Reports Here

Graph View

Table of Contents

Backlinks

Explorer

Metrics & Self-Improvement Loop

📊 Metrics & Self-Improvement Loop

🔄 Improvement Loop Architecture

📈 Primary KPI Dashboard

Funnel Metrics (Weekly)

🧪 A/B Test Registry

🚨 Anomaly Detection Rules

💡 Improvement Proposal Format

📅 Review Cadences

Weekly (~20 min, every Friday)

Monthly (~45 min)

🏛️ Improvement History Log

🔗 What Each Agent Reports Here

📎 Related Files

Graph View

Table of Contents

Backlinks