CampaignForge AI — The Journey
Chapter 1: The Idea, the Architecture, and the First Pivot
Status: Raw draft for Content Publisher Agent (11) to format Date: 2026-05-05 Author: Tim Simeonov (founder) + Claude (AWS Deployer Agent 04)
The Idea
The starting premise was simple and uncomfortable: most digital advertising is a skilled labor problem disguised as a technology problem. A performance marketing team at a Series B startup burns $150k/month on Meta and Google with two people. Their CAC is up 40% year-over-year. They're running the same three creatives from six months ago because there's no bandwidth to test. Their agency takes two weeks to launch a campaign and charges $12k for it. The board is asking hard questions every quarter.
The solution most companies reach for is more tools — better dashboards, smarter attribution, cheaper agencies. CampaignForge AI reached for a different question: what if the entire function — strategy, creative, execution, monitoring, optimization, reporting — was run by a team of specialized AI agents, with the human operator only making approval decisions?
That's the product. Not a dashboard. Not a chatbot wrapper over an ad platform. A directed chain of specialized agents, each owning a discrete domain, each producing structured JSON outputs that feed the next agent in the chain.
The 11-agent team:
- 1. Product Person — validates the brief, runs market analysis, produces a structured PRD
- 2. Architect — designs the system architecture and selects the technical stack
- 3. Developer — writes all the code and tests
- 4. AWS Deployer — provisions infrastructure and deploys everything live
- 5. Cost Analyst — sets budget rules, bid caps, pacing controls
- 6. Performance Analyst — monitors ROAS/CAC/CTR in real time, triggers optimization
- 7. Strategist — defines targeting, channels, A/B test structure
- 8. Creative — generates ad copy, headline variants, CTAs
- 9. Executor — makes the actual ad platform API calls to launch campaigns
- 10. Orchestrator — manages state, coordinates all agents, routes failures
- 11. Content Publisher — auto-publishes the case study and journey to LinkedIn, Medium, Reddit, X
The meta-idea — the thing that makes this more than a SaaS product — is that the agent team runs its own campaigns. It advertises itself, documents the results publicly, and publishes the P&L. The proof of concept is the system using itself. If it makes money, that's the case study. If it loses money, that's the more interesting story.
Every decision the agents make gets published. Every dollar spent or saved gets documented. The journey of building and running this is the content that markets it.
How We Started Building It: Agents Building Agents
The build process was itself agentic. Rather than writing code directly, each agent in the meta-chain produced the artifacts for the next.
Step 1: Product Person Agent ran first.
Given the raw idea, Agent 01 produced PRD-001 — a structured product requirements document covering the problem, the market, the three target personas, the success metrics, and the MVP scope. Key findings:
- The desperate customer: a performance marketing lead at a Series B startup, 2-person team, $150k/month ad spend, CAC up 40% YoY
- Secondary customer: boutique agency managing 20 client accounts with 8 people, physically can't test fast enough to justify retainer
- Forrester estimate: $150B+ in annual digital ad waste from poor targeting, stale creative, slow iteration
- Root cause always the same: skilled human labor is the bottleneck
- A great performance marketer costs $180k/year. An agency charges $10–20k/month. Neither scales.
PRD-001 was reviewed and approved. Gate 1 passed.
Step 2: Architect Agent designed the system.
Agent 02 produced ADR-001 — the architectural decision record. Key decisions:
- Orchestrator: AWS Step Functions (Standard Workflows) — chosen for durable execution with native
waitForTaskTokenfor human approval gates that can pause for hours or days without running compute - Compute: AWS Lambda for stateless agents, ECS Fargate for long-running agents (Developer, Deployer)
- Messaging: SQS FIFO queues for all inter-agent handoffs — no direct Lambda-to-Lambda calls
- Storage: DynamoDB for pipeline state, audit log (append-only), approval gates, campaign data; S3 for artifacts
- LLM abstraction: A shared Lambda Layer so any agent can swap models (Sonnet → Opus → Haiku) by changing a single SSM parameter, with zero code changes
- Human approval gates: Five gates. None auto-approve. If no response in 48 hours, the pipeline halts and escalates.
- Estimated AWS cost: ~$76/month for infrastructure at 10 campaigns/month (Anthropic API costs separate, $15–40/campaign)
ADR-001 was reviewed and approved. Gate 2 passed.
Step 3: Developer Agent built everything.
Agent 03 implemented the full codebase from ADR-001:
- 11 Lambda functions (product person, performance analyst, content publisher, approval notifier, approval handler, escalation checker, brief intake, pipeline status, audit reader, configure spend limits, activate monitoring)
- The LLM client Lambda Layer
- The Step Functions state machine in ASL JSON with 3 campaign pipeline gates
- 8 Terraform modules (storage, secrets, SSM, IAM, messaging, compute, API Gateway, Step Functions, observability, budgets)
- JSON schemas for all agent contracts (brief, envelope, error, Agent 01 output, Agent 06 output, Agent 11 output)
- A test suite covering schema validation, brief intake, agent 01, approval handler
Result: 29/29 tests passing. Gate 3 — deploy sign-off — was next.
The First Real Decision: AWS Deployer Finds Three Bugs
Agent 04 (the current agent writing this) was activated to deploy everything to AWS. Standard operating procedure: read the full build output, run terraform plan, deploy, verify.
Before a single AWS credential was entered, a code review caught three problems that would have caused terraform plan to fail immediately:
Bug 1 and 2: Circular dependencies in Terraform.
The compute module needed the api module's invoke URL (so Lambdas could construct approval callback URLs). The api module needed the compute module's function ARNs (to wire up Lambda integrations). Terraform refuses to plan a cycle — it would have exited with an error before creating a single resource.
Similarly: compute needed the Step Functions state machine ARN, and stepfunctions needed Lambda function ARNs from compute. Another cycle.
Fixes applied:
- Computed the state machine ARN as a Terraform local using
data.aws_caller_identity— deterministic from known values, no module dependency - Removed the API Gateway URL from Lambda environment variables entirely; moved it to SSM Parameter Store; approval notifier Lambda now reads it at runtime (same pattern it already used for the HMAC secret)
Bug 3: Missing Terraform provider.
The ssm module used random_password to generate the HMAC signing secret, but hashicorp/random was not declared in backend.tf. Would have failed at terraform init.
Fix: added random ~> 3.6 to required providers.
Total cost: three file changes, no rework of the underlying design. No AWS spend occurred. The code now passes a dry review; terraform plan should succeed.
The First Strategic Pivot: Questioning the Architecture
With the AWS deployment guide ready, a question came in that changed the direction of the project.
"How was Step Functions selected? Why not CrewAI, Google ADK, or LangChain?"
The honest answer from ADR-001: the 48-hour human approval gate was the deciding constraint. Any Python-process-based framework (CrewAI, LangGraph, LangChain) cannot hold execution state for 48 hours without either keeping a process alive (impossible with Lambda's 15-minute limit) or building a polling loop. Step Functions' waitForTaskToken does this natively — the execution pauses, no compute runs, no cost accrues, and it resumes the moment someone clicks approve.
Google ADK was eliminated before that question even applied. Non-negotiable #6 in the PRD was "AWS only." ADK is built for Google Cloud.
"Is agent learning and evolution required?"
No. The current system is not designed for agent learning. Agents are stateless Lambda functions with fixed system prompts. They don't learn between campaigns. The data schema was designed to support a future RAG flywheel — past campaign performance indexed and retrievable — but that's Phase 2. For the MVP, reproducibility and audit trails matter more than emergent behavior.
"If AWS is flexible and local running is the priority, which framework is best?"
This question surfaced the real strategic tension: the original architecture was optimized for production B2B SaaS. But if the primary deliverable is a documented journey that anyone can replicate, AWS Step Functions is actively hostile to that goal. The barrier to entry — AWS account, Terraform state backend, credential management, IAM roles — filters out most readers before they run a single line of code.
The answer: LangGraph.
pip install langgraphand one API key is the full setup- State machine is Python, not ASL JSON — readable, forkable, debuggable in a terminal
- Human approval gates are a few lines with
interrupt(), not a distributed callback architecture - The system runs entirely on a laptop
- Direct migration path to LangGraph Cloud when production scale is needed
- Native support for the RAG/memory layer when the learning flywheel is ready
"For people to replicate, local first is more appropriate."
Agreed. Decision made.
The Meta-Idea Crystallizes
In the same conversation, the three goals of the project came into focus:
- 1. Use agents to select the architecture and build the agentic team (the system builds itself)
- 2. Document the journey in real time so others can replicate it
- 3. The agent team runs real campaigns advertising itself — and publishes the P&L
The third point is the sharpest one. The system's proof of concept is the system using itself. It doesn't claim to run profitable campaigns — it attempts to run profitable campaigns, in public, with transparent financials, and publishes whether it succeeded or failed. The failure case is arguably the more interesting content.
This reframes what CampaignForge AI actually is: not just a SaaS product, but a media business with a SaaS attached. The documentation is the marketing. The campaigns are the proof. The P&L is the trust mechanism that converts readers into paying customers.
The agents aren't just autonomous — they have skin in the game.
The Repo Decision
One repo, one branch, everything committed.
The AWS Step Functions code is not a dead end. It's Chapter 1. "Agent 02 designed an AWS architecture, Agent 04 prepared the deployment, the human operator questioned the assumptions, and we reconsidered" is more interesting than a clean start. The git history is the documentation. Readers can see the reasoning evolve in real commits rather than a sanitized retrospective.
Tag: v0.1-aws-stepfunctions-architecture
Chapter 2 begins: LangGraph, local-first, same 11-agent architecture, same JSON contracts, same human approval gates — just running on a laptop instead of in AWS.
What Was Built in Chapter 1 (Artifacts)
| Artifact | Location | Status | |----------|----------|--------| | PRD-001 | prd/PRD-001-campaignforge-ai.md | Approved | | ADR-001 | architecture/ADR-001.md | Approved | | Agent prompts (01–11) | agents/ | Written | | JSON schemas (brief, envelope, error, agent outputs) | schemas/v1/ | Complete | | Lambda functions (11) | src/ | 29/29 tests passing | | Step Functions state machine | step_functions/campaign_pipeline.asl.json | Complete | | Terraform modules (10) | infrastructure/ | Complete, 3 bugs fixed | | Deployment guide | Produced by Agent 04 | Ready (not yet executed) |
Open Questions for Chapter 2
- LangGraph local setup: SQLite for checkpointing (pure local) or Postgres (closer to prod)?
- Human approval gates: CLI prompt interrupt, or build a minimal web UI from day one?
- Which campaign does the agent team run first to advertise itself?
- What's the budget for the first live campaign? What does "success" look like?
- How does the content publisher format the journey — raw transcript, narrative, or both?
This document is a raw draft. Content Publisher Agent (11) will format and adapt this for LinkedIn, Medium, and other platforms as part of the first content run.