Building an AI Factory

Beyond the laptop, beyond the subscription

Stanford WebCamp 2026 · May 1, 2026

AI coding was working pretty well
six months ago.

Why does the industry act like
you need the newest model
for every problem?

Stop chasing the model.

Engineer around the one you have.

One developer.
One laptop.
One subscription.

A ceiling, not a strategy.

The Factory Model

  • Agents off the laptop
  • CI/CD as orchestrator
  • Event-driven gates

Forge

A factory in a CLI

Forge pipeline complete: 6 stages done, 44 files generated

YAML in. Production code out.

Go service · tests · Terraform · K8s/KRO · CI · docs

Plain English in.

forge new prompt asking 'describe your service' with plain-English SQS/Kafka/RDS description

The architect asks
until it's unambiguous.

Forge refining spec: clarifying questions about deduplication key, retention window, throughput

Six stages. Six agents.

Architect → Implementer → Infra → CI/CD → Quality → Docs

Forge pipeline running mid-flight: architect/implementer/infra DONE, cicd RUNNING

Direct Anthropic API.

NOT the Claude Code subscription.

1. The CLI is the control plane.

Your laptop IS the orchestrator.

2. Schema-validated handoffs.

The CLI ships its own contracts.

3. Self-healing is its own agent.

Recovery is agentic too.

Self-healing in action

Forge architect stage in FIXING state: schema validation failed, output-fixer engaged to repair malformed output

architect · FIXING · output-fixer engaged

4. Patterns baked in.

Killed pattern-judge. Used examples.

Fewer moving parts.

5. Wiring is testable.

--dry-run

Scaffolding has its own test suite.

6. Token usage is visible.

Per stage. Per chunk. Real time.

No platform required.

  • Control plane
  • Contracts
  • Recovery

The factory is itself a system.

Same discipline as production.

Use expensive intelligence
to build contracts.

Use cheap intelligence
to honor them.

Forge: built with Opus (Claude Max, 2.5h).
Runs on direct API. Downshifts to Haiku.

Subsidies are ending.

Copilot → usage-based

June 1, 2026

Token-metered. Monthly credits. Overage billed.

Token budget = invoice line item.

The invoice arrives in 30 days.

What this means

  • Monitor your usage.
  • Cost per unit of work is a metric.
  • The factory is the survival strategy.

sniffly: top for tokens

sniffly dashboard showing 250.4K input / 4.2M output tokens and $1203.71 in direct API cost over 16 days

Be constrained by your budget

Not by hourly rate limits.

"Building software still demands discipline, but the discipline shows up more in the scaffolding than the code."

OpenAI, Harness Engineering

References

Thanks.

Mark Ferree · Stanford WebCamp 2026