AI & Infrastructure¶
AI & Infra is the evolution of SRE/DevOps at Dapper Labs. It owns the AI infrastructure layer that every other org builds on. If it runs in production, breaks at 3am, or needs to scale, it goes through this team. Goal: zero downtime, zero friction, every engineer unblocked.
Scope: SRE -- IT -- Security -- Data (6 people under Ben Noyce)
This page covers the production systems, AI tooling, data science infrastructure, and the intelligence architecture that ties them together.
Systems Overview¶
| System | What It Does | Status | Owner |
|---|---|---|---|
| Campaign Builder | WYSIWYG campaign editor on Atlas | Staging, 2 PRs pending | Jim Wheaton |
| Heimdall | Data science agent wrapping BigQuery | Plugin built (15 skills), daemon deployment pending | Data team / Product |
| KAAOS Daemon | Knowledge management, morning dossiers, triage | Production (CEO instance) | AI team |
| Morning Brief | Auto-generated standup replacement | Designed, not yet built | AI team |
| ReGgie | Live operations agent for CK: All The Zen | Production (~$20-30/week) | Alan Carr |
| SWE Pipeline | Autonomous coding from Linear issues | Early development | Jim Wheaton |
Campaign Builder¶
The Campaign Builder enables product and marketing teams to create campaigns on Atlas without engineering deploys. Built by Jim Wheaton.
How It Works¶
- User opens the Campaign Builder (VPN + Auth0 required)
- Selects components from the Atlas design system
- Arranges them visually — WYSIWYG editing with real-time preview
- Under the hood, the layout is stored as JSON
json-renderrenders the JSON with actual React components on the frontend- Changes are visible immediately (real-time population)
Why It Matters for AI¶
Because the entire page layout is JSON, an AI agent can:
- Generate campaign layouts programmatically
- A/B test different layouts by serving different JSON to different segments
- Iterate on campaign design based on performance data from Heimdall
This is the bridge between the intelligence layer (sensing what's happening) and the execution layer (doing something about it).
Current Status¶
- Working on Atlas staging
- Demoed to Leon Li, Spencer Bogad, Jordan Wilberding
- 2 PRs pending for v1 (FE:
atlas-app#761, BE:atlas-api#632) - VPN-gated on staging and production
Heimdall — Data Science Agent¶
Heimdall is a Claude Code plugin with 15 skills that wraps BigQuery for analytical work. It adds intelligence on top of raw query execution: question decomposition, statistical reasoning, analytical guardrails, persistent memory, and proactive monitoring.
Skills¶
| Tier | Skill | Function |
|---|---|---|
| 0 (cross-cutting) | data-guardrails |
20 pitfalls as executable checks. Fires on every data claim. |
| 1 (seconds) | quick-answer |
Canonical numbers from insight graph. Routes to deeper skills when not found. |
| 2 (minutes) | analyze |
20 parameterized query templates. Auto-caveats. |
| 3 (hours) | investigate |
Multi-wave analysis (5-8 queries, 2-3 waves). Matched-control for causal claims. |
| 2-3 | simulate |
Impact projection with historical analogs and confidence bands. |
| 1-3 | health-scan |
Product health monitoring. Quick mode: ~20 metrics x 3 products, traffic-light dashboard. Exhaustive mode: deep dives + insight graph lint. |
| 3 | opportunity |
Proactive discovery — OKR-aligned + unknown unknowns. |
| Meta | benchmark |
25 known-answer test cases. Regression detection. |
| Meta | track |
Measurement plan lifecycle — register predictions, checkpoints, retrospectives. |
| Meta | verify |
Independent audit + adversarial debate layer. |
| Per-user | user-score |
BQML per-user churn/upgrade scoring for full population. |
| Per-user | whale-watch |
LLM narrative whale assessment for top ~108 XL users. |
The Insight Graph¶
Heimdall's persistent knowledge store. Unlike conversation history that disappears, the insight graph compounds:
research-reports/data-science-insights/
index.md # Catalog of all findings
log.md # Activity log
findings/ # 22 seed pages + auto-investigation findings
health-scans/ # Scan reports + benchmark log
opportunities/ # CDO-quality opportunity reports
tracking/ # Measurement plan lifecycle pages
numbers/
canonical-numbers.md # 100+ verified entries
superseded.md # Retired numbers + evolution log
blacklist.md # 6 known-wrong numbers
Daemon Architecture (Target)¶
Heimdall as a persistent daemon on a GCP VM, running on a 30-minute cycle:
- INGEST — pull latest metrics, compare against wiki baselines, detect anomalies
- ANALYZE — auto-investigate any detected anomaly, cross-reference wiki history
- SYNTHESIZE — weekly summaries, monthly baseline updates, quarterly segment refreshes
- LINT — nightly check for contradictions, stale baselines, data source connectivity
Blocker: BQ credentials on the VM. David Wang has agreed to support; deployment scripted and waiting.
KAAOS Daemon¶
The Knowledge Amplification and Autonomous Operations System. Runs on a GCP VM as cron jobs executing headless Claude Code sessions.
What It Does¶
- Hourly monitoring — scans Slack, Linear, GitHub for organizational state changes
- Morning dossier — daily 4:03am PT comprehensive briefing compiled from all sources
- Knowledge management — maintains a persistent knowledge base (
kaaos-knowledge/) that compounds across sessions - Triage — prioritizes incoming information and surfaces what needs attention
Architecture¶
GCP VM (kaaos-daemon)
├── Cron: hourly monitoring (lightweight model)
├── Cron: daily morning dossier (reasoning model, 4:03am PT)
├── MCP connections:
│ ├── Slack (read channels, send messages)
│ ├── Google Drive (read/write docs)
│ ├── Linear (project tracking)
│ └── GitHub (PR/commit activity)
└── Persistent knowledge base:
└── kaaos-knowledge/ (git repo)
Cost: ~$50-80/day
Morning Brief System¶
Designed to replace the daily standup. The system reports to the person, not the person to management.
Five Sections¶
| Section | Source | Human Effort |
|---|---|---|
| SHIPPED | Linear (completed), GitHub (merged PRs), Slack (shipped mentions) | Confirm accuracy |
| METRICS MOVED | Heimdall, BigQuery semantic layer | Flag misleading moves |
| LEARNED | AI-proposed from observed loop closures | Validate or correct (30-60 sec) |
| SHIPPING NEXT | Linear (in-progress), calendar, campaign tools | Correct priority |
| BLOCKERS | Human-written | Only section requiring generation |
Build Sequence¶
| Phase | Scope | Timeline |
|---|---|---|
| 1 | SHIPPED + METRICS MOVED + SHIPPING NEXT (auto-generated) | 2 weeks |
| 2 | LEARNED section (loop-closure detection + interpretation) | 6 weeks |
| 3 | Twice-monthly synthesis + feedback loop + per-person config | 10 weeks |
Cadence¶
- Daily when agents are active and producing output
- Event-driven when agents are sparse (no empty briefs)
- Minimum: if no brief in 5 business days, one line: "No loops closed this week"
- Twice-monthly synthesis on 1st and 15th for leadership
Status: Designed (April 8, 2026). Not yet built.
ReGgie — Live Operations Agent (Production)¶
The most mature proof-of-concept for autonomous operations. Runs on CryptoKitties: All The Zen. Built by Alan Carr.
Architecture¶
Three cron jobs running headless Claude Code (claude -p) on a GCP VM:
| Schedule | Model | Function |
|---|---|---|
| Every 20 min | Sonnet | Game state monitoring — queries Supabase, detects anomalies, auto-investigates before alerting |
| Every hour | Sonnet | Community pulse — reads Slack + Telegram, classifies messages, responds in character ("Felis") |
| Twice daily | Opus | Deep analysis — canonical metrics, player profiles, daily digest |
MCP connections: Supabase, Slack, Telegram (configured in .mcp.json)
Knowledge pattern: Persistent insights directory on disk. Each cron session is stateless; files are the memory. Canonical numbers with staleness tracking, investigation findings with provenance, player profiles with behavioral notes.
Cost: ~$20-30/week
Key Learnings¶
- Prompt files over conversation context. Each cron tick is a fresh session. Smarts live in prompt + insights directory.
- Auto-investigate before alerting. One extra query turns "metric spiked" into "metric spiked because X."
- Community management is the killer app. Response time goes from "whenever a human checks" to "within the hour."
- Sonnet for frequency, Opus for depth. Don't run expensive models on every tick.
The Intelligence Architecture¶
All of these systems are components of a broader intelligence architecture. The target is a company-wide system that senses what's happening, understands what it means, decides what to do, builds the response, and runs it in production.
Five Layers¶
graph TD
S[Sensing Layer] -->|raw signals| C[Context Layer]
C -->|enriched understanding| E[Evaluation Layer]
E -->|decisions + plans| X[Execution Layer]
X -->|deployed changes| O[Operations Layer]
O -->|measured outcomes| S
| Layer | Function | Current Implementation |
|---|---|---|
| Sensing | Detect what's happening — metrics, anomalies, signals | Heimdall health scans, ReGgie monitoring |
| Context | Understand what it means — market research, competitive intel, organizational state | KAAOS knowledge base, Slack/Linear integration |
| Evaluation | Decide what to do — synthesize signals, compose responses, recommend actions | Not yet unified. Fragments in CPO plugin, Delphi plugin. |
| Execution | Build and deploy — campaigns, code, content | Campaign Builder (staging), SWE Pipeline (early) |
| Operations | Manage live products — community, marketing, targeting, support | ReGgie on CK:ATZ (production) |
Loop Levels (Canonical -- 5 Levels)¶
Every loop in the system operates at a level that graduates based on demonstrated competence:
| Level | The Person Does... | The Loop Does... | Graduation Signal |
|---|---|---|---|
| L1 | You do the work | Assists | Can describe "good." Knows failure modes. |
| L2 | You and the loop collaborate | Drafts, you refine | Minor edits >80% of the time |
| L3 | Loop works, you review daily | Plans and executes, escalates exceptions | Intervenes on <20% of outputs |
| L4 | Loop runs, you review weekly | Full cycle autonomously | Decisions match human's >90% |
| L5 | Fully autonomous, you audit monthly | Self-improving | Sustained performance, no drift |
Six Agents (Target Architecture)¶
| Agent | Function | Current State |
|---|---|---|
| Analytics (Heimdall) | Quantitative sensing — metrics, anomalies, opportunities | Plugin built, daemon pending |
| Research (Frigg) | Qualitative sensing — market, competitors, customer empathy | Fragments (X search MCP) |
| Intelligence | Synthesis — composes capabilities into solutions | Does not exist as unified agent |
| Build (Valkyrie) | Execution — campaigns, code, deploys | Campaign Builder on staging, SWE pipeline early |
| Live Ops (Loki) | Operations — community, marketing, support | ReGgie on CK:ATZ (production) |
| Personal Orchestrator (Thor) | Your interface to the system — triage, orchestration, morning digest | CEO instance mature (KAAOS) |
Three Learning Loops (Target)¶
The mechanism that makes the system get smarter, not just bigger:
- Generation Loop (minutes) — generate N variants of any task, score against a matrix, present the best. Not yet implemented.
- Preference Loop (days/weeks) — when a human picks variant 3 over variant 1, the scoring matrix learns. Override tracking exists in Heimdall but doesn't feed back into scoring.
- Impact Loop (weeks/months) — market outcomes calibrate the scoring matrix against reality. Track skill captures predictions vs outcomes, but feedback doesn't flow back to modify generation.
AI & Infra Team¶
| Person | Focus |
|---|---|
| Ben Noyce | Team lead, AI infrastructure, SRE |
| Jim Wheaton | Campaign Builder, SWE pipeline |
| David Wang | Data engineering, BigQuery, SRE |
| Jackson Foley | AI infrastructure |
Separation from Sandbox
Alan Carr moved to Octopus Rodeo (sandbox experiments: CryptoKitties, Miquela). Riptide (agent infrastructure) is a separate sandbox under Jan Bernatik & Navid TehraniFar. AI & Infra is production infrastructure, not research.
Repositories¶
| Repo | Contents |
|---|---|
dapperlabs/dapper-ai |
SWE pipeline |
dapperlabs/atlas-app |
Atlas frontend (Campaign Builder FE) |
dapperlabs/atlas-api |
Atlas backend (Campaign Builder BE) |
This section needs enrichment from the engineering team
Detailed deployment procedures, MCP configuration guides, agent prompt specifications, and SWE pipeline architecture should be documented by the AI & Infra team. The intelligence architecture (six agents, three learning loops) is the target design, not current state.