Skip to content

AI & Infrastructure

REFERENCE | DERIVED | Updated 2026-04-09 | Owner: AI & Infra (Ben Noyce)

AI & Infra is the evolution of SRE/DevOps at Dapper Labs. It owns the AI infrastructure layer that every other org builds on. If it runs in production, breaks at 3am, or needs to scale, it goes through this team. Goal: zero downtime, zero friction, every engineer unblocked.

Scope: SRE -- IT -- Security -- Data (6 people under Ben Noyce)

This page covers the production systems, AI tooling, data science infrastructure, and the intelligence architecture that ties them together.

Systems Overview

System What It Does Status Owner
Campaign Builder WYSIWYG campaign editor on Atlas Staging, 2 PRs pending Jim Wheaton
Heimdall Data science agent wrapping BigQuery Plugin built (15 skills), daemon deployment pending Data team / Product
KAAOS Daemon Knowledge management, morning dossiers, triage Production (CEO instance) AI team
Morning Brief Auto-generated standup replacement Designed, not yet built AI team
ReGgie Live operations agent for CK: All The Zen Production (~$20-30/week) Alan Carr
SWE Pipeline Autonomous coding from Linear issues Early development Jim Wheaton

Campaign Builder

The Campaign Builder enables product and marketing teams to create campaigns on Atlas without engineering deploys. Built by Jim Wheaton.

How It Works

  1. User opens the Campaign Builder (VPN + Auth0 required)
  2. Selects components from the Atlas design system
  3. Arranges them visually — WYSIWYG editing with real-time preview
  4. Under the hood, the layout is stored as JSON
  5. json-render renders the JSON with actual React components on the frontend
  6. Changes are visible immediately (real-time population)

Why It Matters for AI

Because the entire page layout is JSON, an AI agent can:

  • Generate campaign layouts programmatically
  • A/B test different layouts by serving different JSON to different segments
  • Iterate on campaign design based on performance data from Heimdall

This is the bridge between the intelligence layer (sensing what's happening) and the execution layer (doing something about it).

Current Status

  • Working on Atlas staging
  • Demoed to Leon Li, Spencer Bogad, Jordan Wilberding
  • 2 PRs pending for v1 (FE: atlas-app#761, BE: atlas-api#632)
  • VPN-gated on staging and production

Heimdall — Data Science Agent

Heimdall is a Claude Code plugin with 15 skills that wraps BigQuery for analytical work. It adds intelligence on top of raw query execution: question decomposition, statistical reasoning, analytical guardrails, persistent memory, and proactive monitoring.

Skills

Tier Skill Function
0 (cross-cutting) data-guardrails 20 pitfalls as executable checks. Fires on every data claim.
1 (seconds) quick-answer Canonical numbers from insight graph. Routes to deeper skills when not found.
2 (minutes) analyze 20 parameterized query templates. Auto-caveats.
3 (hours) investigate Multi-wave analysis (5-8 queries, 2-3 waves). Matched-control for causal claims.
2-3 simulate Impact projection with historical analogs and confidence bands.
1-3 health-scan Product health monitoring. Quick mode: ~20 metrics x 3 products, traffic-light dashboard. Exhaustive mode: deep dives + insight graph lint.
3 opportunity Proactive discovery — OKR-aligned + unknown unknowns.
Meta benchmark 25 known-answer test cases. Regression detection.
Meta track Measurement plan lifecycle — register predictions, checkpoints, retrospectives.
Meta verify Independent audit + adversarial debate layer.
Per-user user-score BQML per-user churn/upgrade scoring for full population.
Per-user whale-watch LLM narrative whale assessment for top ~108 XL users.

The Insight Graph

Heimdall's persistent knowledge store. Unlike conversation history that disappears, the insight graph compounds:

research-reports/data-science-insights/
  index.md                    # Catalog of all findings
  log.md                      # Activity log
  findings/                   # 22 seed pages + auto-investigation findings
  health-scans/               # Scan reports + benchmark log
  opportunities/              # CDO-quality opportunity reports
  tracking/                   # Measurement plan lifecycle pages
  numbers/
    canonical-numbers.md      # 100+ verified entries
    superseded.md             # Retired numbers + evolution log
    blacklist.md              # 6 known-wrong numbers

Daemon Architecture (Target)

Heimdall as a persistent daemon on a GCP VM, running on a 30-minute cycle:

  1. INGEST — pull latest metrics, compare against wiki baselines, detect anomalies
  2. ANALYZE — auto-investigate any detected anomaly, cross-reference wiki history
  3. SYNTHESIZE — weekly summaries, monthly baseline updates, quarterly segment refreshes
  4. LINT — nightly check for contradictions, stale baselines, data source connectivity

Blocker: BQ credentials on the VM. David Wang has agreed to support; deployment scripted and waiting.

KAAOS Daemon

The Knowledge Amplification and Autonomous Operations System. Runs on a GCP VM as cron jobs executing headless Claude Code sessions.

What It Does

  • Hourly monitoring — scans Slack, Linear, GitHub for organizational state changes
  • Morning dossier — daily 4:03am PT comprehensive briefing compiled from all sources
  • Knowledge management — maintains a persistent knowledge base (kaaos-knowledge/) that compounds across sessions
  • Triage — prioritizes incoming information and surfaces what needs attention

Architecture

GCP VM (kaaos-daemon)
  ├── Cron: hourly monitoring (lightweight model)
  ├── Cron: daily morning dossier (reasoning model, 4:03am PT)
  ├── MCP connections:
  │   ├── Slack (read channels, send messages)
  │   ├── Google Drive (read/write docs)
  │   ├── Linear (project tracking)
  │   └── GitHub (PR/commit activity)
  └── Persistent knowledge base:
      └── kaaos-knowledge/ (git repo)

Cost: ~$50-80/day

Morning Brief System

Designed to replace the daily standup. The system reports to the person, not the person to management.

Five Sections

Section Source Human Effort
SHIPPED Linear (completed), GitHub (merged PRs), Slack (shipped mentions) Confirm accuracy
METRICS MOVED Heimdall, BigQuery semantic layer Flag misleading moves
LEARNED AI-proposed from observed loop closures Validate or correct (30-60 sec)
SHIPPING NEXT Linear (in-progress), calendar, campaign tools Correct priority
BLOCKERS Human-written Only section requiring generation

Build Sequence

Phase Scope Timeline
1 SHIPPED + METRICS MOVED + SHIPPING NEXT (auto-generated) 2 weeks
2 LEARNED section (loop-closure detection + interpretation) 6 weeks
3 Twice-monthly synthesis + feedback loop + per-person config 10 weeks

Cadence

  • Daily when agents are active and producing output
  • Event-driven when agents are sparse (no empty briefs)
  • Minimum: if no brief in 5 business days, one line: "No loops closed this week"
  • Twice-monthly synthesis on 1st and 15th for leadership

Status: Designed (April 8, 2026). Not yet built.

ReGgie — Live Operations Agent (Production)

The most mature proof-of-concept for autonomous operations. Runs on CryptoKitties: All The Zen. Built by Alan Carr.

Architecture

Three cron jobs running headless Claude Code (claude -p) on a GCP VM:

Schedule Model Function
Every 20 min Sonnet Game state monitoring — queries Supabase, detects anomalies, auto-investigates before alerting
Every hour Sonnet Community pulse — reads Slack + Telegram, classifies messages, responds in character ("Felis")
Twice daily Opus Deep analysis — canonical metrics, player profiles, daily digest

MCP connections: Supabase, Slack, Telegram (configured in .mcp.json)

Knowledge pattern: Persistent insights directory on disk. Each cron session is stateless; files are the memory. Canonical numbers with staleness tracking, investigation findings with provenance, player profiles with behavioral notes.

Cost: ~$20-30/week

Key Learnings

  1. Prompt files over conversation context. Each cron tick is a fresh session. Smarts live in prompt + insights directory.
  2. Auto-investigate before alerting. One extra query turns "metric spiked" into "metric spiked because X."
  3. Community management is the killer app. Response time goes from "whenever a human checks" to "within the hour."
  4. Sonnet for frequency, Opus for depth. Don't run expensive models on every tick.

The Intelligence Architecture

All of these systems are components of a broader intelligence architecture. The target is a company-wide system that senses what's happening, understands what it means, decides what to do, builds the response, and runs it in production.

Five Layers

graph TD
    S[Sensing Layer] -->|raw signals| C[Context Layer]
    C -->|enriched understanding| E[Evaluation Layer]
    E -->|decisions + plans| X[Execution Layer]
    X -->|deployed changes| O[Operations Layer]
    O -->|measured outcomes| S
Layer Function Current Implementation
Sensing Detect what's happening — metrics, anomalies, signals Heimdall health scans, ReGgie monitoring
Context Understand what it means — market research, competitive intel, organizational state KAAOS knowledge base, Slack/Linear integration
Evaluation Decide what to do — synthesize signals, compose responses, recommend actions Not yet unified. Fragments in CPO plugin, Delphi plugin.
Execution Build and deploy — campaigns, code, content Campaign Builder (staging), SWE Pipeline (early)
Operations Manage live products — community, marketing, targeting, support ReGgie on CK:ATZ (production)

Loop Levels (Canonical -- 5 Levels)

Every loop in the system operates at a level that graduates based on demonstrated competence:

Level The Person Does... The Loop Does... Graduation Signal
L1 You do the work Assists Can describe "good." Knows failure modes.
L2 You and the loop collaborate Drafts, you refine Minor edits >80% of the time
L3 Loop works, you review daily Plans and executes, escalates exceptions Intervenes on <20% of outputs
L4 Loop runs, you review weekly Full cycle autonomously Decisions match human's >90%
L5 Fully autonomous, you audit monthly Self-improving Sustained performance, no drift

Six Agents (Target Architecture)

Agent Function Current State
Analytics (Heimdall) Quantitative sensing — metrics, anomalies, opportunities Plugin built, daemon pending
Research (Frigg) Qualitative sensing — market, competitors, customer empathy Fragments (X search MCP)
Intelligence Synthesis — composes capabilities into solutions Does not exist as unified agent
Build (Valkyrie) Execution — campaigns, code, deploys Campaign Builder on staging, SWE pipeline early
Live Ops (Loki) Operations — community, marketing, support ReGgie on CK:ATZ (production)
Personal Orchestrator (Thor) Your interface to the system — triage, orchestration, morning digest CEO instance mature (KAAOS)

Three Learning Loops (Target)

The mechanism that makes the system get smarter, not just bigger:

  1. Generation Loop (minutes) — generate N variants of any task, score against a matrix, present the best. Not yet implemented.
  2. Preference Loop (days/weeks) — when a human picks variant 3 over variant 1, the scoring matrix learns. Override tracking exists in Heimdall but doesn't feed back into scoring.
  3. Impact Loop (weeks/months) — market outcomes calibrate the scoring matrix against reality. Track skill captures predictions vs outcomes, but feedback doesn't flow back to modify generation.

AI & Infra Team

Person Focus
Ben Noyce Team lead, AI infrastructure, SRE
Jim Wheaton Campaign Builder, SWE pipeline
David Wang Data engineering, BigQuery, SRE
Jackson Foley AI infrastructure

Separation from Sandbox

Alan Carr moved to Octopus Rodeo (sandbox experiments: CryptoKitties, Miquela). Riptide (agent infrastructure) is a separate sandbox under Jan Bernatik & Navid TehraniFar. AI & Infra is production infrastructure, not research.

Repositories

Repo Contents
dapperlabs/dapper-ai SWE pipeline
dapperlabs/atlas-app Atlas frontend (Campaign Builder FE)
dapperlabs/atlas-api Atlas backend (Campaign Builder BE)

This section needs enrichment from the engineering team

Detailed deployment procedures, MCP configuration guides, agent prompt specifications, and SWE pipeline architecture should be documented by the AI & Infra team. The intelligence architecture (six agents, three learning loops) is the target design, not current state.