hivemind v0.6 · internal preview
https://hivemind.wandb.tools
Agent engineering
HiveMind
A shared layer of understanding for how we build with agents.
P&E All-Hands · May 2026
Presented by CVP
by the numbers
02 / 29
Four months in.
176K
coding sessions captured
across 162 engineers · 260 devices · 820 repos
15M+
lines of code written by agents
28,923
sessions linked to a PR
4.3M
tool calls observed
SESSIONS PER MONTH
Explosive growth 💥
73,604
apr 2026
Jan '26Feb '26Mar '26Apr '26
SESSIONS BY AGENT
Claude
140,328
Cursor
18,082
Codex
16,757
OpenCode
772
Gemini
79
first commit · jan 2, 2026  ·  912 commits
HIVEMIND
context
03 / 29
What it means to develop and maintain software has
fundamentally changed.
the new reality
HIVEMIND
why we're building this
04 / 29
Why HiveMind
01
You can't improve what you don't measure.
No data on agent usage means we're guessing.
02
Sharing skills and techniques should be easy.
Your best workflow shouldn't die in your terminal.
03
Visibility matters more as you run more agents.
More concurrent sessions, more surface area.
03
audience
05 / 29
Who it's for
Everyone who ships code here.
IC
Engineers who want to get better at prompting.
See how teammates solve similar problems.
STAFF
Staff+ who want to share patterns across teams.
Individual tricks → team practices.
LEAD
Leads and managers who need to
understand their team.
Where's the work? What's blocking?
positioning
06 / 29
Two products. Two audiences.
How is HiveMind
different from Weave?
WEAVE
For teams putting AI into their products.
Instrument the AI features you ship: evals, traces, prompts.
user → your AI feature
HIVEMIND
For teams using AI to build their products.
Instrument how your agents write code: sessions, tools, what ships.
engineer → AI agent → code
section
— 01 —
PART ONE
Product
overview.
architecture
·
privacy
·
features
01 · architecture
08 / 29
Architecture
Dumb daemon. Smart backend.
ON YOUR MACHINE
hivemind daemon
claude code
cursor
codex
gemini
opencode
Discovers sessions. Ships raw JSONL. Redacts secrets.
INGEST
FASTAPI
normalize & store
→ AG-UI events
agentstream adapters
raw + normalized
Raw sessions in. Normalized AG-UI events out.
QUERY
DATA + UI
ClickHouse + React
sessions · turns · tools
PRs · $$$ · trajectories
search index
dashboard
Fast analytics on everything the team has done.
// daemon is intentionally dumb — only discovery, redaction, and syncing infra: clickhouse · fastapi · terraform
02 · trust model · privacy
09 / 29
Privacy · 1 of 3
Sessions inherit GitHub's permissions.
No new ACL system. The repo decides the audience.
SESSION GITHUB MATCH VISIBILITY
01
~/wandb-core
refactor inference loop
github.com/wandb/core
org repo · team has read
TEAM-VISIBLE
Anyone with repo read.
02
~/wandb-secret-spike
deploy hotfix to staging
github.com/wandb/secret-spike
private repo · 4 collaborators
REPO-LOCKED
Only repo collaborators.
03
~/scratch/notes
draft a board update
— no git remote —
unmatched
SOLO
Only you. Always.
// override at any layer
mark any session private disable sharing per-repo disable sharing per-user
02 · trust model · security
10 / 29
Security · 2 of 3
Three gates. Each fails closed.
Device, identity, network — independently revocable.
GATE 01 · DEVICE
Keychain-stored, auto-rotated.
Short-lived tokens. Never on disk in plaintext.
macOS Keychain · libsecret
24-hour rotation
instant dashboard revoke
GATE 02 · IDENTITY
SSO & SCIM, straight from your IdP.
Membership and offboarding flow from Okta. No parallel directory.
Okta · Azure AD · Google
SCIM auto-deprovision
group-scoped access
GATE 03 · NETWORK
Private VPC, fully terraformed.
Data plane behind private connect. No public ingress.
VPC + private connect
infra in version control
audit log on every read
PERIMETER Pull a token, an SSO group, or VPC peering — the next request stops at the door.
02 · trust model · sensitive data
11 / 29
Sensitive data · 3 of 3
Minimizing sensitive data exposure.
Daemon scrubs trajectories on-device, before any payload is shipped.
// on your laptop RAW
# trajectory.jsonl — pre-egress
cmd : curl -H "Bearer sk_live_8f3d92ab"
env : ANTHROPIC_KEY=sk-ant-9a3…
       DATABASE_URL=postgres://p@ss
       AWS_SECRET=wJalrXUtnFEMI…
diff: + STRIPE_KEY=sk_live_K3y…
// over the wire → EGRESSED
# trajectory.jsonl — post-egress
cmd : curl -H "Bearer [REDACTED]"
env : ANTHROPIC_KEY=[REDACTED]
       DATABASE_URL=[REDACTED]
       AWS_SECRET=[REDACTED]
diff: + STRIPE_KEY=[REDACTED]
SCRUBBED ON-DEVICE
API keys, env vars, auth headers, and high-entropy strings.
regex + entropy configurable allowlist
COMING NEXT
Org-wide archive policy and enhanced AI PII redaction.
archival · soon AI PII redaction · soon
03 · features
12 / 29
Features
Six surfaces. One shared brain.
OVERVIEW
Activity feed
Your week — shipped, open, teammates' work.
LIVE
In-flight sessions
Who's mid-session right now. Tail if shared.
INSIGHTS
Trajectory analysis
Where agents fail. What to fix.
LEADERBOARD
Cross-team stats
Who's shipping. Where. (V2 en route.)
USAGE
Cost & tokens
Spend per user, repo, model. Cost per merged PR.
SESSIONS
Full replay & fork
Every turn. Searchable. Shareable. Resumable.
section
— 02 —
PART TWO
Demo.
live from my laptop · no slides · YOLO
section
— 03 —
PART THREE
What we've
learned.
four months of dogfooding · seven lessons
what we've learned
15 / 29
Seven Learnings
01
Rich trajectories are a data flywheel.
A dataset that didn't exist before.
02
Seeing teammates' sessions is legitimately useful.
"How would Tim do this?" — one click away.
03
Live + Overview unlock parallel work streams.
Tail one session while another runs.
04
The leaderboard is loved — and tricky.
Rewards burn, not outcomes.
05
Fork is a lifesaver when Anthropic is down.
One click to hand off a stuck session.
06
Our bottleneck is QA and code review.
HiveMind can help.
07
The Agent is a User.
hivemind --help FTW.
learning · 01
16 / 29
The thing I'm sure of
The full trajectory is a data flywheel.
RAW SESSION ↓ ENRICHED ON INGEST
PROMPT
user turn
bug feature fix refactor
THOUGHT
reasoning
intent
TOOL
edit · bash · grep
skill: frontend-design subagent: code-search ✕ exit 1
DIFF
code change
files · langs
PR
shipped (or not)
merged reverted?
Infinite agents. One shape. Already changing how we debug, teach, and build tooling.
learning · 02 · the flywheel in action
17 / 29
Introducing Soul Stealer.
Mines a teammate's sessions; produces a sub-agent that talks like them. Built by a user, not us.
~/skills · soul-stealer
$ /soul-stealer tssweeney "Tim Sweeney"
→ spawning hivemind sub-agent…
→ analyzing 47 sessions · 1,284 prompts
→ clustering behavioral modes…
→ extracting negative space (rejections, reversals)…
→ fingerprinting vocabulary…
✓ ~/Desktop/skills/talk-to-tim/SKILL.md
$ /talk-to-tim "should this be a class or a function?"
Tim: Why is this a class? Flatten it. We'll add structure when there's a second caller.
TS
Tim Sweeney
@tssweeney · popular soul
composition over inheritance flatten until proven otherwise blast radius first
premature abstraction "sloppy" indirection without payoff
"thoughts?" "first-class param" "surface area"
// we did not build this. we built the substrate. the org built the skill.
learning · 03 · the memory aid
18 / 29
New feature
My week, at a glance.
Your week at a glance — overview page
overview.hivemind / your-week
// not a dashboard. a memory aid.
learning · 04 · the hard one
19 / 29
The hard one
The leaderboard is loved.
It also rewards the wrong thing.
V1 · TODAY
token count
Rewards burn. Ignores whether the code shipped.
V2 · NEXT
outcomes
Merged PRs · 30-day survival · cost per merged line.
// incentives shape behavior. if we get the scoreboard wrong, the players optimize for the wrong game.
learning · 05 · the panic button
20 / 29
When the model goes down
Fork is a lifesaver when Anthropic is on
Fork session dialog
// we don't control Anthropic's uptime. we do control how fast you keep going.
learning · 06 · the bottleneck
21 / 29
The new bottleneck
QA & code review is where the time goes.
SHIPPED · THUMBNAILS in any session view
Skim a session at a glance.
HiveMind session viewer with inline thumbnails of each generated artifact
NEW · AI WALKTHROUGH session/4f9c2…
Read the agent before the diff.
00:42
SUMMARY
Refactored inference loop · 4 files.
02:18
WORTH A LOOK
Silent fallback at batch.py:142.
06:55
SELF-CORRECTED
Reverted breaking change after test failed.
11:03
SKIPPED
Legacy-path tests not run.
// review the trajectory · not the patch ~15 min → ~3 min
// QA is the real bottleneck. we're tooling it.
learning · 07 · cli first
22 / 29
Built for them, used by them
The agent is a user.
We design for humans and for the agent reading their terminal. Both want the same thing: a clean CLI.
~/projects · hivemind --help
$ hivemind --help
Usage: hivemind [OPTIONS] COMMAND [ARGS]…

  Hivemind — sync agentic coding sessions to the cloud.

Sessions:
  import      Import sessions from local agent history.
  transcript  Fetch and display a session transcript.
  search      Search sessions or list recent ones.
  fork        Fork work from a previous session.
  insights    Manage contextual intelligence suggestions.
  export      Export your data as Parquet or DuckDB.
AUTO-INSTALLED
@hivemind subagent.
Every login drops a Claude Code subagent into ~/.claude/agents. Mention it and it knows your sessions, your forks, your skills.
@hivemind find sessions where I used the frontend-design skill last week → calling hivemind search … → 4 matches · ranked by recency ✓ ready to fork or summarize
claude code cursor · soon codex · soon
// good CLI · readable by both humans and the agent reading their terminal.
section
— 04 —
PART FOUR
Where
it's going.
three bets · next quarter
roadmap · q2
24 / 29
Where we're going
Three bets. One quarter.
01
Service agents
Agents that work while you sleep.
Async agents on PR comments, alerts, on-call pages. Federated identity. Sandboxes included.
02
Insights V2
A classifier that earns its keep.
Rebuilt on Weave evals. Repo-wide patterns. Insights → PR in one command.
03
Code quality
Did the agent's code matter?
Track what merges, what survives, what gets reverted. Receipts for the spend.
// each one gets its own slide. let's go.
roadmap · bet 01 · service agents
25 / 29
01
Bet · 01 · Service Agents
Agents that work while you sleep.
Same shape. Same insights. No human in the loop.
// trigger
An event happens.
  • PR comment
  • Sentry / pagerduty
  • Slack mention
  • Cron / schedule
// run
An agent runs — no secrets.
  • WIF: GH Actions, k8s, Modal
  • Short-lived tokens
  • Your sandbox or ours
// tracked
It shows up in the feed.
  • Same trajectory shape
  • Same insights, same leaderboard
  • Replays, redaction, privacy
// every event becomes a session. every session becomes a learning.
roadmap · bet 02 · insights v2
26 / 29
02
Bet · 02 · Insights V2
A classifier that earns its keep.
Trajectories → "here's what's broken." Today, noisy. Next, measured.
V1 · today
Noisy.
  • One-shot, no eval rigor.
  • False positives drown signal.
  • Single-session issues, not repeating ones.
  • Insights you read — not act on.
// we're eating Weave's dog food. it's the only way this gets better.
roadmap · bet 03 · code quality
27 / 29
03
Bet · 03 · Code Quality
Did the agent's code matter?
Generation is cheap. Survival is the metric.
t = 0
generated
agent writes a diff
t + min
PR opened
does it pass CI?
t + day
merged · or not
how much did review rewrite?
t + 30d
still here
replaced? reverted? rewritten?
t + 90d
survived
in prod, shipping value
payoff · engineers
A leaderboard that rewards survival, not burn.
Cost per merged line. Code still in main after 30 days.
payoff · stakeholders
A clear answer to "what is the AI capex buying?"
PRs shipped, code surviving — by repo, team, quarter.
// generated is cheap. survived is the metric.
closing
28 / 29
Great tools are part of our DNA.
HiveMind is how we stay ahead in
a landscape that's accelerating.
the key message
HIVEMIND
questions
29 / 29
Over to you
Questions.
SLACK
#hivemind
REPO
wandb/agentstream
INSTALL
brew install wandb/taps/hivemind
SITE
https://hivemind.wandb.tools
thanks for listening
HIVEMIND