Featured work
AISWARM
in progress (testing)
Distributed cognitive workforce runtime — Claude, Codex, DeepSeek and local Llama working side-by-side over one codebase, with a2abridge messaging and BrainCore memory underneath.
- Go 1.25
- tmux
- git worktree
- SQLite (modernc.org)
- MCP
- A2A 1.0
- BrainCore
- Bubble Tea
- REST API
- sandbox-exec / bwrap
The problem
A single AI coding agent — no matter how capable — bottlenecks on its own context, its own provider’s outages and its own pricing curve. The honest production pattern is a swarm: several specialised agents working over the same codebase, each picked for a job it does best, with a planner deciding who gets what and a verifier deciding when to escalate.
AISWARM is what fell out of running that pattern for six months and getting tired of duct-tape. It only makes sense as a triad: AISWARM as the runtime, a2abridge as the open A2A 1.0 messaging mesh, and BrainCore as the cognitive memory plane. Each ships on its own; together they form a distributed cognitive workforce.
Architecture
- One pure-Go binary, four providers. Claude Code, Codex, DeepSeek and a local Llama (LM Studio / llama.cpp) run in parallel, each in its own tmux session and its own git worktree, each driven by a uniform RPC contract.
- KindPlanner with 12 typed agent roles. Plans are expanded into roles (planner, scaffolder, refactorer, tester, reviewer, doc-writer, …); a role-aware
auto_chooseroutes tasks 30/30/20/20 across providers (high-stakes → Claude, scaffolding → Llama, tests/docs → Codex, low/medium-stakes glue → DeepSeek). Plans can be re-expanded mid-run via theplan_expandMCP tool. - Anthropic-format CLI shim. A small adapter normalises any OpenAI-compatible backend (DeepSeek, Llama via LM Studio) to Anthropic’s request/response shape, so the rest of the system only speaks one protocol.
- 3-tier verifier. Every change is gated by a composite verifier: tier 1 runs the project’s own command (e.g.
go test ./...), tier 2 replays the change against the integration branch to catch regressions, tier 3 asks an LLM judge for a semantic sign-off. Failures are re-queued with structured feedback attached. - Peer-helper consult (
ASK_PEER). A stuck worker can emit a marker, and the dispatcher spawns a cross-model helper in a new tmux window — read-only, one-shot, hard-capped to 3 consultations per task, with a polling reply file. The original worker keeps its session; the helper writes its answer and exits. - Failover chain + soft pause/resume. If a provider 5xx’s, hits a rate limit or times out, the planner shifts the task down a typed chain. Cancelling a task does not kill its dependants — the cascade pauses and resumes cleanly when the parent is fixed.
- Subtask decomposition over MCP. Big tasks are sliced into atomic units the workers can actually finish in one turn.
- Cost ledger. Tokens, dollars and latency are tracked per provider, per task; the dashboard tells you which provider is paying for itself.
Implementation highlights
- 33 internal packages, 8 binaries — Phase 6 closed. The runtime is past the “moves under load” line: planner, dispatcher, runner, verifier, merger, watchdog, quota, state, mcp, a2abridge client, braincore client, language autodetect, notify, consult, tui, config — all green, lint silent.
- a2abridge underneath. Workers and the orchestrator talk over the open A2A 1.0 protocol — not bespoke pipes — so any external A2A-speaking agent plugs in for free.
- BrainCore (or any MCP memory) as a side-channel. AISWARM deliberately does not embed its own cognitive memory engine. Memory is a side-channel over MCP: point it at your personal
total-agent-memoryserver or at BrainCore — both work. - Hardened workers. Each worker runs inside
sandbox-exec(macOS) orbwrap(Linux), in its own git worktree with.a2a/and.aiswarm/excluded per-worktree so coordination files never end up in a commit. - Detach-by-default daemon.
aiswarm runforks into the background after planning, prints the PID and dashboard URL, returns the shell. A REST API (/api/projects,/api/projects/:id/swarms) and asignal(0)liveness probe let the WebUI respawn a dead daemon from the ▶ button. - Live dashboard, two surfaces. A Bubble Tea TUI for the terminal and a WebUI with A2A and Memory tabs in the browser — both read from the same atomic flock-protected JSON state store.
- Pure-Go, zero CGO. SQLite via
modernc.org/sqlitein~/.aiswarm/aiswarm.db. Single static binary; no Docker required for the orchestrator itself.
Current status
Phase 6 is closed: 33 packages green, lint silent, smoke 2/2 merged, 8 binaries shipping. The repository is still private while the wire protocol between planner, verifier and workers stabilises, and while the cost-routing heuristics are tuned against real-world workloads. Once the orchestration layer reaches the same maturity as a2abridge and BrainCore, AISWARM will be open-sourced under the same MIT licence as its two peers.
Lessons so far
- Swarms don’t fail like a single agent. A single agent gets stuck; a swarm cascades. Half the design budget went into stopping cascades — peer-helper consult, soft pause/resume and the 3-tier verifier all exist for the same reason.
- Picking the right worker is worth more than picking the right model. A 30 % cheaper provider that’s right 95 % of the time on its lane is a better deal than the top model on everything — that’s why
auto_chooseis role-aware, not just model-aware. - A verifier is not optional, and one tier isn’t enough. “Trust the model’s output” is the failure mode that justifies a verifier the next day. “Trust the unit tests” is the failure mode that justifies tier 2 and tier 3 the week after.
- Memory belongs outside the runtime. Embedding a memory engine inside the orchestrator was the obvious mistake. Treating memory as a side-channel over MCP let BrainCore and
total-agent-memoryevolve independently — and let AISWARM ship without becoming a research project.