Featured work

AISWARM

in progress (testing)

Distributed cognitive workforce runtime — Claude, Codex, DeepSeek and local Llama working side-by-side over one codebase, with a2abridge messaging and BrainCore memory underneath.

Go 1.25
tmux
git worktree
SQLite (modernc.org)
MCP
A2A 1.0
BrainCore
Bubble Tea
REST API
sandbox-exec / bwrap

← Featured

The problem

A single AI coding agent — no matter how capable — bottlenecks on its own context, its own provider’s outages and its own pricing curve. The honest production pattern is a swarm: several specialised agents working over the same codebase, each picked for a job it does best, with a planner deciding who gets what and a verifier deciding when to escalate.

AISWARM is what fell out of running that pattern for six months and getting tired of duct-tape. It only makes sense as a triad: AISWARM as the runtime, a2abridge as the open A2A 1.0 messaging mesh, and BrainCore as the cognitive memory plane. Each ships on its own; together they form a distributed cognitive workforce.

Architecture

One pure-Go binary, four providers. Claude Code, Codex, DeepSeek and a local Llama (LM Studio / llama.cpp) run in parallel, each in its own tmux session and its own git worktree, each driven by a uniform RPC contract.
KindPlanner with 12 typed agent roles. Plans are expanded into roles (planner, scaffolder, refactorer, tester, reviewer, doc-writer, …); a role-aware auto_choose routes tasks 30/30/20/20 across providers (high-stakes → Claude, scaffolding → Llama, tests/docs → Codex, low/medium-stakes glue → DeepSeek). Plans can be re-expanded mid-run via the plan_expand MCP tool.
Anthropic-format CLI shim. A small adapter normalises any OpenAI-compatible backend (DeepSeek, Llama via LM Studio) to Anthropic’s request/response shape, so the rest of the system only speaks one protocol.
3-tier verifier. Every change is gated by a composite verifier: tier 1 runs the project’s own command (e.g. go test ./...), tier 2 replays the change against the integration branch to catch regressions, tier 3 asks an LLM judge for a semantic sign-off. Failures are re-queued with structured feedback attached.
Peer-helper consult (ASK_PEER). A stuck worker can emit a marker, and the dispatcher spawns a cross-model helper in a new tmux window — read-only, one-shot, hard-capped to 3 consultations per task, with a polling reply file. The original worker keeps its session; the helper writes its answer and exits.
Failover chain + soft pause/resume. If a provider 5xx’s, hits a rate limit or times out, the planner shifts the task down a typed chain. Cancelling a task does not kill its dependants — the cascade pauses and resumes cleanly when the parent is fixed.
Subtask decomposition over MCP. Big tasks are sliced into atomic units the workers can actually finish in one turn.
Cost ledger. Tokens, dollars and latency are tracked per provider, per task; the dashboard tells you which provider is paying for itself.

Implementation highlights

33 internal packages, 8 binaries — Phase 6 closed. The runtime is past the “moves under load” line: planner, dispatcher, runner, verifier, merger, watchdog, quota, state, mcp, a2abridge client, braincore client, language autodetect, notify, consult, tui, config — all green, lint silent.
a2abridge underneath. Workers and the orchestrator talk over the open A2A 1.0 protocol — not bespoke pipes — so any external A2A-speaking agent plugs in for free.
BrainCore (or any MCP memory) as a side-channel. AISWARM deliberately does not embed its own cognitive memory engine. Memory is a side-channel over MCP: point it at your personal total-agent-memory server or at BrainCore — both work.
Hardened workers. Each worker runs inside sandbox-exec (macOS) or bwrap (Linux), in its own git worktree with .a2a/ and .aiswarm/ excluded per-worktree so coordination files never end up in a commit.
Detach-by-default daemon. aiswarm run forks into the background after planning, prints the PID and dashboard URL, returns the shell. A REST API (/api/projects, /api/projects/:id/swarms) and a signal(0) liveness probe let the WebUI respawn a dead daemon from the ▶ button.
Live dashboard, two surfaces. A Bubble Tea TUI for the terminal and a WebUI with A2A and Memory tabs in the browser — both read from the same atomic flock-protected JSON state store.
Pure-Go, zero CGO. SQLite via modernc.org/sqlite in ~/.aiswarm/aiswarm.db. Single static binary; no Docker required for the orchestrator itself.

Current status

Phase 6 is closed: 33 packages green, lint silent, smoke 2/2 merged, 8 binaries shipping. The repository is still private while the wire protocol between planner, verifier and workers stabilises, and while the cost-routing heuristics are tuned against real-world workloads. Once the orchestration layer reaches the same maturity as a2abridge and BrainCore, AISWARM will be open-sourced under the same MIT licence as its two peers.

Lessons so far

Swarms don’t fail like a single agent. A single agent gets stuck; a swarm cascades. Half the design budget went into stopping cascades — peer-helper consult, soft pause/resume and the 3-tier verifier all exist for the same reason.
Picking the right worker is worth more than picking the right model. A 30 % cheaper provider that’s right 95 % of the time on its lane is a better deal than the top model on everything — that’s why auto_choose is role-aware, not just model-aware.
A verifier is not optional, and one tier isn’t enough. “Trust the model’s output” is the failure mode that justifies a verifier the next day. “Trust the unit tests” is the failure mode that justifies tier 2 and tier 3 the week after.
Memory belongs outside the runtime. Embedding a memory engine inside the orchestrator was the obvious mistake. Treating memory as a side-channel over MCP let BrainCore and total-agent-memory evolve independently — and let AISWARM ship without becoming a research project.