The Problem With Cloud AI Assistants
Every major AI assistant follows the same playbook: your messages go to a cloud server, get processed, and responses come back. You're a tenant in someone else's infrastructure, subject to their rate limits, their privacy policies, their uptime.
Clawdbot inverts this model entirely. It's a local-first gateway that runs on your machine, connects to your messaging accounts, and routes conversations to AI agents under your control. The architecture treats messaging platforms as interchangeable protocols and AI models as swappable backends—what remains constant is your infrastructure, your data, your rules.
This isn't a weekend hack. The codebase spans 40,000+ lines of TypeScript, supports 29 messaging channels through a plugin system, and implements patterns you rarely see outside distributed systems: lane-based concurrency, cascading route resolution, cross-channel identity linking, and human-in-the-loop approval gating.
Let's examine what makes it work.
1. Lane-Based Concurrency: Preventing Starvation by Design
Most async systems use a single priority queue. High-priority tasks run first; low-priority tasks wait. The problem: a burst of medium-priority work can starve everything below it indefinitely.
Clawdbot takes a different approach. Work is partitioned into lanes—orthogonal queues that operate independently with separate concurrency limits.
From src/process/lanes.ts:
export const enum CommandLane {
Main = "main", // Primary chat workflow
Cron = "cron", // Scheduled jobs
Subagent = "subagent", // Child agent spawning
Nested = "nested", // Nested tool calls
}Each lane maintains its own queue and active count. The implementation in src/process/command-queue.ts is elegant:
type LaneState = {
lane: string;
queue: QueueEntry[];
active: number;
maxConcurrent: number;
draining: boolean;
};
const lanes = new Map<string, LaneState>();
export function enqueueCommandInLane<T>(
lane: string,
task: () => Promise<T>,
opts?: { warnAfterMs?: number; onWait?: (waitMs: number, queuedAhead: number) => void },
): Promise<T> {
const state = getLaneState(lane);
return new Promise<T>((resolve, reject) => {
state.queue.push({
task: () => task(),
resolve: (value) => resolve(value as T),
reject,
enqueuedAt: Date.now(),
warnAfterMs: opts?.warnAfterMs ?? 2_000,
onWait: opts?.onWait,
});
drainLane(lane);
});
}The drain function pumps tasks until the lane hits its concurrency limit:
function drainLane(lane: string) {
const state = getLaneState(lane);
if (state.draining) return;
state.draining = true;
const pump = () => {
while (state.active < state.maxConcurrent && state.queue.length > 0) {
const entry = state.queue.shift()!;
state.active += 1;
void (async () => {
try {
const result = await entry.task();
state.active -= 1;
pump(); // Recursive drain
entry.resolve(result);
} catch (err) {
state.active -= 1;
pump();
entry.reject(err);
}
})();
}
state.draining = false;
};
pump();
}The gateway configures these limits from user settings in src/gateway/server-lanes.ts:
export function applyGatewayLaneConcurrency(cfg: ReturnType<typeof loadConfig>) {
setCommandLaneConcurrency(CommandLane.Cron, cfg.cron?.maxConcurrentRuns ?? 1);
setCommandLaneConcurrency(CommandLane.Main, resolveAgentMaxConcurrent(cfg));
setCommandLaneConcurrency(CommandLane.Subagent, resolveSubagentMaxConcurrent(cfg));
}A scheduled email digest running in the Cron lane cannot block incoming WhatsApp messages in the Main lane. Subagent spawning has its own budget. Lanes don't compete—they coexist. This is starvation-free by construction, not by tuning.
2. The Channel Plugin System: Protocol as Commodity
Clawdbot supports WhatsApp, Telegram, Signal, Discord, Slack, iMessage, Matrix, Microsoft Teams, LINE, Nostr, Google Chat, Twitch, Mattermost, and more. Each platform has radically different APIs, authentication flows, message formats, and capabilities.
Rather than building monolithic handlers, the codebase defines a plugin contract that normalizes all channels. From src/channels/plugins/types.plugin.ts:
export type ChannelPlugin<ResolvedAccount = any> = {
id: ChannelId;
meta: ChannelMeta;
capabilities: ChannelCapabilities;
// Authentication & setup
config: ChannelConfigAdapter<ResolvedAccount>;
setup?: ChannelSetupAdapter;
auth?: ChannelAuthAdapter;
// Security policies
pairing?: ChannelPairingAdapter;
security?: ChannelSecurityAdapter<ResolvedAccount>;
// Messaging primitives
outbound?: ChannelOutboundAdapter;
messaging?: ChannelMessagingAdapter;
streaming?: ChannelStreamingAdapter;
threading?: ChannelThreadingAdapter;
actions?: ChannelMessageActionAdapter;
// Gateway integration
gateway?: ChannelGatewayAdapter<ResolvedAccount>;
// Agent tools (channel-specific capabilities)
agentTools?: ChannelAgentToolFactory | ChannelAgentTool[];
};Each adapter is optional—channels implement what they support. The capabilities declaration tells the system what's available:
export type ChannelCapabilities = {
chatTypes: Array<NormalizedChatType | "thread">;
polls?: boolean;
reactions?: boolean;
edit?: boolean;
unsend?: boolean;
reply?: boolean;
threads?: boolean;
media?: boolean;
blockStreaming?: boolean;
};Adding a new channel requires minimal boilerplate. Here's the complete Matrix extension from extensions/matrix/index.ts:
import type { ClawdbotPluginApi } from "clawdbot/plugin-sdk";
import { emptyPluginConfigSchema } from "clawdbot/plugin-sdk";
import { matrixPlugin } from "./src/channel.js";
import { setMatrixRuntime } from "./src/runtime.js";
const plugin = {
id: "matrix",
name: "Matrix",
description: "Matrix channel plugin (matrix-js-sdk)",
configSchema: emptyPluginConfigSchema(),
register(api: ClawdbotPluginApi) {
setMatrixRuntime(api.runtime);
api.registerChannel({ plugin: matrixPlugin });
},
};
export default plugin;The plugin SDK exports 100+ types and utilities, including channel-specific helpers for normalizing targets, resolving accounts, and handling onboarding—patterns extracted from production channels that extension authors can reuse.
Messaging protocols become commodities. The gateway doesn't care if a message came from WhatsApp or Matrix or Nostr. It cares about the normalized event, the resolved route, and the agent that should handle it.
3. The Routing Cascade: From Message to Agent
When a message arrives, Clawdbot must decide which agent handles it. The routing system implements a cascade of matching strategies with precise precedence.
From src/routing/resolve-route.ts:
export type ResolvedAgentRoute = {
agentId: string;
channel: string;
accountId: string;
sessionKey: string;
mainSessionKey: string;
matchedBy:
| "binding.peer" // Specific sender matched
| "binding.guild" // Discord server matched
| "binding.team" // MS Teams team matched
| "binding.account" // Account-level binding
| "binding.channel" // Channel-wide wildcard
| "default"; // Fallback agent
};The resolution function filters bindings by channel and account, then applies matches in order:
export function resolveAgentRoute(input: ResolveAgentRouteInput): ResolvedAgentRoute {
const bindings = listBindings(input.cfg).filter((binding) => {
if (!matchesChannel(binding.match, channel)) return false;
return matchesAccountId(binding.match?.accountId, accountId);
});
// 1. Peer match (most specific)
if (peer) {
const peerMatch = bindings.find((b) => matchesPeer(b.match, peer));
if (peerMatch) return choose(peerMatch.agentId, "binding.peer");
}
// 2. Guild match (Discord servers)
if (guildId) {
const guildMatch = bindings.find((b) => matchesGuild(b.match, guildId));
if (guildMatch) return choose(guildMatch.agentId, "binding.guild");
}
// 3. Team match (MS Teams)
if (teamId) {
const teamMatch = bindings.find((b) => matchesTeam(b.match, teamId));
if (teamMatch) return choose(teamMatch.agentId, "binding.team");
}
// 4. Account-level fallback
const accountMatch = bindings.find((b) =>
b.match?.accountId?.trim() !== "*" &&
!b.match?.peer && !b.match?.guildId && !b.match?.teamId
);
if (accountMatch) return choose(accountMatch.agentId, "binding.account");
// 5. Channel wildcard
const anyAccountMatch = bindings.find((b) =>
b.match?.accountId?.trim() === "*" &&
!b.match?.peer && !b.match?.guildId && !b.match?.teamId
);
if (anyAccountMatch) return choose(anyAccountMatch.agentId, "binding.channel");
// 6. Default
return choose(resolveDefaultAgentId(input.cfg), "default");
}Session Keys and Identity Linking
Session continuity is managed through structured keys. From src/routing/session-key.ts:
export function buildAgentPeerSessionKey(params: {
agentId: string;
channel: string;
peerKind?: "dm" | "group" | "channel" | null;
peerId?: string | null;
identityLinks?: Record<string, string[]>;
dmScope?: "main" | "per-peer" | "per-channel-peer";
}): string {
const peerKind = params.peerKind ?? "dm";
if (peerKind === "dm") {
const dmScope = params.dmScope ?? "main";
let peerId = (params.peerId ?? "").trim();
// Resolve cross-channel identity links
const linkedPeerId = dmScope === "main" ? null : resolveLinkedPeerId({
identityLinks: params.identityLinks,
channel: params.channel,
peerId,
});
if (linkedPeerId) peerId = linkedPeerId;
if (dmScope === "per-channel-peer" && peerId) {
return `agent:\${normalizeAgentId(params.agentId)}:\${channel}:dm:\${peerId}`;
}
if (dmScope === "per-peer" && peerId) {
return `agent:\${normalizeAgentId(params.agentId)}:dm:\${peerId}`;
}
return buildAgentMainSessionKey({ agentId: params.agentId });
}
return `agent:\${normalizeAgentId(params.agentId)}:\${channel}:\${peerKind}:\${peerId}`;
}The identityLinks configuration allows mapping a single person across channels:
session:
dmScope: per-peer
identityLinks:
alice:
- "whatsapp:+15551234567"
- "telegram:alice_smith"
- "signal:+15551234567"Now conversations with Alice share context whether she messages via WhatsApp, Telegram, or Signal.
The session key is a canonical address for conversation state. By structuring it as agent:{id}:{scope}:{peer}, the system achieves both isolation (different agents, different contexts) and continuity (same person across channels).
4. The Gateway: 84 Methods for Personal Infrastructure
The gateway server exposes 84+ RPC methods over WebSocket. From src/gateway/server-methods-list.ts:
const BASE_METHODS = [
// Health & status
"health", "status", "channels.status",
// Configuration
"config.get", "config.set", "config.apply", "config.patch",
// Execution approval
"exec.approval.request", "exec.approval.resolve",
// Sessions
"sessions.list", "sessions.preview", "sessions.reset", "sessions.compact",
// Agents
"agent", "agents.list", "agent.identity.get", "agent.wait",
// Nodes (mobile devices)
"node.pair.request", "node.pair.approve", "node.list", "node.invoke",
// Cron
"cron.list", "cron.add", "cron.run", "cron.runs",
// Models & TTS
"models.list", "tts.providers", "tts.convert",
// Messaging
"send", "chat.send", "chat.history", "chat.abort",
// ...
];And 11 event types for real-time updates:
export const GATEWAY_EVENTS = [
"connect.challenge",
"agent",
"chat",
"presence",
"shutdown",
"exec.approval.requested",
"exec.approval.resolved",
"node.pair.requested",
"voicewake.changed",
// ...
];This is the API surface of personal infrastructure. Mobile apps, desktop clients, and CLI tools all speak this protocol. The gateway maintains channel connections, executes scheduled tasks, manages approval workflows, and coordinates across devices—all locally.
5. Execution Approval: Human-in-the-Loop by Default
When an AI agent wants to run a shell command or modify files, the system gates dangerous operations through human approval. From src/gateway/exec-approval-manager.ts:
export type ExecApprovalRequestPayload = {
command: string;
cwd?: string | null;
host?: string | null;
security?: string | null;
ask?: string | null;
agentId?: string | null;
sessionKey?: string | null;
};
export class ExecApprovalManager {
private pending = new Map<string, PendingEntry>();
async waitForDecision(
record: ExecApprovalRecord,
timeoutMs: number,
): Promise<ExecApprovalDecision | null> {
return new Promise<ExecApprovalDecision | null>((resolve, reject) => {
const timer = setTimeout(() => {
this.pending.delete(record.id);
resolve(null); // Timeout = denied
}, timeoutMs);
this.pending.set(record.id, { record, resolve, reject, timer });
});
}
resolve(
recordId: string,
decision: ExecApprovalDecision,
resolvedBy?: string | null
): boolean {
const pending = this.pending.get(recordId);
if (!pending) return false;
clearTimeout(pending.timer);
pending.record.decision = decision;
pending.record.resolvedBy = resolvedBy ?? null;
this.pending.delete(recordId);
pending.resolve(decision);
return true;
}
}Approval requests propagate to connected nodes (iOS/Android apps) via the exec.approval.requested event. You get a push notification: "Agent 'work' wants to run git push origin main"—with full context about which session triggered it.
AI agents operate in a trust hierarchy. The human remains the final authority for consequential actions, but the system handles the mechanics of request routing, timeout enforcement, and decision propagation.
6. Media Handling: Cross-Platform Normalization
Each messaging platform has different media limits (WhatsApp: 16MB, Telegram: 50MB), format support, and URL handling. The media store in src/media/store.ts normalizes this:
const MAX_BYTES = 5 * 1024 * 1024; // 5MB default
const DEFAULT_TTL_MS = 2 * 60 * 1000; // 2 minutes
/**
* Sanitize filename for cross-platform safety.
* Removes chars unsafe on Windows/SharePoint/all platforms.
*/
function sanitizeFilename(name: string): string {
const unsafe = /[<>:"/\|?*\x00-\x1f]/g;
return name
.trim()
.replace(unsafe, "_")
.replace(/\s+/g, "_")
.replace(/_+/g, "_")
.replace(/^_|_$/g, "")
.slice(0, 60);
}
/**
* Extract original filename from embedded UUID pattern.
* {original}---{uuid}.{ext} → {original}.{ext}
*/
export function extractOriginalFilename(filePath: string): string {
const basename = path.basename(filePath);
if (!basename) return "file.bin";
const ext = path.extname(basename);
const nameWithoutExt = path.basename(basename, ext);
const match = nameWithoutExt.match(
/^(.+)---[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}$/i
);
return match?.[1] ? `\${match[1]}\${ext}` : basename;
}
// Auto-cleanup expired media
export async function cleanOldMedia(ttlMs = DEFAULT_TTL_MS) {
const mediaDir = await ensureMediaDir();
const entries = await fs.readdir(mediaDir).catch(() => []);
const now = Date.now();
await Promise.all(entries.map(async (file) => {
const full = path.join(mediaDir, file);
const stat = await fs.stat(full).catch(() => null);
if (stat && now - stat.mtimeMs > ttlMs) {
await fs.rm(full).catch(() => {});
}
}));
}Temporary files are garbage-collected automatically. The UUID-embedding pattern preserves original filenames while ensuring uniqueness. And the sanitization handles the intersection of what Windows, macOS, Linux, and various cloud services consider safe.
The Design Philosophy
Reading through Clawdbot's architecture, a consistent philosophy emerges:
-
Local-first, not local-only. Data lives under
~/.clawdbot/, but the gateway can expose itself via Tailscale or mDNS when you want remote access. -
Protocols are plugins. WhatsApp, Matrix, Nostr—they're all just implementations of
ChannelPlugin. The gateway doesn't privilege any platform. -
Isolation by construction. Lanes prevent work categories from interfering. Session keys prevent conversations from bleeding. Approval gating prevents agents from acting unilaterally.
-
Human authority is preserved. The AI does the work; the human approves the consequences. This isn't a limitation—it's the point.
For developers building AI assistants, Clawdbot offers an alternative to the cloud-tenant model. Your infrastructure, your protocols, your rules—with engineering rigor that takes these constraints seriously.
Casey has a great essay on what this architecture means beyond the code. For the deeper philosophical take, see The Sovereign Agent on Texxr.
