feat(discord): Discord transport

This commit is contained in:
Shadow
2025-12-15 10:11:18 -06:00
committed by Peter Steinberger
parent 557f8e5a04
commit ac659ff5a7
44 changed files with 1352 additions and 56 deletions

View File

@@ -8,14 +8,14 @@ read_when:
Last updated: 2025-12-09
## Overview
- A single long-lived **Gateway** process owns all messaging surfaces (WhatsApp via Baileys, Telegram when enabled) and the control/event plane.
- A single long-lived **Gateway** process owns all messaging surfaces (WhatsApp via Baileys, Telegram via grammY, Discord via discord.js) and the control/event plane.
- All clients (macOS app, CLI, web UI, automations) connect to the Gateway over one transport: **WebSocket on 127.0.0.1:18789** (tunnel or VPN for remote).
- One Gateway per host; it is the only place that is allowed to open a WhatsApp session. All sends/agent runs go through it.
- By default: the Gateway exposes a Canvas host on `canvasHost.port` (default `18793`), serving `~/clawd/canvas` at `/__clawdis__/canvas/` with live-reload; disable via `canvasHost.enabled=false` or `CLAWDIS_SKIP_CANVAS_HOST=1`.
## Components and flows
- **Gateway (daemon)**
- Maintains Baileys/Telegram connections.
- Maintains Baileys/Telegram/Discord connections.
- Exposes a typed WS API (req/resp + server push events).
- Validates every inbound frame against JSON Schema; rejects anything before a mandatory `connect`.
- **Clients (mac app / CLI / web admin)**

View File

@@ -7,14 +7,14 @@ read_when:
<!-- {% raw %} -->
# Building a personal assistant with CLAWDIS (Clawd-style)
CLAWDIS is a WhatsApp + Telegram gateway for **Pi** agents. This guide is the “personal assistant” setup: one dedicated WhatsApp number that behaves like your always-on agent.
CLAWDIS is a WhatsApp + Telegram + Discord gateway for **Pi** agents. This guide is the “personal assistant” setup: one dedicated WhatsApp number that behaves like your always-on agent.
## ⚠️ Safety first
Youre putting an agent in a position to:
- run commands on your machine (depending on your Pi tool setup)
- read/write files in your workspace
- send messages back out via WhatsApp/Telegram
- send messages back out via WhatsApp/Telegram/Discord
Start conservative:
- Always set `routing.allowFrom` (never run open-to-the-world on your personal Mac).

View File

@@ -85,6 +85,23 @@ Group messages default to **require mention** (either metadata mention or regex
}
```
### `discord` (bot transport)
Configure the Discord bot by setting the bot token and optional gating:
```json5
{
discord: {
token: "your-bot-token",
allowFrom: ["discord:1234567890", "*"], // optional DM allowlist (user ids)
requireMention: true, // require @bot mentions in guilds
mediaMaxMb: 8 // clamp inbound media size
}
}
```
Clawdis reads `DISCORD_BOT_TOKEN` or `discord.token` to start the provider. Use `user:<id>` (DM) or `channel:<id>` (guild channel) when specifying delivery targets for cron/CLI commands.
### `agent.workspace`
Sets the **single global workspace directory** used by the agent for file operations.
@@ -152,7 +169,7 @@ deprecation fallback.
- `every`: duration string (`ms`, `s`, `m`, `h`); default unit minutes. Omit or set
`0m` to disable.
- `model`: optional override model for heartbeat runs (`provider/model`).
- `target`: optional delivery channel (`last`, `whatsapp`, `telegram`, `none`). Default: `last`.
- `target`: optional delivery channel (`last`, `whatsapp`, `telegram`, `discord`, `none`). Default: `last`.
- `to`: optional recipient override (E.164 for WhatsApp, chat id for Telegram).
- `prompt`: optional override for the heartbeat body (default: `HEARTBEAT`).
@@ -510,7 +527,7 @@ Template placeholders are expanded in `routing.transcribeAudio.command` (and any
| `{{GroupMembers}}` | Group members preview (best effort) |
| `{{SenderName}}` | Sender display name (best effort) |
| `{{SenderE164}}` | Sender phone number (best effort) |
| `{{Surface}}` | Surface hint (whatsapp|telegram|webchat|…) |
| `{{Surface}}` | Surface hint (whatsapp|telegram|discord|webchat|…) |
## Cron (Gateway scheduler)

View File

@@ -264,7 +264,7 @@ Add a `cron` command group (all commands should also support `--json` where sens
- `--wake now|next-heartbeat`
- payload flags (choose one):
- `--system-event "<text>"`
- `--message "<agent message>" [--deliver] [--channel last|whatsapp|telegram] [--to <dest>]`
- `--message "<agent message>" [--deliver] [--channel last|whatsapp|telegram|discord] [--to <dest>]`
- `clawdis cron edit <id> ...` (patch-by-flags, non-interactive)
- `clawdis cron rm <id>`

54
docs/discord.md Normal file
View File

@@ -0,0 +1,54 @@
---
summary: "Discord bot support status, capabilities, and configuration"
read_when:
- Working on Discord surface features
---
# Discord (Bot API)
Updated: 2025-12-07
Status: ready for DM and guild text channels via the official Discord bot gateway.
## Goals
- Talk to Clawdis via Discord DMs or guild channels.
- Share the same `main` session used by WhatsApp/Telegram/WebChat; guild channels stay isolated as `group:<channelId>`.
- Keep routing deterministic: replies always go back to the surface they arrived on.
## How it works
1. Create a Discord application → Bot, enable the intents you need (DMs + guild messages + message content), and grab the bot token.
2. Invite the bot to your server with the permissions required to read/send messages where you want to use it.
3. Configure Clawdis with `DISCORD_BOT_TOKEN` (or `discord.token` in `~/.clawdis/clawdis.json`).
4. Run the gateway; it auto-starts the Discord provider when the token is set.
5. Direct chats: use `user:<id>` (or a `<@id>` mention) when delivering; all turns land in the shared `main` session.
6. Guild channels: use `channel:<channelId>` for delivery. Mentions are required by default; disable with `discord.requireMention = false`.
7. Optional DM allowlist: reuse `discord.allowFrom` with user ids (`1234567890` or `discord:1234567890`). Use `"*"` to allow all DMs.
Note: Discord does not provide a simple username → id lookup without extra guild context, so prefer ids or `<@id>` mentions for DM delivery targets.
## Capabilities & limits
- DMs and guild text channels (threads are treated as separate channels; voice not supported).
- Typing indicators sent best-effort; message chunking honors Discords 2k character limit.
- File uploads supported up to the configured `discord.mediaMaxMb` (default 8 MB).
- Mention-gated guild replies by default to avoid noisy bots.
## Config
```json5
{
discord: {
token: "abc.123",
allowFrom: ["123456789012345678"],
requireMention: true,
mediaMaxMb: 8
}
}
```
- `allowFrom`: DM allowlist (user ids). Omit or set to `["*"]` to allow any DM sender.
- `requireMention`: when `true`, messages in guild channels must mention the bot.
- `mediaMaxMb`: clamp inbound media saved to disk.
## Safety & ops
- Treat the bot token like a password; prefer the `DISCORD_BOT_TOKEN` env var on supervised hosts or lock down the config file permissions.
- Only grant the bot permissions it needs (typically Read/Send Messages).
- If the bot is stuck or rate limited, restart the gateway (`clawdis gateway --force`) after confirming no other processes own the Discord session.

View File

@@ -9,7 +9,7 @@ Short guide to verify the WhatsApp Web / Baileys stack without guessing.
## Quick checks
- `clawdis status` — local summary: whether creds exist, auth age, session store path + recent sessions.
- `clawdis status --deep` — also probes the running Gateway (WA connect + Telegram API).
- `clawdis status --deep` — also probes the running Gateway (WhatsApp connect + Telegram + Discord APIs).
- `clawdis health --json` — asks the running Gateway for a full health snapshot (WS-only; no direct Baileys socket).
- Send `/status` in WhatsApp/WebChat to get a status reply without invoking the agent.
- Logs: tail `/tmp/clawdis/clawdis-*.log` and filter for `web-heartbeat`, `web-reconnect`, `web-auto-reply`, `web-inbound`.

View File

@@ -13,7 +13,7 @@ read_when:
</p>
<p align="center">
<strong>WhatsApp + Telegram gateway for AI agents (Pi).</strong><br>
<strong>WhatsApp + Telegram + Discord gateway for AI agents (Pi).</strong><br>
Send a message, get an agent response — from your pocket.
</p>
@@ -23,13 +23,13 @@ read_when:
<a href="./clawd">Clawd setup</a>
</p>
CLAWDIS bridges WhatsApp (via WhatsApp Web / Baileys) and Telegram (Bot API / grammY) to coding agents like [Pi](https://github.com/badlogic/pi-mono).
CLAWDIS bridges WhatsApp (via WhatsApp Web / Baileys), Telegram (Bot API / grammY), and Discord (Bot API / discord.js) to coding agents like [Pi](https://github.com/badlogic/pi-mono).
Its built for [Clawd](https://clawd.me), a space lobster who needed a TARDIS.
## How it works
```
WhatsApp / Telegram
WhatsApp / Telegram / Discord
┌──────────────────────────┐
@@ -60,6 +60,7 @@ Most operations flow through the **Gateway** (`clawdis gateway`), a single long-
- 📱 **WhatsApp Integration** — Uses Baileys for WhatsApp Web protocol
- ✈️ **Telegram Bot** — DMs + groups via grammY
- 🎮 **Discord Bot** — DMs + guild channels via discord.js
- 🤖 **Agent bridge** — Pi (RPC mode) with tool streaming
- 💬 **Sessions** — Direct chats collapse into shared `main` (default); groups are isolated
- 👥 **Group Chat Support** — Mention-based by default; owner can toggle `/activation always|mention`
@@ -127,6 +128,7 @@ Example:
- [WebChat](./webchat.md)
- [Control UI (browser)](./control-ui.md)
- [Telegram](./telegram.md)
- [Discord](./discord.md)
- [Group messages](./group-messages.md)
- [Media: images](./images.md)
- [Media: audio](./audio.md)

View File

@@ -46,7 +46,7 @@ Hardening:
## Forwarding behavior
- When Voice Wake is enabled, transcripts are forwarded to the active gateway/agent (the same local vs remote mode used by the rest of the mac app).
- Replies are delivered to the **last-used main surface** (WhatsApp/Telegram/WebChat). If delivery fails, the error is logged and the run is still visible via WebChat/session logs.
- Replies are delivered to the **last-used main surface** (WhatsApp/Telegram/Discord/WebChat). If delivery fails, the error is logged and the run is still visible via WebChat/session logs.
## Forwarding payload
- `VoiceWakeForwarder.prefixedTranscript(_:)` prepends the machine hint before sending. Shared between wake-word and push-to-talk paths.

View File

@@ -18,7 +18,7 @@ The macOS menu bar app shows the WebChat UI as a native SwiftUI view and reuses
## How its wired
- Implementation: `apps/macos/Sources/Clawdis/WebChatSwiftUI.swift` hosts `ClawdisChatUI` and speaks to the Gateway over `GatewayConnection`.
- Data plane: Gateway WebSocket methods `chat.history`, `chat.send`, `chat.abort`; events `chat`, `agent`, `presence`, `tick`, `health`.
- Session: usually primary (`main`). The onboarding flow uses a dedicated `onboarding` session to keep first-run setup separate.
- Session: usually primary (`main`); multiple transports (WhatsApp/Telegram/Discord/Desktop) share the same key. The onboarding flow uses a dedicated `onboarding` session to keep first-run setup separate.
## Security / surface area
- Remote mode forwards only the Gateway WebSocket control port over SSH.

View File

@@ -21,7 +21,7 @@ All session state is **owned by the gateway** (the “master” Clawdis). UI cli
- Clawdis does **not** read legacy Pi/Tau session folders.
## Mapping transports → session keys
- Direct chats (WhatsApp, Telegram, desktop Web Chat) all collapse to the **primary key** so they share context.
- Direct chats (WhatsApp, Telegram, Discord, desktop Web Chat) all collapse to the **primary key** so they share context.
- Multiple phone numbers can map to that same key; they act as transports into the same conversation.
- Group chats still isolate state with `group:<jid>` keys; do not reuse the primary key for groups.

View File

@@ -1,5 +1,5 @@
---
summary: "Routing rules per surface (WhatsApp, Telegram, web) and shared context"
summary: "Routing rules per surface (WhatsApp, Telegram, Discord, web) and shared context"
read_when:
- Changing surface routing or inbox behavior
---
@@ -9,12 +9,12 @@ Updated: 2025-12-07
Goal: make replies deterministic per channel while keeping one shared context for direct chats.
- **Surfaces** (channel labels): `whatsapp`, `webchat`, `telegram`, `voice`, etc. Add `Surface` to inbound `MsgContext` so templates/agents can log which channel a turn came from. Routing is fixed: replies go back to the origin surface; the model doesnt choose.
- **Surfaces** (channel labels): `whatsapp`, `webchat`, `telegram`, `discord`, `voice`, etc. Add `Surface` to inbound `MsgContext` so templates/agents can log which channel a turn came from. Routing is fixed: replies go back to the origin surface; the model doesnt choose.
- **Reply context:** inbound replies include `ReplyToId`, `ReplyToBody`, and `ReplyToSender`, and the quoted context is appended to `Body` as a `[Replying to ...]` block.
- **Canonical direct session:** All direct chats collapse into the single `main` session by default (no config needed). Groups stay `group:<jid>`, so they remain isolated.
- **Session store:** Keys are resolved via `resolveSessionKey(scope, ctx, mainKey)`; the agent JSONL path lives under `~/.clawdis/sessions/<SessionId>.jsonl`.
- **WebChat:** Always attaches to `main`, loads the full session transcript so desktop reflects cross-surface history, and writes new turns back to the same session.
- **Implementation hints:**
- Set `Surface` in each ingress (WhatsApp gateway, WebChat bridge, future Telegram).
- Set `Surface` in each ingress (WhatsApp gateway, WebChat bridge, Telegram, Discord).
- Keep routing deterministic: originate → same surface. Use the gateway WebSocket for sends; avoid side channels.
- Do not let the agent emit “send to X” decisions; keep that policy in the host code.

View File

@@ -83,7 +83,7 @@ Or use the `process` tool to background long commands.
```bash
# Check local status (creds, sessions, queued events)
clawdis status
# Probe the running gateway + providers (WA connect + Telegram API)
# Probe the running gateway + providers (WA connect + Telegram + Discord APIs)
clawdis status --deep
# View recent connection events