--- summary: "Export OpenClaw diagnostics to any OpenTelemetry collector via the diagnostics-otel plugin (OTLP/HTTP)" title: "OpenTelemetry export" read_when: - You want to send OpenClaw model usage, message flow, or session metrics to an OpenTelemetry collector - You are wiring traces, metrics, or logs into Grafana, Datadog, Honeycomb, New Relic, Tempo, or another OTLP backend - You need the exact metric names, span names, or attribute shapes to build dashboards or alerts --- OpenClaw exports diagnostics through the bundled `diagnostics-otel` plugin using **OTLP/HTTP (protobuf)**. Any collector or backend that accepts OTLP/HTTP works without code changes. For local file logs and how to read them, see [Logging](/logging). ## How it fits together - **Diagnostics events** are structured, in-process records emitted by the Gateway and bundled plugins for model runs, message flow, sessions, queues, and exec. - **`diagnostics-otel` plugin** subscribes to those events and exports them as OpenTelemetry **metrics**, **traces**, and **logs** over OTLP/HTTP. - **Provider calls** receive a W3C `traceparent` header from OpenClaw's trusted model-call span context when the provider transport accepts custom headers. Plugin-emitted trace context is not propagated. - Exporters only attach when both the diagnostics surface and the plugin are enabled, so the in-process cost stays near zero by default. ## Quick start ```json5 { plugins: { allow: ["diagnostics-otel"], entries: { "diagnostics-otel": { enabled: true }, }, }, diagnostics: { enabled: true, otel: { enabled: true, endpoint: "http://otel-collector:4318", protocol: "http/protobuf", serviceName: "openclaw-gateway", traces: true, metrics: true, logs: true, sampleRate: 0.2, flushIntervalMs: 60000, }, }, } ``` You can also enable the plugin from the CLI: ```bash openclaw plugins enable diagnostics-otel ``` `protocol` currently supports `http/protobuf` only. `grpc` is ignored. ## Signals exported | Signal | What goes in it | | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------ | | **Metrics** | Counters and histograms for token usage, cost, run duration, message flow, queue lanes, session state, exec, and memory pressure. | | **Traces** | Spans for model usage, model calls, harness lifecycle, tool execution, exec, webhook/message processing, context assembly, and tool loops. | | **Logs** | Structured `logging.file` records exported over OTLP when `diagnostics.otel.logs` is enabled. | Toggle `traces`, `metrics`, and `logs` independently. All three default to on when `diagnostics.otel.enabled` is true. ## Configuration reference ```json5 { diagnostics: { enabled: true, otel: { enabled: true, endpoint: "http://otel-collector:4318", tracesEndpoint: "http://otel-collector:4318/v1/traces", metricsEndpoint: "http://otel-collector:4318/v1/metrics", logsEndpoint: "http://otel-collector:4318/v1/logs", protocol: "http/protobuf", // grpc is ignored serviceName: "openclaw-gateway", headers: { "x-collector-token": "..." }, traces: true, metrics: true, logs: true, sampleRate: 0.2, // root-span sampler, 0.0..1.0 flushIntervalMs: 60000, // metric export interval (min 1000ms) captureContent: { enabled: false, inputMessages: false, outputMessages: false, toolInputs: false, toolOutputs: false, systemPrompt: false, }, }, }, } ``` ### Environment variables | Variable | Purpose | | ----------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `OTEL_EXPORTER_OTLP_ENDPOINT` | Override `diagnostics.otel.endpoint`. If the value already contains `/v1/traces`, `/v1/metrics`, or `/v1/logs`, it is used as-is. | | `OTEL_EXPORTER_OTLP_TRACES_ENDPOINT` / `OTEL_EXPORTER_OTLP_METRICS_ENDPOINT` / `OTEL_EXPORTER_OTLP_LOGS_ENDPOINT` | Signal-specific endpoint overrides used when the matching `diagnostics.otel.*Endpoint` config key is unset. Signal-specific config wins over signal-specific env, which wins over the shared endpoint. | | `OTEL_SERVICE_NAME` | Override `diagnostics.otel.serviceName`. | | `OTEL_EXPORTER_OTLP_PROTOCOL` | Override the wire protocol (only `http/protobuf` is honored today). | | `OTEL_SEMCONV_STABILITY_OPT_IN` | Set to `gen_ai_latest_experimental` to emit the latest experimental GenAI span attribute (`gen_ai.provider.name`) instead of the legacy `gen_ai.system`. GenAI metrics always use bounded, low-cardinality semantic attributes regardless. | | `OPENCLAW_OTEL_PRELOADED` | Set to `1` when another preload or host process already registered the global OpenTelemetry SDK. The plugin then skips its own NodeSDK lifecycle but still wires diagnostic listeners and honors `traces`/`metrics`/`logs`. | ## Privacy and content capture Raw model/tool content is **not** exported by default. Spans carry bounded identifiers (channel, provider, model, error category, hash-only request ids) and never include prompt text, response text, tool inputs, tool outputs, or session keys. Outbound model requests may include a W3C `traceparent` header. That header is generated only from OpenClaw-owned diagnostic trace context for the active model call. Existing caller-supplied `traceparent` headers are replaced, so plugins or custom provider options cannot spoof cross-service trace ancestry. Set `diagnostics.otel.captureContent.*` to `true` only when your collector and retention policy are approved for prompt, response, tool, or system-prompt text. Each subkey is opt-in independently: - `inputMessages` — user prompt content. - `outputMessages` — model response content. - `toolInputs` — tool argument payloads. - `toolOutputs` — tool result payloads. - `systemPrompt` — assembled system/developer prompt. When any subkey is enabled, model and tool spans get bounded, redacted `openclaw.content.*` attributes for that class only. ## Sampling and flushing - **Traces:** `diagnostics.otel.sampleRate` (root-span only, `0.0` drops all, `1.0` keeps all). - **Metrics:** `diagnostics.otel.flushIntervalMs` (minimum `1000`). - **Logs:** OTLP logs respect `logging.level` (file log level). They use the diagnostic log-record redaction path, not console formatting. High-volume installs should prefer OTLP collector sampling/filtering over local sampling. - **File-log correlation:** JSONL file logs include top-level `traceId`, `spanId`, `parentSpanId`, and `traceFlags` when the log call carries a valid diagnostic trace context, which lets log processors join local log lines with exported spans. - **Request correlation:** Gateway HTTP requests and WebSocket frames create an internal request trace scope. Logs and diagnostic events inside that scope inherit the request trace by default, while agent run and model-call spans are created as children so provider `traceparent` headers stay on the same trace. ## Exported metrics ### Model usage - `openclaw.tokens` (counter, attrs: `openclaw.token`, `openclaw.channel`, `openclaw.provider`, `openclaw.model`, `openclaw.agent`) - `openclaw.cost.usd` (counter, attrs: `openclaw.channel`, `openclaw.provider`, `openclaw.model`) - `openclaw.run.duration_ms` (histogram, attrs: `openclaw.channel`, `openclaw.provider`, `openclaw.model`) - `openclaw.context.tokens` (histogram, attrs: `openclaw.context`, `openclaw.channel`, `openclaw.provider`, `openclaw.model`) - `gen_ai.client.token.usage` (histogram, GenAI semantic-conventions metric, attrs: `gen_ai.token.type` = `input`/`output`, `gen_ai.provider.name`, `gen_ai.operation.name`, `gen_ai.request.model`) - `gen_ai.client.operation.duration` (histogram, seconds, GenAI semantic-conventions metric, attrs: `gen_ai.provider.name`, `gen_ai.operation.name`, `gen_ai.request.model`, optional `error.type`) - `openclaw.model_call.duration_ms` (histogram, attrs: `openclaw.provider`, `openclaw.model`, `openclaw.api`, `openclaw.transport`, plus `openclaw.errorCategory` and `openclaw.failureKind` on classified errors) - `openclaw.model_call.request_bytes` (histogram, UTF-8 byte size of the final model request payload; no raw payload content) - `openclaw.model_call.response_bytes` (histogram, UTF-8 byte size of streamed model response events; no raw response content) - `openclaw.model_call.time_to_first_byte_ms` (histogram, elapsed time before the first streamed response event) ### Message flow - `openclaw.webhook.received` (counter, attrs: `openclaw.channel`, `openclaw.webhook`) - `openclaw.webhook.error` (counter, attrs: `openclaw.channel`, `openclaw.webhook`) - `openclaw.webhook.duration_ms` (histogram, attrs: `openclaw.channel`, `openclaw.webhook`) - `openclaw.message.queued` (counter, attrs: `openclaw.channel`, `openclaw.source`) - `openclaw.message.processed` (counter, attrs: `openclaw.channel`, `openclaw.outcome`) - `openclaw.message.duration_ms` (histogram, attrs: `openclaw.channel`, `openclaw.outcome`) - `openclaw.message.delivery.started` (counter, attrs: `openclaw.channel`, `openclaw.delivery.kind`) - `openclaw.message.delivery.duration_ms` (histogram, attrs: `openclaw.channel`, `openclaw.delivery.kind`, `openclaw.outcome`, `openclaw.errorCategory`) ### Queues and sessions - `openclaw.queue.lane.enqueue` (counter, attrs: `openclaw.lane`) - `openclaw.queue.lane.dequeue` (counter, attrs: `openclaw.lane`) - `openclaw.queue.depth` (histogram, attrs: `openclaw.lane` or `openclaw.channel=heartbeat`) - `openclaw.queue.wait_ms` (histogram, attrs: `openclaw.lane`) - `openclaw.session.state` (counter, attrs: `openclaw.state`, `openclaw.reason`) - `openclaw.session.stuck` (counter, attrs: `openclaw.state`) - `openclaw.session.stuck_age_ms` (histogram, attrs: `openclaw.state`) - `openclaw.run.attempt` (counter, attrs: `openclaw.attempt`) ### Harness lifecycle - `openclaw.harness.duration_ms` (histogram, attrs: `openclaw.harness.id`, `openclaw.harness.plugin`, `openclaw.outcome`, `openclaw.harness.phase` on errors) ### Exec - `openclaw.exec.duration_ms` (histogram, attrs: `openclaw.exec.target`, `openclaw.exec.mode`, `openclaw.outcome`, `openclaw.failureKind`) ### Diagnostics internals (memory and tool loop) - `openclaw.memory.heap_used_bytes` (histogram, attrs: `openclaw.memory.kind`) - `openclaw.memory.rss_bytes` (histogram) - `openclaw.memory.pressure` (counter, attrs: `openclaw.memory.level`) - `openclaw.tool.loop.iterations` (counter, attrs: `openclaw.toolName`, `openclaw.outcome`) - `openclaw.tool.loop.duration_ms` (histogram, attrs: `openclaw.toolName`, `openclaw.outcome`) ## Exported spans - `openclaw.model.usage` - `openclaw.channel`, `openclaw.provider`, `openclaw.model` - `openclaw.tokens.*` (input/output/cache_read/cache_write/total) - `gen_ai.system` by default, or `gen_ai.provider.name` when the latest GenAI semantic conventions are opted in - `gen_ai.request.model`, `gen_ai.operation.name`, `gen_ai.usage.*` - `openclaw.run` - `openclaw.outcome`, `openclaw.channel`, `openclaw.provider`, `openclaw.model`, `openclaw.errorCategory` - `openclaw.model.call` - `gen_ai.system` by default, or `gen_ai.provider.name` when the latest GenAI semantic conventions are opted in - `gen_ai.request.model`, `gen_ai.operation.name`, `openclaw.provider`, `openclaw.model`, `openclaw.api`, `openclaw.transport` - `openclaw.errorCategory` and optional `openclaw.failureKind` on errors - `openclaw.model_call.request_bytes`, `openclaw.model_call.response_bytes`, `openclaw.model_call.time_to_first_byte_ms` - `openclaw.provider.request_id_hash` (bounded SHA-based hash of the upstream provider request id; raw ids are not exported) - `openclaw.harness.run` - `openclaw.harness.id`, `openclaw.harness.plugin`, `openclaw.outcome`, `openclaw.provider`, `openclaw.model`, `openclaw.channel` - On completion: `openclaw.harness.result_classification`, `openclaw.harness.yield_detected`, `openclaw.harness.items.started`, `openclaw.harness.items.completed`, `openclaw.harness.items.active` - On error: `openclaw.harness.phase`, `openclaw.errorCategory`, optional `openclaw.harness.cleanup_failed` - `openclaw.tool.execution` - `gen_ai.tool.name`, `openclaw.toolName`, `openclaw.errorCategory`, `openclaw.tool.params.*` - `openclaw.exec` - `openclaw.exec.target`, `openclaw.exec.mode`, `openclaw.outcome`, `openclaw.failureKind`, `openclaw.exec.command_length`, `openclaw.exec.exit_code`, `openclaw.exec.timed_out` - `openclaw.webhook.processed` - `openclaw.channel`, `openclaw.webhook`, `openclaw.chatId` - `openclaw.webhook.error` - `openclaw.channel`, `openclaw.webhook`, `openclaw.chatId`, `openclaw.error` - `openclaw.message.processed` - `openclaw.channel`, `openclaw.outcome`, `openclaw.chatId`, `openclaw.messageId`, `openclaw.reason` - `openclaw.message.delivery` - `openclaw.channel`, `openclaw.delivery.kind`, `openclaw.outcome`, `openclaw.errorCategory`, `openclaw.delivery.result_count` - `openclaw.session.stuck` - `openclaw.state`, `openclaw.ageMs`, `openclaw.queueDepth` - `openclaw.context.assembled` - `openclaw.prompt.size`, `openclaw.history.size`, `openclaw.context.tokens`, `openclaw.errorCategory` (no prompt, history, response, or session-key content) - `openclaw.tool.loop` - `openclaw.toolName`, `openclaw.outcome`, `openclaw.iterations`, `openclaw.errorCategory` (no loop messages, params, or tool output) - `openclaw.memory.pressure` - `openclaw.memory.level`, `openclaw.memory.heap_used_bytes`, `openclaw.memory.rss_bytes` When content capture is explicitly enabled, model and tool spans can also include bounded, redacted `openclaw.content.*` attributes for the specific content classes you opted into. ## Diagnostic event catalog The events below back the metrics and spans above. Plugins can also subscribe to them directly without OTLP export. **Model usage** - `model.usage` — tokens, cost, duration, context, provider/model/channel, session ids. `usage` is provider/turn accounting for cost and telemetry; `context.used` is the current prompt/context snapshot and can be lower than provider `usage.total` when cached input or tool-loop calls are involved. **Message flow** - `webhook.received` / `webhook.processed` / `webhook.error` - `message.queued` / `message.processed` - `message.delivery.started` / `message.delivery.completed` / `message.delivery.error` **Queue and session** - `queue.lane.enqueue` / `queue.lane.dequeue` - `session.state` / `session.stuck` - `run.attempt` - `diagnostic.heartbeat` (aggregate counters: webhooks/queue/session) **Harness lifecycle** - `harness.run.started` / `harness.run.completed` / `harness.run.error` — per-run lifecycle for the agent harness. Includes `harnessId`, optional `pluginId`, provider/model/channel, and run id. Completion adds `durationMs`, `outcome`, optional `resultClassification`, `yieldDetected`, and `itemLifecycle` counts. Errors add `phase` (`prepare`/`start`/`send`/`resolve`/`cleanup`), `errorCategory`, and optional `cleanupFailed`. **Exec** - `exec.process.completed` — terminal outcome, duration, target, mode, exit code, and failure kind. Command text and working directories are not included. ## Without an exporter You can keep diagnostics events available to plugins or custom sinks without running `diagnostics-otel`: ```json5 { diagnostics: { enabled: true }, } ``` For targeted debug output without raising `logging.level`, use diagnostics flags. Flags are case-insensitive and support wildcards (e.g. `telegram.*` or `*`): ```json5 { diagnostics: { flags: ["telegram.http"] }, } ``` Or as a one-off env override: ```bash OPENCLAW_DIAGNOSTICS=telegram.http,telegram.payload openclaw gateway ``` Flag output goes to the standard log file (`logging.file`) and is still redacted by `logging.redactSensitive`. Full guide: [Diagnostics flags](/diagnostics/flags). ## Disable ```json5 { diagnostics: { otel: { enabled: false } }, } ``` You can also leave `diagnostics-otel` out of `plugins.allow`, or run `openclaw plugins disable diagnostics-otel`. ## Related - [Logging](/logging) — file logs, console output, CLI tailing, and the Control UI Logs tab - [Gateway logging internals](/gateway/logging) — WS log styles, subsystem prefixes, and console capture - [Diagnostics flags](/diagnostics/flags) — targeted debug-log flags - [Diagnostics export](/gateway/diagnostics) — operator support-bundle tool (separate from OTEL export) - [Configuration reference](/gateway/configuration-reference#diagnostics) — full `diagnostics.*` field reference