test(browser): add CDP snapshot Docker smoke

This commit is contained in:
Peter Steinberger
2026-04-26 04:38:11 +01:00
parent 4edf22f63f
commit e40094a9ef
5 changed files with 180 additions and 1 deletions

View File

@@ -13,6 +13,7 @@ Docs: https://docs.openclaw.ai
- TTS/agents: allow `agents.list[].tts` to override global `messages.tts` for per-agent voices while keeping shared provider credentials and preferences in the existing TTS config surface.
- TTS/agents: make `/tts audio`, `/tts status`, and the `tts` agent tool honor the active `agents.list[].tts` voice/provider override.
- Providers/Azure Speech: add Azure Speech as a bundled TTS provider with Speech-resource auth, voice listing, SSML escaping, native Ogg/Opus voice-note output, and telephony output. (#51776) Thanks @leonchui.
- Browser automation: add a CDP-native role snapshot fallback with iframe-aware refs, cursor-clickable detection, target attach preparation, and `openclaw browser doctor --deep` live snapshot probing.
- CLI/image generation: expose generic `--background` on `openclaw infer image generate` and `openclaw infer image edit`, keep `--openai-background` as an OpenAI alias, and let fal image generation honor `--output-format png|jpeg`. Thanks @steipete.
- Browser/config: allow local managed Chrome launch discovery and post-launch CDP readiness timeouts to be raised for slower hosts such as Raspberry Pi. Fixes #66803. Thanks @beat843796.
- Discord: allow `channels.discord.voice.model` to override the LLM used for voice channel responses while keeping STT and TTS on their existing media settings. (#64368) Thanks @mrdavey.
@@ -100,6 +101,7 @@ Docs: https://docs.openclaw.ai
- Diagnostics/OTEL: treat normal early model stream cleanup as a completed model call instead of exporting a misleading `StreamAbandoned` error span. Thanks @vincentkoc.
- Gateway/pairing: stop corrupt or unreadable device/node pairing stores from being treated as empty state, preserving `paired.json` for repair instead of overwriting approved pairings. Fixes #71873. Thanks @iret77.
- ACP: keep `/acp` management commands, plus local `/status` and `/unfocus`, on the Gateway path inside ACP-bound threads so they are not consumed as ACP prompt text. Fixes #66298. Thanks @kindomLee.
- ACPX: stop probing ACP agents during normal Gateway startup; the embedded backend now registers without spawning Codex/ACP child processes unless `OPENCLAW_ACPX_RUNTIME_STARTUP_PROBE=1` is explicitly set.
- CLI/image edit: accept `--size`, `--aspect-ratio`, and `--resolution` on `openclaw infer image edit` and report all supported edit flags from `capability inspect image.edit`. Thanks @Pinghuachiu.
- ACP: wait for the configured runtime backend to become healthy before startup identity reconciliation, avoiding transient acpx warnings during Gateway boot. Fixes #40566.
- Channels/ACP bindings: time out configured binding readiness checks instead of letting Discord preflight hang forever when an ACP target never settles. Fixes #68776.

View File

@@ -603,7 +603,7 @@ These Docker runners split into two buckets:
`OPENCLAW_LIVE_GATEWAY_MODEL_TIMEOUT_MS=90000`. Override those env vars when you
explicitly want the larger exhaustive scan.
- `test:docker:all` builds the live Docker image once via `test:docker:live-build`, then reuses it for the live Docker lanes. It also builds one shared `scripts/e2e/Dockerfile` image via `test:docker:e2e-build` and reuses it for the E2E container smoke runners that exercise the built app. The aggregate uses a weighted local scheduler: `OPENCLAW_DOCKER_ALL_PARALLELISM` controls process slots, while resource caps keep heavy live, npm-install, and multi-service lanes from all starting at once. Defaults are 10 slots, `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=6`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=8`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7`; tune `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` only when the Docker host has more headroom. The runner performs a Docker preflight by default, removes stale OpenClaw E2E containers, prints status every 30 seconds, stores successful lane timings in `.artifacts/docker-tests/lane-timings.json`, and uses those timings to start longer lanes first on later runs. Use `OPENCLAW_DOCKER_ALL_DRY_RUN=1` to print the weighted lane manifest without building or running Docker.
- Container smoke runners: `test:docker:openwebui`, `test:docker:onboard`, `test:docker:npm-onboard-channel-agent`, `test:docker:session-runtime-context`, `test:docker:agents-delete-shared-workspace`, `test:docker:gateway-network`, `test:docker:mcp-channels`, `test:docker:pi-bundle-mcp-tools`, `test:docker:cron-mcp-cleanup`, `test:docker:plugins`, `test:docker:plugin-update`, and `test:docker:config-reload` boot one or more real containers and verify higher-level integration paths.
- Container smoke runners: `test:docker:openwebui`, `test:docker:onboard`, `test:docker:npm-onboard-channel-agent`, `test:docker:session-runtime-context`, `test:docker:agents-delete-shared-workspace`, `test:docker:gateway-network`, `test:docker:browser-cdp-snapshot`, `test:docker:mcp-channels`, `test:docker:pi-bundle-mcp-tools`, `test:docker:cron-mcp-cleanup`, `test:docker:plugins`, `test:docker:plugin-update`, and `test:docker:config-reload` boot one or more real containers and verify higher-level integration paths.
The live-model Docker runners also bind-mount only the needed CLI auth homes (or all supported ones when the run is not narrowed), then copy them into the container home before the run so external-CLI OAuth can refresh tokens without mutating the host auth store:
@@ -621,6 +621,7 @@ The live-model Docker runners also bind-mount only the needed CLI auth homes (or
- Install Smoke CI skips the duplicate direct-npm global update with `OPENCLAW_INSTALL_SMOKE_SKIP_NPM_GLOBAL=1`; run the script locally without that env when direct `npm install -g` coverage is needed.
- Agents delete shared workspace CLI smoke: `pnpm test:docker:agents-delete-shared-workspace` (script: `scripts/e2e/agents-delete-shared-workspace-docker.sh`) builds the root Dockerfile image by default, seeds two agents with one workspace in an isolated container home, runs `agents delete --json`, and verifies valid JSON plus retained workspace behavior. Reuse the install-smoke image with `OPENCLAW_AGENTS_DELETE_SHARED_WORKSPACE_E2E_IMAGE=openclaw-dockerfile-smoke:local OPENCLAW_AGENTS_DELETE_SHARED_WORKSPACE_E2E_SKIP_BUILD=1`.
- Gateway networking (two containers, WS auth + health): `pnpm test:docker:gateway-network` (script: `scripts/e2e/gateway-network-docker.sh`)
- Browser CDP snapshot smoke: `pnpm test:docker:browser-cdp-snapshot` (script: `scripts/e2e/browser-cdp-snapshot-docker.sh`) builds the source E2E image plus a Chromium layer, starts Chromium with raw CDP, runs `browser doctor --deep`, and verifies CDP role snapshots cover link URLs, cursor-promoted clickables, iframe refs, and frame metadata.
- OpenAI Responses web_search minimal reasoning regression: `pnpm test:docker:openai-web-search-minimal` (script: `scripts/e2e/openai-web-search-minimal-docker.sh`) runs a mocked OpenAI server through Gateway, verifies `web_search` raises `reasoning.effort` from `minimal` to `low`, then forces the provider schema reject and checks the raw detail appears in Gateway logs.
- MCP channel bridge (seeded Gateway + stdio bridge + raw Claude notification-frame smoke): `pnpm test:docker:mcp-channels` (script: `scripts/e2e/mcp-channels-docker.sh`)
- Pi bundle MCP tools (real stdio MCP server + embedded Pi profile allow/deny smoke): `pnpm test:docker:pi-bundle-mcp-tools` (script: `scripts/e2e/pi-bundle-mcp-tools-docker.sh`)

View File

@@ -33,6 +33,7 @@ title: "Tests"
- `pnpm test:e2e`: Runs gateway end-to-end smoke tests (multi-instance WS/HTTP/node pairing). Defaults to `threads` + `isolate: false` with adaptive workers in `vitest.e2e.config.ts`; tune with `OPENCLAW_E2E_WORKERS=<n>` and set `OPENCLAW_E2E_VERBOSE=1` for verbose logs.
- `pnpm test:live`: Runs provider live tests (minimax/zai). Requires API keys and `LIVE=1` (or provider-specific `*_LIVE_TEST=1`) to unskip.
- `pnpm test:docker:all`: Builds the shared live-test image and Docker E2E image once, then runs the Docker smoke lanes with `OPENCLAW_SKIP_DOCKER_BUILD=1` through a weighted scheduler. `OPENCLAW_DOCKER_ALL_PARALLELISM=<n>` controls process slots and defaults to 10; `OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM=<n>` controls the provider-sensitive tail pool and defaults to 10. Heavy lane caps default to `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=9`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=10`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7`; provider caps default to one heavy lane per provider via `OPENCLAW_DOCKER_ALL_LIVE_CLAUDE_LIMIT=4`, `OPENCLAW_DOCKER_ALL_LIVE_CODEX_LIMIT=4`, and `OPENCLAW_DOCKER_ALL_LIVE_GEMINI_LIMIT=4`. Use `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` for larger hosts. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with `OPENCLAW_DOCKER_ALL_START_STAGGER_MS=<ms>`. The runner preflights Docker by default, cleans stale OpenClaw E2E containers, emits active-lane status every 30 seconds, shares provider CLI tool caches between compatible lanes, retries transient live-provider failures once by default (`OPENCLAW_DOCKER_ALL_LIVE_RETRIES=<n>`), and stores lane timings in `.artifacts/docker-tests/lane-timings.json` for longest-first ordering on later runs. Use `OPENCLAW_DOCKER_ALL_DRY_RUN=1` to print the lane manifest without running Docker, `OPENCLAW_DOCKER_ALL_STATUS_INTERVAL_MS=<ms>` to tune status output, or `OPENCLAW_DOCKER_ALL_TIMINGS=0` to disable timing reuse. Use `OPENCLAW_DOCKER_ALL_LIVE_MODE=skip` for deterministic/local lanes only or `OPENCLAW_DOCKER_ALL_LIVE_MODE=only` for live-provider lanes only; package aliases are `pnpm test:docker:local:all` and `pnpm test:docker:live:all`. Live-only mode merges main and tail live lanes into one longest-first pool so provider buckets can pack Claude, Codex, and Gemini work together. The runner stops scheduling new pooled lanes after the first failure unless `OPENCLAW_DOCKER_ALL_FAIL_FAST=0` is set, and each lane has a 120-minute fallback timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`; selected live/tail lanes use tighter per-lane caps. CLI backend Docker setup commands have their own timeout via `OPENCLAW_LIVE_CLI_BACKEND_SETUP_TIMEOUT_SECONDS` (default 180). Per-lane logs are written under `.artifacts/docker-tests/<run-id>/`.
- `pnpm test:docker:browser-cdp-snapshot`: Builds a Chromium-backed source E2E container, starts raw CDP plus an isolated Gateway, runs `browser doctor --deep`, and verifies CDP role snapshots include link URLs, cursor-promoted clickables, iframe refs, and frame metadata.
- CLI backend live Docker probes can be run as focused lanes, for example `pnpm test:docker:live-cli-backend:codex`, `pnpm test:docker:live-cli-backend:codex:resume`, or `pnpm test:docker:live-cli-backend:codex:mcp`. Claude and Gemini have matching `:resume` and `:mcp` aliases.
- `pnpm test:docker:openwebui`: Starts Dockerized OpenClaw + Open WebUI, signs in through Open WebUI, checks `/api/models`, then runs a real proxied chat through `/api/chat/completions`. Requires a usable live model key (for example OpenAI in `~/.profile`), pulls an external Open WebUI image, and is not expected to be CI-stable like the normal unit/e2e suites.
- `pnpm test:docker:mcp-channels`: Starts a seeded Gateway container and a second client container that spawns `openclaw mcp serve`, then verifies routed conversation discovery, transcript reads, attachment metadata, live event queue behavior, outbound send routing, and Claude-style channel + permission notifications over the real stdio bridge. The Claude notification assertion reads the raw stdio MCP frames directly so the smoke reflects what the bridge actually emits.

View File

@@ -1481,6 +1481,7 @@
"test:coverage:changed": "node scripts/run-vitest.mjs run --config test/vitest/vitest.unit.config.ts --coverage --changed origin/main",
"test:docker:agents-delete-shared-workspace": "bash scripts/e2e/agents-delete-shared-workspace-docker.sh",
"test:docker:all": "node scripts/test-docker-all.mjs",
"test:docker:browser-cdp-snapshot": "bash scripts/e2e/browser-cdp-snapshot-docker.sh",
"test:docker:bundled-channel-deps": "bash scripts/e2e/bundled-channel-runtime-deps-docker.sh",
"test:docker:bundled-channel-deps:fast": "OPENCLAW_BUNDLED_CHANNEL_SCENARIOS=0 OPENCLAW_BUNDLED_CHANNEL_UPDATE_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_ROOT_OWNED_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_SETUP_ENTRY_SCENARIO=1 OPENCLAW_BUNDLED_CHANNEL_DISABLED_CONFIG_SCENARIO=1 OPENCLAW_BUNDLED_CHANNEL_LOAD_FAILURE_SCENARIO=1 bash scripts/e2e/bundled-channel-runtime-deps-docker.sh",
"test:docker:cleanup": "bash scripts/test-cleanup-docker.sh",

View File

@@ -0,0 +1,174 @@
#!/usr/bin/env bash
set -euo pipefail
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
source "$ROOT_DIR/scripts/lib/docker-e2e-image.sh"
BASE_IMAGE="$(docker_e2e_resolve_image "openclaw-browser-cdp-base-e2e" OPENCLAW_BROWSER_CDP_BASE_E2E_IMAGE)"
IMAGE_NAME="$(docker_e2e_resolve_image "openclaw-browser-cdp-snapshot-e2e" OPENCLAW_BROWSER_CDP_SNAPSHOT_E2E_IMAGE)"
SKIP_BUILD="${OPENCLAW_BROWSER_CDP_SNAPSHOT_E2E_SKIP_BUILD:-0}"
PORT="18789"
CDP_PORT="19222"
FIXTURE_PORT="18080"
TOKEN="browser-cdp-e2e-token"
CONTAINER_NAME="openclaw-browser-cdp-e2e-$$"
DOCKER_COMMAND_TIMEOUT="${OPENCLAW_BROWSER_CDP_SNAPSHOT_DOCKER_COMMAND_TIMEOUT:-900s}"
docker_cmd() {
timeout "$DOCKER_COMMAND_TIMEOUT" "$@"
}
cleanup() {
docker_cmd docker rm -f "$CONTAINER_NAME" >/dev/null 2>&1 || true
}
trap cleanup EXIT
if [ "${OPENCLAW_SKIP_DOCKER_BUILD:-0}" = "1" ] || [ "$SKIP_BUILD" = "1" ]; then
echo "Reusing Docker image: $IMAGE_NAME"
docker_cmd docker image inspect "$IMAGE_NAME" >/dev/null
else
docker_e2e_build_or_reuse "$BASE_IMAGE" browser-cdp-base "$ROOT_DIR/scripts/e2e/Dockerfile" "$ROOT_DIR" "" "0"
build_dir="$(mktemp -d "${TMPDIR:-/tmp}/openclaw-browser-cdp-build.XXXXXX")"
trap 'cleanup; rm -rf "$build_dir"' EXIT
cat >"$build_dir/Dockerfile" <<EOF
FROM $BASE_IMAGE
USER root
RUN apt-get update \\
&& apt-get install -y --no-install-recommends chromium fonts-liberation procps \\
&& rm -rf /var/lib/apt/lists/*
USER appuser
EOF
echo "Building Docker image: $IMAGE_NAME"
run_logged browser-cdp-snapshot-build docker build -t "$IMAGE_NAME" -f "$build_dir/Dockerfile" "$build_dir"
fi
echo "Starting browser CDP snapshot container..."
docker_cmd docker run -d \
--name "$CONTAINER_NAME" \
-e COREPACK_ENABLE_DOWNLOAD_PROMPT=0 \
-e OPENCLAW_GATEWAY_TOKEN="$TOKEN" \
-e OPENCLAW_DISABLE_BONJOUR=1 \
-e OPENCLAW_SKIP_CHANNELS=1 \
-e OPENCLAW_SKIP_PROVIDERS=1 \
-e OPENCLAW_SKIP_GMAIL_WATCHER=1 \
-e OPENCLAW_SKIP_CRON=1 \
-e OPENCLAW_SKIP_CANVAS_HOST=1 \
"$IMAGE_NAME" \
bash -lc "set -euo pipefail
entry=dist/index.mjs
[ -f \"\$entry\" ] || entry=dist/index.js
mkdir -p \"\$HOME/.openclaw\" /tmp/openclaw-browser-cdp/chrome
find dist -maxdepth 1 -type f -name 'pw-ai-*.js' ! -name 'pw-ai-state-*' -exec mv {} /tmp/openclaw-browser-cdp/ \;
cat > \"\$HOME/.openclaw/openclaw.json\" <<'JSON'
{
\"gateway\": {
\"port\": $PORT,
\"auth\": {
\"mode\": \"token\",
\"token\": \"$TOKEN\"
},
\"controlUi\": { \"enabled\": false }
},
\"browser\": {
\"enabled\": true,
\"defaultProfile\": \"docker-cdp\",
\"ssrfPolicy\": {
\"allowedHostnames\": [\"127.0.0.1\"]
},
\"profiles\": {
\"docker-cdp\": {
\"cdpUrl\": \"http://127.0.0.1:$CDP_PORT\",
\"color\": \"#FF4500\"
}
}
}
}
JSON
chromium --headless=new --no-sandbox --disable-gpu --disable-dev-shm-usage \\
--remote-debugging-address=127.0.0.1 \\
--remote-debugging-port=$CDP_PORT \\
--user-data-dir=/tmp/openclaw-browser-cdp/chrome \\
about:blank >/tmp/browser-cdp-chromium.log 2>&1 &
node --input-type=module - <<'NODE' >/tmp/browser-cdp-fixture.log 2>&1 &
import http from 'node:http';
const html = \`<!doctype html>
<html>
<body>
<main>
<button>Save</button>
<a href=\"https://docs.openclaw.ai/browser-cdp-live\">Docs</a>
<div id=\"card\" onclick=\"window.__clicked = true\" style=\"cursor: pointer\">Clickable Card</div>
<iframe title=\"Child\" srcdoc='<button>Inside</button>'></iframe>
</main>
</body>
</html>\`;
http
.createServer((_req, res) => {
res.writeHead(200, { 'content-type': 'text/html; charset=utf-8' });
res.end(html);
})
.listen($FIXTURE_PORT, '127.0.0.1');
NODE
node \"\$entry\" gateway --port $PORT --bind loopback --allow-unconfigured >/tmp/browser-cdp-gateway.log 2>&1" >/dev/null
echo "Waiting for Chromium and Gateway..."
ready=0
for _ in $(seq 1 180); do
if [ "$(docker_cmd docker inspect -f '{{.State.Running}}' "$CONTAINER_NAME" 2>/dev/null || echo false)" != "true" ]; then
break
fi
if docker_cmd docker exec "$CONTAINER_NAME" bash -lc "
node --input-type=module -e 'const res = await fetch(\"http://127.0.0.1:$CDP_PORT/json/version\"); if (!res.ok) process.exit(1);' >/dev/null &&
node --input-type=module -e '
import net from \"node:net\";
const socket = net.createConnection({ host: \"127.0.0.1\", port: $PORT });
const timeout = setTimeout(() => { socket.destroy(); process.exit(1); }, 400);
socket.on(\"connect\", () => { clearTimeout(timeout); socket.end(); process.exit(0); });
socket.on(\"error\", () => { clearTimeout(timeout); process.exit(1); });
' >/dev/null
" >/dev/null 2>&1; then
ready=1
break
fi
sleep 0.5
done
if [ "$ready" -ne 1 ]; then
echo "Browser CDP snapshot container failed to become ready"
docker_cmd docker logs "$CONTAINER_NAME" 2>&1 | tail -n 120 || true
docker_cmd docker exec "$CONTAINER_NAME" bash -lc "tail -n 120 /tmp/browser-cdp-chromium.log /tmp/browser-cdp-gateway.log /tmp/browser-cdp-fixture.log" || true
exit 1
fi
echo "Running browser CDP snapshot smoke..."
docker_cmd docker exec "$CONTAINER_NAME" bash -lc "
set -euo pipefail
entry=dist/index.mjs
[ -f \"\$entry\" ] || entry=dist/index.js
base_args=(--url ws://127.0.0.1:$PORT --token '$TOKEN')
node \"\$entry\" browser \"\${base_args[@]}\" --browser-profile docker-cdp doctor --deep >/tmp/browser-cdp-doctor.txt
grep -q 'OK live-snapshot' /tmp/browser-cdp-doctor.txt
node \"\$entry\" browser \"\${base_args[@]}\" --browser-profile docker-cdp open http://127.0.0.1:$FIXTURE_PORT/ >/tmp/browser-cdp-open.txt
node \"\$entry\" browser \"\${base_args[@]}\" --browser-profile docker-cdp snapshot --interactive --urls --out /tmp/browser-cdp-snapshot.txt >/tmp/browser-cdp-snapshot.out
node --input-type=module - <<'NODE'
import fs from 'node:fs';
const snapshot = fs.readFileSync('/tmp/browser-cdp-snapshot.txt', 'utf8');
for (const needle of [
'button \"Save\"',
'link \"Docs\"',
'https://docs.openclaw.ai/browser-cdp-live',
'generic \"Clickable Card\"',
'cursor:pointer',
'Iframe \"Child\"',
'button \"Inside\"',
]) {
if (!snapshot.includes(needle)) {
console.error(snapshot);
throw new Error('missing snapshot needle: ' + needle);
}
}
console.log('ok');
NODE
"
echo "Browser CDP snapshot Docker E2E passed."