From e40094a9ef00593abd970c88b739141eafda87f8 Mon Sep 17 00:00:00 2001 From: Peter Steinberger Date: Sun, 26 Apr 2026 04:38:11 +0100 Subject: [PATCH] test(browser): add CDP snapshot Docker smoke --- CHANGELOG.md | 2 + docs/help/testing.md | 3 +- docs/reference/test.md | 1 + package.json | 1 + scripts/e2e/browser-cdp-snapshot-docker.sh | 174 +++++++++++++++++++++ 5 files changed, 180 insertions(+), 1 deletion(-) create mode 100755 scripts/e2e/browser-cdp-snapshot-docker.sh diff --git a/CHANGELOG.md b/CHANGELOG.md index 1cd6f3189ba..a027bbeb076 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,7 @@ Docs: https://docs.openclaw.ai - TTS/agents: allow `agents.list[].tts` to override global `messages.tts` for per-agent voices while keeping shared provider credentials and preferences in the existing TTS config surface. - TTS/agents: make `/tts audio`, `/tts status`, and the `tts` agent tool honor the active `agents.list[].tts` voice/provider override. - Providers/Azure Speech: add Azure Speech as a bundled TTS provider with Speech-resource auth, voice listing, SSML escaping, native Ogg/Opus voice-note output, and telephony output. (#51776) Thanks @leonchui. +- Browser automation: add a CDP-native role snapshot fallback with iframe-aware refs, cursor-clickable detection, target attach preparation, and `openclaw browser doctor --deep` live snapshot probing. - CLI/image generation: expose generic `--background` on `openclaw infer image generate` and `openclaw infer image edit`, keep `--openai-background` as an OpenAI alias, and let fal image generation honor `--output-format png|jpeg`. Thanks @steipete. - Browser/config: allow local managed Chrome launch discovery and post-launch CDP readiness timeouts to be raised for slower hosts such as Raspberry Pi. Fixes #66803. Thanks @beat843796. - Discord: allow `channels.discord.voice.model` to override the LLM used for voice channel responses while keeping STT and TTS on their existing media settings. (#64368) Thanks @mrdavey. @@ -100,6 +101,7 @@ Docs: https://docs.openclaw.ai - Diagnostics/OTEL: treat normal early model stream cleanup as a completed model call instead of exporting a misleading `StreamAbandoned` error span. Thanks @vincentkoc. - Gateway/pairing: stop corrupt or unreadable device/node pairing stores from being treated as empty state, preserving `paired.json` for repair instead of overwriting approved pairings. Fixes #71873. Thanks @iret77. - ACP: keep `/acp` management commands, plus local `/status` and `/unfocus`, on the Gateway path inside ACP-bound threads so they are not consumed as ACP prompt text. Fixes #66298. Thanks @kindomLee. +- ACPX: stop probing ACP agents during normal Gateway startup; the embedded backend now registers without spawning Codex/ACP child processes unless `OPENCLAW_ACPX_RUNTIME_STARTUP_PROBE=1` is explicitly set. - CLI/image edit: accept `--size`, `--aspect-ratio`, and `--resolution` on `openclaw infer image edit` and report all supported edit flags from `capability inspect image.edit`. Thanks @Pinghuachiu. - ACP: wait for the configured runtime backend to become healthy before startup identity reconciliation, avoiding transient acpx warnings during Gateway boot. Fixes #40566. - Channels/ACP bindings: time out configured binding readiness checks instead of letting Discord preflight hang forever when an ACP target never settles. Fixes #68776. diff --git a/docs/help/testing.md b/docs/help/testing.md index f4ac3729e6c..204d4119af4 100644 --- a/docs/help/testing.md +++ b/docs/help/testing.md @@ -603,7 +603,7 @@ These Docker runners split into two buckets: `OPENCLAW_LIVE_GATEWAY_MODEL_TIMEOUT_MS=90000`. Override those env vars when you explicitly want the larger exhaustive scan. - `test:docker:all` builds the live Docker image once via `test:docker:live-build`, then reuses it for the live Docker lanes. It also builds one shared `scripts/e2e/Dockerfile` image via `test:docker:e2e-build` and reuses it for the E2E container smoke runners that exercise the built app. The aggregate uses a weighted local scheduler: `OPENCLAW_DOCKER_ALL_PARALLELISM` controls process slots, while resource caps keep heavy live, npm-install, and multi-service lanes from all starting at once. Defaults are 10 slots, `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=6`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=8`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7`; tune `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` only when the Docker host has more headroom. The runner performs a Docker preflight by default, removes stale OpenClaw E2E containers, prints status every 30 seconds, stores successful lane timings in `.artifacts/docker-tests/lane-timings.json`, and uses those timings to start longer lanes first on later runs. Use `OPENCLAW_DOCKER_ALL_DRY_RUN=1` to print the weighted lane manifest without building or running Docker. -- Container smoke runners: `test:docker:openwebui`, `test:docker:onboard`, `test:docker:npm-onboard-channel-agent`, `test:docker:session-runtime-context`, `test:docker:agents-delete-shared-workspace`, `test:docker:gateway-network`, `test:docker:mcp-channels`, `test:docker:pi-bundle-mcp-tools`, `test:docker:cron-mcp-cleanup`, `test:docker:plugins`, `test:docker:plugin-update`, and `test:docker:config-reload` boot one or more real containers and verify higher-level integration paths. +- Container smoke runners: `test:docker:openwebui`, `test:docker:onboard`, `test:docker:npm-onboard-channel-agent`, `test:docker:session-runtime-context`, `test:docker:agents-delete-shared-workspace`, `test:docker:gateway-network`, `test:docker:browser-cdp-snapshot`, `test:docker:mcp-channels`, `test:docker:pi-bundle-mcp-tools`, `test:docker:cron-mcp-cleanup`, `test:docker:plugins`, `test:docker:plugin-update`, and `test:docker:config-reload` boot one or more real containers and verify higher-level integration paths. The live-model Docker runners also bind-mount only the needed CLI auth homes (or all supported ones when the run is not narrowed), then copy them into the container home before the run so external-CLI OAuth can refresh tokens without mutating the host auth store: @@ -621,6 +621,7 @@ The live-model Docker runners also bind-mount only the needed CLI auth homes (or - Install Smoke CI skips the duplicate direct-npm global update with `OPENCLAW_INSTALL_SMOKE_SKIP_NPM_GLOBAL=1`; run the script locally without that env when direct `npm install -g` coverage is needed. - Agents delete shared workspace CLI smoke: `pnpm test:docker:agents-delete-shared-workspace` (script: `scripts/e2e/agents-delete-shared-workspace-docker.sh`) builds the root Dockerfile image by default, seeds two agents with one workspace in an isolated container home, runs `agents delete --json`, and verifies valid JSON plus retained workspace behavior. Reuse the install-smoke image with `OPENCLAW_AGENTS_DELETE_SHARED_WORKSPACE_E2E_IMAGE=openclaw-dockerfile-smoke:local OPENCLAW_AGENTS_DELETE_SHARED_WORKSPACE_E2E_SKIP_BUILD=1`. - Gateway networking (two containers, WS auth + health): `pnpm test:docker:gateway-network` (script: `scripts/e2e/gateway-network-docker.sh`) +- Browser CDP snapshot smoke: `pnpm test:docker:browser-cdp-snapshot` (script: `scripts/e2e/browser-cdp-snapshot-docker.sh`) builds the source E2E image plus a Chromium layer, starts Chromium with raw CDP, runs `browser doctor --deep`, and verifies CDP role snapshots cover link URLs, cursor-promoted clickables, iframe refs, and frame metadata. - OpenAI Responses web_search minimal reasoning regression: `pnpm test:docker:openai-web-search-minimal` (script: `scripts/e2e/openai-web-search-minimal-docker.sh`) runs a mocked OpenAI server through Gateway, verifies `web_search` raises `reasoning.effort` from `minimal` to `low`, then forces the provider schema reject and checks the raw detail appears in Gateway logs. - MCP channel bridge (seeded Gateway + stdio bridge + raw Claude notification-frame smoke): `pnpm test:docker:mcp-channels` (script: `scripts/e2e/mcp-channels-docker.sh`) - Pi bundle MCP tools (real stdio MCP server + embedded Pi profile allow/deny smoke): `pnpm test:docker:pi-bundle-mcp-tools` (script: `scripts/e2e/pi-bundle-mcp-tools-docker.sh`) diff --git a/docs/reference/test.md b/docs/reference/test.md index 8b44337f0d1..5216754bdb9 100644 --- a/docs/reference/test.md +++ b/docs/reference/test.md @@ -33,6 +33,7 @@ title: "Tests" - `pnpm test:e2e`: Runs gateway end-to-end smoke tests (multi-instance WS/HTTP/node pairing). Defaults to `threads` + `isolate: false` with adaptive workers in `vitest.e2e.config.ts`; tune with `OPENCLAW_E2E_WORKERS=` and set `OPENCLAW_E2E_VERBOSE=1` for verbose logs. - `pnpm test:live`: Runs provider live tests (minimax/zai). Requires API keys and `LIVE=1` (or provider-specific `*_LIVE_TEST=1`) to unskip. - `pnpm test:docker:all`: Builds the shared live-test image and Docker E2E image once, then runs the Docker smoke lanes with `OPENCLAW_SKIP_DOCKER_BUILD=1` through a weighted scheduler. `OPENCLAW_DOCKER_ALL_PARALLELISM=` controls process slots and defaults to 10; `OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM=` controls the provider-sensitive tail pool and defaults to 10. Heavy lane caps default to `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=9`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=10`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7`; provider caps default to one heavy lane per provider via `OPENCLAW_DOCKER_ALL_LIVE_CLAUDE_LIMIT=4`, `OPENCLAW_DOCKER_ALL_LIVE_CODEX_LIMIT=4`, and `OPENCLAW_DOCKER_ALL_LIVE_GEMINI_LIMIT=4`. Use `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` for larger hosts. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with `OPENCLAW_DOCKER_ALL_START_STAGGER_MS=`. The runner preflights Docker by default, cleans stale OpenClaw E2E containers, emits active-lane status every 30 seconds, shares provider CLI tool caches between compatible lanes, retries transient live-provider failures once by default (`OPENCLAW_DOCKER_ALL_LIVE_RETRIES=`), and stores lane timings in `.artifacts/docker-tests/lane-timings.json` for longest-first ordering on later runs. Use `OPENCLAW_DOCKER_ALL_DRY_RUN=1` to print the lane manifest without running Docker, `OPENCLAW_DOCKER_ALL_STATUS_INTERVAL_MS=` to tune status output, or `OPENCLAW_DOCKER_ALL_TIMINGS=0` to disable timing reuse. Use `OPENCLAW_DOCKER_ALL_LIVE_MODE=skip` for deterministic/local lanes only or `OPENCLAW_DOCKER_ALL_LIVE_MODE=only` for live-provider lanes only; package aliases are `pnpm test:docker:local:all` and `pnpm test:docker:live:all`. Live-only mode merges main and tail live lanes into one longest-first pool so provider buckets can pack Claude, Codex, and Gemini work together. The runner stops scheduling new pooled lanes after the first failure unless `OPENCLAW_DOCKER_ALL_FAIL_FAST=0` is set, and each lane has a 120-minute fallback timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`; selected live/tail lanes use tighter per-lane caps. CLI backend Docker setup commands have their own timeout via `OPENCLAW_LIVE_CLI_BACKEND_SETUP_TIMEOUT_SECONDS` (default 180). Per-lane logs are written under `.artifacts/docker-tests//`. +- `pnpm test:docker:browser-cdp-snapshot`: Builds a Chromium-backed source E2E container, starts raw CDP plus an isolated Gateway, runs `browser doctor --deep`, and verifies CDP role snapshots include link URLs, cursor-promoted clickables, iframe refs, and frame metadata. - CLI backend live Docker probes can be run as focused lanes, for example `pnpm test:docker:live-cli-backend:codex`, `pnpm test:docker:live-cli-backend:codex:resume`, or `pnpm test:docker:live-cli-backend:codex:mcp`. Claude and Gemini have matching `:resume` and `:mcp` aliases. - `pnpm test:docker:openwebui`: Starts Dockerized OpenClaw + Open WebUI, signs in through Open WebUI, checks `/api/models`, then runs a real proxied chat through `/api/chat/completions`. Requires a usable live model key (for example OpenAI in `~/.profile`), pulls an external Open WebUI image, and is not expected to be CI-stable like the normal unit/e2e suites. - `pnpm test:docker:mcp-channels`: Starts a seeded Gateway container and a second client container that spawns `openclaw mcp serve`, then verifies routed conversation discovery, transcript reads, attachment metadata, live event queue behavior, outbound send routing, and Claude-style channel + permission notifications over the real stdio bridge. The Claude notification assertion reads the raw stdio MCP frames directly so the smoke reflects what the bridge actually emits. diff --git a/package.json b/package.json index d109aa7c89a..7358d38c8d1 100644 --- a/package.json +++ b/package.json @@ -1481,6 +1481,7 @@ "test:coverage:changed": "node scripts/run-vitest.mjs run --config test/vitest/vitest.unit.config.ts --coverage --changed origin/main", "test:docker:agents-delete-shared-workspace": "bash scripts/e2e/agents-delete-shared-workspace-docker.sh", "test:docker:all": "node scripts/test-docker-all.mjs", + "test:docker:browser-cdp-snapshot": "bash scripts/e2e/browser-cdp-snapshot-docker.sh", "test:docker:bundled-channel-deps": "bash scripts/e2e/bundled-channel-runtime-deps-docker.sh", "test:docker:bundled-channel-deps:fast": "OPENCLAW_BUNDLED_CHANNEL_SCENARIOS=0 OPENCLAW_BUNDLED_CHANNEL_UPDATE_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_ROOT_OWNED_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_SETUP_ENTRY_SCENARIO=1 OPENCLAW_BUNDLED_CHANNEL_DISABLED_CONFIG_SCENARIO=1 OPENCLAW_BUNDLED_CHANNEL_LOAD_FAILURE_SCENARIO=1 bash scripts/e2e/bundled-channel-runtime-deps-docker.sh", "test:docker:cleanup": "bash scripts/test-cleanup-docker.sh", diff --git a/scripts/e2e/browser-cdp-snapshot-docker.sh b/scripts/e2e/browser-cdp-snapshot-docker.sh new file mode 100755 index 00000000000..daa32d470bf --- /dev/null +++ b/scripts/e2e/browser-cdp-snapshot-docker.sh @@ -0,0 +1,174 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" +source "$ROOT_DIR/scripts/lib/docker-e2e-image.sh" + +BASE_IMAGE="$(docker_e2e_resolve_image "openclaw-browser-cdp-base-e2e" OPENCLAW_BROWSER_CDP_BASE_E2E_IMAGE)" +IMAGE_NAME="$(docker_e2e_resolve_image "openclaw-browser-cdp-snapshot-e2e" OPENCLAW_BROWSER_CDP_SNAPSHOT_E2E_IMAGE)" +SKIP_BUILD="${OPENCLAW_BROWSER_CDP_SNAPSHOT_E2E_SKIP_BUILD:-0}" +PORT="18789" +CDP_PORT="19222" +FIXTURE_PORT="18080" +TOKEN="browser-cdp-e2e-token" +CONTAINER_NAME="openclaw-browser-cdp-e2e-$$" +DOCKER_COMMAND_TIMEOUT="${OPENCLAW_BROWSER_CDP_SNAPSHOT_DOCKER_COMMAND_TIMEOUT:-900s}" + +docker_cmd() { + timeout "$DOCKER_COMMAND_TIMEOUT" "$@" +} + +cleanup() { + docker_cmd docker rm -f "$CONTAINER_NAME" >/dev/null 2>&1 || true +} +trap cleanup EXIT + +if [ "${OPENCLAW_SKIP_DOCKER_BUILD:-0}" = "1" ] || [ "$SKIP_BUILD" = "1" ]; then + echo "Reusing Docker image: $IMAGE_NAME" + docker_cmd docker image inspect "$IMAGE_NAME" >/dev/null +else + docker_e2e_build_or_reuse "$BASE_IMAGE" browser-cdp-base "$ROOT_DIR/scripts/e2e/Dockerfile" "$ROOT_DIR" "" "0" + build_dir="$(mktemp -d "${TMPDIR:-/tmp}/openclaw-browser-cdp-build.XXXXXX")" + trap 'cleanup; rm -rf "$build_dir"' EXIT + cat >"$build_dir/Dockerfile" < \"\$HOME/.openclaw/openclaw.json\" <<'JSON' +{ + \"gateway\": { + \"port\": $PORT, + \"auth\": { + \"mode\": \"token\", + \"token\": \"$TOKEN\" + }, + \"controlUi\": { \"enabled\": false } + }, + \"browser\": { + \"enabled\": true, + \"defaultProfile\": \"docker-cdp\", + \"ssrfPolicy\": { + \"allowedHostnames\": [\"127.0.0.1\"] + }, + \"profiles\": { + \"docker-cdp\": { + \"cdpUrl\": \"http://127.0.0.1:$CDP_PORT\", + \"color\": \"#FF4500\" + } + } + } +} +JSON +chromium --headless=new --no-sandbox --disable-gpu --disable-dev-shm-usage \\ + --remote-debugging-address=127.0.0.1 \\ + --remote-debugging-port=$CDP_PORT \\ + --user-data-dir=/tmp/openclaw-browser-cdp/chrome \\ + about:blank >/tmp/browser-cdp-chromium.log 2>&1 & +node --input-type=module - <<'NODE' >/tmp/browser-cdp-fixture.log 2>&1 & +import http from 'node:http'; +const html = \` + + +
+ + Docs +
Clickable Card
+ +
+ +\`; +http + .createServer((_req, res) => { + res.writeHead(200, { 'content-type': 'text/html; charset=utf-8' }); + res.end(html); + }) + .listen($FIXTURE_PORT, '127.0.0.1'); +NODE +node \"\$entry\" gateway --port $PORT --bind loopback --allow-unconfigured >/tmp/browser-cdp-gateway.log 2>&1" >/dev/null + +echo "Waiting for Chromium and Gateway..." +ready=0 +for _ in $(seq 1 180); do + if [ "$(docker_cmd docker inspect -f '{{.State.Running}}' "$CONTAINER_NAME" 2>/dev/null || echo false)" != "true" ]; then + break + fi + if docker_cmd docker exec "$CONTAINER_NAME" bash -lc " + node --input-type=module -e 'const res = await fetch(\"http://127.0.0.1:$CDP_PORT/json/version\"); if (!res.ok) process.exit(1);' >/dev/null && + node --input-type=module -e ' + import net from \"node:net\"; + const socket = net.createConnection({ host: \"127.0.0.1\", port: $PORT }); + const timeout = setTimeout(() => { socket.destroy(); process.exit(1); }, 400); + socket.on(\"connect\", () => { clearTimeout(timeout); socket.end(); process.exit(0); }); + socket.on(\"error\", () => { clearTimeout(timeout); process.exit(1); }); + ' >/dev/null + " >/dev/null 2>&1; then + ready=1 + break + fi + sleep 0.5 +done + +if [ "$ready" -ne 1 ]; then + echo "Browser CDP snapshot container failed to become ready" + docker_cmd docker logs "$CONTAINER_NAME" 2>&1 | tail -n 120 || true + docker_cmd docker exec "$CONTAINER_NAME" bash -lc "tail -n 120 /tmp/browser-cdp-chromium.log /tmp/browser-cdp-gateway.log /tmp/browser-cdp-fixture.log" || true + exit 1 +fi + +echo "Running browser CDP snapshot smoke..." +docker_cmd docker exec "$CONTAINER_NAME" bash -lc " +set -euo pipefail +entry=dist/index.mjs +[ -f \"\$entry\" ] || entry=dist/index.js +base_args=(--url ws://127.0.0.1:$PORT --token '$TOKEN') +node \"\$entry\" browser \"\${base_args[@]}\" --browser-profile docker-cdp doctor --deep >/tmp/browser-cdp-doctor.txt +grep -q 'OK live-snapshot' /tmp/browser-cdp-doctor.txt +node \"\$entry\" browser \"\${base_args[@]}\" --browser-profile docker-cdp open http://127.0.0.1:$FIXTURE_PORT/ >/tmp/browser-cdp-open.txt +node \"\$entry\" browser \"\${base_args[@]}\" --browser-profile docker-cdp snapshot --interactive --urls --out /tmp/browser-cdp-snapshot.txt >/tmp/browser-cdp-snapshot.out +node --input-type=module - <<'NODE' +import fs from 'node:fs'; +const snapshot = fs.readFileSync('/tmp/browser-cdp-snapshot.txt', 'utf8'); +for (const needle of [ + 'button \"Save\"', + 'link \"Docs\"', + 'https://docs.openclaw.ai/browser-cdp-live', + 'generic \"Clickable Card\"', + 'cursor:pointer', + 'Iframe \"Child\"', + 'button \"Inside\"', +]) { + if (!snapshot.includes(needle)) { + console.error(snapshot); + throw new Error('missing snapshot needle: ' + needle); + } +} +console.log('ok'); +NODE +" + +echo "Browser CDP snapshot Docker E2E passed."