fix: harden async media completion delivery

2026-05-07 07:58:36 +00:00 · 2026-05-05 06:13:14 +01:00
parent 349ce0056d
commit 6c8974f3f5
8 changed files with 138 additions and 8 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -109,6 +109,7 @@ Docs: https://docs.openclaw.ai
 - Active Memory: give timeout partial transcript recovery enough abort-settle headroom so temporary recall summaries are returned before cleanup. Thanks @vincentkoc.
 - Gateway/chat: clear the active reply-run guard before draining queued same-session follow-up turns, so sequential `chat.send` calls no longer trip `ReplyRunAlreadyActiveError` every other request. Fixes #77485. Thanks @bws14email.
 - Agents/media: avoid sending generated image, video, and music attachments twice when streamed reply text arrives before the final `MEDIA:` directive.
+- Agents/media: tell async music and video completion agents when normal final replies are private, and send completion fallbacks directly to message-tool-only group/channel routes when the completion agent still only writes a private final reply, so generated media does not disappear behind the delivery contract.
 - CLI/sessions: cap `openclaw sessions` output to the newest 100 rows by default and add `--limit <n|all>` plus JSON pagination metadata, so repeated machine polling of large session stores cannot fan out into unbounded per-row enrichment/output work. Fixes #77500. Thanks @Kaotic3.
 - Doctor/config: restore legacy group chat config migrations for `routing.allowFrom`, `routing.groupChat.*`, and `channels.telegram.requireMention` so upgrades keep WhatsApp, Telegram, and iMessage group mention gates and history settings instead of leaving configs invalid or silently blocked. Thanks @scoootscooob.
 - CLI/update: make package-update follow-up processes write completion results and exit explicitly, so Windows packaged upgrades do not hang after the new package finishes post-core plugin work. Thanks @vincentkoc.
--- a/docs/automation/tasks.md
+++ b/docs/automation/tasks.md
@@ -102,7 +102,7 @@ Not every agent run creates a task. Heartbeat turns and normal interactive chat
  <Accordion title="Notify defaults for cron and media">
    Main-session cron tasks use `silent` notify policy by default — they create records for tracking but do not generate notifications. Isolated cron tasks also default to `silent` but are more visible because they run in their own session.

-    Session-backed `music_generate` and `video_generate` runs also use `silent` notify policy. They still create task records, but completion is handed back to the original agent session as an internal wake so the agent can write the follow-up message and attach the finished media itself. Group/channel completions follow the normal visible-reply policy, so the agent uses the message tool when source delivery requires it.
+    Session-backed `music_generate` and `video_generate` runs also use `silent` notify policy. They still create task records, but completion is handed back to the original agent session as an internal wake so the agent can write the follow-up message and attach the finished media itself. Group/channel completions follow the normal visible-reply policy, so the agent uses the message tool when source delivery requires it. If the completion agent fails to produce message-tool delivery evidence in a tool-only route, OpenClaw sends the completion fallback directly to the original channel instead of leaving the media private.

  </Accordion>
  <Accordion title="Concurrent video_generate guardrail">
--- a/docs/tools/media-overview.md
+++ b/docs/tools/media-overview.md
@@ -93,7 +93,9 @@ id immediately, and tracks the job in the task ledger. The agent continues
 responding to other messages while the job runs. When the provider finishes,
 OpenClaw wakes the agent with the generated media paths so it can tell the
 user and, when required by source-delivery policy, relay the result through
-the message tool.
+the message tool. For message-tool-only group/channel routes, OpenClaw treats
+missing message-tool delivery evidence as a failed completion attempt and sends
+the generated media fallback directly to the original channel.

 ## Speech-to-text and Voice Call

--- a/docs/tools/music-generation.md
+++ b/docs/tools/music-generation.md
@@ -16,7 +16,10 @@ For session-backed agent runs, OpenClaw starts music generation as a
 background task, tracks it in the task ledger, then wakes the agent again
 when the track is ready so the agent can tell the user and attach the
 finished audio. In group/channel chats that use message-tool-only visible
-delivery, the agent relays the result through the message tool.
+delivery, the agent relays the result through the message tool. If the
+completion agent writes only a private final reply, OpenClaw falls back to a
+direct channel send with the generated media. The completion wake explicitly
+warns the agent that normal final replies are private in those routes.

 <Note>
 The built-in shared tool only appears when at least one music-generation
--- a/src/agents/subagent-announce-delivery.test.ts
+++ b/src/agents/subagent-announce-delivery.test.ts
@@ -1202,7 +1202,7 @@ describe("deliverSubagentAnnouncement completion delivery", () => {
    expect(sendMessage).not.toHaveBeenCalled();
  });

-  it("requires message-tool delivery for generated media completions in default group routes", async () => {
+  it("falls back to direct send for generated media completions in default group routes", async () => {
    const callGateway = createGatewayMock({
      result: {
        payloads: [
@@ -1241,8 +1241,8 @@ describe("deliverSubagentAnnouncement completion delivery", () => {

    expect(result).toEqual(
      expect.objectContaining({
-        delivered: false,
-        path: "direct",
+        delivered: true,
+        path: "direct-fallback",
      }),
    );
    expect(callGateway).toHaveBeenCalledWith(
@@ -1257,7 +1257,18 @@ describe("deliverSubagentAnnouncement completion delivery", () => {
        }),
      }),
    );
-    expect(sendMessage).not.toHaveBeenCalled();
+    expect(sendMessage).toHaveBeenCalledWith(
+      expect.objectContaining({
+        channel: "slack",
+        accountId: "acct-1",
+        to: "channel:C123",
+        threadId: undefined,
+        content: "Generated 1 track.\nMEDIA:/tmp/generated-night-drive.mp3",
+        requesterSessionKey: "agent:main:slack:channel:C123",
+        bestEffort: true,
+        idempotencyKey: "announce-channel-media-message-tool",
+      }),
+    );
  });

  it("uses a direct channel fallback when announce-agent returns no visible output", async () => {
--- a/src/agents/subagent-announce-delivery.ts
+++ b/src/agents/subagent-announce-delivery.ts
@@ -885,7 +885,9 @@ async function sendSubagentAnnounceDirectly(params: {
      });
    const shouldDeliverAgentFinal = deliveryTarget.deliver && !requiresMessageToolDelivery;
    const completionFallbackText =
-      params.expectsCompletionMessage && shouldDeliverAgentFinal && !agentMediatedCompletion
+      params.expectsCompletionMessage &&
+      deliveryTarget.deliver &&
+      (!agentMediatedCompletion || requiresMessageToolDelivery)
        ? extractThreadCompletionFallbackText(params.internalEvents)
        : "";
    const requesterActivity = resolveRequesterSessionActivity(canonicalRequesterSessionKey);
@@ -1070,6 +1072,24 @@ async function sendSubagentAnnounceDirectly(params: {
      requiresMessageToolDelivery &&
      !hasGatewayAgentMessagingToolDelivery(directAnnounceResponse)
    ) {
+      const didFallback = await sendCompletionFallback({
+        cfg,
+        channel: deliveryTarget.channel,
+        to: deliveryTarget.to,
+        accountId: deliveryTarget.accountId,
+        threadId: deliveryTarget.threadId,
+        content: completionFallbackText,
+        requesterSessionKey: canonicalRequesterSessionKey,
+        bestEffortDeliver: params.bestEffortDeliver,
+        idempotencyKey: params.directIdempotencyKey,
+        signal: params.signal,
+      });
+      if (didFallback) {
+        return {
+          delivered: true,
+          path: resolveCompletionFallbackPath(deliveryTarget.threadId),
+        };
+      }
      return {
        delivered: false,
        path: "direct",
--- a/src/agents/tools/media-generate-background-shared.ts
+++ b/src/agents/tools/media-generate-background-shared.ts
@@ -1,8 +1,10 @@
 import crypto from "node:crypto";
+import { SILENT_REPLY_TOKEN } from "../../auto-reply/tokens.js";
 import type { OpenClawConfig } from "../../config/types.openclaw.js";
 import { clearAgentRunContext, registerAgentRunContext } from "../../infra/agent-events.js";
 import { formatErrorMessage } from "../../infra/errors.js";
 import { createSubsystemLogger } from "../../logging/subsystem.js";
+import { deriveSessionChatTypeFromKey } from "../../sessions/session-chat-type-shared.js";
 import {
  completeTaskRunByRunId,
  createRunningTaskRun,
@@ -222,8 +224,18 @@ function failMediaGenerationTaskRun(params: {
 function buildMediaGenerationReplyInstruction(params: {
  status: "ok" | "error";
  completionLabel: string;
+  requiresMessageToolDelivery: boolean;
 }) {
  if (params.status === "ok") {
+    if (params.requiresMessageToolDelivery) {
+      return [
+        `The ${params.completionLabel} is ready for the original channel/group chat.`,
+        "This route requires message-tool delivery: the user will NOT see your normal assistant final reply.",
+        'Call the message tool with action="send" to the original/current chat, put a short caption in the message, and attach the generated media paths from the result.',
+        `After the message tool succeeds, reply only ${SILENT_REPLY_TOKEN}.`,
+        "Do not put MEDIA: lines only in your final answer; that final answer is private in this chat.",
+      ].join(" ");
+    }
    return `Tell the user the ${params.completionLabel} is ready. If visible source delivery requires the message tool, send it there with the generated media attached.`;
  }
  return [
@@ -233,6 +245,39 @@ function buildMediaGenerationReplyInstruction(params: {
  ].join(" ");
 }

+function inferMediaGenerationCompletionChatType(
+  handle: MediaGenerationTaskHandle,
+): "direct" | "group" | "channel" | "unknown" {
+  const sessionKeyChatType = deriveSessionChatTypeFromKey(handle.requesterSessionKey);
+  if (sessionKeyChatType !== "unknown") {
+    return sessionKeyChatType;
+  }
+  const to = handle.requesterOrigin?.to?.trim().toLowerCase();
+  if (to?.startsWith("group:")) {
+    return "group";
+  }
+  if (to?.startsWith("channel:")) {
+    return "channel";
+  }
+  if (to?.startsWith("dm:") || to?.startsWith("direct:")) {
+    return "direct";
+  }
+  return "unknown";
+}
+
+function mediaGenerationCompletionRequiresMessageToolDelivery(params: {
+  config?: OpenClawConfig;
+  handle: MediaGenerationTaskHandle;
+}): boolean {
+  const chatType = inferMediaGenerationCompletionChatType(params.handle);
+  if (chatType === "group" || chatType === "channel") {
+    const configuredMode =
+      params.config?.messages?.groupChat?.visibleReplies ?? params.config?.messages?.visibleReplies;
+    return configuredMode !== "automatic";
+  }
+  return params.config?.messages?.visibleReplies === "message_tool";
+}
+
 async function wakeMediaGenerationTaskCompletion(params: {
  config?: OpenClawConfig;
  handle: MediaGenerationTaskHandle | null;
@@ -266,6 +311,10 @@ async function wakeMediaGenerationTaskCompletion(params: {
      replyInstruction: buildMediaGenerationReplyInstruction({
        status: params.status,
        completionLabel: params.completionLabel,
+        requiresMessageToolDelivery: mediaGenerationCompletionRequiresMessageToolDelivery({
+          config: params.config,
+          handle: params.handle,
+        }),
      }),
    },
  ];
--- a/src/agents/tools/music-generate-background.test.ts
+++ b/src/agents/tools/music-generate-background.test.ts
@@ -95,6 +95,50 @@ describe("music generate background helpers", () => {
    expect(announceDeliveryMocks.deliverSubagentAnnouncement).toHaveBeenCalled();
  });

+  it("warns channel completion agents that normal final replies are private", async () => {
+    announceDeliveryMocks.deliverSubagentAnnouncement.mockResolvedValue({
+      delivered: true,
+      path: "direct",
+    });
+    const completion = createMediaCompletionFixture({
+      runId: "tool:music_generate:abc",
+      taskLabel: "night-drive synthwave",
+      result: "Generated 1 track.\nMEDIA:/tmp/generated-night-drive.mp3",
+      mediaUrls: ["/tmp/generated-night-drive.mp3"],
+    });
+
+    await wakeMusicGenerationTaskCompletion({
+      ...completion,
+      handle: {
+        ...completion.handle,
+        requesterSessionKey: "agent:main:discord:channel:C123",
+      },
+    });
+
+    expect(announceDeliveryMocks.deliverSubagentAnnouncement).toHaveBeenCalledWith(
+      expect.objectContaining({
+        internalEvents: expect.arrayContaining([
+          expect.objectContaining({
+            replyInstruction: expect.stringContaining(
+              "the user will NOT see your normal assistant final reply",
+            ),
+          }),
+        ]),
+      }),
+    );
+    expect(announceDeliveryMocks.deliverSubagentAnnouncement).toHaveBeenCalledWith(
+      expect.objectContaining({
+        internalEvents: expect.arrayContaining([
+          expect.objectContaining({
+            replyInstruction: expect.stringContaining(
+              "Do not put MEDIA: lines only in your final answer",
+            ),
+          }),
+        ]),
+      }),
+    );
+  });
+
  it("queues a completion event when direct send is enabled globally", async () => {
    taskDeliveryRuntimeMocks.sendMessage.mockResolvedValue({
      channel: "discord",