docs: clarify tool-access boundary in prompt injection section

Merge pull request #2322 from arc53/dependabot/npm_and_yarn/extensions/react-widget/svgo-4.0.1
chore(deps-dev): bump svgo from 3.3.3 to 4.0.1 in /extensions/react-widget
2026-05-07 06:30:03 +00:00 · 2026-04-15 18:31:42 +01:00 · 2026-04-15 17:06:40 +05:30 · 2026-04-15 10:43:02 +00:00 · 2026-04-15 16:03:56 +05:30 · 2026-04-15 10:28:46 +00:00
513 changed files with 108765 additions and 15046 deletions
--- a/.env-template
+++ b/.env-template
@@ -3,6 +3,14 @@ LLM_NAME=docsgpt
 VITE_API_STREAMING=true
 INTERNAL_KEY=<internal key for worker-to-backend authentication>

+# Provider-specific API keys (optional - use these to enable multiple providers)
+# OPENAI_API_KEY=<your-openai-api-key>
+# ANTHROPIC_API_KEY=<your-anthropic-api-key>
+# GOOGLE_API_KEY=<your-google-api-key>
+# GROQ_API_KEY=<your-groq-api-key>
+# NOVITA_API_KEY=<your-novita-api-key>
+# OPEN_ROUTER_API_KEY=<your-openrouter-api-key>
+
 # Remote Embeddings (Optional - for using a remote embeddings API instead of local SentenceTransformer)
 # When set, the app will use the remote API and won't load SentenceTransformer (saves RAM)
 EMBEDDINGS_BASE_URL=
@@ -12,4 +20,23 @@ EMBEDDINGS_KEY=
 OPENAI_API_BASE=
 OPENAI_API_VERSION=
 AZURE_DEPLOYMENT_NAME=
-AZURE_EMBEDDINGS_DEPLOYMENT_NAME=
+AZURE_EMBEDDINGS_DEPLOYMENT_NAME=
+
+#Azure AD Application (client) ID
+MICROSOFT_CLIENT_ID=your-azure-ad-client-id
+#Azure AD Application client secret
+MICROSOFT_CLIENT_SECRET=your-azure-ad-client-secret
+#Azure AD Tenant ID (or 'common' for multi-tenant)
+MICROSOFT_TENANT_ID=your-azure-ad-tenant-id
+#If you are using a Microsoft Entra ID tenant,
+#configure the AUTHORITY variable as
+#"https://login.microsoftonline.com/TENANT_GUID"
+#or "https://login.microsoftonline.com/contoso.onmicrosoft.com".
+#Alternatively, use "https://login.microsoftonline.com/common" for multi-tenant app.
+MICROSOFT_AUTHORITY=https://{tenantId}.ciamlogin.com/{tenantId}
+
+# User-data Postgres DB (Phase 0 of the MongoDB→Postgres migration).
+# Standard Postgres URI — `postgres://` and `postgresql://` both work.
+# Leave unset while the migration is still being rolled out; the app will
+# fall back to MongoDB for user data until POSTGRES_URI is configured.
+# POSTGRES_URI=postgresql://docsgpt:docsgpt@localhost:5432/docsgpt
--- a/.github/INCIDENT_RESPONSE.md
+++ b/.github/INCIDENT_RESPONSE.md
@@ -0,0 +1,99 @@
+# DocsGPT Incident Response Plan (IRP)
+
+This playbook describes how maintainers respond to confirmed or suspected security incidents.
+
+- Vulnerability reporting: [`SECURITY.md`](../SECURITY.md)
+- Non-security bugs/features: [`CONTRIBUTING.md`](../CONTRIBUTING.md)
+
+## Severity
+
+| Severity | Definition | Typical examples |
+|---|---|---|
+| **Critical** | Active exploitation, supply-chain compromise, or confirmed data breach requiring immediate user action. | Compromised release artifact/image; remote execution. |
+| **High** | Serious undisclosed vulnerability with no practical workaround, or CVSS >= 7.0. | key leakage; prompt injection enabling cross-tenant access. |
+| **Medium** | Material impact but constrained by preconditions/scope, or a practical workaround exists. | Auth-required exploit; dependency CVE with limited reachability. |
+| **Low** | Defense-in-depth or narrow availability impact with no confirmed data exposure. | Missing rate limiting; hardening gap without exploit evidence. |
+
+
+## Response workflow
+
+### 1) Triage (target: initial response within 48 hours)
+
+1. Acknowledge report.
+2. Validate on latest release and `main`.
+3. Confirm in-scope security issue vs. hardening item (per `SECURITY.md`).
+4. Assign severity and open a **draft GitHub Security Advisory (GHSA)** (no public issue).
+5. Determine whether root cause is DocsGPT code or upstream dependency/provider.
+
+### 2) Investigation
+
+1. Identify affected components, versions, and deployment scope (self-hosted, cloud, or both).
+2. For AI issues, explicitly evaluate prompt injection, document isolation, and output leakage.
+3. Request a CVE through GHSA for **Medium+** issues.
+
+### 3) Containment, fix, and disclosure
+
+1. Implement and test fix in private security workflow (GHSA private fork/branch).
+2. Merge fix to `main`, cut patched release, and verify published artifacts/images.
+3. Patch managed cloud deployment (`app.docsgpt.cloud`) and other deployments as soon as validated.
+4. Publish GHSA with CVE (if assigned), affected/fixed versions, CVSS, mitigations, and upgrade guidance.
+5. **Critical/High:** coordinate disclosure timing with reporter (goal: <= 90 days) and publish a notice.
+6. **Medium/Low:** include in next scheduled release unless risk requires immediate out-of-band patching.
+
+### 4) Post-incident
+
+1. Monitor support channels (GitHub/Discord) for regressions or exploitation reports.
+2. Run a short retrospective (root cause, detection, response gaps, prevention work).
+3. Track follow-up hardening actions with owners/dates.
+4. Update this IRP and related runbooks as needed.
+
+## Scenario playbooks
+
+### Supply-chain compromise
+
+1. Freeze releases and investigate blast radius.
+2. Rotate credentials in order: Docker Hub -> GitHub tokens -> LLM provider keys -> DB credentials -> `JWT_SECRET_KEY` -> `ENCRYPTION_SECRET_KEY` -> `INTERNAL_KEY`.
+3. Replace compromised artifacts/tags with clean releases and revoke/remove bad tags where possible.
+4. Publish advisory with exact affected versions and required user actions.
+
+### Data exposure
+
+1. Determine scope (users, documents, keys, logs, time window).
+2. Disable affected path or hotfix immediately for managed cloud.
+3. Notify affected users with concrete remediation steps (for example, rotate keys).
+4. Continue through standard fix/disclosure workflow.
+
+### Critical regression with security impact
+
+1. Identify introducing change (`git bisect` if needed).
+2. Publish workaround within 24 hours (for example, pin to known-good version).
+3. Ship patch release with regression test and close incident with public summary.
+
+## AI-specific guidance
+
+Treat confirmed AI-specific abuse as security incidents:
+
+- Prompt injection causing sensitive data exfiltration (from tools that don't belong to the agent) -> **High**
+- Cross-tenant retrieval/isolation failure -> **High**
+- API key disclosure in output -> **High**
+
+## Secret rotation quick reference
+
+| Secret | Standard rotation action |
+|---|---|
+| Docker Hub credentials | Revoke/replace in Docker Hub; update CI/CD secrets |
+| GitHub tokens/PATs | Revoke/replace in GitHub; update automation secrets |
+| LLM provider API keys | Rotate in provider console; update runtime/deploy secrets |
+| Database credentials | Rotate in DB platform; redeploy with new secrets |
+| `JWT_SECRET_KEY` | Rotate and redeploy (invalidates all active user sessions/tokens) |
+| `ENCRYPTION_SECRET_KEY` | Rotate and redeploy (re-encrypt stored data if possible; existing encrypted data may become inaccessible) |
+| `INTERNAL_KEY` | Rotate and redeploy (invalidates worker-to-backend authentication) |
+
+## Maintenance
+
+Review this document:
+
+- after every **Critical/High** incident, and
+- at least annually.
+
+Changes should be proposed via pull request to `main`.
--- a/.github/THREAT_MODEL.md
+++ b/.github/THREAT_MODEL.md
@@ -0,0 +1,144 @@
+# DocsGPT Public Threat Model
+
+**Classification:** Public  
+**Last updated:** 2026-04-15  
+**Applies to:** Open-source and self-hosted DocsGPT deployments
+
+## 1) Overview
+
+DocsGPT ingests content (files/URLs/connectors), indexes it, and answers queries via LLM-backed APIs and optional tools.
+
+Core components:
+- Backend API (`application/`)
+- Workers/ingestion (`application/worker.py` and related modules)
+- Datastores (MongoDB/Redis/vector stores)
+- Frontend (`frontend/`)
+- Optional extensions/integrations (`extensions/`)
+
+## 2) Scope and assumptions
+
+In scope:
+- Application-level threats in this repository.
+- Local and internet-exposed self-hosted deployments.
+
+Assumptions:
+- Internet-facing instances enable auth and use strong secrets.
+- Datastores/internal services are not publicly exposed.
+
+Out of scope:
+- Cloud hardware/provider compromise.
+- Security guarantees of external LLM vendors.
+- Full security audits of third-party systems targeted by tools (external DBs/MCP servers/code-exec APIs).
+
+## 3) Security objectives
+
+- Protect document/conversation confidentiality.
+- Preserve integrity of prompts, agents, tools, and indexed data.
+- Maintain API/worker availability.
+- Enforce tenant isolation in authenticated deployments.
+
+## 4) Assets
+
+- Documents, attachments, chunks/embeddings, summaries.
+- Conversations, agents, workflows, prompt templates.
+- Secrets (JWT secret, `INTERNAL_KEY`, provider/API/OAuth credentials).
+- Operational capacity (worker throughput, queue depth, model quota/cost).
+
+## 5) Trust boundaries and untrusted input
+
+Trust boundaries:
+- Internet ↔ Frontend
+- Frontend ↔ Backend API
+- Backend ↔ Workers/internal APIs
+- Backend/workers ↔ Datastores
+- Backend ↔ External LLM/connectors/remote URLs
+
+Untrusted input includes API payloads, file uploads, remote URLs, OAuth/webhook data, retrieved content, and LLM/tool arguments.
+
+## 6) Main attack surfaces
+
+1. Auth/authz paths and sharing tokens.
+2. File upload + parsing pipeline.
+3. Remote URL fetching and connectors (SSRF risk).
+4. Agent/tool execution from LLM output.
+5. Template/workflow rendering.
+6. Frontend rendering + token storage.
+7. Internal service endpoints (`INTERNAL_KEY`).
+8. High-impact integrations (SQL tool, generic API tool, remote MCP tools).
+
+## 7) Key threats and expected mitigations
+
+### A. Auth/authz misconfiguration
+- Threat: weak/no auth or leaked tokens leads to broad data access.
+- Mitigations: require auth for public deployments, short-lived tokens, rotation/revocation, least-privilege sharing.
+
+### B. Untrusted file ingestion
+- Threat: malicious files/archives trigger traversal, parser exploits, or resource exhaustion.
+- Mitigations: strict path checks, archive safeguards, file limits, patched parser dependencies.
+
+### C. SSRF/outbound abuse
+- Threat: URL loaders/tools access private/internal/metadata endpoints.
+- Mitigations: validate URLs + redirects, block private/link-local ranges, apply egress controls/allowlists.
+
+### D. Prompt injection + tool abuse
+- Threat: retrieved text manipulates model behavior and causes unsafe tool calls.
+- Threat: never rely on the model to "choose correctly" under adversarial input.
+- Mitigations: treat retrieved/model output as untrusted, enforce tool policies, only expose tools explicitly assigned by the user/admin to that agent, separate system instructions from retrieved content, audit tool calls.
+
+### E. Dangerous tool capability chaining (SQL/API/MCP)
+- Threat: write-capable SQL credentials allow destructive queries.
+- Threat: API tool can trigger side effects (infra/payment/webhook/code-exec endpoints).
+- Threat: remote MCP tools may expose privileged operations.
+- Mitigations: read-only-by-default credentials, destination allowlists, explicit approval for write/exec actions, per-tool policy enforcement + logging.
+
+### F. Frontend/XSS + token theft
+- Threat: XSS can steal local tokens and call APIs.
+- Mitigations: reduce unsafe rendering paths, strong CSP, scoped short-lived credentials.
+
+### G. Internal endpoint exposure
+- Threat: weak/unset `INTERNAL_KEY` enables internal API abuse.
+- Mitigations: fail closed, require strong random keys, keep internal APIs private.
+
+### H. DoS and cost abuse
+- Threat: request floods, large ingestion jobs, expensive prompts/crawls.
+- Mitigations: rate limits, quotas, timeouts, queue backpressure, usage budgets.
+
+## 8) Example attacker stories
+
+- Internet-exposed deployment runs with weak/no auth and receives unauthorized data access/abuse.
+- Intranet deployment intentionally using weak/no auth is vulnerable to insider misuse and lateral-movement abuse.
+- Crafted archive attempts path traversal during extraction.
+- Malicious URL/redirect chain targets internal services.
+- Poisoned document causes data exfiltration through tool calls.
+- Over-privileged SQL/API/MCP tool performs destructive side effects.
+
+## 9) Severity calibration
+
+- **Critical:** unauthenticated public data access; prompt-injection-driven exfiltration; SSRF to sensitive internal endpoints.
+- **High:** cross-tenant leakage, persistent token compromise, over-privileged destructive tools.
+- **Medium:** DoS/cost amplification and non-critical information disclosure.
+- **Low:** minor hardening gaps with limited impact.
+
+## 10) Baseline controls for public deployments
+
+1. Enforce authentication and secure defaults.
+2. Set/rotate strong secrets (`JWT`, `INTERNAL_KEY`, encryption keys).
+3. Restrict CORS and front API with a hardened proxy.
+4. Add rate limiting/quotas for answer/upload/crawl/token endpoints.
+5. Enforce URL+redirect SSRF protections and egress restrictions.
+6. Apply upload/archive/parsing hardening.
+7. Require least-privilege tool credentials and auditable tool execution.
+8. Monitor auth failures, tool anomalies, ingestion spikes, and cost anomalies.
+9. Keep dependencies/images patched and scanned.
+10. Validate multi-tenant isolation with explicit tests.
+
+## 11) Maintenance
+
+Review this model after major auth, ingestion, connector, tool, or workflow changes.
+
+## References
+
+- [OWASP Top 10 for LLM Applications](https://owasp.org/www-project-top-10-for-large-language-model-applications/)
+- [OWASP ASVS](https://owasp.org/www-project-application-security-verification-standard/)
+- [STRIDE overview](https://learn.microsoft.com/azure/security/develop/threat-modeling-tool-threats)
+- [DocsGPT SECURITY.md](../SECURITY.md)
--- a/.github/styles/config/vocabularies/DocsGPT/accept.txt
+++ b/.github/styles/config/vocabularies/DocsGPT/accept.txt
@@ -1,46 +1,80 @@
-Ollama
-Qdrant
-Milvus
-Chatwoot
-Nextra
-VSCode
-npm
-LLMs
+Agentic
+Anthropic's
+api
 APIs
-Groq
-SGLang
-LMDeploy
-OAuth
-Vite
-LLM
-JSONPath
-UIs
+Atlassian
+automations
+autoescaping
+Autoescaping
+backfill
+backfills
+bool
+boolean
+brave_web_search
+chatbot
+Chatwoot
+config
 configs
-uncomment
-qdrant
-vectorstore
+CSVs
+dev
+diarization
+Docling
 docsgpt
-llm
+docstrings
+Entra
+env
+enqueues
+EOL
+ESLint
+feedbacks
+Figma
 GPUs
+Groq
+hardcode
+hardcoding
+Idempotency
+JSONPath
 kubectl
 Lightsail
-enqueues
-chatbot
-VSCode's
-Shareability
-feedbacks
-automations
+llama_cpp
+llm
+LLM
+LLMs
+LMDeploy
+Milvus
+Mixtral
+namespace
+namespaces
+needs_auth
+Nextra
+Novita
+npm
+OAuth
+Ollama
+opencode
+parsable
+passthrough
+PDFs
+pgvector
+Postgres
 Premade
-Signup
+Pydantic
+pytest
+Qdrant
+qdrant
 Repo
 repo
-env
-URl
-agentic
-llama_cpp
-parsable
+Sanitization
 SDKs
-boolean
-bool
-hardcode
-EOL
+SGLang
+Shareability
+Signup
+Supabase
+UIs
+uncomment
+URl
+vectorstore
+Vite
+VSCode
+VSCode's
+widget's
--- a/.github/workflows/npm-publish.yml
+++ b/.github/workflows/npm-publish.yml
@@ -0,0 +1,114 @@
+name: Publish npm libraries
+
+on:
+  workflow_dispatch:
+    inputs:
+      version:
+        description: >
+          Version bump type (patch | minor | major) or explicit semver (e.g. 1.2.3).
+          Applies to both docsgpt and docsgpt-react.
+        required: true
+        default: patch
+
+permissions:
+  contents: write
+  pull-requests: write
+
+jobs:
+  publish:
+    runs-on: ubuntu-latest
+    environment: npm-release
+    defaults:
+      run:
+        working-directory: extensions/react-widget
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-node@v4
+        with:
+          node-version: 20
+          registry-url: https://registry.npmjs.org
+
+      - name: Install dependencies
+        run: npm ci
+
+      # ── docsgpt (HTML embedding bundle) ──────────────────────────────────
+      # Uses the `build` script (parcel build src/browser.tsx) and keeps
+      # the `targets` field so Parcel produces browser-optimised bundles.
+
+      - name: Set package name → docsgpt
+        run: jq --arg n "docsgpt" '.name=$n' package.json > _tmp.json && mv _tmp.json package.json
+
+      - name: Bump version (docsgpt)
+        id: version_docsgpt
+        run: |
+          VERSION="${{ github.event.inputs.version }}"
+          NEW_VER=$(npm version "${VERSION:-patch}" --no-git-tag-version)
+          echo "version=${NEW_VER#v}" >> "$GITHUB_OUTPUT"
+
+      - name: Build docsgpt
+        run: npm run build
+
+      - name: Publish docsgpt
+        run: npm publish --verbose
+        env:
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+
+      # ── docsgpt-react (React library bundle) ─────────────────────────────
+      # Uses `build:react` script (parcel build src/index.ts) and strips
+      # the `targets` field so Parcel treats the output as a plain library
+      # without browser-specific target resolution, producing a smaller bundle.
+
+      - name: Reset package.json from source control
+        run: git checkout -- package.json
+
+      - name: Set package name → docsgpt-react
+        run: jq --arg n "docsgpt-react" '.name=$n' package.json > _tmp.json && mv _tmp.json package.json
+
+      - name: Remove targets field (react library build)
+        run: jq 'del(.targets)' package.json > _tmp.json && mv _tmp.json package.json
+
+      - name: Bump version (docsgpt-react) to match docsgpt
+        run: npm version "${{ steps.version_docsgpt.outputs.version }}" --no-git-tag-version
+
+      - name: Clean dist before react build
+        run: rm -rf dist
+
+      - name: Build docsgpt-react
+        run: npm run build:react
+
+      - name: Publish docsgpt-react
+        run: npm publish --verbose
+        env:
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+
+      # ── Commit the bumped version back to the repository ─────────────────
+
+      - name: Reset package.json and write final version
+        run: |
+          git checkout -- package.json
+          jq --arg v "${{ steps.version_docsgpt.outputs.version }}" '.version=$v' \
+            package.json > _tmp.json && mv _tmp.json package.json
+          npm install --package-lock-only
+
+      - name: Commit version bump and create PR
+        run: |
+          git config user.name  "github-actions[bot]"
+          git config user.email "github-actions[bot]@users.noreply.github.com"
+          BRANCH="chore/bump-npm-v${{ steps.version_docsgpt.outputs.version }}"
+          git checkout -b "$BRANCH"
+          git add package.json package-lock.json
+          git commit -m "chore: bump npm libraries to v${{ steps.version_docsgpt.outputs.version }}"
+          git push origin "$BRANCH"
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Create PR
+        run: |
+          gh pr create \
+            --title "chore: bump npm libraries to v${{ steps.version_docsgpt.outputs.version }}" \
+            --body "Automated version bump after npm publish." \
+            --base main
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
--- a/.github/workflows/react-widget-build.yml
+++ b/.github/workflows/react-widget-build.yml
@@ -0,0 +1,34 @@
+name: React Widget Build
+
+on:
+  push:
+    paths:
+      - 'extensions/react-widget/**'
+  pull_request:
+    paths:
+      - 'extensions/react-widget/**'
+
+permissions:
+  contents: read
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    defaults:
+      run:
+        working-directory: extensions/react-widget
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-node@v4
+        with:
+          node-version: 20
+          cache: npm
+          cache-dependency-path: extensions/react-widget/package-lock.json
+
+      - name: Install dependencies
+        run: npm ci
+
+      - name: Build
+        run: npm run build
--- a/.github/workflows/zizmor.yml
+++ b/.github/workflows/zizmor.yml
@@ -0,0 +1,25 @@
+name: GitHub Actions Security Analysis
+
+on:
+  push:
+    branches: ["master"]
+  pull_request:
+    branches: ["**"]
+
+permissions: {}
+
+jobs:
+  zizmor:
+    runs-on: ubuntu-latest
+
+    permissions:
+      security-events: write # Required for upload-sarif (used by zizmor-action) to upload SARIF files.
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
+        with:
+          persist-credentials: false
+
+      - name: Run zizmor 🌈
+        uses: zizmorcore/zizmor-action@71321a20a9ded102f6e9ce5718a2fcec2c4f70d8 # v0.5.2
--- a/.gitignore
+++ b/.gitignore
@@ -2,6 +2,7 @@
 __pycache__/
 *.py[cod]
 *$py.class
+results.txt
 experiments/

 experiments
@@ -107,6 +108,8 @@ celerybeat.pid
 # Environments
 .env
 .venv
+# Machine-specific Claude Code guidance (see CLAUDE.md preamble)
+CLAUDE.md
 env/
 venv/
 ENV/
@@ -180,5 +183,6 @@ application/vectors/

 node_modules/
 .vscode/settings.json
+.vscode/sftp.json
 /models/
 model/
--- a/.vale.ini
+++ b/.vale.ini
@@ -1,5 +1,7 @@
 MinAlertLevel = warning
 StylesPath = .github/styles
+Vocab = DocsGPT

 [*.{md,mdx}]
 BasedOnStyles = DocsGPT
+
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,134 @@
+# AGENTS.md
+
+- Read `CONTRIBUTING.md` before making non-trivial changes.
+- For day-to-day development and feature work, follow the development-environment workflow rather than defaulting to `setup.sh` / `setup.ps1`.
+- Avoid using the setup scripts during normal feature work unless the user explicitly asks for them. Users configure `.env` usually.
+- Try to follow red/green TDD
+
+### Check existing dev prerequisites first
+
+For feature work, do **not** assume the environment needs to be recreated.
+
+- Check whether the user already has a Python virtual environment such as `venv/` or `.venv/`.
+- Check whether MongoDB is already running.
+- Check whether Redis is already running.
+- Reuse what is already working. Do not stop or recreate MongoDB, Redis, or the Python environment unless the task is environment setup or troubleshooting.
+
+## Normal local development commands
+
+Use these commands once the dev prerequisites above are satisfied.
+
+### Backend
+
+```bash
+source .venv/bin/activate  # macOS/Linux
+uv pip install -r application/requirements.txt  # or: pip install -r application/requirements.txt
+```
+
+Run the Flask API (if needed):
+
+```bash
+flask --app application/app.py run --host=0.0.0.0 --port=7091
+```
+
+Run the Celery worker in a separate terminal (if needed):
+
+```bash
+celery -A application.app.celery worker -l INFO
+```
+
+On macOS, prefer the solo pool for Celery:
+
+```bash
+python -m celery -A application.app.celery worker -l INFO --pool=solo
+```
+
+### Frontend
+
+Install dependencies only when needed, then run the dev server:
+
+```bash
+cd frontend
+npm install --include=dev
+npm run dev
+```
+
+### Docs site
+
+```bash
+cd docs
+npm install
+```
+
+### Python / backend changes validation
+
+```bash
+ruff check .
+python -m pytest
+```
+
+### Frontend changes
+
+```bash
+cd frontend && npm run lint
+cd frontend && npm run build
+```
+
+### Documentation changes
+
+```bash
+cd docs && npm run build
+```
+
+If Vale is installed locally and you edited prose, also run:
+
+```bash
+vale .
+```
+
+## Repository map
+
+- `application/`: Flask backend, API routes, agent logic, retrieval, parsing, security, storage, Celery worker, and WSGI entrypoints.
+- `tests/`: backend unit/integration tests and test-only Python dependencies.
+- `frontend/`: Vite + React + TypeScript application.
+- `frontend/src/`: main UI code, including `components`, `conversation`, `hooks`, `locale`, `settings`, `upload`, and Redux store wiring in `store.ts`.
+- `docs/`: separate documentation site built with Next.js/Nextra.
+- `extensions/`: integrations and widgets such as Chatwoot, Chrome, Discord, React widget, Slack bot, and web widget.
+- `deployment/`: Docker Compose variants and Kubernetes manifests.
+
+## Coding rules
+
+### Backend
+
+- Follow PEP 8 and keep Python line length at or under 120 characters.
+- Use type hints for function arguments and return values.
+- Add Google-style docstrings to new or substantially changed functions and classes.
+- Add or update tests under `tests/` for backend behavior changes.
+- Keep changes narrow in `api`, `auth`, `security`, `parser`, `retriever`, and `storage` areas.
+
+### Backend Abstractions
+
+- LLM providers implement a common interface in `application/llm/` (add new providers by extending the base class).
+- Vector stores are abstracted in `application/vectorstore/`.
+- Parsers live in `application/parser/` and handle different document formats in the ingestion stage.
+- Agents and tools are in `application/agents/` and `application/agents/tools/`.
+- Celery setup/config lives in `application/celery_init.py` and `application/celeryconfig.py`.
+- Settings and env vars are managed via Pydantic in `application/core/settings.py`.
+
+### Frontend
+
+- Follow the existing ESLint + Prettier setup.
+- Prefer small, reusable functional components and hooks.
+- If shared state must be added, use Redux rather than introducing a new global state library.
+- Avoid broad UI refactors unless the task explicitly asks for them.
+- Do not re-create components if we already have some in the app.
+
+## PR readiness
+
+Before opening a PR:
+
+- run the relevant validation commands above
+- confirm backend changes still work end-to-end after ingesting sample data when applicable
+- clearly summarize user-visible behavior changes
+- mention any config, dependency, or deployment implications
+- Ask your user to attach a screenshot or a video to it
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -22,6 +22,11 @@ Thank you for choosing to contribute to DocsGPT! We are all very grateful!

 - We have a frontend built on React (Vite) and a backend in Python.

+> **Required for every PR:** Please attach screenshots or a short screen
+> recording that shows the working version of your changes. This makes the
+> requirement visible to reviewers and helps them quickly verify what you are
+> submitting.
+
  
 Before creating issues, please check out how the latest version of our app looks and works by launching it via [Quickstart](https://github.com/arc53/DocsGPT#quickstart) the version on our live demo is slightly modified with login. Your issues should relate to the version you can launch via [Quickstart](https://github.com/arc53/DocsGPT#quickstart).

@@ -125,7 +130,7 @@ Here's a step-by-step guide on how to contribute to DocsGPT:
     ```

 9. **Submit a Pull Request (PR):**
-   - Create a Pull Request from your branch to the main repository. Make sure to include a detailed description of your changes and reference any related issues.
+   - Create a Pull Request from your branch to the main repository. Make sure to include a detailed description of your changes, reference any related issues, and attach screenshots or a screen recording showing the working version.

 10. **Collaborate:**
   - Be responsive to comments and feedback on your PR.
--- a/README.md
+++ b/README.md
@@ -7,7 +7,7 @@
 </p>

 <p align="left">
-  <strong><a href="https://www.docsgpt.cloud/">DocsGPT</a></strong> is an open-source AI platform for building intelligent agents and assistants. Features Agent Builder, deep research tools, document analysis (PDF, Office, web content), Multi-model support (choose your provider or run locally), and rich API connectivity for agents with actionable tools and integrations. Deploy anywhere with complete privacy control.
+  <strong><a href="https://www.docsgpt.cloud/">DocsGPT</a></strong> is an open-source AI platform for building intelligent agents and assistants. Features Agent Builder, deep research tools, document analysis (PDF, Office, web content, and audio), Multi-model support (choose your provider or run locally), and rich API connectivity for agents with actionable tools and integrations. Deploy anywhere with complete privacy control.
 </p>

 <div align="center">
@@ -29,13 +29,14 @@

 <div align="center">
  <br>
-<img src="https://d3dg1063dc54p9.cloudfront.net/videos/demov7.gif" alt="video-example-of-docs-gpt" width="800" height="450">
+<img src="https://d3dg1063dc54p9.cloudfront.net/videos/demo-26.gif" alt="video-example-of-docs-gpt" width="800" height="480">
 </div>
 <h3 align="left">
  <strong>Key Features:</strong>
 </h3>
 <ul align="left">
-    <li><strong>🗂️ Wide Format Support:</strong> Reads PDF, DOCX, CSV, XLSX, EPUB, MD, RST, HTML, MDX, JSON, PPTX, and images.</li>
+    <li><strong>🗂️ Wide Format Support:</strong> Reads PDF, DOCX, CSV, XLSX, EPUB, MD, RST, HTML, MDX, JSON, PPTX, images, and audio files such as MP3, WAV, M4A, OGG, and WebM.</li>
+    <li><strong>🎙️ Speech Workflows:</strong> Record voice input into chat, transcribe audio on the backend, and ingest meeting recordings or voice notes as searchable knowledge.</li>
    <li><strong>🌐 Web & Data Integration:</strong> Ingests from URLs, sitemaps, Reddit, GitHub and web crawlers.</li>
    <li><strong>✅ Reliable Answers:</strong> Get accurate, hallucination-free responses with source citations viewable in a clean UI.</li>
    <li><strong>🔑 Streamlined API Keys:</strong>  Generate keys linked to your settings, documents, and models, simplifying chatbot and integration setup.</li>
@@ -158,4 +159,3 @@ The source code license is [MIT](https://opensource.org/license/mit/), as descri
  </a>
  
 </p>
-
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -2,13 +2,21 @@

 ## Supported Versions

-Supported Versions:
-
-Currently, we support security patches by committing changes and bumping the version published on Github.
+Security patches target the latest release and the `main` branch. We recommend always running the most recent version.

 ## Reporting a Vulnerability

-Found a vulnerability? Please email us:
+Preferred method: use GitHub's private vulnerability reporting flow:
+https://github.com/arc53/DocsGPT/security

-security@arc53.com
+Then click **Report a vulnerability**.
+
+
+Alternatively, email us at: security@arc53.com
+
+We aim to acknowledge reports within 48 hours.
+
+## Incident Handling
+
+For the public incident response process, see [`INCIDENT_RESPONSE.md`](./.github/INCIDENT_RESPONSE.md). If you believe an active exploit is occurring, include **URGENT** in your report subject line.

--- a/application/agents/agent_creator.py
+++ b/application/agents/agent_creator.py
@@ -1,7 +1,8 @@
 import logging

+from application.agents.agentic_agent import AgenticAgent
 from application.agents.classic_agent import ClassicAgent
-from application.agents.react_agent import ReActAgent
+from application.agents.research_agent import ResearchAgent
 from application.agents.workflow_agent import WorkflowAgent

 logger = logging.getLogger(__name__)
@@ -10,7 +11,9 @@ logger = logging.getLogger(__name__)
 class AgentCreator:
    agents = {
        "classic": ClassicAgent,
-        "react": ReActAgent,
+        "react": ClassicAgent,  # backwards compat: react falls back to classic
+        "agentic": AgenticAgent,
+        "research": ResearchAgent,
        "workflow": WorkflowAgent,
    }

--- a/application/agents/agentic_agent.py
+++ b/application/agents/agentic_agent.py
@@ -0,0 +1,63 @@
+import logging
+from typing import Dict, Generator, Optional
+
+from application.agents.base import BaseAgent
+from application.agents.tools.internal_search import (
+    INTERNAL_TOOL_ID,
+    add_internal_search_tool,
+)
+from application.logging import LogContext
+
+logger = logging.getLogger(__name__)
+
+
+class AgenticAgent(BaseAgent):
+    """Agent where the LLM controls retrieval via tools.
+
+    Unlike ClassicAgent which pre-fetches docs into the prompt,
+    AgenticAgent gives the LLM an internal_search tool so it can
+    decide when, what, and whether to search.
+    """
+
+    def __init__(
+        self,
+        retriever_config: Optional[Dict] = None,
+        *args,
+        **kwargs,
+    ):
+        super().__init__(*args, **kwargs)
+        self.retriever_config = retriever_config or {}
+
+    def _gen_inner(
+        self, query: str, log_context: LogContext
+    ) -> Generator[Dict, None, None]:
+        tools_dict = self.tool_executor.get_tools()
+        add_internal_search_tool(tools_dict, self.retriever_config)
+        self._prepare_tools(tools_dict)
+
+        # 4. Build messages (prompt has NO pre-fetched docs)
+        messages = self._build_messages(self.prompt, query)
+
+        # 5. Call LLM — the handler manages the tool loop
+        llm_response = self._llm_gen(messages, log_context)
+
+        yield from self._handle_response(
+            llm_response, tools_dict, messages, log_context
+        )
+
+        # 6. Collect sources from internal search tool results
+        self._collect_internal_sources()
+
+        yield {"sources": self.retrieved_docs}
+        yield {"tool_calls": self._get_truncated_tool_calls()}
+
+        log_context.stacks.append(
+            {"component": "agent", "data": {"tool_calls": self.tool_calls.copy()}}
+        )
+
+    def _collect_internal_sources(self):
+        """Collect retrieved docs from the cached InternalSearchTool instance."""
+        cache_key = f"internal_search:{INTERNAL_TOOL_ID}:{self.user or ''}"
+        tool = self.tool_executor._loaded_tools.get(cache_key)
+        if tool and hasattr(tool, "retrieved_docs") and tool.retrieved_docs:
+            self.retrieved_docs = tool.retrieved_docs
--- a/application/agents/base.py
+++ b/application/agents/base.py
@@ -1,18 +1,16 @@
+import json
 import logging
 import uuid
 from abc import ABC, abstractmethod
-from typing import Dict, Generator, List, Optional
+from typing import Any, Dict, Generator, List, Optional

-from bson.objectid import ObjectId
-
-from application.agents.tools.tool_action_parser import ToolActionParser
-from application.agents.tools.tool_manager import ToolManager
+from application.agents.tool_executor import ToolExecutor
 from application.core.json_schema_utils import (
    JsonSchemaValidationError,
    normalize_json_schema_payload,
 )
-from application.core.mongo_db import MongoDB
 from application.core.settings import settings
+from application.llm.handlers.base import ToolCall
 from application.llm.handlers.handler_creator import LLMHandlerCreator
 from application.llm.llm_creator import LLMCreator
 from application.logging import build_stack_data, log_activity, LogContext
@@ -40,6 +38,10 @@ class BaseAgent(ABC):
        limited_request_mode: Optional[bool] = False,
        request_limit: Optional[int] = settings.DEFAULT_AGENT_LIMITS["request_limit"],
        compressed_summary: Optional[str] = None,
+        llm=None,
+        llm_handler=None,
+        tool_executor: Optional[ToolExecutor] = None,
+        backup_models: Optional[List[str]] = None,
    ):
        self.endpoint = endpoint
        self.llm_name = llm_name
@@ -50,22 +52,42 @@ class BaseAgent(ABC):
        self.prompt = prompt
        self.decoded_token = decoded_token or {}
        self.user: str = self.decoded_token.get("sub")
-        self.tool_config: Dict = {}
        self.tools: List[Dict] = []
-        self.tool_calls: List[Dict] = []
        self.chat_history: List[Dict] = chat_history if chat_history is not None else []
-        self.llm = LLMCreator.create_llm(
-            llm_name,
-            api_key=api_key,
-            user_api_key=user_api_key,
-            decoded_token=decoded_token,
-            model_id=model_id,
-            agent_id=agent_id,
-        )
+
+        # Dependency injection for LLM — fall back to creating if not provided
+        if llm is not None:
+            self.llm = llm
+        else:
+            self.llm = LLMCreator.create_llm(
+                llm_name,
+                api_key=api_key,
+                user_api_key=user_api_key,
+                decoded_token=decoded_token,
+                model_id=model_id,
+                agent_id=agent_id,
+                backup_models=backup_models,
+            )
+
        self.retrieved_docs = retrieved_docs or []
-        self.llm_handler = LLMHandlerCreator.create_handler(
-            llm_name if llm_name else "default"
-        )
+
+        if llm_handler is not None:
+            self.llm_handler = llm_handler
+        else:
+            self.llm_handler = LLMHandlerCreator.create_handler(
+                llm_name if llm_name else "default"
+            )
+
+        # Tool executor — injected or created
+        if tool_executor is not None:
+            self.tool_executor = tool_executor
+        else:
+            self.tool_executor = ToolExecutor(
+                user_api_key=user_api_key,
+                user=self.user,
+                decoded_token=decoded_token,
+            )
+
        self.attachments = attachments or []
        self.json_schema = None
        if json_schema is not None:
@@ -93,327 +115,219 @@ class BaseAgent(ABC):
    ) -> Generator[Dict, None, None]:
        pass

-    def _get_tools(self, api_key: str = None) -> Dict[str, Dict]:
-        mongo = MongoDB.get_client()
-        db = mongo[settings.MONGO_DB_NAME]
-        agents_collection = db["agents"]
-        tools_collection = db["user_tools"]
+    def gen_continuation(
+        self,
+        messages: List[Dict],
+        tools_dict: Dict,
+        pending_tool_calls: List[Dict],
+        tool_actions: List[Dict],
+    ) -> Generator[Dict, None, None]:
+        """Resume generation after tool actions are resolved.

-        agent_data = agents_collection.find_one({"key": api_key or self.user_api_key})
-        tool_ids = agent_data.get("tools", []) if agent_data else []
-
-        tools = (
-            tools_collection.find(
-                {"_id": {"$in": [ObjectId(tool_id) for tool_id in tool_ids]}}
-            )
-            if tool_ids
-            else []
-        )
-        tools = list(tools)
-        tools_by_id = {str(tool["_id"]): tool for tool in tools} if tools else {}
-
-        return tools_by_id
-
-    def _get_user_tools(self, user="local"):
-        mongo = MongoDB.get_client()
-        db = mongo[settings.MONGO_DB_NAME]
-        user_tools_collection = db["user_tools"]
-        user_tools = user_tools_collection.find({"user": user, "status": True})
-        user_tools = list(user_tools)
-
-        return {str(i): tool for i, tool in enumerate(user_tools)}
-
-    def _build_tool_parameters(self, action):
-        params = {"type": "object", "properties": {}, "required": []}
-        for param_type in ["query_params", "headers", "body", "parameters"]:
-            if param_type in action and action[param_type].get("properties"):
-                for k, v in action[param_type]["properties"].items():
-                    if v.get("filled_by_llm", True):
-                        params["properties"][k] = {
-                            key: value
-                            for key, value in v.items()
-                            if key not in ("filled_by_llm", "value", "required")
-                        }
-                        if v.get("required", False):
-                            params["required"].append(k)
-        return params
-
-    def _prepare_tools(self, tools_dict):
-        self.tools = [
-            {
-                "type": "function",
-                "function": {
-                    "name": f"{action['name']}_{tool_id}",
-                    "description": action["description"],
-                    "parameters": self._build_tool_parameters(action),
-                },
-            }
-            for tool_id, tool in tools_dict.items()
-            if (
-                (tool["name"] == "api_tool" and "actions" in tool.get("config", {}))
-                or (tool["name"] != "api_tool" and "actions" in tool)
-            )
-            for action in (
-                tool["config"]["actions"].values()
-                if tool["name"] == "api_tool"
-                else tool["actions"]
-            )
-            if action.get("active", True)
-        ]
-
-    def _execute_tool_action(self, tools_dict, call):
-        parser = ToolActionParser(self.llm.__class__.__name__)
-        tool_id, action_name, call_args = parser.parse_args(call)
-
-        call_id = getattr(call, "id", None) or str(uuid.uuid4())
-
-        # Check if parsing failed
-
-        if tool_id is None or action_name is None:
-            error_message = f"Error: Failed to parse LLM tool call. Tool name: {getattr(call, 'name', 'unknown')}"
-            logger.error(error_message)
-
-            tool_call_data = {
-                "tool_name": "unknown",
-                "call_id": call_id,
-                "action_name": getattr(call, "name", "unknown"),
-                "arguments": call_args or {},
-                "result": f"Failed to parse tool call. Invalid tool name format: {getattr(call, 'name', 'unknown')}",
-            }
-            yield {"type": "tool_call", "data": {**tool_call_data, "status": "error"}}
-            self.tool_calls.append(tool_call_data)
-            return "Failed to parse tool call.", call_id
-        # Check if tool_id exists in available tools
-
-        if tool_id not in tools_dict:
-            error_message = f"Error: Tool ID '{tool_id}' extracted from LLM call not found in available tools_dict. Available IDs: {list(tools_dict.keys())}"
-            logger.error(error_message)
-
-            # Return error result
-
-            tool_call_data = {
-                "tool_name": "unknown",
-                "call_id": call_id,
-                "action_name": f"{action_name}_{tool_id}",
-                "arguments": call_args,
-                "result": f"Tool with ID {tool_id} not found. Available tools: {list(tools_dict.keys())}",
-            }
-            yield {"type": "tool_call", "data": {**tool_call_data, "status": "error"}}
-            self.tool_calls.append(tool_call_data)
-            return f"Tool with ID {tool_id} not found.", call_id
-        tool_call_data = {
-            "tool_name": tools_dict[tool_id]["name"],
-            "call_id": call_id,
-            "action_name": f"{action_name}_{tool_id}",
-            "arguments": call_args,
-        }
-        yield {"type": "tool_call", "data": {**tool_call_data, "status": "pending"}}
-
-        tool_data = tools_dict[tool_id]
-        action_data = (
-            tool_data["config"]["actions"][action_name]
-            if tool_data["name"] == "api_tool"
-            else next(
-                action
-                for action in tool_data["actions"]
-                if action["name"] == action_name
-            )
-        )
-
-        query_params, headers, body, parameters = {}, {}, {}, {}
-        param_types = {
-            "query_params": query_params,
-            "headers": headers,
-            "body": body,
-            "parameters": parameters,
-        }
-
-        for param_type, target_dict in param_types.items():
-            if param_type in action_data and action_data[param_type].get("properties"):
-                for param, details in action_data[param_type]["properties"].items():
-                    if (
-                        param not in call_args
-                        and "value" in details
-                        and details["value"]
-                    ):
-                        target_dict[param] = details["value"]
-        for param, value in call_args.items():
-            for param_type, target_dict in param_types.items():
-                if param_type in action_data and param in action_data[param_type].get(
-                    "properties", {}
-                ):
-                    target_dict[param] = value
-        tm = ToolManager(config={})
-
-        # Prepare tool_config and add tool_id for memory tools
-
-        if tool_data["name"] == "api_tool":
-            action_config = tool_data["config"]["actions"][action_name]
-            tool_config = {
-                "url": action_config["url"],
-                "method": action_config["method"],
-                "headers": headers,
-                "query_params": query_params,
-            }
-            if "body_content_type" in action_config:
-                tool_config["body_content_type"] = action_config.get(
-                    "body_content_type", "application/json"
-                )
-                tool_config["body_encoding_rules"] = action_config.get(
-                    "body_encoding_rules", {}
-                )
-        else:
-            tool_config = tool_data["config"].copy() if tool_data["config"] else {}
-            # Add tool_id from MongoDB _id for tools that need instance isolation (like memory tool)
-            # Use MongoDB _id if available, otherwise fall back to enumerated tool_id
-
-            tool_config["tool_id"] = str(tool_data.get("_id", tool_id))
-            if hasattr(self, "conversation_id") and self.conversation_id:
-                tool_config["conversation_id"] = self.conversation_id
-        tool = tm.load_tool(
-            tool_data["name"],
-            tool_config=tool_config,
-            user_id=self.user,
-        )
-        resolved_arguments = (
-            {"query_params": query_params, "headers": headers, "body": body}
-            if tool_data["name"] == "api_tool"
-            else parameters
-        )
-        if tool_data["name"] == "api_tool":
-            logger.debug(
-                f"Executing api: {action_name} with query_params: {query_params}, headers: {headers}, body: {body}"
-            )
-            result = tool.execute_action(action_name, **body)
-        else:
-            logger.debug(f"Executing tool: {action_name} with args: {call_args}")
-            result = tool.execute_action(action_name, **parameters)
-
-        get_artifact_id = (
-            getattr(tool, "get_artifact_id", None)
-            if tool_data["name"] != "api_tool"
-            else None
-        )
-
-        artifact_id = None
-        if callable(get_artifact_id):
-            try:
-                artifact_id = get_artifact_id(action_name, **parameters)
-            except Exception:
-                logger.exception(
-                    "Failed to extract artifact_id from tool %s for action %s",
-                    tool_data["name"],
-                    action_name,
-                )
-
-        artifact_id = str(artifact_id).strip() if artifact_id is not None else ""
-        if artifact_id:
-            tool_call_data["artifact_id"] = artifact_id
-        result_full = str(result)
-        tool_call_data["resolved_arguments"] = resolved_arguments
-        tool_call_data["result_full"] = result_full
-        tool_call_data["result"] = (
-            f"{result_full[:50]}..." if len(result_full) > 50 else result_full
-        )
-
-        stream_tool_call_data = {
-            key: value
-            for key, value in tool_call_data.items()
-            if key not in {"result_full", "resolved_arguments"}
-        }
-        yield {"type": "tool_call", "data": {**stream_tool_call_data, "status": "completed"}}
-        self.tool_calls.append(tool_call_data)
-
-        return result, call_id
-
-    def _get_truncated_tool_calls(self):
-        return [
-            {
-                "tool_name": tool_call.get("tool_name"),
-                "call_id": tool_call.get("call_id"),
-                "action_name": tool_call.get("action_name"),
-                "arguments": tool_call.get("arguments"),
-                "artifact_id": tool_call.get("artifact_id"),
-                "result": (
-                    f"{str(tool_call['result'])[:50]}..."
-                    if len(str(tool_call["result"])) > 50
-                    else tool_call["result"]
-                ),
-                "status": "completed",
-            }
-            for tool_call in self.tool_calls
-        ]
-
-    def _calculate_current_context_tokens(self, messages: List[Dict]) -> int:
-        """
-        Calculate total tokens in current context (messages).
+        Processes the client-provided *tool_actions* (approvals, denials,
+        or client-side results), appends the resulting messages, then
+        hands back to the LLM to continue the conversation.

        Args:
-            messages: List of message dicts
-
-        Returns:
-            Total token count
+            messages: The saved messages array from the pause point.
+            tools_dict: The saved tools dictionary.
+            pending_tool_calls: The pending tool call descriptors from the pause.
+            tool_actions: Client-provided actions resolving the pending calls.
        """
+        self._prepare_tools(tools_dict)
+
+        actions_by_id = {a["call_id"]: a for a in tool_actions}
+
+        # Build a single assistant message containing all tool calls so
+        # the message history matches the format LLM providers expect
+        # (one assistant message with N tool_calls, followed by N tool results).
+        tc_objects: List[Dict[str, Any]] = []
+        for pending in pending_tool_calls:
+            call_id = pending["call_id"]
+            args = pending["arguments"]
+            args_str = (
+                json.dumps(args) if isinstance(args, dict) else (args or "{}")
+            )
+            tc_obj: Dict[str, Any] = {
+                "id": call_id,
+                "type": "function",
+                "function": {
+                    "name": pending["name"],
+                    "arguments": args_str,
+                },
+            }
+            if pending.get("thought_signature"):
+                tc_obj["thought_signature"] = pending["thought_signature"]
+            tc_objects.append(tc_obj)
+
+        messages.append({
+            "role": "assistant",
+            "content": None,
+            "tool_calls": tc_objects,
+        })
+
+        # Now process each pending call and append tool result messages
+        for pending in pending_tool_calls:
+            call_id = pending["call_id"]
+            args = pending["arguments"]
+            action = actions_by_id.get(call_id)
+            if not action:
+                action = {
+                    "call_id": call_id,
+                    "decision": "denied",
+                    "comment": "No response provided",
+                }
+
+            if action.get("decision") == "approved":
+                # Execute the tool server-side
+                tc = ToolCall(
+                    id=call_id,
+                    name=pending["name"],
+                    arguments=(
+                        json.dumps(args) if isinstance(args, dict) else args
+                    ),
+                )
+                tool_gen = self._execute_tool_action(tools_dict, tc)
+                tool_response = None
+                while True:
+                    try:
+                        event = next(tool_gen)
+                        yield event
+                    except StopIteration as e:
+                        tool_response, _ = e.value
+                        break
+                messages.append(
+                    self.llm_handler.create_tool_message(tc, tool_response)
+                )
+
+            elif action.get("decision") == "denied":
+                comment = action.get("comment", "")
+                denial = (
+                    f"Tool execution denied by user. Reason: {comment}"
+                    if comment
+                    else "Tool execution denied by user."
+                )
+                tc = ToolCall(
+                    id=call_id, name=pending["name"], arguments=args
+                )
+                messages.append(
+                    self.llm_handler.create_tool_message(tc, denial)
+                )
+                yield {
+                    "type": "tool_call",
+                    "data": {
+                        "tool_name": pending.get("tool_name", "unknown"),
+                        "call_id": call_id,
+                        "action_name": pending.get("llm_name", pending["name"]),
+                        "arguments": args,
+                        "status": "denied",
+                    },
+                }
+
+            elif "result" in action:
+                result = action["result"]
+                result_str = (
+                    json.dumps(result)
+                    if not isinstance(result, str)
+                    else result
+                )
+                tc = ToolCall(
+                    id=call_id, name=pending["name"], arguments=args
+                )
+                messages.append(
+                    self.llm_handler.create_tool_message(tc, result_str)
+                )
+                yield {
+                    "type": "tool_call",
+                    "data": {
+                        "tool_name": pending.get("tool_name", "unknown"),
+                        "call_id": call_id,
+                        "action_name": pending.get("llm_name", pending["name"]),
+                        "arguments": args,
+                        "result": (
+                            result_str[:50] + "..."
+                            if len(result_str) > 50
+                            else result_str
+                        ),
+                        "status": "completed",
+                    },
+                }
+
+        # Resume the LLM loop with the updated messages
+        llm_response = self._llm_gen(messages)
+        yield from self._handle_response(
+            llm_response, tools_dict, messages, None
+        )
+
+        yield {"sources": self.retrieved_docs}
+        yield {"tool_calls": self._get_truncated_tool_calls()}
+
+    # ---- Tool delegation (thin wrappers around ToolExecutor) ----
+
+    @property
+    def tool_calls(self) -> List[Dict]:
+        return self.tool_executor.tool_calls
+
+    @tool_calls.setter
+    def tool_calls(self, value: List[Dict]):
+        self.tool_executor.tool_calls = value
+
+    def _get_tools(self, api_key: str = None) -> Dict[str, Dict]:
+        return self.tool_executor._get_tools_by_api_key(api_key or self.user_api_key)
+
+    def _get_user_tools(self, user="local"):
+        return self.tool_executor._get_user_tools(user)
+
+    def _build_tool_parameters(self, action):
+        return self.tool_executor._build_tool_parameters(action)
+
+    def _prepare_tools(self, tools_dict):
+        self.tools = self.tool_executor.prepare_tools_for_llm(tools_dict)
+
+    def _execute_tool_action(self, tools_dict, call):
+        return self.tool_executor.execute(
+            tools_dict, call, self.llm.__class__.__name__
+        )
+
+    def _get_truncated_tool_calls(self):
+        return self.tool_executor.get_truncated_tool_calls()
+
+    # ---- Context / token management ----
+
+    def _calculate_current_context_tokens(self, messages: List[Dict]) -> int:
        from application.api.answer.services.compression.token_counter import (
            TokenCounter,
        )
-
        return TokenCounter.count_message_tokens(messages)

    def _check_context_limit(self, messages: List[Dict]) -> bool:
-        """
-        Check if we're approaching context limit (80%).
-
-        Args:
-            messages: Current message list
-
-        Returns:
-            True if at or above 80% of context limit
-        """
        from application.core.model_utils import get_token_limit
-        from application.core.settings import settings

        try:
-            # Calculate current tokens
            current_tokens = self._calculate_current_context_tokens(messages)
            self.current_token_count = current_tokens
-
-            # Get context limit for model
            context_limit = get_token_limit(self.model_id)
-
-            # Calculate threshold (80%)
            threshold = int(context_limit * settings.COMPRESSION_THRESHOLD_PERCENTAGE)

-            # Check if we've reached the limit
            if current_tokens >= threshold:
                logger.warning(
                    f"Context limit approaching: {current_tokens}/{context_limit} tokens "
                    f"({(current_tokens/context_limit)*100:.1f}%)"
                )
                return True
-
            return False
-
        except Exception as e:
            logger.error(f"Error checking context limit: {str(e)}", exc_info=True)
            return False

    def _validate_context_size(self, messages: List[Dict]) -> None:
-        """
-        Pre-flight validation before calling LLM. Logs warnings but never raises errors.
-
-        Args:
-            messages: Messages to be sent to LLM
-        """
        from application.core.model_utils import get_token_limit

        current_tokens = self._calculate_current_context_tokens(messages)
        self.current_token_count = current_tokens
        context_limit = get_token_limit(self.model_id)
-
        percentage = (current_tokens / context_limit) * 100

-        # Log based on usage level
        if current_tokens >= context_limit:
            logger.warning(
                f"Context at limit: {current_tokens:,}/{context_limit:,} tokens "
@@ -428,44 +342,32 @@ class BaseAgent(ABC):
            )

    def _truncate_text_middle(self, text: str, max_tokens: int) -> str:
-        """
-        Truncate text by removing content from the middle, preserving start and end.
-
-        Args:
-            text: Text to truncate
-            max_tokens: Maximum tokens allowed
-
-        Returns:
-            Truncated text with middle removed if needed
-        """
        from application.utils import num_tokens_from_string

        current_tokens = num_tokens_from_string(text)
        if current_tokens <= max_tokens:
            return text

-        # Estimate chars per token (roughly 4 chars per token for English)
        chars_per_token = len(text) / current_tokens if current_tokens > 0 else 4
-        target_chars = int(max_tokens * chars_per_token * 0.95)  # 5% safety margin
+        target_chars = int(max_tokens * chars_per_token * 0.95)

        if target_chars <= 0:
            return ""

-        # Split: keep 40% from start, 40% from end, remove middle
        start_chars = int(target_chars * 0.4)
        end_chars = int(target_chars * 0.4)

        truncation_marker = "\n\n[... content truncated to fit context limit ...]\n\n"
-
        truncated = text[:start_chars] + truncation_marker + text[-end_chars:]

        logger.info(
            f"Truncated text from {current_tokens:,} to ~{max_tokens:,} tokens "
            f"(removed middle section)"
        )
-
        return truncated

+    # ---- Message building ----
+
    def _build_messages(
        self,
        system_prompt: str,
@@ -475,7 +377,6 @@ class BaseAgent(ABC):
        from application.core.model_utils import get_token_limit
        from application.utils import num_tokens_from_string

-        # Append compression summary to system prompt if present
        if self.compressed_summary:
            compression_context = (
                "\n\n---\n\n"
@@ -489,23 +390,18 @@ class BaseAgent(ABC):
        context_limit = get_token_limit(self.model_id)
        system_tokens = num_tokens_from_string(system_prompt)

-        # Reserve 10% for response/tools
        safety_buffer = int(context_limit * 0.1)
        available_after_system = context_limit - system_tokens - safety_buffer

-        # Max tokens for query: 80% of available space (leave room for history)
        max_query_tokens = int(available_after_system * 0.8)
        query_tokens = num_tokens_from_string(query)

-        # Truncate query from middle if it exceeds 80% of available context
        if query_tokens > max_query_tokens:
            query = self._truncate_text_middle(query, max_query_tokens)
            query_tokens = num_tokens_from_string(query)

-        # Calculate remaining budget for chat history
        available_for_history = max(available_after_system - query_tokens, 0)

-        # Truncate chat history to fit within available budget
        working_history = self._truncate_history_to_fit(
            self.chat_history,
            available_for_history,
@@ -520,28 +416,35 @@ class BaseAgent(ABC):
            if "tool_calls" in i:
                for tool_call in i["tool_calls"]:
                    call_id = tool_call.get("call_id") or str(uuid.uuid4())
-
-                    function_call_dict = {
-                        "function_call": {
-                            "name": tool_call.get("action_name"),
-                            "args": tool_call.get("arguments"),
-                            "call_id": call_id,
-                        }
-                    }
-                    function_response_dict = {
-                        "function_response": {
-                            "name": tool_call.get("action_name"),
-                            "response": {"result": tool_call.get("result")},
-                            "call_id": call_id,
-                        }
-                    }
-
-                    messages.append(
-                        {"role": "assistant", "content": [function_call_dict]}
+                    args = tool_call.get("arguments")
+                    args_str = (
+                        json.dumps(args)
+                        if isinstance(args, dict)
+                        else (args or "{}")
                    )
-                    messages.append(
-                        {"role": "tool", "content": [function_response_dict]}
+                    messages.append({
+                        "role": "assistant",
+                        "content": None,
+                        "tool_calls": [{
+                            "id": call_id,
+                            "type": "function",
+                            "function": {
+                                "name": tool_call.get("action_name", ""),
+                                "arguments": args_str,
+                            },
+                        }],
+                    })
+                    result = tool_call.get("result")
+                    result_str = (
+                        json.dumps(result)
+                        if not isinstance(result, str)
+                        else (result or "")
                    )
+                    messages.append({
+                        "role": "tool",
+                        "tool_call_id": call_id,
+                        "content": result_str,
+                    })
        messages.append({"role": "user", "content": query})
        return messages

@@ -550,16 +453,6 @@ class BaseAgent(ABC):
        history: List[Dict],
        max_tokens: int,
    ) -> List[Dict]:
-        """
-        Truncate chat history to fit within token budget, keeping most recent messages.
-
-        Args:
-            history: Full chat history
-            max_tokens: Maximum tokens allowed for history
-
-        Returns:
-            Truncated history (most recent messages that fit)
-        """
        from application.utils import num_tokens_from_string

        if not history or max_tokens <= 0:
@@ -568,7 +461,6 @@ class BaseAgent(ABC):
        truncated = []
        current_tokens = 0

-        # Iterate from newest to oldest
        for message in reversed(history):
            message_tokens = 0

@@ -588,7 +480,7 @@ class BaseAgent(ABC):

            if current_tokens + message_tokens <= max_tokens:
                current_tokens += message_tokens
-                truncated.insert(0, message)  # Maintain chronological order
+                truncated.insert(0, message)
            else:
                break

@@ -600,13 +492,13 @@ class BaseAgent(ABC):

        return truncated

+    # ---- LLM generation ----
+
    def _llm_gen(self, messages: List[Dict], log_context: Optional[LogContext] = None):
-        # Pre-flight context validation - fail fast if over limit
        self._validate_context_size(messages)

        gen_kwargs = {"model": self.model_id, "messages": messages}
        if self.attachments:
-            # Usage accounting only; stripped before provider invocation.
            gen_kwargs["_usage_attachments"] = self.attachments

        if (
--- a/application/agents/classic_agent.py
+++ b/application/agents/classic_agent.py
@@ -15,11 +15,7 @@ class ClassicAgent(BaseAgent):
    ) -> Generator[Dict, None, None]:
        """Core generator function for ClassicAgent execution flow"""

-        tools_dict = (
-            self._get_user_tools(self.user)
-            if not self.user_api_key
-            else self._get_tools(self.user_api_key)
-        )
+        tools_dict = self.tool_executor.get_tools()
        self._prepare_tools(tools_dict)

        messages = self._build_messages(self.prompt, query)
--- a/application/agents/react_agent.py
+++ b/application/agents/react_agent.py
@@ -1,238 +0,0 @@
-import logging
-import os
-from typing import Any, Dict, Generator, List
-
-from application.agents.base import BaseAgent
-from application.logging import build_stack_data, LogContext
-
-logger = logging.getLogger(__name__)
-
-MAX_ITERATIONS_REASONING = 10
-
-current_dir = os.path.dirname(
-    os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
-)
-with open(
-    os.path.join(current_dir, "application/prompts", "react_planning_prompt.txt"), "r"
-) as f:
-    PLANNING_PROMPT_TEMPLATE = f.read()
-with open(
-    os.path.join(current_dir, "application/prompts", "react_final_prompt.txt"), "r"
-) as f:
-    FINAL_PROMPT_TEMPLATE = f.read()
-
-
-class ReActAgent(BaseAgent):
-    """
-    Research and Action (ReAct) Agent - Advanced reasoning agent with iterative planning.
-
-    Implements a think-act-observe loop for complex problem-solving:
-    1. Creates a strategic plan based on the query
-    2. Executes tools and gathers observations
-    3. Iteratively refines approach until satisfied
-    4. Synthesizes final answer from all observations
-    """
-
-    def __init__(self, *args, **kwargs):
-        super().__init__(*args, **kwargs)
-        self.plan: str = ""
-        self.observations: List[str] = []
-
-    def _gen_inner(
-        self, query: str, log_context: LogContext
-    ) -> Generator[Dict, None, None]:
-        """Execute ReAct reasoning loop with planning, action, and observation cycles"""
-
-        self._reset_state()
-
-        tools_dict = (
-            self._get_tools(self.user_api_key)
-            if self.user_api_key
-            else self._get_user_tools(self.user)
-        )
-        self._prepare_tools(tools_dict)
-
-        for iteration in range(1, MAX_ITERATIONS_REASONING + 1):
-            yield {"thought": f"Reasoning... (iteration {iteration})\n\n"}
-
-            yield from self._planning_phase(query, log_context)
-
-            if not self.plan:
-                logger.warning(
-                    f"ReActAgent: No plan generated in iteration {iteration}"
-                )
-                break
-            self.observations.append(f"Plan (iteration {iteration}): {self.plan}")
-
-            satisfied = yield from self._execution_phase(query, tools_dict, log_context)
-
-            if satisfied:
-                logger.info("ReActAgent: Goal satisfied, stopping reasoning loop")
-                break
-        yield from self._synthesis_phase(query, log_context)
-
-    def _reset_state(self):
-        """Reset agent state for new query"""
-        self.plan = ""
-        self.observations = []
-
-    def _planning_phase(
-        self, query: str, log_context: LogContext
-    ) -> Generator[Dict, None, None]:
-        """Generate strategic plan for query"""
-        logger.info("ReActAgent: Creating plan...")
-
-        plan_prompt = self._build_planning_prompt(query)
-        messages = [{"role": "user", "content": plan_prompt}]
-
-        plan_stream = self.llm.gen_stream(
-            model=self.model_id,
-            messages=messages,
-            tools=self.tools if self.tools else None,
-        )
-
-        if log_context:
-            log_context.stacks.append(
-                {"component": "planning_llm", "data": build_stack_data(self.llm)}
-            )
-        plan_parts = []
-        for chunk in plan_stream:
-            content = self._extract_content(chunk)
-            if content:
-                plan_parts.append(content)
-                yield {"thought": content}
-        self.plan = "".join(plan_parts)
-
-    def _execution_phase(
-        self, query: str, tools_dict: Dict, log_context: LogContext
-    ) -> Generator[bool, None, None]:
-        """Execute plan with tool calls and observations"""
-        execution_prompt = self._build_execution_prompt(query)
-        messages = self._build_messages(execution_prompt, query)
-
-        llm_response = self._llm_gen(messages, log_context)
-        initial_content = self._extract_content(llm_response)
-
-        if initial_content:
-            self.observations.append(f"Initial response: {initial_content}")
-        processed_response = self._llm_handler(
-            llm_response, tools_dict, messages, log_context
-        )
-
-        for tool_call in self.tool_calls:
-            observation = (
-                f"Executed: {tool_call.get('tool_name', 'Unknown')} "
-                f"with args {tool_call.get('arguments', {})}. "
-                f"Result: {str(tool_call.get('result', ''))[:200]}"
-            )
-            self.observations.append(observation)
-        final_content = self._extract_content(processed_response)
-        if final_content:
-            self.observations.append(f"Response after tools: {final_content}")
-        if log_context:
-            log_context.stacks.append(
-                {
-                    "component": "agent_tool_calls",
-                    "data": {"tool_calls": self.tool_calls.copy()},
-                }
-            )
-        yield {"sources": self.retrieved_docs}
-        yield {"tool_calls": self._get_truncated_tool_calls()}
-
-        return "SATISFIED" in (final_content or "")
-
-    def _synthesis_phase(
-        self, query: str, log_context: LogContext
-    ) -> Generator[Dict, None, None]:
-        """Synthesize final answer from all observations"""
-        logger.info("ReActAgent: Generating final answer...")
-
-        final_prompt = self._build_final_answer_prompt(query)
-        messages = [{"role": "user", "content": final_prompt}]
-
-        final_stream = self.llm.gen_stream(
-            model=self.model_id, messages=messages, tools=None
-        )
-
-        if log_context:
-            log_context.stacks.append(
-                {"component": "final_answer_llm", "data": build_stack_data(self.llm)}
-            )
-        for chunk in final_stream:
-            content = self._extract_content(chunk)
-            if content:
-                yield {"answer": content}
-
-    def _build_planning_prompt(self, query: str) -> str:
-        """Build planning phase prompt"""
-        prompt = PLANNING_PROMPT_TEMPLATE.replace("{query}", query)
-        prompt = prompt.replace("{prompt}", self.prompt or "")
-        prompt = prompt.replace("{summaries}", "")
-        prompt = prompt.replace("{observations}", "\n".join(self.observations))
-        return prompt
-
-    def _build_execution_prompt(self, query: str) -> str:
-        """Build execution phase prompt with plan and observations"""
-        observations_str = "\n".join(self.observations)
-
-        if len(observations_str) > 20000:
-            observations_str = observations_str[:20000] + "\n...[truncated]"
-        return (
-            f"{self.prompt or ''}\n\n"
-            f"Follow this plan:\n{self.plan}\n\n"
-            f"Observations:\n{observations_str}\n\n"
-            f"If sufficient data exists to answer '{query}', respond with 'SATISFIED'. "
-            f"Otherwise, continue executing the plan."
-        )
-
-    def _build_final_answer_prompt(self, query: str) -> str:
-        """Build final synthesis prompt"""
-        observations_str = "\n".join(self.observations)
-
-        if len(observations_str) > 10000:
-            observations_str = observations_str[:10000] + "\n...[truncated]"
-            logger.warning("ReActAgent: Observations truncated for final answer")
-        return FINAL_PROMPT_TEMPLATE.format(query=query, observations=observations_str)
-
-    def _extract_content(self, response: Any) -> str:
-        """Extract text content from various LLM response formats"""
-        if not response:
-            return ""
-        collected = []
-
-        if isinstance(response, str):
-            return response
-        if hasattr(response, "message") and hasattr(response.message, "content"):
-            if response.message.content:
-                return response.message.content
-        if hasattr(response, "choices") and response.choices:
-            if hasattr(response.choices[0], "message"):
-                content = response.choices[0].message.content
-                if content:
-                    return content
-        if hasattr(response, "content") and isinstance(response.content, list):
-            if response.content and hasattr(response.content[0], "text"):
-                return response.content[0].text
-        try:
-            for chunk in response:
-                content_piece = ""
-
-                if hasattr(chunk, "choices") and chunk.choices:
-                    if hasattr(chunk.choices[0], "delta"):
-                        delta_content = chunk.choices[0].delta.content
-                        if delta_content:
-                            content_piece = delta_content
-                elif hasattr(chunk, "type") and chunk.type == "content_block_delta":
-                    if hasattr(chunk, "delta") and hasattr(chunk.delta, "text"):
-                        content_piece = chunk.delta.text
-                elif isinstance(chunk, str):
-                    content_piece = chunk
-                if content_piece:
-                    collected.append(content_piece)
-        except (TypeError, AttributeError):
-            logger.debug(
-                f"Response not iterable or unexpected format: {type(response)}"
-            )
-        except Exception as e:
-            logger.error(f"Error extracting content: {e}")
-        return "".join(collected)
--- a/application/agents/research_agent.py
+++ b/application/agents/research_agent.py
@@ -0,0 +1,698 @@
+import json
+import logging
+import os
+import time
+from typing import Dict, Generator, List, Optional
+
+from application.agents.base import BaseAgent
+from application.agents.tool_executor import ToolExecutor
+from application.agents.tools.internal_search import (
+    INTERNAL_TOOL_ID,
+    add_internal_search_tool,
+)
+from application.agents.tools.think import THINK_TOOL_ENTRY, THINK_TOOL_ID
+from application.logging import LogContext
+
+logger = logging.getLogger(__name__)
+
+# Defaults (can be overridden via constructor)
+DEFAULT_MAX_STEPS = 6
+DEFAULT_MAX_SUB_ITERATIONS = 5
+DEFAULT_TIMEOUT_SECONDS = 300  # 5 minutes
+DEFAULT_TOKEN_BUDGET = 100_000
+DEFAULT_PARALLEL_WORKERS = 3
+
+# Adaptive depth caps per complexity level
+COMPLEXITY_CAPS = {
+    "simple": 2,
+    "moderate": 4,
+    "complex": 6,
+}
+
+_PROMPTS_DIR = os.path.join(
+    os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
+    "prompts",
+    "research",
+)
+
+
+def _load_prompt(name: str) -> str:
+    with open(os.path.join(_PROMPTS_DIR, name), "r") as f:
+        return f.read()
+
+
+CLARIFICATION_PROMPT = _load_prompt("clarification.txt")
+PLANNING_PROMPT = _load_prompt("planning.txt")
+STEP_PROMPT = _load_prompt("step.txt")
+SYNTHESIS_PROMPT = _load_prompt("synthesis.txt")
+
+
+# ---------------------------------------------------------------------------
+# CitationManager
+# ---------------------------------------------------------------------------
+
+
+class CitationManager:
+    """Tracks and deduplicates citations across research steps."""
+
+    def __init__(self):
+        self.citations: Dict[int, Dict] = {}
+        self._counter = 0
+
+    def add(self, doc: Dict) -> int:
+        """Register a source, return its citation number. Deduplicates by source."""
+        source = doc.get("source", "")
+        title = doc.get("title", "")
+        for num, existing in self.citations.items():
+            if existing.get("source") == source and existing.get("title") == title:
+                return num
+        self._counter += 1
+        self.citations[self._counter] = doc
+        return self._counter
+
+    def add_docs(self, docs: List[Dict]) -> str:
+        """Register multiple docs, return formatted citation mapping text."""
+        mapping_lines = []
+        for doc in docs:
+            num = self.add(doc)
+            title = doc.get("title", "Untitled")
+            mapping_lines.append(f"[{num}] {title}")
+        return "\n".join(mapping_lines)
+
+    def format_references(self) -> str:
+        """Generate [N] -> source mapping for report footer."""
+        if not self.citations:
+            return "No sources found."
+        lines = []
+        for num, doc in sorted(self.citations.items()):
+            title = doc.get("title", "Untitled")
+            source = doc.get("source", "Unknown")
+            filename = doc.get("filename", "")
+            display = filename or title
+            lines.append(f"[{num}] {display} — {source}")
+        return "\n".join(lines)
+
+    def get_all_docs(self) -> List[Dict]:
+        return list(self.citations.values())
+
+
+# ---------------------------------------------------------------------------
+# ResearchAgent
+# ---------------------------------------------------------------------------
+
+
+class ResearchAgent(BaseAgent):
+    """Multi-step research agent with parallel execution and budget controls.
+
+    Orchestrates: Plan -> Research (per step, optionally parallel) -> Synthesize.
+    """
+
+    def __init__(
+        self,
+        retriever_config: Optional[Dict] = None,
+        max_steps: int = DEFAULT_MAX_STEPS,
+        max_sub_iterations: int = DEFAULT_MAX_SUB_ITERATIONS,
+        timeout_seconds: int = DEFAULT_TIMEOUT_SECONDS,
+        token_budget: int = DEFAULT_TOKEN_BUDGET,
+        parallel_workers: int = DEFAULT_PARALLEL_WORKERS,
+        *args,
+        **kwargs,
+    ):
+        super().__init__(*args, **kwargs)
+        self.retriever_config = retriever_config or {}
+        self.max_steps = max_steps
+        self.max_sub_iterations = max_sub_iterations
+        self.timeout_seconds = timeout_seconds
+        self.token_budget = token_budget
+        self.parallel_workers = parallel_workers
+        self.citations = CitationManager()
+        self._start_time: float = 0
+        self._tokens_used: int = 0
+        self._last_token_snapshot: int = 0
+
+    # ------------------------------------------------------------------
+    # Budget & timeout helpers
+    # ------------------------------------------------------------------
+
+    def _is_timed_out(self) -> bool:
+        return (time.monotonic() - self._start_time) >= self.timeout_seconds
+
+    def _elapsed(self) -> float:
+        return round(time.monotonic() - self._start_time, 1)
+
+    def _track_tokens(self, count: int):
+        self._tokens_used += count
+
+    def _budget_remaining(self) -> int:
+        return max(self.token_budget - self._tokens_used, 0)
+
+    def _is_over_budget(self) -> bool:
+        return self._tokens_used >= self.token_budget
+
+    def _snapshot_llm_tokens(self) -> int:
+        """Read current token usage from LLM and return delta since last snapshot."""
+        current = self.llm.token_usage.get("prompt_tokens", 0) + self.llm.token_usage.get("generated_tokens", 0)
+        delta = current - self._last_token_snapshot
+        self._last_token_snapshot = current
+        return delta
+
+    # ------------------------------------------------------------------
+    # Main orchestration
+    # ------------------------------------------------------------------
+
+    def _gen_inner(
+        self, query: str, log_context: LogContext
+    ) -> Generator[Dict, None, None]:
+        self._start_time = time.monotonic()
+        tools_dict = self._setup_tools()
+
+        # Phase 0: Clarification (skip if user is responding to a prior clarification)
+        if not self._is_follow_up():
+            clarification = self._clarification_phase(query)
+            if clarification:
+                yield {"metadata": {"is_clarification": True}}
+                yield {"answer": clarification}
+                yield {"sources": []}
+                yield {"tool_calls": []}
+                log_context.stacks.append(
+                    {"component": "agent", "data": {"clarification": True}}
+                )
+                return
+
+        # Phase 1: Planning (with adaptive depth)
+        yield {"type": "research_progress", "data": {"status": "planning"}}
+        plan, complexity = self._planning_phase(query)
+
+        if not plan:
+            logger.warning("ResearchAgent: Planning produced no steps, falling back")
+            plan = [{"query": query, "rationale": "Direct investigation"}]
+            complexity = "simple"
+
+        yield {
+            "type": "research_plan",
+            "data": {"steps": plan, "complexity": complexity},
+        }
+
+        # Phase 2: Research each step (yields progress events in real-time)
+        intermediate_reports = []
+        for i, step in enumerate(plan):
+            step_num = i + 1
+            step_query = step.get("query", query)
+
+            if self._is_timed_out():
+                logger.warning(
+                    f"ResearchAgent: Timeout at step {step_num}/{len(plan)} "
+                    f"({self._elapsed()}s)"
+                )
+                break
+            if self._is_over_budget():
+                logger.warning(
+                    f"ResearchAgent: Token budget exhausted at step {step_num}/{len(plan)}"
+                )
+                break
+
+            yield {
+                "type": "research_progress",
+                "data": {
+                    "step": step_num,
+                    "total": len(plan),
+                    "query": step_query,
+                    "status": "researching",
+                },
+            }
+
+            report = self._research_step(step_query, tools_dict)
+            intermediate_reports.append({"step": step, "content": report})
+
+            yield {
+                "type": "research_progress",
+                "data": {
+                    "step": step_num,
+                    "total": len(plan),
+                    "query": step_query,
+                    "status": "complete",
+                },
+            }
+
+        # Phase 3: Synthesis (streaming)
+        if self._is_timed_out():
+            logger.warning(
+                f"ResearchAgent: Timeout ({self._elapsed()}s) before synthesis, "
+                f"synthesizing with {len(intermediate_reports)} reports"
+            )
+        yield {
+            "type": "research_progress",
+            "data": {
+                "status": "synthesizing",
+                "elapsed_seconds": self._elapsed(),
+                "tokens_used": self._tokens_used,
+            },
+        }
+        yield from self._synthesis_phase(
+            query, plan, intermediate_reports, tools_dict, log_context
+        )
+
+        # Sources and tool calls
+        self.retrieved_docs = self.citations.get_all_docs()
+        yield {"sources": self.retrieved_docs}
+        yield {"tool_calls": self._get_truncated_tool_calls()}
+
+        logger.info(
+            f"ResearchAgent completed: {len(intermediate_reports)}/{len(plan)} steps, "
+            f"{self._elapsed()}s, ~{self._tokens_used} tokens"
+        )
+        log_context.stacks.append(
+            {"component": "agent", "data": {"tool_calls": self.tool_calls.copy()}}
+        )
+
+    # ------------------------------------------------------------------
+    # Tool setup
+    # ------------------------------------------------------------------
+
+    def _setup_tools(self) -> Dict:
+        """Build tools_dict with user tools + internal search + think."""
+        tools_dict = self.tool_executor.get_tools()
+
+        add_internal_search_tool(tools_dict, self.retriever_config)
+
+        think_entry = dict(THINK_TOOL_ENTRY)
+        think_entry["config"] = {}
+        tools_dict[THINK_TOOL_ID] = think_entry
+
+        self._prepare_tools(tools_dict)
+        return tools_dict
+
+    # ------------------------------------------------------------------
+    # Phase 0: Clarification
+    # ------------------------------------------------------------------
+
+    def _is_follow_up(self) -> bool:
+        """Check if the user is responding to a prior clarification.
+
+        Uses the metadata flag stored in the conversation DB — no string matching.
+        Only skip clarification when the last query was explicitly flagged
+        as a clarification by this agent.
+        """
+        if not self.chat_history:
+            return False
+        last = self.chat_history[-1]
+        meta = last.get("metadata", {})
+        return bool(meta.get("is_clarification"))
+
+    def _clarification_phase(self, question: str) -> Optional[str]:
+        """Ask the LLM whether the question needs clarification.
+
+        Returns formatted clarification text if needed, or None to proceed.
+        Uses response_format to force valid JSON output.
+        """
+        messages = [
+            {"role": "system", "content": CLARIFICATION_PROMPT},
+            {"role": "user", "content": question},
+        ]
+
+        try:
+            response = self.llm.gen(
+                model=self.model_id,
+                messages=messages,
+                tools=None,
+                response_format={"type": "json_object"},
+            )
+            text = self._extract_text(response)
+            self._track_tokens(self._snapshot_llm_tokens())
+            logger.info(f"ResearchAgent clarification response: {text[:300]}")
+
+            data = self._parse_clarification_json(text)
+            if not data or not data.get("needs_clarification"):
+                return None
+
+            questions = data.get("questions", [])
+            if not questions:
+                return None
+
+            # Format as a friendly response
+            lines = [
+                "Before I begin researching, I'd like to clarify a few things:\n"
+            ]
+            for i, q in enumerate(questions[:3], 1):
+                lines.append(f"{i}. {q}")
+            lines.append(
+                "\nPlease provide these details and I'll start the research."
+            )
+            return "\n".join(lines)
+
+        except Exception as e:
+            logger.error(f"Clarification phase failed: {e}", exc_info=True)
+            return None  # proceed with research on failure
+
+    def _parse_clarification_json(self, text: str) -> Optional[Dict]:
+        """Parse clarification JSON from LLM response."""
+        try:
+            return json.loads(text)
+        except json.JSONDecodeError:
+            pass
+
+        # Try extracting from code fences
+        for marker in ["```json", "```"]:
+            if marker in text:
+                start = text.index(marker) + len(marker)
+                end = text.index("```", start) if "```" in text[start:] else len(text)
+                try:
+                    return json.loads(text[start:end].strip())
+                except (json.JSONDecodeError, ValueError):
+                    pass
+
+        # Try finding JSON object
+        for i, ch in enumerate(text):
+            if ch == "{":
+                for j in range(len(text) - 1, i, -1):
+                    if text[j] == "}":
+                        try:
+                            return json.loads(text[i : j + 1])
+                        except json.JSONDecodeError:
+                            continue
+                break
+
+        return None
+
+    # ------------------------------------------------------------------
+    # Phase 1: Planning (with adaptive depth)
+    # ------------------------------------------------------------------
+
+    def _planning_phase(self, question: str) -> tuple[List[Dict], str]:
+        """Decompose the question into research steps via LLM.
+
+        Returns (steps, complexity) where complexity is simple/moderate/complex.
+        """
+        messages = [
+            {"role": "system", "content": PLANNING_PROMPT},
+            {"role": "user", "content": question},
+        ]
+
+        try:
+            response = self.llm.gen(
+                model=self.model_id,
+                messages=messages,
+                tools=None,
+                response_format={"type": "json_object"},
+            )
+            text = self._extract_text(response)
+            self._track_tokens(self._snapshot_llm_tokens())
+            logger.info(f"ResearchAgent planning LLM response: {text[:500]}")
+
+            plan_data = self._parse_plan_json(text)
+            if isinstance(plan_data, dict):
+                complexity = plan_data.get("complexity", "moderate")
+                steps = plan_data.get("steps", [])
+            else:
+                complexity = "moderate"
+                steps = plan_data
+
+            # Adaptive depth: cap steps based on assessed complexity
+            cap = COMPLEXITY_CAPS.get(complexity, self.max_steps)
+            cap = min(cap, self.max_steps)
+            steps = steps[:cap]
+
+            logger.info(
+                f"ResearchAgent plan: complexity={complexity}, "
+                f"steps={len(steps)} (cap={cap})"
+            )
+            return steps, complexity
+
+        except Exception as e:
+            logger.error(f"Planning phase failed: {e}", exc_info=True)
+            return (
+                [{"query": question, "rationale": "Direct investigation (planning failed)"}],
+                "simple",
+            )
+
+    def _parse_plan_json(self, text: str):
+        """Extract JSON plan from LLM response. Returns dict or list."""
+        # Try direct parse
+        try:
+            data = json.loads(text)
+            if isinstance(data, dict) and "steps" in data:
+                return data
+            if isinstance(data, list):
+                return data
+        except json.JSONDecodeError:
+            pass
+
+        # Try extracting from markdown code fences
+        for marker in ["```json", "```"]:
+            if marker in text:
+                start = text.index(marker) + len(marker)
+                end = text.index("```", start) if "```" in text[start:] else len(text)
+                try:
+                    data = json.loads(text[start:end].strip())
+                    if isinstance(data, dict) and "steps" in data:
+                        return data
+                    if isinstance(data, list):
+                        return data
+                except (json.JSONDecodeError, ValueError):
+                    pass
+
+        # Try finding JSON object in text
+        for i, ch in enumerate(text):
+            if ch == "{":
+                for j in range(len(text) - 1, i, -1):
+                    if text[j] == "}":
+                        try:
+                            data = json.loads(text[i : j + 1])
+                            if isinstance(data, dict) and "steps" in data:
+                                return data
+                        except json.JSONDecodeError:
+                            continue
+                break
+
+        logger.warning(f"Could not parse plan JSON from: {text[:200]}")
+        return []
+
+    # ------------------------------------------------------------------
+    # Phase 2: Research step (core loop)
+    # ------------------------------------------------------------------
+
+    def _research_step(self, step_query: str, tools_dict: Dict) -> str:
+        """Run a focused research loop for one sub-question (sequential path)."""
+        report = self._research_step_with_executor(
+            step_query, tools_dict, self.tool_executor
+        )
+        self._collect_step_sources()
+        return report
+
+    def _research_step_with_executor(
+        self, step_query: str, tools_dict: Dict, executor: ToolExecutor
+    ) -> str:
+        """Core research loop. Works with any ToolExecutor instance."""
+        system_prompt = STEP_PROMPT.replace("{step_query}", step_query)
+        messages = [
+            {"role": "system", "content": system_prompt},
+            {"role": "user", "content": step_query},
+        ]
+
+        last_search_empty = False
+
+        for iteration in range(self.max_sub_iterations):
+            # Check timeout and budget
+            if self._is_timed_out():
+                logger.info(
+                    f"Research step '{step_query[:50]}' timed out at iteration {iteration}"
+                )
+                break
+            if self._is_over_budget():
+                logger.info(
+                    f"Research step '{step_query[:50]}' hit token budget at iteration {iteration}"
+                )
+                break
+
+            try:
+                response = self.llm.gen(
+                    model=self.model_id,
+                    messages=messages,
+                    tools=self.tools if self.tools else None,
+                )
+                self._track_tokens(self._snapshot_llm_tokens())
+            except Exception as e:
+                logger.error(
+                    f"Research step LLM call failed (iteration {iteration}): {e}",
+                    exc_info=True,
+                )
+                break
+
+            parsed = self.llm_handler.parse_response(response)
+
+            if not parsed.requires_tool_call:
+                return parsed.content or "No findings for this step."
+
+            # Execute tool calls
+            messages, last_search_empty = self._execute_step_tools_with_refinement(
+                parsed.tool_calls, tools_dict, messages, executor, last_search_empty
+            )
+
+        # Max iterations / timeout / budget — ask for summary
+        messages.append(
+            {
+                "role": "user",
+                "content": "Please summarize your findings so far based on the information gathered.",
+            }
+        )
+        try:
+            response = self.llm.gen(
+                model=self.model_id, messages=messages, tools=None
+            )
+            self._track_tokens(self._snapshot_llm_tokens())
+            text = self._extract_text(response)
+            return text or "Research step completed."
+        except Exception:
+            return "Research step completed."
+
+    def _execute_step_tools_with_refinement(
+        self,
+        tool_calls,
+        tools_dict: Dict,
+        messages: List[Dict],
+        executor: ToolExecutor,
+        last_search_empty: bool,
+    ) -> tuple[List[Dict], bool]:
+        """Execute tool calls with query refinement on empty results.
+
+        Returns (updated_messages, was_last_search_empty).
+        """
+        search_returned_empty = False
+
+        for call in tool_calls:
+            gen = executor.execute(
+                tools_dict, call, self.llm.__class__.__name__
+            )
+            result = None
+            call_id = None
+            while True:
+                try:
+                    event = next(gen)
+                    # Log tool_call status events instead of discarding them
+                    if isinstance(event, dict) and event.get("type") == "tool_call":
+                        logger.debug(
+                            "Tool %s status: %s",
+                            event.get("data", {}).get("action_name", ""),
+                            event.get("data", {}).get("status", ""),
+                        )
+                except StopIteration as e:
+                    result, call_id = e.value
+                    break
+
+            # Detect empty search results for refinement
+            is_search = "search" in (call.name or "").lower()
+            result_str = str(result) if result else ""
+            if is_search and "No documents found" in result_str:
+                search_returned_empty = True
+                if last_search_empty:
+                    # Two consecutive empty searches — inject refinement hint
+                    result_str += (
+                        "\n\nHint: Previous search also returned no results. "
+                        "Try a very different query with different keywords, "
+                        "or broaden your search terms."
+                    )
+                    result = result_str
+
+            import json as _json
+
+            args_str = (
+                _json.dumps(call.arguments)
+                if isinstance(call.arguments, dict)
+                else call.arguments
+            )
+            messages.append({
+                "role": "assistant",
+                "content": None,
+                "tool_calls": [{
+                    "id": call_id,
+                    "type": "function",
+                    "function": {"name": call.name, "arguments": args_str},
+                }],
+            })
+            tool_message = self.llm_handler.create_tool_message(call, result)
+            messages.append(tool_message)
+
+        return messages, search_returned_empty
+
+    def _collect_step_sources(self):
+        """Collect sources from InternalSearchTool and register with CitationManager."""
+        cache_key = f"internal_search:{INTERNAL_TOOL_ID}:{self.user or ''}"
+        tool = self.tool_executor._loaded_tools.get(cache_key)
+        if tool and hasattr(tool, "retrieved_docs"):
+            for doc in tool.retrieved_docs:
+                self.citations.add(doc)
+
+    # ------------------------------------------------------------------
+    # Phase 3: Synthesis
+    # ------------------------------------------------------------------
+
+    def _synthesis_phase(
+        self,
+        question: str,
+        plan: List[Dict],
+        intermediate_reports: List[Dict],
+        tools_dict: Dict,
+        log_context: LogContext,
+    ) -> Generator[Dict, None, None]:
+        """Compile all findings into a final cited report (streaming)."""
+        plan_lines = []
+        for i, step in enumerate(plan, 1):
+            plan_lines.append(
+                f"{i}. {step.get('query', 'Unknown')} — {step.get('rationale', '')}"
+            )
+        plan_summary = "\n".join(plan_lines)
+
+        findings_parts = []
+        for i, report in enumerate(intermediate_reports, 1):
+            step_query = report["step"].get("query", "Unknown")
+            content = report["content"]
+            findings_parts.append(
+                f"--- Step {i}: {step_query} ---\n{content}"
+            )
+        findings = "\n\n".join(findings_parts)
+
+        references = self.citations.format_references()
+
+        synthesis_prompt = SYNTHESIS_PROMPT.replace("{question}", question)
+        synthesis_prompt = synthesis_prompt.replace("{plan_summary}", plan_summary)
+        synthesis_prompt = synthesis_prompt.replace("{findings}", findings)
+        synthesis_prompt = synthesis_prompt.replace("{references}", references)
+
+        messages = [
+            {"role": "system", "content": synthesis_prompt},
+            {"role": "user", "content": f"Please write the research report for: {question}"},
+        ]
+
+        llm_response = self.llm.gen_stream(
+            model=self.model_id, messages=messages, tools=None
+        )
+
+        if log_context:
+            from application.logging import build_stack_data
+
+            log_context.stacks.append(
+                {"component": "synthesis_llm", "data": build_stack_data(self.llm)}
+            )
+
+        yield from self._handle_response(
+            llm_response, tools_dict, messages, log_context
+        )
+
+    # ------------------------------------------------------------------
+    # Helpers
+    # ------------------------------------------------------------------
+
+    def _extract_text(self, response) -> str:
+        """Extract text content from a non-streaming LLM response."""
+        if isinstance(response, str):
+            return response
+        if hasattr(response, "message") and hasattr(response.message, "content"):
+            return response.message.content or ""
+        if hasattr(response, "choices") and response.choices:
+            choice = response.choices[0]
+            if hasattr(choice, "message") and hasattr(choice.message, "content"):
+                return choice.message.content or ""
+        if hasattr(response, "content") and isinstance(response.content, list):
+            if response.content and hasattr(response.content[0], "text"):
+                return response.content[0].text or ""
+        return str(response) if response else ""
--- a/application/agents/tool_executor.py
+++ b/application/agents/tool_executor.py
@@ -0,0 +1,477 @@
+import logging
+import uuid
+from collections import Counter
+from typing import Dict, List, Optional, Tuple
+
+from bson.objectid import ObjectId
+
+from application.agents.tools.tool_action_parser import ToolActionParser
+from application.agents.tools.tool_manager import ToolManager
+from application.core.mongo_db import MongoDB
+from application.core.settings import settings
+from application.security.encryption import decrypt_credentials
+
+logger = logging.getLogger(__name__)
+
+
+class ToolExecutor:
+    """Handles tool discovery, preparation, and execution.
+
+    Extracted from BaseAgent to separate concerns and enable tool caching.
+    """
+
+    def __init__(
+        self,
+        user_api_key: Optional[str] = None,
+        user: Optional[str] = None,
+        decoded_token: Optional[Dict] = None,
+    ):
+        self.user_api_key = user_api_key
+        self.user = user
+        self.decoded_token = decoded_token
+        self.tool_calls: List[Dict] = []
+        self._loaded_tools: Dict[str, object] = {}
+        self.conversation_id: Optional[str] = None
+        self.client_tools: Optional[List[Dict]] = None
+        self._name_to_tool: Dict[str, Tuple[str, str]] = {}
+        self._tool_to_name: Dict[Tuple[str, str], str] = {}
+
+    def get_tools(self) -> Dict[str, Dict]:
+        """Load tool configs from DB based on user context.
+
+        If *client_tools* have been set on this executor, they are
+        automatically merged into the returned dict.
+        """
+        if self.user_api_key:
+            tools = self._get_tools_by_api_key(self.user_api_key)
+        else:
+            tools = self._get_user_tools(self.user or "local")
+        if self.client_tools:
+            self.merge_client_tools(tools, self.client_tools)
+        return tools
+
+    def _get_tools_by_api_key(self, api_key: str) -> Dict[str, Dict]:
+        mongo = MongoDB.get_client()
+        db = mongo[settings.MONGO_DB_NAME]
+        agents_collection = db["agents"]
+        tools_collection = db["user_tools"]
+
+        agent_data = agents_collection.find_one({"key": api_key})
+        tool_ids = agent_data.get("tools", []) if agent_data else []
+
+        tools = (
+            tools_collection.find(
+                {"_id": {"$in": [ObjectId(tool_id) for tool_id in tool_ids]}}
+            )
+            if tool_ids
+            else []
+        )
+        tools = list(tools)
+        return {str(tool["_id"]): tool for tool in tools} if tools else {}
+
+    def _get_user_tools(self, user: str = "local") -> Dict[str, Dict]:
+        mongo = MongoDB.get_client()
+        db = mongo[settings.MONGO_DB_NAME]
+        user_tools_collection = db["user_tools"]
+        user_tools = user_tools_collection.find({"user": user, "status": True})
+        user_tools = list(user_tools)
+        return {str(i): tool for i, tool in enumerate(user_tools)}
+
+    def merge_client_tools(
+        self, tools_dict: Dict, client_tools: List[Dict]
+    ) -> Dict:
+        """Merge client-provided tool definitions into tools_dict.
+
+        Client tools use the standard function-calling format::
+
+            [{"type": "function", "function": {"name": "get_weather",
+              "description": "...", "parameters": {...}}}]
+
+        They are stored in *tools_dict* with ``client_side: True`` so that
+        :meth:`check_pause` returns a pause signal instead of trying to
+        execute them server-side.
+
+        Args:
+            tools_dict: The mutable server tools dict (will be modified in place).
+            client_tools: List of tool definitions in function-calling format.
+
+        Returns:
+            The updated *tools_dict* (same reference, for convenience).
+        """
+        for i, ct in enumerate(client_tools):
+            func = ct.get("function", ct)  # tolerate bare {"name":..} too
+            name = func.get("name", f"clienttool{i}")
+            tool_id = f"ct{i}"
+
+            tools_dict[tool_id] = {
+                "name": name,
+                "client_side": True,
+                "actions": [
+                    {
+                        "name": name,
+                        "description": func.get("description", ""),
+                        "active": True,
+                        "parameters": func.get("parameters", {}),
+                    }
+                ],
+            }
+        return tools_dict
+
+    def prepare_tools_for_llm(self, tools_dict: Dict) -> List[Dict]:
+        """Convert tool configs to LLM function schemas.
+
+        Action names are kept clean for the LLM:
+        - Unique action names appear as-is (e.g. ``get_weather``).
+        - Duplicate action names get numbered suffixes (e.g. ``search_1``,
+          ``search_2``).
+
+        A reverse mapping is stored in ``_name_to_tool`` so that tool calls
+        can be routed back to the correct ``(tool_id, action_name)`` without
+        brittle string splitting.
+        """
+        # Pass 1: collect entries and count action name occurrences
+        entries: List[Tuple[str, str, Dict, bool]] = []  # (tool_id, action_name, action, is_client)
+        name_counts: Counter = Counter()
+
+        for tool_id, tool in tools_dict.items():
+            is_api = tool["name"] == "api_tool"
+            is_client = tool.get("client_side", False)
+
+            if is_api and "actions" not in tool.get("config", {}):
+                continue
+            if not is_api and "actions" not in tool:
+                continue
+
+            actions = (
+                tool["config"]["actions"].values()
+                if is_api
+                else tool["actions"]
+            )
+
+            for action in actions:
+                if not action.get("active", True):
+                    continue
+                entries.append((tool_id, action["name"], action, is_client))
+                name_counts[action["name"]] += 1
+
+        # Pass 2: assign LLM-visible names and build mappings
+        self._name_to_tool = {}
+        self._tool_to_name = {}
+        collision_counters: Dict[str, int] = {}
+        all_llm_names: set = set()
+
+        result = []
+        for tool_id, action_name, action, is_client in entries:
+            if name_counts[action_name] == 1:
+                llm_name = action_name
+            else:
+                counter = collision_counters.get(action_name, 1)
+                candidate = f"{action_name}_{counter}"
+                # Skip if candidate collides with a unique action name
+                while candidate in all_llm_names or (
+                    candidate in name_counts and name_counts[candidate] == 1
+                ):
+                    counter += 1
+                    candidate = f"{action_name}_{counter}"
+                collision_counters[action_name] = counter + 1
+                llm_name = candidate
+
+            all_llm_names.add(llm_name)
+            self._name_to_tool[llm_name] = (tool_id, action_name)
+            self._tool_to_name[(tool_id, action_name)] = llm_name
+
+            if is_client:
+                params = action.get("parameters", {})
+            else:
+                params = self._build_tool_parameters(action)
+
+            result.append({
+                "type": "function",
+                "function": {
+                    "name": llm_name,
+                    "description": action.get("description", ""),
+                    "parameters": params,
+                },
+            })
+        return result
+
+    def _build_tool_parameters(self, action: Dict) -> Dict:
+        params = {"type": "object", "properties": {}, "required": []}
+        for param_type in ["query_params", "headers", "body", "parameters"]:
+            if param_type in action and action[param_type].get("properties"):
+                for k, v in action[param_type]["properties"].items():
+                    if v.get("filled_by_llm", True):
+                        params["properties"][k] = {
+                            key: value
+                            for key, value in v.items()
+                            if key not in ("filled_by_llm", "value", "required")
+                        }
+                        if v.get("required", False):
+                            params["required"].append(k)
+        return params
+
+    def check_pause(
+        self, tools_dict: Dict, call, llm_class_name: str
+    ) -> Optional[Dict]:
+        """Check if a tool call requires pausing for approval or client execution.
+
+        Returns a dict describing the pending action if pause is needed, None otherwise.
+        """
+        parser = ToolActionParser(llm_class_name, name_mapping=self._name_to_tool)
+        tool_id, action_name, call_args = parser.parse_args(call)
+        call_id = getattr(call, "id", None) or str(uuid.uuid4())
+        llm_name = getattr(call, "name", "")
+
+        if tool_id is None or action_name is None or tool_id not in tools_dict:
+            return None  # Will be handled as error by execute()
+
+        tool_data = tools_dict[tool_id]
+
+        # Client-side tools
+        if tool_data.get("client_side"):
+            return {
+                "call_id": call_id,
+                "name": llm_name,
+                "tool_name": tool_data.get("name", "unknown"),
+                "tool_id": tool_id,
+                "action_name": action_name,
+                "llm_name": llm_name,
+                "arguments": call_args if isinstance(call_args, dict) else {},
+                "pause_type": "requires_client_execution",
+                "thought_signature": getattr(call, "thought_signature", None),
+            }
+
+        # Approval required
+        if tool_data["name"] == "api_tool":
+            action_data = tool_data.get("config", {}).get("actions", {}).get(
+                action_name, {}
+            )
+        else:
+            action_data = next(
+                (a for a in tool_data.get("actions", []) if a["name"] == action_name),
+                {},
+            )
+
+        if action_data.get("require_approval"):
+            return {
+                "call_id": call_id,
+                "name": llm_name,
+                "tool_name": tool_data.get("name", "unknown"),
+                "tool_id": tool_id,
+                "action_name": action_name,
+                "llm_name": llm_name,
+                "arguments": call_args if isinstance(call_args, dict) else {},
+                "pause_type": "awaiting_approval",
+                "thought_signature": getattr(call, "thought_signature", None),
+            }
+
+        return None
+
+    def execute(self, tools_dict: Dict, call, llm_class_name: str):
+        """Execute a tool call. Yields status events, returns (result, call_id)."""
+        parser = ToolActionParser(llm_class_name, name_mapping=self._name_to_tool)
+        tool_id, action_name, call_args = parser.parse_args(call)
+        llm_name = getattr(call, "name", "unknown")
+
+        call_id = getattr(call, "id", None) or str(uuid.uuid4())
+
+        if tool_id is None or action_name is None:
+            error_message = f"Error: Failed to parse LLM tool call. Tool name: {llm_name}"
+            logger.error(error_message)
+
+            tool_call_data = {
+                "tool_name": "unknown",
+                "call_id": call_id,
+                "action_name": llm_name,
+                "arguments": call_args or {},
+                "result": f"Failed to parse tool call. Invalid tool name format: {llm_name}",
+            }
+            yield {"type": "tool_call", "data": {**tool_call_data, "status": "error"}}
+            self.tool_calls.append(tool_call_data)
+            return "Failed to parse tool call.", call_id
+
+        if tool_id not in tools_dict:
+            error_message = f"Error: Tool ID '{tool_id}' extracted from LLM call not found in available tools_dict. Available IDs: {list(tools_dict.keys())}"
+            logger.error(error_message)
+
+            tool_call_data = {
+                "tool_name": "unknown",
+                "call_id": call_id,
+                "action_name": llm_name,
+                "arguments": call_args,
+                "result": f"Tool with ID {tool_id} not found. Available tools: {list(tools_dict.keys())}",
+            }
+            yield {"type": "tool_call", "data": {**tool_call_data, "status": "error"}}
+            self.tool_calls.append(tool_call_data)
+            return f"Tool with ID {tool_id} not found.", call_id
+
+        tool_call_data = {
+            "tool_name": tools_dict[tool_id]["name"],
+            "call_id": call_id,
+            "action_name": llm_name,
+            "arguments": call_args,
+        }
+        yield {"type": "tool_call", "data": {**tool_call_data, "status": "pending"}}
+
+        tool_data = tools_dict[tool_id]
+        action_data = (
+            tool_data["config"]["actions"][action_name]
+            if tool_data["name"] == "api_tool"
+            else next(
+                action
+                for action in tool_data["actions"]
+                if action["name"] == action_name
+            )
+        )
+
+        query_params, headers, body, parameters = {}, {}, {}, {}
+        param_types = {
+            "query_params": query_params,
+            "headers": headers,
+            "body": body,
+            "parameters": parameters,
+        }
+
+        for param_type, target_dict in param_types.items():
+            if param_type in action_data and action_data[param_type].get("properties"):
+                for param, details in action_data[param_type]["properties"].items():
+                    if (
+                        param not in call_args
+                        and "value" in details
+                        and details["value"]
+                    ):
+                        target_dict[param] = details["value"]
+        for param, value in call_args.items():
+            for param_type, target_dict in param_types.items():
+                if param_type in action_data and param in action_data[param_type].get(
+                    "properties", {}
+                ):
+                    target_dict[param] = value
+
+        # Load tool (with caching)
+        tool = self._get_or_load_tool(
+            tool_data, tool_id, action_name,
+            headers=headers, query_params=query_params,
+        )
+
+        resolved_arguments = (
+            {"query_params": query_params, "headers": headers, "body": body}
+            if tool_data["name"] == "api_tool"
+            else parameters
+        )
+        if tool_data["name"] == "api_tool":
+            logger.debug(
+                f"Executing api: {action_name} with query_params: {query_params}, headers: {headers}, body: {body}"
+            )
+            result = tool.execute_action(action_name, **body)
+        else:
+            logger.debug(f"Executing tool: {action_name} with args: {call_args}")
+            result = tool.execute_action(action_name, **parameters)
+
+        get_artifact_id = (
+            getattr(tool, "get_artifact_id", None)
+            if tool_data["name"] != "api_tool"
+            else None
+        )
+
+        artifact_id = None
+        if callable(get_artifact_id):
+            try:
+                artifact_id = get_artifact_id(action_name, **parameters)
+            except Exception:
+                logger.exception(
+                    "Failed to extract artifact_id from tool %s for action %s",
+                    tool_data["name"],
+                    action_name,
+                )
+
+        artifact_id = str(artifact_id).strip() if artifact_id is not None else ""
+        if artifact_id:
+            tool_call_data["artifact_id"] = artifact_id
+        result_full = str(result)
+        tool_call_data["resolved_arguments"] = resolved_arguments
+        tool_call_data["result_full"] = result_full
+        tool_call_data["result"] = (
+            f"{result_full[:50]}..." if len(result_full) > 50 else result_full
+        )
+
+        stream_tool_call_data = {
+            key: value
+            for key, value in tool_call_data.items()
+            if key not in {"result_full", "resolved_arguments"}
+        }
+        yield {"type": "tool_call", "data": {**stream_tool_call_data, "status": "completed"}}
+        self.tool_calls.append(tool_call_data)
+
+        return result, call_id
+
+    def _get_or_load_tool(
+        self, tool_data: Dict, tool_id: str, action_name: str,
+        headers: Optional[Dict] = None, query_params: Optional[Dict] = None,
+    ):
+        """Load a tool, using cache when possible."""
+        cache_key = f"{tool_data['name']}:{tool_id}:{self.user or ''}"
+        if cache_key in self._loaded_tools:
+            return self._loaded_tools[cache_key]
+
+        tm = ToolManager(config={})
+
+        if tool_data["name"] == "api_tool":
+            action_config = tool_data["config"]["actions"][action_name]
+            tool_config = {
+                "url": action_config["url"],
+                "method": action_config["method"],
+                "headers": headers or {},
+                "query_params": query_params or {},
+            }
+            if "body_content_type" in action_config:
+                tool_config["body_content_type"] = action_config.get(
+                    "body_content_type", "application/json"
+                )
+                tool_config["body_encoding_rules"] = action_config.get(
+                    "body_encoding_rules", {}
+                )
+        else:
+            tool_config = tool_data["config"].copy() if tool_data["config"] else {}
+            if tool_config.get("encrypted_credentials") and self.user:
+                decrypted = decrypt_credentials(
+                    tool_config["encrypted_credentials"], self.user
+                )
+                tool_config.update(decrypted)
+                tool_config["auth_credentials"] = decrypted
+                tool_config.pop("encrypted_credentials", None)
+            tool_config["tool_id"] = str(tool_data.get("_id", tool_id))
+            if self.conversation_id:
+                tool_config["conversation_id"] = self.conversation_id
+            if tool_data["name"] == "mcp_tool":
+                tool_config["query_mode"] = True
+
+        tool = tm.load_tool(
+            tool_data["name"],
+            tool_config=tool_config,
+            user_id=self.user,
+        )
+
+        # Don't cache api_tool since config varies by action
+        if tool_data["name"] != "api_tool":
+            self._loaded_tools[cache_key] = tool
+
+        return tool
+
+    def get_truncated_tool_calls(self) -> List[Dict]:
+        return [
+            {
+                "tool_name": tool_call.get("tool_name"),
+                "call_id": tool_call.get("call_id"),
+                "action_name": tool_call.get("action_name"),
+                "arguments": tool_call.get("arguments"),
+                "artifact_id": tool_call.get("artifact_id"),
+                "result": (
+                    f"{str(tool_call['result'])[:50]}..."
+                    if len(str(tool_call["result"])) > 50
+                    else tool_call["result"]
+                ),
+                "status": "completed",
+            }
+            for tool_call in self.tool_calls
+        ]
--- a/application/agents/tools/base.py
+++ b/application/agents/tools/base.py
@@ -2,6 +2,8 @@ from abc import ABC, abstractmethod


 class Tool(ABC):
+    internal: bool = False
+
    @abstractmethod
    def execute_action(self, action_name: str, **kwargs):
        pass
--- a/application/agents/tools/brave.py
+++ b/application/agents/tools/brave.py
@@ -1,6 +1,11 @@
+import logging
+
 import requests
+
 from application.agents.tools.base import Tool

+logger = logging.getLogger(__name__)
+

 class BraveSearchTool(Tool):
    """
@@ -41,7 +46,7 @@ class BraveSearchTool(Tool):
        """
        Performs a web search using the Brave Search API.
        """
-        print(f"Performing Brave web search for: {query}")
+        logger.debug("Performing Brave web search for: %s", query)

        url = f"{self.base_url}/web/search"

@@ -68,7 +73,7 @@ class BraveSearchTool(Tool):
            "X-Subscription-Token": self.token,
        }

-        response = requests.get(url, params=params, headers=headers)
+        response = requests.get(url, params=params, headers=headers, timeout=100)

        if response.status_code == 200:
            return {
@@ -94,7 +99,7 @@ class BraveSearchTool(Tool):
        """
        Performs an image search using the Brave Search API.
        """
-        print(f"Performing Brave image search for: {query}")
+        logger.debug("Performing Brave image search for: %s", query)

        url = f"{self.base_url}/images/search"

@@ -113,7 +118,7 @@ class BraveSearchTool(Tool):
            "X-Subscription-Token": self.token,
        }

-        response = requests.get(url, params=params, headers=headers)
+        response = requests.get(url, params=params, headers=headers, timeout=100)

        if response.status_code == 200:
            return {
@@ -177,6 +182,10 @@ class BraveSearchTool(Tool):
        return {
            "token": {
                "type": "string",
+                "label": "API Key",
                "description": "Brave Search API key for authentication",
+                "required": True,
+                "secret": True,
+                "order": 1,
            },
        }
--- a/application/agents/tools/cryptoprice.py
+++ b/application/agents/tools/cryptoprice.py
@@ -28,7 +28,7 @@ class CryptoPriceTool(Tool):
            returns price in USD.
        """
        url = f"https://min-api.cryptocompare.com/data/price?fsym={symbol.upper()}&tsyms={currency.upper()}"
-        response = requests.get(url)
+        response = requests.get(url, timeout=100)
        if response.status_code == 200:
            data = response.json()
            if currency.upper() in data:
--- a/application/agents/tools/duckduckgo.py
+++ b/application/agents/tools/duckduckgo.py
@@ -1,5 +1,14 @@
+import logging
+import time
+from typing import Any, Dict, Optional
+
 from application.agents.tools.base import Tool
-from duckduckgo_search import DDGS
+
+logger = logging.getLogger(__name__)
+
+MAX_RETRIES = 3
+RETRY_DELAY = 2.0
+DEFAULT_TIMEOUT = 15


 class DuckDuckGoSearchTool(Tool):
@@ -10,71 +19,123 @@ class DuckDuckGoSearchTool(Tool):

    def __init__(self, config):
        self.config = config
+        self.timeout = config.get("timeout", DEFAULT_TIMEOUT)
+
+    def _get_ddgs_client(self):
+        from ddgs import DDGS
+
+        return DDGS(timeout=self.timeout)
+
+    def _execute_with_retry(self, operation, operation_name: str) -> Dict[str, Any]:
+        last_error = None
+        for attempt in range(1, MAX_RETRIES + 1):
+            try:
+                results = operation()
+                return {
+                    "status_code": 200,
+                    "results": list(results) if results else [],
+                    "message": f"{operation_name} completed successfully.",
+                }
+            except Exception as e:
+                last_error = e
+                error_str = str(e).lower()
+                if "ratelimit" in error_str or "429" in error_str:
+                    if attempt < MAX_RETRIES:
+                        delay = RETRY_DELAY * attempt
+                        logger.warning(
+                            f"{operation_name} rate limited, retrying in {delay}s (attempt {attempt}/{MAX_RETRIES})"
+                        )
+                        time.sleep(delay)
+                        continue
+                logger.error(f"{operation_name} failed: {e}")
+                break
+        return {
+            "status_code": 500,
+            "results": [],
+            "message": f"{operation_name} failed: {str(last_error)}",
+        }

    def execute_action(self, action_name, **kwargs):
        actions = {
            "ddg_web_search": self._web_search,
            "ddg_image_search": self._image_search,
+            "ddg_news_search": self._news_search,
        }
-
-        if action_name in actions:
-            return actions[action_name](**kwargs)
-        else:
+        if action_name not in actions:
            raise ValueError(f"Unknown action: {action_name}")
+        return actions[action_name](**kwargs)

    def _web_search(
        self,
-        query,
-        max_results=5,
-    ):
-        print(f"Performing DuckDuckGo web search for: {query}")
+        query: str,
+        max_results: int = 5,
+        region: str = "wt-wt",
+        safesearch: str = "moderate",
+        timelimit: Optional[str] = None,
+    ) -> Dict[str, Any]:
+        logger.info(f"DuckDuckGo web search: {query}")

-        try:
-            results = DDGS().text(
+        def operation():
+            client = self._get_ddgs_client()
+            return client.text(
                query,
-                max_results=max_results,
+                region=region,
+                safesearch=safesearch,
+                timelimit=timelimit,
+                max_results=min(max_results, 20),
            )

-            return {
-                "status_code": 200,
-                "results": results,
-                "message": "Web search completed successfully.",
-            }
-        except Exception as e:
-            return {
-                "status_code": 500,
-                "message": f"Web search failed: {str(e)}",
-            }
+        return self._execute_with_retry(operation, "Web search")

    def _image_search(
        self,
-        query,
-        max_results=5,
-    ):
-        print(f"Performing DuckDuckGo image search for: {query}")
+        query: str,
+        max_results: int = 5,
+        region: str = "wt-wt",
+        safesearch: str = "moderate",
+        timelimit: Optional[str] = None,
+    ) -> Dict[str, Any]:
+        logger.info(f"DuckDuckGo image search: {query}")

-        try:
-            results = DDGS().images(
-                keywords=query,
-                max_results=max_results,
+        def operation():
+            client = self._get_ddgs_client()
+            return client.images(
+                query,
+                region=region,
+                safesearch=safesearch,
+                timelimit=timelimit,
+                max_results=min(max_results, 50),
            )

-            return {
-                "status_code": 200,
-                "results": results,
-                "message": "Image search completed successfully.",
-            }
-        except Exception as e:
-            return {
-                "status_code": 500,
-                "message": f"Image search failed: {str(e)}",
-            }
+        return self._execute_with_retry(operation, "Image search")
+
+    def _news_search(
+        self,
+        query: str,
+        max_results: int = 5,
+        region: str = "wt-wt",
+        safesearch: str = "moderate",
+        timelimit: Optional[str] = None,
+    ) -> Dict[str, Any]:
+        logger.info(f"DuckDuckGo news search: {query}")
+
+        def operation():
+            client = self._get_ddgs_client()
+            return client.news(
+                query,
+                region=region,
+                safesearch=safesearch,
+                timelimit=timelimit,
+                max_results=min(max_results, 20),
+            )
+
+        return self._execute_with_retry(operation, "News search")

    def get_actions_metadata(self):
        return [
            {
                "name": "ddg_web_search",
-                "description": "Perform a web search using DuckDuckGo.",
+                "description": "Search the web using DuckDuckGo. Returns titles, URLs, and snippets.",
                "parameters": {
                    "type": "object",
                    "properties": {
@@ -84,7 +145,15 @@ class DuckDuckGoSearchTool(Tool):
                        },
                        "max_results": {
                            "type": "integer",
-                            "description": "Number of results to return (default: 5)",
+                            "description": "Number of results (default: 5, max: 20)",
+                        },
+                        "region": {
+                            "type": "string",
+                            "description": "Region code (default: wt-wt for worldwide, us-en for US)",
+                        },
+                        "timelimit": {
+                            "type": "string",
+                            "description": "Time filter: d (day), w (week), m (month), y (year)",
                        },
                    },
                    "required": ["query"],
@@ -92,17 +161,43 @@ class DuckDuckGoSearchTool(Tool):
            },
            {
                "name": "ddg_image_search",
-                "description": "Perform an image search using DuckDuckGo.",
+                "description": "Search for images using DuckDuckGo. Returns image URLs and metadata.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
-                            "description": "Search query",
+                            "description": "Image search query",
                        },
                        "max_results": {
                            "type": "integer",
-                            "description": "Number of results to return (default: 5, max: 50)",
+                            "description": "Number of results (default: 5, max: 50)",
+                        },
+                        "region": {
+                            "type": "string",
+                            "description": "Region code (default: wt-wt for worldwide)",
+                        },
+                    },
+                    "required": ["query"],
+                },
+            },
+            {
+                "name": "ddg_news_search",
+                "description": "Search for news articles using DuckDuckGo. Returns recent news.",
+                "parameters": {
+                    "type": "object",
+                    "properties": {
+                        "query": {
+                            "type": "string",
+                            "description": "News search query",
+                        },
+                        "max_results": {
+                            "type": "integer",
+                            "description": "Number of results (default: 5, max: 20)",
+                        },
+                        "timelimit": {
+                            "type": "string",
+                            "description": "Time filter: d (day), w (week), m (month)",
                        },
                    },
                    "required": ["query"],
--- a/application/agents/tools/internal_search.py
+++ b/application/agents/tools/internal_search.py
@@ -0,0 +1,438 @@
+import json
+import logging
+from typing import Dict, List, Optional
+
+from application.agents.tools.base import Tool
+from application.core.settings import settings
+from application.retriever.retriever_creator import RetrieverCreator
+
+logger = logging.getLogger(__name__)
+
+
+class InternalSearchTool(Tool):
+    """Wraps the ClassicRAG retriever as an LLM-callable tool.
+
+    Instead of pre-fetching docs into the prompt, the LLM decides
+    when and what to search. Supports multiple searches per session.
+
+    Optional capabilities (enabled when sources have directory_structure):
+    - path_filter on search: restrict results to a specific file/folder
+    - list_files action: browse the file/folder structure
+    """
+
+    internal = True
+
+    def __init__(self, config: Dict):
+        self.config = config
+        self.retrieved_docs: List[Dict] = []
+        self._retriever = None
+        self._directory_structure: Optional[Dict] = None
+        self._dir_structure_loaded = False
+
+    def _get_retriever(self):
+        if self._retriever is None:
+            self._retriever = RetrieverCreator.create_retriever(
+                self.config.get("retriever_name", "classic"),
+                source=self.config.get("source", {}),
+                chat_history=[],
+                prompt="",
+                chunks=int(self.config.get("chunks", 2)),
+                doc_token_limit=int(self.config.get("doc_token_limit", 50000)),
+                model_id=self.config.get("model_id", "docsgpt-local"),
+                user_api_key=self.config.get("user_api_key"),
+                agent_id=self.config.get("agent_id"),
+                llm_name=self.config.get("llm_name", settings.LLM_PROVIDER),
+                api_key=self.config.get("api_key", settings.API_KEY),
+                decoded_token=self.config.get("decoded_token"),
+            )
+        return self._retriever
+
+    def _get_directory_structure(self) -> Optional[Dict]:
+        """Load directory structure from MongoDB for the configured sources."""
+        if self._dir_structure_loaded:
+            return self._directory_structure
+
+        self._dir_structure_loaded = True
+        source = self.config.get("source", {})
+        active_docs = source.get("active_docs", [])
+        if not active_docs:
+            return None
+
+        try:
+            from bson.objectid import ObjectId
+            from application.core.mongo_db import MongoDB
+
+            mongo = MongoDB.get_client()
+            db = mongo[settings.MONGO_DB_NAME]
+            sources_collection = db["sources"]
+
+            if isinstance(active_docs, str):
+                active_docs = [active_docs]
+
+            merged_structure = {}
+            for doc_id in active_docs:
+                try:
+                    source_doc = sources_collection.find_one(
+                        {"_id": ObjectId(doc_id)}
+                    )
+                    if not source_doc:
+                        continue
+                    dir_str = source_doc.get("directory_structure")
+                    if dir_str:
+                        if isinstance(dir_str, str):
+                            dir_str = json.loads(dir_str)
+                        source_name = source_doc.get("name", doc_id)
+                        if len(active_docs) > 1:
+                            merged_structure[source_name] = dir_str
+                        else:
+                            merged_structure = dir_str
+                except Exception as e:
+                    logger.debug(f"Could not load dir structure for {doc_id}: {e}")
+
+            self._directory_structure = merged_structure if merged_structure else None
+        except Exception as e:
+            logger.debug(f"Failed to load directory structures: {e}")
+
+        return self._directory_structure
+
+    def execute_action(self, action_name: str, **kwargs):
+        if action_name == "search":
+            return self._execute_search(**kwargs)
+        elif action_name == "list_files":
+            return self._execute_list_files(**kwargs)
+        return f"Unknown action: {action_name}"
+
+    def _execute_search(self, **kwargs) -> str:
+        query = kwargs.get("query", "")
+        path_filter = kwargs.get("path_filter", "")
+
+        if not query:
+            return "Error: 'query' parameter is required."
+
+        try:
+            retriever = self._get_retriever()
+            docs = retriever.search(query)
+        except Exception as e:
+            logger.error(f"Internal search failed: {e}", exc_info=True)
+            return "Search failed: an internal error occurred."
+
+        if not docs:
+            return "No documents found matching your query."
+
+        # Apply path filter if specified
+        if path_filter:
+            path_lower = path_filter.lower()
+            docs = [
+                d
+                for d in docs
+                if path_lower in d.get("source", "").lower()
+                or path_lower in d.get("filename", "").lower()
+                or path_lower in d.get("title", "").lower()
+            ]
+            if not docs:
+                return f"No documents found matching query '{query}' in path '{path_filter}'."
+
+        # Accumulate for source tracking
+        for doc in docs:
+            if doc not in self.retrieved_docs:
+                self.retrieved_docs.append(doc)
+
+        # Format results for the LLM
+        formatted = []
+        for i, doc in enumerate(docs, 1):
+            title = doc.get("title", "Untitled")
+            text = doc.get("text", "")
+            source = doc.get("source", "Unknown")
+            filename = doc.get("filename", "")
+            header = filename or title
+            formatted.append(f"[{i}] {header} (source: {source})\n{text}")
+
+        return "\n\n---\n\n".join(formatted)
+
+    def _execute_list_files(self, **kwargs) -> str:
+        path = kwargs.get("path", "")
+        dir_structure = self._get_directory_structure()
+
+        if not dir_structure:
+            return "No file structure available for the current sources."
+
+        # Navigate to the requested path
+        current = dir_structure
+        if path:
+            for part in path.strip("/").split("/"):
+                if not part:
+                    continue
+                if isinstance(current, dict) and part in current:
+                    current = current[part]
+                else:
+                    return f"Path '{path}' not found in the file structure."
+
+        # Format the structure for the LLM
+        return self._format_structure(current, path or "/")
+
+    def _format_structure(self, node: Dict, current_path: str) -> str:
+        if not isinstance(node, dict):
+            return f"'{current_path}' is a file, not a directory."
+
+        lines = [f"File structure at '{current_path}':\n"]
+        folders = []
+        files = []
+
+        for name, value in sorted(node.items()):
+            if isinstance(value, dict):
+                # Check if it's a file metadata dict or a folder
+                if "type" in value or "size_bytes" in value or "token_count" in value:
+                    # It's a file with metadata
+                    size = value.get("token_count", "")
+                    ftype = value.get("type", "")
+                    info_parts = []
+                    if ftype:
+                        info_parts.append(ftype)
+                    if size:
+                        info_parts.append(f"{size} tokens")
+                    info = f" ({', '.join(info_parts)})" if info_parts else ""
+                    files.append(f"  {name}{info}")
+                else:
+                    # It's a folder
+                    count = self._count_files(value)
+                    folders.append(f"  {name}/ ({count} items)")
+            else:
+                files.append(f"  {name}")
+
+        if folders:
+            lines.append("Folders:")
+            lines.extend(folders)
+        if files:
+            lines.append("Files:")
+            lines.extend(files)
+        if not folders and not files:
+            lines.append("  (empty)")
+
+        return "\n".join(lines)
+
+    def _count_files(self, node: Dict) -> int:
+        count = 0
+        for value in node.values():
+            if isinstance(value, dict):
+                if "type" in value or "size_bytes" in value or "token_count" in value:
+                    count += 1
+                else:
+                    count += self._count_files(value)
+            else:
+                count += 1
+        return count
+
+    def get_actions_metadata(self):
+        actions = [
+            {
+                "name": "search",
+                "description": (
+                    "Search the user's uploaded documents and knowledge base. "
+                    "Use this to find relevant information before answering questions. "
+                    "You can call this multiple times with different queries."
+                ),
+                "parameters": {
+                    "properties": {
+                        "query": {
+                            "type": "string",
+                            "description": "The search query. Be specific and focused.",
+                            "filled_by_llm": True,
+                            "required": True,
+                        },
+                    }
+                },
+            }
+        ]
+
+        # Add path_filter and list_files only if directory structure exists
+        has_structure = self.config.get("has_directory_structure", False)
+        if has_structure:
+            actions[0]["parameters"]["properties"]["path_filter"] = {
+                "type": "string",
+                "description": (
+                    "Optional: filter results to a specific file or folder path. "
+                    "Use list_files first to see available paths."
+                ),
+                "filled_by_llm": True,
+                "required": False,
+            }
+            actions.append(
+                {
+                    "name": "list_files",
+                    "description": (
+                        "Browse the file and folder structure of the knowledge base. "
+                        "Use this to see what files are available before searching. "
+                        "Optionally provide a path to browse a specific folder."
+                    ),
+                    "parameters": {
+                        "properties": {
+                            "path": {
+                                "type": "string",
+                                "description": "Optional: folder path to browse. Leave empty for root.",
+                                "filled_by_llm": True,
+                                "required": False,
+                            }
+                        }
+                    },
+                }
+            )
+
+        return actions
+
+    def get_config_requirements(self):
+        return {}
+
+
+# Constants for building synthetic tools_dict entries
+INTERNAL_TOOL_ID = "internal"
+
+
+def build_internal_tool_entry(has_directory_structure: bool = False) -> Dict:
+    """Build the tools_dict entry for InternalSearchTool.
+
+    Dynamically includes list_files and path_filter based on
+    whether the sources have directory structure.
+    """
+    search_params = {
+        "properties": {
+            "query": {
+                "type": "string",
+                "description": "The search query. Be specific and focused.",
+                "filled_by_llm": True,
+                "required": True,
+            }
+        }
+    }
+
+    actions = [
+        {
+            "name": "search",
+            "description": (
+                "Search the user's uploaded documents and knowledge base. "
+                "Use this to find relevant information before answering questions. "
+                "You can call this multiple times with different queries."
+            ),
+            "active": True,
+            "parameters": search_params,
+        }
+    ]
+
+    if has_directory_structure:
+        search_params["properties"]["path_filter"] = {
+            "type": "string",
+            "description": (
+                "Optional: filter results to a specific file or folder path. "
+                "Use list_files first to see available paths."
+            ),
+            "filled_by_llm": True,
+            "required": False,
+        }
+        actions.append(
+            {
+                "name": "list_files",
+                "description": (
+                    "Browse the file and folder structure of the knowledge base. "
+                    "Use this to see what files are available before searching. "
+                    "Optionally provide a path to browse a specific folder."
+                ),
+                "active": True,
+                "parameters": {
+                    "properties": {
+                        "path": {
+                            "type": "string",
+                            "description": "Optional: folder path to browse. Leave empty for root.",
+                            "filled_by_llm": True,
+                            "required": False,
+                        }
+                    }
+                },
+            }
+        )
+
+    return {"name": "internal_search", "actions": actions}
+
+
+# Keep backward compat
+INTERNAL_TOOL_ENTRY = build_internal_tool_entry(has_directory_structure=False)
+
+
+def sources_have_directory_structure(source: Dict) -> bool:
+    """Check if any of the active sources have directory_structure in MongoDB."""
+    active_docs = source.get("active_docs", [])
+    if not active_docs:
+        return False
+
+    try:
+        from bson.objectid import ObjectId
+        from application.core.mongo_db import MongoDB
+
+        mongo = MongoDB.get_client()
+        db = mongo[settings.MONGO_DB_NAME]
+        sources_collection = db["sources"]
+
+        if isinstance(active_docs, str):
+            active_docs = [active_docs]
+
+        for doc_id in active_docs:
+            try:
+                source_doc = sources_collection.find_one(
+                    {"_id": ObjectId(doc_id)},
+                    {"directory_structure": 1},
+                )
+                if source_doc and source_doc.get("directory_structure"):
+                    return True
+            except Exception:
+                continue
+    except Exception as e:
+        logger.debug(f"Could not check directory structure: {e}")
+
+    return False
+
+
+def add_internal_search_tool(tools_dict: Dict, retriever_config: Dict) -> None:
+    """Add the internal search tool to tools_dict if sources are configured.
+
+    Shared by AgenticAgent and ResearchAgent to avoid duplicate setup logic.
+    Mutates tools_dict in place.
+    """
+    source = retriever_config.get("source", {})
+    has_sources = bool(source.get("active_docs"))
+    if not retriever_config or not has_sources:
+        return
+
+    has_dir = sources_have_directory_structure(source)
+    internal_entry = build_internal_tool_entry(has_directory_structure=has_dir)
+    internal_entry["config"] = build_internal_tool_config(
+        **retriever_config,
+        has_directory_structure=has_dir,
+    )
+    tools_dict[INTERNAL_TOOL_ID] = internal_entry
+
+
+def build_internal_tool_config(
+    source: Dict,
+    retriever_name: str = "classic",
+    chunks: int = 2,
+    doc_token_limit: int = 50000,
+    model_id: str = "docsgpt-local",
+    user_api_key: Optional[str] = None,
+    agent_id: Optional[str] = None,
+    llm_name: str = None,
+    api_key: str = None,
+    decoded_token: Optional[Dict] = None,
+    has_directory_structure: bool = False,
+) -> Dict:
+    """Build the config dict for InternalSearchTool."""
+    return {
+        "source": source,
+        "retriever_name": retriever_name,
+        "chunks": chunks,
+        "doc_token_limit": doc_token_limit,
+        "model_id": model_id,
+        "user_api_key": user_api_key,
+        "agent_id": agent_id,
+        "llm_name": llm_name or settings.LLM_PROVIDER,
+        "api_key": api_key or settings.API_KEY,
+        "decoded_token": decoded_token,
+        "has_directory_structure": has_directory_structure,
+    }
--- a/application/agents/tools/mcp_tool.py
+++ b/application/agents/tools/mcp_tool.py
@@ -1,20 +1,12 @@
 import asyncio
 import base64
+import concurrent.futures
 import json
 import logging
 import time
 from typing import Any, Dict, List, Optional
 from urllib.parse import parse_qs, urlparse

-from application.agents.tools.base import Tool
-from application.api.user.tasks import mcp_oauth_status_task, mcp_oauth_task
-from application.cache import get_redis_instance
-
-from application.core.mongo_db import MongoDB
-
-from application.core.settings import settings
-
-from application.security.encryption import decrypt_credentials
 from fastmcp import Client
 from fastmcp.client.auth import BearerAuth
 from fastmcp.client.transports import (
@@ -24,10 +16,19 @@ from fastmcp.client.transports import (
 )
 from mcp.client.auth import OAuthClientProvider, TokenStorage
 from mcp.shared.auth import OAuthClientInformationFull, OAuthClientMetadata, OAuthToken
-
 from pydantic import AnyHttpUrl, ValidationError
 from redis import Redis

+from application.agents.tools.base import Tool
+from application.api.user.tasks import mcp_oauth_status_task, mcp_oauth_task
+from application.cache import get_redis_instance
+from application.core.mongo_db import MongoDB
+from application.core.settings import settings
+from application.core.url_validation import SSRFError, validate_url
+from application.security.encryption import decrypt_credentials
+
+logger = logging.getLogger(__name__)
+
 mongo = MongoDB.get_client()
 db = mongo[settings.MONGO_DB_NAME]

@@ -56,11 +57,13 @@ class MCPTool(Tool):
                - args: Arguments for STDIO transport
                - oauth_scopes: OAuth scopes for oauth auth type
                - oauth_client_name: OAuth client name for oauth auth type
+                - query_mode: If True, use non-interactive OAuth (fail-fast on 401)
            user_id: User ID for decrypting credentials (required if encrypted_credentials exist)
        """
        self.config = config
        self.user_id = user_id
-        self.server_url = config.get("server_url", "")
+        raw_url = config.get("server_url", "")
+        self.server_url = self._validate_server_url(raw_url) if raw_url else ""
        self.transport_type = config.get("transport_type", "auto")
        self.auth_type = config.get("auth_type", "none")
        self.timeout = config.get("timeout", 30)
@@ -76,23 +79,53 @@ class MCPTool(Tool):
        self.oauth_scopes = config.get("oauth_scopes", [])
        self.oauth_task_id = config.get("oauth_task_id", None)
        self.oauth_client_name = config.get("oauth_client_name", "DocsGPT-MCP")
-        self.redirect_uri = f"{settings.API_URL}/api/mcp_server/callback"
+        self.redirect_uri = self._resolve_redirect_uri(config.get("redirect_uri"))

        self.available_tools = []
        self._cache_key = self._generate_cache_key()
        self._client = None
-
-        # Only validate and setup if server_url is provided and not OAuth
+        self.query_mode = config.get("query_mode", False)

        if self.server_url and self.auth_type != "oauth":
            self._setup_client()

+    @staticmethod
+    def _validate_server_url(server_url: str) -> str:
+        """Validate server_url to prevent SSRF to internal networks.
+
+        Raises:
+            ValueError: If the URL points to a private/internal address.
+        """
+        try:
+            return validate_url(server_url)
+        except SSRFError as exc:
+            raise ValueError(f"Invalid MCP server URL: {exc}") from exc
+
+    def _resolve_redirect_uri(self, configured_redirect_uri: Optional[str]) -> str:
+        if configured_redirect_uri:
+            return configured_redirect_uri.rstrip("/")
+
+        explicit = getattr(settings, "MCP_OAUTH_REDIRECT_URI", None)
+        if explicit:
+            return explicit.rstrip("/")
+
+        connector_base = getattr(settings, "CONNECTOR_REDIRECT_BASE_URI", None)
+        if connector_base:
+            parsed = urlparse(connector_base)
+            if parsed.scheme and parsed.netloc:
+                return f"{parsed.scheme}://{parsed.netloc}/api/mcp_server/callback"
+
+        return f"{settings.API_URL.rstrip('/')}/api/mcp_server/callback"
+
    def _generate_cache_key(self) -> str:
        """Generate a unique cache key for this MCP server configuration."""
        auth_key = ""
        if self.auth_type == "oauth":
            scopes_str = ",".join(self.oauth_scopes) if self.oauth_scopes else "none"
-            auth_key = f"oauth:{self.oauth_client_name}:{scopes_str}"
+            oauth_identity = self.user_id or self.oauth_task_id or "anonymous"
+            auth_key = (
+                f"oauth:{oauth_identity}:{self.oauth_client_name}:{scopes_str}:{self.redirect_uri}"
+            )
        elif self.auth_type in ["bearer"]:
            token = self.auth_credentials.get(
                "bearer_token", ""
@@ -109,11 +142,10 @@ class MCPTool(Tool):
        return f"{self.server_url}#{self.transport_type}#{auth_key}"

    def _setup_client(self):
-        """Setup FastMCP client with proper transport and authentication."""
        global _mcp_clients_cache
        if self._cache_key in _mcp_clients_cache:
            cached_data = _mcp_clients_cache[self._cache_key]
-            if time.time() - cached_data["created_at"] < 1800:
+            if time.time() - cached_data["created_at"] < 300:
                self._client = cached_data["client"]
                return
            else:
@@ -123,15 +155,25 @@ class MCPTool(Tool):

        if self.auth_type == "oauth":
            redis_client = get_redis_instance()
-            auth = DocsGPTOAuth(
-                mcp_url=self.server_url,
-                scopes=self.oauth_scopes,
-                redis_client=redis_client,
-                redirect_uri=self.redirect_uri,
-                task_id=self.oauth_task_id,
-                db=db,
-                user_id=self.user_id,
-            )
+            if self.query_mode:
+                auth = NonInteractiveOAuth(
+                    mcp_url=self.server_url,
+                    scopes=self.oauth_scopes,
+                    redis_client=redis_client,
+                    redirect_uri=self.redirect_uri,
+                    db=db,
+                    user_id=self.user_id,
+                )
+            else:
+                auth = DocsGPTOAuth(
+                    mcp_url=self.server_url,
+                    scopes=self.oauth_scopes,
+                    redis_client=redis_client,
+                    redirect_uri=self.redirect_uri,
+                    task_id=self.oauth_task_id,
+                    db=db,
+                    user_id=self.user_id,
+                )
        elif self.auth_type == "bearer":
            token = self.auth_credentials.get(
                "bearer_token", ""
@@ -233,38 +275,53 @@ class MCPTool(Tool):
            else:
                raise Exception(f"Unknown operation: {operation}")

+    _ERROR_MAP = [
+        (concurrent.futures.TimeoutError, lambda op, t, _: f"Timed out after {t}s"),
+        (ConnectionRefusedError, lambda *_: "Connection refused"),
+    ]
+
+    _ERROR_PATTERNS = {
+        ("403", "Forbidden"): "Access denied (403 Forbidden)",
+        ("401", "Unauthorized"): "Authentication failed (401 Unauthorized)",
+        ("ECONNREFUSED",): "Connection refused",
+        ("SSL", "certificate"): "SSL/TLS error",
+    }
+
    def _run_async_operation(self, operation: str, *args, **kwargs):
-        """Run async operation in sync context."""
        try:
            try:
-                loop = asyncio.get_running_loop()
-                import concurrent.futures
-
-                def run_in_thread():
-                    new_loop = asyncio.new_event_loop()
-                    asyncio.set_event_loop(new_loop)
-                    try:
-                        return new_loop.run_until_complete(
-                            self._execute_with_client(operation, *args, **kwargs)
-                        )
-                    finally:
-                        new_loop.close()
-
+                asyncio.get_running_loop()
                with concurrent.futures.ThreadPoolExecutor() as executor:
-                    future = executor.submit(run_in_thread)
+                    future = executor.submit(
+                        self._run_in_new_loop, operation, *args, **kwargs
+                    )
                    return future.result(timeout=self.timeout)
            except RuntimeError:
-                loop = asyncio.new_event_loop()
-                asyncio.set_event_loop(loop)
-                try:
-                    return loop.run_until_complete(
-                        self._execute_with_client(operation, *args, **kwargs)
-                    )
-                finally:
-                    loop.close()
+                return self._run_in_new_loop(operation, *args, **kwargs)
        except Exception as e:
-            print(f"Error occurred while running async operation: {e}")
-            raise
+            raise self._map_error(operation, e) from e
+            raise self._map_error(operation, e) from e
+
+    def _run_in_new_loop(self, operation, *args, **kwargs):
+        loop = asyncio.new_event_loop()
+        asyncio.set_event_loop(loop)
+        try:
+            return loop.run_until_complete(
+                self._execute_with_client(operation, *args, **kwargs)
+            )
+        finally:
+            loop.close()
+
+    def _map_error(self, operation: str, exc: Exception) -> Exception:
+        for exc_type, msg_fn in self._ERROR_MAP:
+            if isinstance(exc, exc_type):
+                return Exception(msg_fn(operation, self.timeout, exc))
+        error_msg = str(exc)
+        for patterns, friendly in self._ERROR_PATTERNS.items():
+            if any(p.lower() in error_msg.lower() for p in patterns):
+                return Exception(friendly)
+        logger.error("MCP %s failed: %s", operation, exc)
+        return exc

    def discover_tools(self) -> List[Dict]:
        """
@@ -285,16 +342,6 @@ class MCPTool(Tool):
            raise Exception(f"Failed to discover tools from MCP server: {str(e)}")

    def execute_action(self, action_name: str, **kwargs) -> Any:
-        """
-        Execute an action on the remote MCP server using FastMCP.
-
-        Args:
-            action_name: Name of the action to execute
-            **kwargs: Parameters for the action
-
-        Returns:
-            Result from the MCP server
-        """
        if not self.server_url:
            raise Exception("No MCP server configured")
        if not self._client:
@@ -310,7 +357,37 @@ class MCPTool(Tool):
            )
            return self._format_result(result)
        except Exception as e:
-            raise Exception(f"Failed to execute action '{action_name}': {str(e)}")
+            error_msg = str(e)
+            lower_msg = error_msg.lower()
+            is_auth_error = (
+                "401" in error_msg
+                or "unauthorized" in lower_msg
+                or "session expired" in lower_msg
+                or "re-authorize" in lower_msg
+            )
+            if is_auth_error:
+                if self.auth_type == "oauth":
+                    raise Exception(
+                        f"Action '{action_name}' failed: OAuth session expired. "
+                        "Please re-authorize this MCP server in tool settings."
+                    ) from e
+                global _mcp_clients_cache
+                _mcp_clients_cache.pop(self._cache_key, None)
+                self._client = None
+                self._setup_client()
+                try:
+                    result = self._run_async_operation(
+                        "call_tool", action_name, **cleaned_kwargs
+                    )
+                    return self._format_result(result)
+                except Exception as retry_e:
+                    raise Exception(
+                        f"Action '{action_name}' failed after re-auth attempt: {retry_e}. "
+                        "Your credentials may have expired — please re-authorize in tool settings."
+                    ) from retry_e
+            raise Exception(
+                f"Failed to execute action '{action_name}': {error_msg}"
+            ) from e

    def _format_result(self, result) -> Dict:
        """Format FastMCP result to match expected format."""
@@ -333,23 +410,35 @@ class MCPTool(Tool):
            return result

    def test_connection(self) -> Dict:
-        """
-        Test the connection to the MCP server and validate functionality.
-
-        Returns:
-            Dictionary with connection test results including tool count
-        """
        if not self.server_url:
            return {
                "success": False,
-                "message": "No MCP server URL configured",
+                "message": "No server URL configured",
+                "tools_count": 0,
+            }
+        try:
+            parsed = urlparse(self.server_url)
+            if parsed.scheme not in ("http", "https"):
+                return {
+                    "success": False,
+                    "message": f"Invalid URL scheme '{parsed.scheme}' — use http:// or https://",
+                    "tools_count": 0,
+                }
+        except Exception:
+            return {
+                "success": False,
+                "message": "Invalid URL format",
                "tools_count": 0,
-                "transport_type": self.transport_type,
-                "auth_type": self.auth_type,
-                "error_type": "ConfigurationError",
            }
        if not self._client:
-            self._setup_client()
+            try:
+                self._setup_client()
+            except Exception as e:
+                return {
+                    "success": False,
+                    "message": f"Client init failed: {str(e)}",
+                    "tools_count": 0,
+                }
        try:
            if self.auth_type == "oauth":
                return self._test_oauth_connection()
@@ -360,56 +449,94 @@ class MCPTool(Tool):
                "success": False,
                "message": f"Connection failed: {str(e)}",
                "tools_count": 0,
-                "transport_type": self.transport_type,
-                "auth_type": self.auth_type,
-                "error_type": type(e).__name__,
            }

    def _test_regular_connection(self) -> Dict:
-        """Test connection for non-OAuth auth types."""
+        ping_ok = False
+        ping_error = None
        try:
            self._run_async_operation("ping")
-            ping_success = True
-        except Exception:
-            ping_success = False
-        tools = self.discover_tools()
+            ping_ok = True
+        except Exception as e:
+            ping_error = str(e)

-        message = f"Successfully connected to MCP server. Found {len(tools)} tools."
-        if not ping_success:
-            message += " (Ping not supported, but tool discovery worked)"
-        return {
-            "success": True,
-            "message": message,
-            "tools_count": len(tools),
-            "transport_type": self.transport_type,
-            "auth_type": self.auth_type,
-            "ping_supported": ping_success,
-            "tools": [tool.get("name", "unknown") for tool in tools],
-        }
-
-    def _test_oauth_connection(self) -> Dict:
-        """Test connection for OAuth auth type with proper async handling."""
        try:
-            task = mcp_oauth_task.delay(config=self.config, user=self.user_id)
-            if not task:
-                raise Exception("Failed to start OAuth authentication")
-            return {
-                "success": True,
-                "requires_oauth": True,
-                "task_id": task.id,
-                "status": "pending",
-                "message": "OAuth flow started",
-            }
+            tools = self.discover_tools()
        except Exception as e:
            return {
                "success": False,
-                "message": f"OAuth connection failed: {str(e)}",
+                "message": f"Connection failed: {ping_error or str(e)}",
                "tools_count": 0,
-                "transport_type": self.transport_type,
-                "auth_type": self.auth_type,
-                "error_type": type(e).__name__,
            }

+        if not tools and not ping_ok:
+            return {
+                "success": False,
+                "message": f"Connection failed: {ping_error or 'No tools found'}",
+                "tools_count": 0,
+            }
+
+        return {
+            "success": True,
+            "message": f"Connected — found {len(tools)} tool{'s' if len(tools) != 1 else ''}.",
+            "tools_count": len(tools),
+            "tools": [
+                {
+                    "name": tool.get("name", "unknown"),
+                    "description": tool.get("description", ""),
+                }
+                for tool in tools
+            ],
+        }
+
+    def _test_oauth_connection(self) -> Dict:
+        storage = DBTokenStorage(
+            server_url=self.server_url, user_id=self.user_id, db_client=db
+        )
+        loop = asyncio.new_event_loop()
+        try:
+            tokens = loop.run_until_complete(storage.get_tokens())
+        finally:
+            loop.close()
+
+        if tokens and tokens.access_token:
+            self.query_mode = True
+            _mcp_clients_cache.pop(self._cache_key, None)
+            self._client = None
+            self._setup_client()
+            try:
+                tools = self.discover_tools()
+                return {
+                    "success": True,
+                    "message": f"Connected — found {len(tools)} tool{'s' if len(tools) != 1 else ''}.",
+                    "tools_count": len(tools),
+                    "tools": [
+                        {
+                            "name": t.get("name", "unknown"),
+                            "description": t.get("description", ""),
+                        }
+                        for t in tools
+                    ],
+                }
+            except Exception as e:
+                logger.warning("OAuth token validation failed: %s", e)
+                _mcp_clients_cache.pop(self._cache_key, None)
+                self._client = None
+
+        return self._start_oauth_task()
+
+    def _start_oauth_task(self) -> Dict:
+        task_config = self.config.copy()
+        task_config.pop("query_mode", None)
+        result = mcp_oauth_task.delay(task_config, self.user_id)
+        return {
+            "success": False,
+            "requires_oauth": True,
+            "task_id": result.id,
+            "message": "OAuth authorization required.",
+            "tools_count": 0,
+        }
+
    def get_actions_metadata(self) -> List[Dict]:
        """
        Get metadata for all available actions.
@@ -455,110 +582,88 @@ class MCPTool(Tool):
        return actions

    def get_config_requirements(self) -> Dict:
-        """Get configuration requirements for the MCP tool."""
-        transport_enum = ["auto", "sse", "http"]
-        transport_help = {
-            "auto": "Automatically detect best transport",
-            "sse": "Server-Sent Events (for real-time streaming)",
-            "http": "HTTP streaming (recommended for production)",
-        }
        return {
            "server_url": {
                "type": "string",
-                "description": "URL of the remote MCP server (e.g., https://api.example.com/mcp or https://docs.mcp.cloudflare.com/sse)",
+                "label": "Server URL",
+                "description": "URL of the remote MCP server",
                "required": True,
-            },
-            "transport_type": {
-                "type": "string",
-                "description": "Transport type for connection",
-                "enum": transport_enum,
-                "default": "auto",
-                "required": False,
-                "help": {
-                    **transport_help,
-                },
+                "secret": False,
+                "order": 1,
            },
            "auth_type": {
                "type": "string",
-                "description": "Authentication type",
+                "label": "Authentication Type",
+                "description": "Authentication method for the MCP server",
                "enum": ["none", "bearer", "oauth", "api_key", "basic"],
                "default": "none",
                "required": True,
-                "help": {
-                    "none": "No authentication",
-                    "bearer": "Bearer token authentication",
-                    "oauth": "OAuth 2.1 authentication (with frontend integration)",
-                    "api_key": "API key authentication",
-                    "basic": "Basic authentication",
-                },
+                "secret": False,
+                "order": 2,
            },
-            "auth_credentials": {
-                "type": "object",
-                "description": "Authentication credentials (varies by auth_type)",
+            "api_key": {
+                "type": "string",
+                "label": "API Key",
+                "description": "API key for authentication",
                "required": False,
-                "properties": {
-                    "bearer_token": {
-                        "type": "string",
-                        "description": "Bearer token for bearer auth",
-                    },
-                    "access_token": {
-                        "type": "string",
-                        "description": "Access token for OAuth (if pre-obtained)",
-                    },
-                    "api_key": {
-                        "type": "string",
-                        "description": "API key for api_key auth",
-                    },
-                    "api_key_header": {
-                        "type": "string",
-                        "description": "Header name for API key (default: X-API-Key)",
-                    },
-                    "username": {
-                        "type": "string",
-                        "description": "Username for basic auth",
-                    },
-                    "password": {
-                        "type": "string",
-                        "description": "Password for basic auth",
-                    },
-                },
+                "secret": True,
+                "order": 3,
+                "depends_on": {"auth_type": "api_key"},
+            },
+            "api_key_header": {
+                "type": "string",
+                "label": "API Key Header",
+                "description": "Header name for API key (default: X-API-Key)",
+                "default": "X-API-Key",
+                "required": False,
+                "secret": False,
+                "order": 4,
+                "depends_on": {"auth_type": "api_key"},
+            },
+            "bearer_token": {
+                "type": "string",
+                "label": "Bearer Token",
+                "description": "Bearer token for authentication",
+                "required": False,
+                "secret": True,
+                "order": 3,
+                "depends_on": {"auth_type": "bearer"},
+            },
+            "username": {
+                "type": "string",
+                "label": "Username",
+                "description": "Username for basic authentication",
+                "required": False,
+                "secret": False,
+                "order": 3,
+                "depends_on": {"auth_type": "basic"},
+            },
+            "password": {
+                "type": "string",
+                "label": "Password",
+                "description": "Password for basic authentication",
+                "required": False,
+                "secret": True,
+                "order": 4,
+                "depends_on": {"auth_type": "basic"},
            },
            "oauth_scopes": {
-                "type": "array",
-                "description": "OAuth scopes to request (for oauth auth_type)",
-                "items": {"type": "string"},
-                "required": False,
-                "default": [],
-            },
-            "oauth_client_name": {
                "type": "string",
-                "description": "Client name for OAuth registration (for oauth auth_type)",
-                "default": "DocsGPT-MCP",
-                "required": False,
-            },
-            "headers": {
-                "type": "object",
-                "description": "Custom headers to send with requests",
+                "label": "OAuth Scopes",
+                "description": "Comma-separated OAuth scopes to request",
                "required": False,
+                "secret": False,
+                "order": 3,
+                "depends_on": {"auth_type": "oauth"},
            },
            "timeout": {
-                "type": "integer",
-                "description": "Request timeout in seconds",
+                "type": "number",
+                "label": "Timeout (seconds)",
+                "description": "Request timeout in seconds (1-300)",
                "default": 30,
-                "minimum": 1,
-                "maximum": 300,
-                "required": False,
-            },
-            "command": {
-                "type": "string",
-                "description": "Command to run for STDIO transport (e.g., 'python')",
-                "required": False,
-            },
-            "args": {
-                "type": "array",
-                "description": "Arguments for STDIO command",
-                "items": {"type": "string"},
                "required": False,
+                "secret": False,
+                "order": 10,
            },
        }

@@ -580,23 +685,8 @@ class DocsGPTOAuth(OAuthClientProvider):
        user_id=None,
        db=None,
        additional_client_metadata: dict[str, Any] | None = None,
+        skip_redirect_validation: bool = False,
    ):
-        """
-        Initialize custom OAuth client provider for DocsGPT.
-
-        Args:
-            mcp_url: Full URL to the MCP endpoint
-            redirect_uri: Custom redirect URI for DocsGPT frontend
-            redis_client: Redis client for storing auth state
-            redis_prefix: Prefix for Redis keys
-            task_id: Task ID for tracking auth status
-            scopes: OAuth scopes to request
-            client_name: Name for this client during registration
-            user_id: User ID for token storage
-            db: Database instance for token storage
-            additional_client_metadata: Extra fields for OAuthClientMetadata
-        """
-
        self.redirect_uri = redirect_uri
        self.redis_client = redis_client
        self.redis_prefix = redis_prefix
@@ -619,7 +709,10 @@ class DocsGPTOAuth(OAuthClientProvider):
        )

        storage = DBTokenStorage(
-            server_url=self.server_base_url, user_id=self.user_id, db_client=self.db
+            server_url=self.server_base_url,
+            user_id=self.user_id,
+            db_client=self.db,
+            expected_redirect_uri=None if skip_redirect_validation else redirect_uri,
        )

        super().__init__(
@@ -651,22 +744,20 @@ class DocsGPTOAuth(OAuthClientProvider):
    async def redirect_handler(self, authorization_url: str) -> None:
        """Store auth URL and state in Redis for frontend to use."""
        auth_url, state = self._process_auth_url(authorization_url)
-        logging.info(
-            "[DocsGPTOAuth] Processed auth_url: %s, state: %s", auth_url, state
-        )
+        logger.info("Processed auth_url: %s, state: %s", auth_url, state)
        self.auth_url = auth_url
        self.extracted_state = state

        if self.redis_client and self.extracted_state:
            key = f"{self.redis_prefix}auth_url:{self.extracted_state}"
            self.redis_client.setex(key, 600, auth_url)
-            logging.info("[DocsGPTOAuth] Stored auth_url in Redis: %s", key)
+            logger.info("Stored auth_url in Redis: %s", key)

            if self.task_id:
                status_key = f"mcp_oauth_status:{self.task_id}"
                status_data = {
                    "status": "requires_redirect",
-                    "message": "OAuth authorization required",
+                    "message": "Authorization required",
                    "authorization_url": self.auth_url,
                    "state": self.extracted_state,
                    "requires_oauth": True,
@@ -686,7 +777,7 @@ class DocsGPTOAuth(OAuthClientProvider):
            status_key = f"mcp_oauth_status:{self.task_id}"
            status_data = {
                "status": "awaiting_callback",
-                "message": "Waiting for OAuth callback...",
+                "message": "Waiting for authorization...",
                "authorization_url": self.auth_url,
                "state": self.extracted_state,
                "requires_oauth": True,
@@ -711,7 +802,7 @@ class DocsGPTOAuth(OAuthClientProvider):
                if self.task_id:
                    status_data = {
                        "status": "callback_received",
-                        "message": "OAuth callback received, completing authentication...",
+                        "message": "Completing authentication...",
                        "task_id": self.task_id,
                    }
                    self.redis_client.setex(status_key, 600, json.dumps(status_data))
@@ -731,14 +822,44 @@ class DocsGPTOAuth(OAuthClientProvider):
            await asyncio.sleep(poll_interval)
        self.redis_client.delete(f"{self.redis_prefix}auth_url:{self.extracted_state}")
        self.redis_client.delete(f"{self.redis_prefix}state:{self.extracted_state}")
-        raise Exception("OAuth callback timeout: no code received within 5 minutes")
+        raise Exception("OAuth timeout: no code received within 5 minutes")
+
+
+class NonInteractiveOAuth(DocsGPTOAuth):
+    """OAuth provider that fails fast on 401 instead of starting interactive auth.
+
+    Used during query execution to prevent the streaming response from blocking
+    while waiting for user authorization that will never come.
+    """
+
+    def __init__(self, **kwargs):
+        kwargs.setdefault("task_id", None)
+        kwargs["skip_redirect_validation"] = True
+        super().__init__(**kwargs)
+
+    async def redirect_handler(self, authorization_url: str) -> None:
+        raise Exception(
+            "OAuth session expired — please re-authorize this MCP server in tool settings."
+        )
+
+    async def callback_handler(self) -> tuple[str, str | None]:
+        raise Exception(
+            "OAuth session expired — please re-authorize this MCP server in tool settings."
+        )


 class DBTokenStorage(TokenStorage):
-    def __init__(self, server_url: str, user_id: str, db_client):
+    def __init__(
+        self,
+        server_url: str,
+        user_id: str,
+        db_client,
+        expected_redirect_uri: Optional[str] = None,
+    ):
        self.server_url = server_url
        self.user_id = user_id
        self.db_client = db_client
+        self.expected_redirect_uri = expected_redirect_uri
        self.collection = db_client["connector_sessions"]

    @staticmethod
@@ -757,10 +878,9 @@ class DBTokenStorage(TokenStorage):
        if not doc or "tokens" not in doc:
            return None
        try:
-            tokens = OAuthToken.model_validate(doc["tokens"])
-            return tokens
+            return OAuthToken.model_validate(doc["tokens"])
        except ValidationError as e:
-            logging.error(f"Could not load tokens: {e}")
+            logger.error("Could not load tokens: %s", e)
            return None

    async def set_tokens(self, tokens: OAuthToken) -> None:
@@ -770,28 +890,38 @@ class DBTokenStorage(TokenStorage):
            {"$set": {"tokens": tokens.model_dump()}},
            True,
        )
-        logging.info(f"Saved tokens for {self.get_base_url(self.server_url)}")
+        logger.info("Saved tokens for %s", self.get_base_url(self.server_url))

    async def get_client_info(self) -> OAuthClientInformationFull | None:
        doc = await asyncio.to_thread(self.collection.find_one, self.get_db_key())
        if not doc or "client_info" not in doc:
+            logger.debug(
+                "No client_info in DB for %s", self.get_base_url(self.server_url)
+            )
            return None
        try:
            client_info = OAuthClientInformationFull.model_validate(doc["client_info"])
-            tokens = await self.get_tokens()
-            if tokens is None:
-                logging.debug(
-                    "No tokens found, clearing client info to force fresh registration."
-                )
-                await asyncio.to_thread(
-                    self.collection.update_one,
-                    self.get_db_key(),
-                    {"$unset": {"client_info": ""}},
-                )
-                return None
+            if self.expected_redirect_uri:
+                stored_uris = [
+                    str(uri).rstrip("/") for uri in client_info.redirect_uris
+                ]
+                expected_uri = self.expected_redirect_uri.rstrip("/")
+                if expected_uri not in stored_uris:
+                    logger.warning(
+                        "Redirect URI mismatch for %s: expected=%s stored=%s — clearing.",
+                        self.get_base_url(self.server_url),
+                        expected_uri,
+                        stored_uris,
+                    )
+                    await asyncio.to_thread(
+                        self.collection.update_one,
+                        self.get_db_key(),
+                        {"$unset": {"client_info": "", "tokens": ""}},
+                    )
+                    return None
            return client_info
        except ValidationError as e:
-            logging.error(f"Could not load client info: {e}")
+            logger.error("Could not load client info: %s", e)
            return None

    def _serialize_client_info(self, info: dict) -> dict:
@@ -807,17 +937,17 @@ class DBTokenStorage(TokenStorage):
            {"$set": {"client_info": serialized_info}},
            True,
        )
-        logging.info(f"Saved client info for {self.get_base_url(self.server_url)}")
+        logger.info("Saved client info for %s", self.get_base_url(self.server_url))

    async def clear(self) -> None:
        await asyncio.to_thread(self.collection.delete_one, self.get_db_key())
-        logging.info(f"Cleared OAuth cache for {self.get_base_url(self.server_url)}")
+        logger.info("Cleared OAuth cache for %s", self.get_base_url(self.server_url))

    @classmethod
    async def clear_all(cls, db_client) -> None:
        collection = db_client["connector_sessions"]
        await asyncio.to_thread(collection.delete_many, {})
-        logging.info("Cleared all OAuth client cache data.")
+        logger.info("Cleared all OAuth client cache data.")


 class MCPOAuthManager:
@@ -856,7 +986,7 @@ class MCPOAuthManager:

            return True
        except Exception as e:
-            logging.error(f"Error handling OAuth callback: {e}")
+            logger.error("Error handling OAuth callback: %s", e)
            return False

    def get_oauth_status(self, task_id: str) -> Dict[str, Any]:
--- a/application/agents/tools/ntfy.py
+++ b/application/agents/tools/ntfy.py
@@ -71,7 +71,7 @@ class NtfyTool(Tool):
        if self.token:
            headers["Authorization"] = f"Basic {self.token}"
        data = message.encode("utf-8")
-        response = requests.post(url, headers=headers, data=data)
+        response = requests.post(url, headers=headers, data=data, timeout=100)
        return {"status_code": response.status_code, "message": "Message sent"}

    def get_actions_metadata(self):
@@ -116,12 +116,13 @@ class NtfyTool(Tool):
        ]

    def get_config_requirements(self):
-        """
-        Specify the configuration requirements.
-
-        Returns:
-            dict: Dictionary describing required config parameters.
-        """
        return {
-            "token": {"type": "string", "description": "Access token for authentication"},
+            "token": {
+                "type": "string",
+                "label": "Access Token",
+                "description": "Ntfy access token for authentication",
+                "required": True,
+                "secret": True,
+                "order": 1,
+            },
        }
--- a/application/agents/tools/postgres.py
+++ b/application/agents/tools/postgres.py
@@ -1,6 +1,12 @@
-import psycopg2
+import logging
+
+import psycopg
+
 from application.agents.tools.base import Tool

+logger = logging.getLogger(__name__)
+
+
 class PostgresTool(Tool):
    """
    PostgreSQL Database Tool
@@ -17,25 +23,25 @@ class PostgresTool(Tool):
            "postgres_execute_sql": self._execute_sql,
            "postgres_get_schema": self._get_schema,
        }
-
-        if action_name in actions:
-            return actions[action_name](**kwargs)
-        else:
+        if action_name not in actions:
            raise ValueError(f"Unknown action: {action_name}")
+        return actions[action_name](**kwargs)

    def _execute_sql(self, sql_query):
        """
        Executes an SQL query against the PostgreSQL database using a connection string.
        """
-        conn = None  # Initialize conn to None for error handling
+        conn = None
        try:
-            conn = psycopg2.connect(self.connection_string)
+            conn = psycopg.connect(self.connection_string)
            cur = conn.cursor()
            cur.execute(sql_query)
            conn.commit()

            if sql_query.strip().lower().startswith("select"):
-                column_names = [desc[0] for desc in cur.description] if cur.description else []
+                column_names = (
+                    [desc[0] for desc in cur.description] if cur.description else []
+                )
                results = []
                rows = cur.fetchall()
                for row in rows:
@@ -43,7 +49,9 @@ class PostgresTool(Tool):
                response_data = {"data": results, "column_names": column_names}
            else:
                row_count = cur.rowcount
-                response_data = {"message": f"Query executed successfully, {row_count} rows affected."}
+                response_data = {
+                    "message": f"Query executed successfully, {row_count} rows affected."
+                }

            cur.close()
            return {
@@ -52,28 +60,29 @@ class PostgresTool(Tool):
                "response_data": response_data,
            }

-        except psycopg2.Error as e:
+        except psycopg.Error as e:
            error_message = f"Database error: {e}"
-            print(f"Database error: {e}")
+            logger.error("PostgreSQL execute_sql error: %s", e)
            return {
                "status_code": 500,
                "message": "Failed to execute SQL query.",
                "error": error_message,
            }
        finally:
-            if conn:  # Ensure connection is closed even if errors occur
+            if conn:
                conn.close()

    def _get_schema(self, db_name):
        """
        Retrieves the schema of the PostgreSQL database using a connection string.
        """
-        conn = None # Initialize conn to None for error handling
+        conn = None
        try:
-            conn = psycopg2.connect(self.connection_string)
+            conn = psycopg.connect(self.connection_string)
            cur = conn.cursor()

-            cur.execute("""
+            cur.execute(
+                """
                SELECT
                    table_name,
                    column_name,
@@ -87,19 +96,22 @@ class PostgresTool(Tool):
                ORDER BY
                    table_name,
                    ordinal_position;
-            """)
+            """
+            )

            schema_data = {}
            for row in cur.fetchall():
                table_name, column_name, data_type, column_default, is_nullable = row
                if table_name not in schema_data:
                    schema_data[table_name] = []
-                schema_data[table_name].append({
-                    "column_name": column_name,
-                    "data_type": data_type,
-                    "column_default": column_default,
-                    "is_nullable": is_nullable
-                })
+                schema_data[table_name].append(
+                    {
+                        "column_name": column_name,
+                        "data_type": data_type,
+                        "column_default": column_default,
+                        "is_nullable": is_nullable,
+                    }
+                )

            cur.close()
            return {
@@ -108,16 +120,16 @@ class PostgresTool(Tool):
                "schema": schema_data,
            }

-        except psycopg2.Error as e:
+        except psycopg.Error as e:
            error_message = f"Database error: {e}"
-            print(f"Database error: {e}")
+            logger.error("PostgreSQL get_schema error: %s", e)
            return {
                "status_code": 500,
                "message": "Failed to retrieve database schema.",
                "error": error_message,
            }
        finally:
-            if conn: # Ensure connection is closed even if errors occur
+            if conn:
                conn.close()

    def get_actions_metadata(self):
@@ -158,6 +170,10 @@ class PostgresTool(Tool):
        return {
            "token": {
                "type": "string",
-                "description": "PostgreSQL database connection string (e.g., 'postgresql://user:password@host:port/dbname')",
+                "label": "Connection String",
+                "description": "PostgreSQL database connection string",
+                "required": True,
+                "secret": True,
+                "order": 1,
            },
-        }
+        }
--- a/application/agents/tools/telegram.py
+++ b/application/agents/tools/telegram.py
@@ -1,6 +1,11 @@
+import logging
+
 import requests
+
 from application.agents.tools.base import Tool

+logger = logging.getLogger(__name__)
+

 class TelegramTool(Tool):
    """
@@ -18,24 +23,22 @@ class TelegramTool(Tool):
            "telegram_send_message": self._send_message,
            "telegram_send_image": self._send_image,
        }
-
-        if action_name in actions:
-            return actions[action_name](**kwargs)
-        else:
+        if action_name not in actions:
            raise ValueError(f"Unknown action: {action_name}")
+        return actions[action_name](**kwargs)

    def _send_message(self, text, chat_id):
-        print(f"Sending message: {text}")
+        logger.debug("Sending Telegram message to chat_id=%s", chat_id)
        url = f"https://api.telegram.org/bot{self.token}/sendMessage"
        payload = {"chat_id": chat_id, "text": text}
-        response = requests.post(url, data=payload)
+        response = requests.post(url, data=payload, timeout=100)
        return {"status_code": response.status_code, "message": "Message sent"}

    def _send_image(self, image_url, chat_id):
-        print(f"Sending image: {image_url}")
+        logger.debug("Sending Telegram image to chat_id=%s", chat_id)
        url = f"https://api.telegram.org/bot{self.token}/sendPhoto"
        payload = {"chat_id": chat_id, "photo": image_url}
-        response = requests.post(url, data=payload)
+        response = requests.post(url, data=payload, timeout=100)
        return {"status_code": response.status_code, "message": "Image sent"}

    def get_actions_metadata(self):
@@ -82,5 +85,12 @@ class TelegramTool(Tool):

    def get_config_requirements(self):
        return {
-            "token": {"type": "string", "description": "Bot token for authentication"},
+            "token": {
+                "type": "string",
+                "label": "Bot Token",
+                "description": "Telegram bot token for authentication",
+                "required": True,
+                "secret": True,
+                "order": 1,
+            },
        }
--- a/application/agents/tools/think.py
+++ b/application/agents/tools/think.py
@@ -0,0 +1,70 @@
+from application.agents.tools.base import Tool
+
+
+THINK_TOOL_ID = "think"
+
+THINK_TOOL_ENTRY = {
+    "name": "think",
+    "actions": [
+        {
+            "name": "reason",
+            "description": (
+                "Use this tool to think through your reasoning step by step "
+                "before deciding on your next action. Always reason before "
+                "searching or answering."
+            ),
+            "active": True,
+            "parameters": {
+                "properties": {
+                    "reasoning": {
+                        "type": "string",
+                        "description": "Your step-by-step reasoning and analysis",
+                        "filled_by_llm": True,
+                        "required": True,
+                    }
+                }
+            },
+        }
+    ],
+}
+
+
+class ThinkTool(Tool):
+    """Pseudo-tool that captures chain-of-thought reasoning.
+
+    Returns a short acknowledgment so the LLM can continue.
+    The reasoning content is captured in tool_call data for transparency.
+    """
+
+    internal = True
+
+    def __init__(self, config=None):
+        pass
+
+    def execute_action(self, action_name: str, **kwargs):
+        return "Continue."
+
+    def get_actions_metadata(self):
+        return [
+            {
+                "name": "reason",
+                "description": (
+                    "Use this tool to think through your reasoning step by step "
+                    "before deciding on your next action. Always reason before "
+                    "searching or answering."
+                ),
+                "parameters": {
+                    "properties": {
+                        "reasoning": {
+                            "type": "string",
+                            "description": "Your step-by-step reasoning and analysis",
+                            "filled_by_llm": True,
+                            "required": True,
+                        }
+                    }
+                },
+            }
+        ]
+
+    def get_config_requirements(self):
+        return {}
--- a/application/agents/tools/tool_action_parser.py
+++ b/application/agents/tools/tool_action_parser.py
@@ -5,8 +5,9 @@ logger = logging.getLogger(__name__)


 class ToolActionParser:
-    def __init__(self, llm_type):
+    def __init__(self, llm_type, name_mapping=None):
        self.llm_type = llm_type
+        self.name_mapping = name_mapping
        self.parsers = {
            "OpenAILLM": self._parse_openai_llm,
            "GoogleLLM": self._parse_google_llm,
@@ -16,22 +17,33 @@ class ToolActionParser:
        parser = self.parsers.get(self.llm_type, self._parse_openai_llm)
        return parser(call)

+    def _resolve_via_mapping(self, call_name):
+        """Look up (tool_id, action_name) from the name mapping if available."""
+        if self.name_mapping and call_name in self.name_mapping:
+            return self.name_mapping[call_name]
+        return None
+
    def _parse_openai_llm(self, call):
        try:
            call_args = json.loads(call.arguments)
+
+            resolved = self._resolve_via_mapping(call.name)
+            if resolved:
+                return resolved[0], resolved[1], call_args
+
+            # Fallback: legacy split on "_" for backward compatibility
            tool_parts = call.name.split("_")

-            # If the tool name doesn't contain an underscore, it's likely a hallucinated tool
            if len(tool_parts) < 2:
                logger.warning(
-                    f"Invalid tool name format: {call.name}. Expected format: action_name_tool_id"
+                    f"Invalid tool name format: {call.name}. "
+                    "Could not resolve via mapping or legacy parsing."
                )
                return None, None, None

            tool_id = tool_parts[-1]
            action_name = "_".join(tool_parts[:-1])

-            # Validate that tool_id looks like a numerical ID
            if not tool_id.isdigit():
                logger.warning(
                    f"Tool ID '{tool_id}' is not numerical. This might be a hallucinated tool call."
@@ -45,19 +57,24 @@ class ToolActionParser:
    def _parse_google_llm(self, call):
        try:
            call_args = call.arguments
+
+            resolved = self._resolve_via_mapping(call.name)
+            if resolved:
+                return resolved[0], resolved[1], call_args
+
+            # Fallback: legacy split on "_" for backward compatibility
            tool_parts = call.name.split("_")

-            # If the tool name doesn't contain an underscore, it's likely a hallucinated tool
            if len(tool_parts) < 2:
                logger.warning(
-                    f"Invalid tool name format: {call.name}. Expected format: action_name_tool_id"
+                    f"Invalid tool name format: {call.name}. "
+                    "Could not resolve via mapping or legacy parsing."
                )
                return None, None, None

            tool_id = tool_parts[-1]
            action_name = "_".join(tool_parts[:-1])

-            # Validate that tool_id looks like a numerical ID
            if not tool_id.isdigit():
                logger.warning(
                    f"Tool ID '{tool_id}' is not numerical. This might be a hallucinated tool call."
--- a/application/agents/tools/tool_manager.py
+++ b/application/agents/tools/tool_manager.py
@@ -19,7 +19,7 @@ class ToolManager:
                continue
            module = importlib.import_module(f"application.agents.tools.{name}")
            for member_name, obj in inspect.getmembers(module, inspect.isclass):
-                if issubclass(obj, Tool) and obj is not Tool:
+                if issubclass(obj, Tool) and obj is not Tool and not obj.internal:
                    tool_config = self.config.get(name, {})
                    self.tools[name] = obj(tool_config)

@@ -36,7 +36,7 @@ class ToolManager:
    def execute_action(self, tool_name, action_name, user_id=None, **kwargs):
        if tool_name not in self.tools:
            raise ValueError(f"Tool '{tool_name}' not loaded")
-        if tool_name in {"mcp_tool", "memory", "todo_list"} and user_id:
+        if tool_name in {"mcp_tool", "memory", "todo_list", "notes"} and user_id:
            tool_config = self.config.get(tool_name, {})
            tool = self.load_tool(tool_name, tool_config, user_id)
            return tool.execute_action(action_name, **kwargs)
--- a/application/agents/workflow_agent.py
+++ b/application/agents/workflow_agent.py
@@ -15,6 +15,9 @@ from application.agents.workflows.workflow_engine import WorkflowEngine
 from application.core.mongo_db import MongoDB
 from application.core.settings import settings
 from application.logging import log_activity, LogContext
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.workflow_runs import WorkflowRunsRepository
+from application.storage.db.repositories.workflows import WorkflowsRepository

 logger = logging.getLogger(__name__)

@@ -181,6 +184,9 @@ class WorkflowAgent(BaseAgent):
    def _save_workflow_run(self, query: str) -> None:
        if not self._engine:
            return
+        owner_id = self.workflow_owner
+        if not owner_id and isinstance(self.decoded_token, dict):
+            owner_id = self.decoded_token.get("sub")
        try:
            mongo = MongoDB.get_client()
            db = mongo[settings.MONGO_DB_NAME]
@@ -188,6 +194,7 @@ class WorkflowAgent(BaseAgent):

            run = WorkflowRun(
                workflow_id=self.workflow_id or "unknown",
+                user=owner_id,
                status=self._determine_run_status(),
                inputs={"query": query},
                outputs=self._serialize_state(self._engine.state),
@@ -196,7 +203,34 @@ class WorkflowAgent(BaseAgent):
                completed_at=datetime.now(timezone.utc),
            )

-            workflow_runs_coll.insert_one(run.to_mongo_doc())
+            result = workflow_runs_coll.insert_one(run.to_mongo_doc())
+            legacy_mongo_id = (
+                str(result.inserted_id)
+                if getattr(result, "inserted_id", None) is not None
+                else None
+            )
+
+            def _pg_write(repo: WorkflowRunsRepository) -> None:
+                if not self.workflow_id or not owner_id or not legacy_mongo_id:
+                    return
+                workflow = WorkflowsRepository(repo._conn).get_by_legacy_id(
+                    self.workflow_id, owner_id,
+                )
+                if workflow is None:
+                    return
+                repo.create(
+                    workflow["id"],
+                    owner_id,
+                    run.status.value,
+                    inputs=run.inputs,
+                    result=run.outputs,
+                    steps=[step.model_dump(mode="json") for step in run.steps],
+                    started_at=run.created_at,
+                    ended_at=run.completed_at,
+                    legacy_mongo_id=legacy_mongo_id,
+                )
+
+            dual_write(WorkflowRunsRepository, _pg_write)
        except Exception as e:
            logger.error(f"Failed to save workflow run: {e}")

--- a/application/agents/workflows/node_agent.py
+++ b/application/agents/workflows/node_agent.py
@@ -2,9 +2,10 @@

 from typing import Any, Dict, List, Optional, Type

+from application.agents.agentic_agent import AgenticAgent
 from application.agents.base import BaseAgent
 from application.agents.classic_agent import ClassicAgent
-from application.agents.react_agent import ReActAgent
+from application.agents.research_agent import ResearchAgent
 from application.agents.workflows.schemas import AgentType


@@ -36,7 +37,8 @@ class ToolFilterMixin:
        return filtered_tools


-class WorkflowNodeClassicAgent(ToolFilterMixin, ClassicAgent):
+class _WorkflowNodeMixin:
+    """Common __init__ for all workflow node agents."""

    def __init__(
        self,
@@ -57,32 +59,25 @@ class WorkflowNodeClassicAgent(ToolFilterMixin, ClassicAgent):
        self._allowed_tool_ids = tool_ids or []


-class WorkflowNodeReActAgent(ToolFilterMixin, ReActAgent):
+class WorkflowNodeClassicAgent(ToolFilterMixin, _WorkflowNodeMixin, ClassicAgent):
+    pass

-    def __init__(
-        self,
-        endpoint: str,
-        llm_name: str,
-        model_id: str,
-        api_key: str,
-        tool_ids: Optional[List[str]] = None,
-        **kwargs,
-    ):
-        super().__init__(
-            endpoint=endpoint,
-            llm_name=llm_name,
-            model_id=model_id,
-            api_key=api_key,
-            **kwargs,
-        )
-        self._allowed_tool_ids = tool_ids or []
+
+class WorkflowNodeAgenticAgent(ToolFilterMixin, _WorkflowNodeMixin, AgenticAgent):
+    pass
+
+
+class WorkflowNodeResearchAgent(ToolFilterMixin, _WorkflowNodeMixin, ResearchAgent):
+    pass


 class WorkflowNodeAgentFactory:

    _agents: Dict[AgentType, Type[BaseAgent]] = {
        AgentType.CLASSIC: WorkflowNodeClassicAgent,
-        AgentType.REACT: WorkflowNodeReActAgent,
+        AgentType.REACT: WorkflowNodeClassicAgent,  # backwards compat
+        AgentType.AGENTIC: WorkflowNodeAgenticAgent,
+        AgentType.RESEARCH: WorkflowNodeResearchAgent,
    }

    @classmethod
--- a/application/agents/workflows/schemas.py
+++ b/application/agents/workflows/schemas.py
@@ -18,6 +18,8 @@ class NodeType(str, Enum):
 class AgentType(str, Enum):
    CLASSIC = "classic"
    REACT = "react"
+    AGENTIC = "agentic"
+    RESEARCH = "research"


 class ExecutionStatus(str, Enum):
@@ -209,6 +211,7 @@ class WorkflowRun(BaseModel):
    model_config = ConfigDict(extra="allow")
    id: Optional[str] = Field(None, alias="_id")
    workflow_id: str
+    user: Optional[str] = None
    status: ExecutionStatus = ExecutionStatus.PENDING
    inputs: Dict[str, str] = Field(default_factory=dict)
    outputs: Dict[str, Any] = Field(default_factory=dict)
@@ -224,7 +227,7 @@ class WorkflowRun(BaseModel):
        return v

    def to_mongo_doc(self) -> Dict[str, Any]:
-        return {
+        doc = {
            "workflow_id": self.workflow_id,
            "status": self.status.value,
            "inputs": self.inputs,
@@ -233,3 +236,7 @@ class WorkflowRun(BaseModel):
            "created_at": self.created_at,
            "completed_at": self.completed_at,
        }
+        if self.user:
+            doc["user"] = self.user
+            doc["user_id"] = self.user
+        return doc
--- a/application/agents/workflows/workflow_engine.py
+++ b/application/agents/workflows/workflow_engine.py
@@ -7,6 +7,7 @@ from application.agents.workflows.cel_evaluator import CelEvaluationError, evalu
 from application.agents.workflows.node_agent import WorkflowNodeAgentFactory
 from application.agents.workflows.schemas import (
    AgentNodeConfig,
+    AgentType,
    ConditionNodeConfig,
    ExecutionStatus,
    NodeExecutionLog,
@@ -18,6 +19,7 @@ from application.core.json_schema_utils import (
    JsonSchemaValidationError,
    normalize_json_schema_payload,
 )
+from application.error import sanitize_api_error
 from application.templates.namespaces import NamespaceManager
 from application.templates.template_engine import TemplateEngine, TemplateRenderError

@@ -99,6 +101,7 @@ class WorkflowEngine:
                log_entry["state_snapshot"] = dict(self.state)
                self.execution_log.append(log_entry)

+                user_friendly_error = sanitize_api_error(e)
                yield {
                    "type": "workflow_step",
                    "node_id": node.id,
@@ -106,9 +109,9 @@ class WorkflowEngine:
                    "node_title": node.title,
                    "status": "failed",
                    "state_snapshot": dict(self.state),
-                    "error": str(e),
+                    "error": user_friendly_error,
                }
-                yield {"type": "error", "error": str(e)}
+                yield {"type": "error", "error": user_friendly_error}
                break
            log_entry["state_snapshot"] = dict(self.state)
            self.execution_log.append(log_entry)
@@ -221,18 +224,32 @@ class WorkflowEngine:
                    f'Model "{node_model_id}" does not support structured output for node "{node.title}"'
                )

-        node_agent = WorkflowNodeAgentFactory.create(
-            agent_type=node_config.agent_type,
-            endpoint=self.agent.endpoint,
-            llm_name=node_llm_name,
-            model_id=node_model_id,
-            api_key=node_api_key,
-            tool_ids=node_config.tools,
-            prompt=node_config.system_prompt,
-            chat_history=self.agent.chat_history,
-            decoded_token=self.agent.decoded_token,
-            json_schema=node_json_schema,
-        )
+        factory_kwargs = {
+            "agent_type": node_config.agent_type,
+            "endpoint": self.agent.endpoint,
+            "llm_name": node_llm_name,
+            "model_id": node_model_id,
+            "api_key": node_api_key,
+            "tool_ids": node_config.tools,
+            "prompt": node_config.system_prompt,
+            "chat_history": self.agent.chat_history,
+            "decoded_token": self.agent.decoded_token,
+            "json_schema": node_json_schema,
+        }
+
+        # Agentic/research agents need retriever_config for on-demand search
+        if node_config.agent_type in (AgentType.AGENTIC, AgentType.RESEARCH):
+            factory_kwargs["retriever_config"] = {
+                "source": {"active_docs": node_config.sources} if node_config.sources else {},
+                "retriever_name": node_config.retriever or "classic",
+                "chunks": int(node_config.chunks) if node_config.chunks else 2,
+                "model_id": node_model_id,
+                "llm_name": node_llm_name,
+                "api_key": node_api_key,
+                "decoded_token": self.agent.decoded_token,
+            }
+
+        node_agent = WorkflowNodeAgentFactory.create(**factory_kwargs)

        full_response_parts: List[str] = []
        structured_response_parts: List[str] = []
--- a/application/alembic.ini
+++ b/application/alembic.ini
@@ -0,0 +1,52 @@
+# Alembic configuration for the DocsGPT user-data Postgres database.
+#
+# The SQLAlchemy URL is deliberately NOT set here — env.py reads it from
+# ``application.core.settings.settings.POSTGRES_URI`` so the same config
+# source serves the running app and migrations. To run from the project
+# root::
+#
+#     alembic -c application/alembic.ini upgrade head
+
+[alembic]
+script_location = %(here)s/alembic
+prepend_sys_path = ..
+version_path_separator = os
+
+# sqlalchemy.url is intentionally left blank — env.py supplies it.
+sqlalchemy.url =
+
+[post_write_hooks]
+
+[loggers]
+keys = root,sqlalchemy,alembic
+
+[handlers]
+keys = console
+
+[formatters]
+keys = generic
+
+[logger_root]
+level = WARNING
+handlers = console
+qualname =
+
+[logger_sqlalchemy]
+level = WARNING
+handlers =
+qualname = sqlalchemy.engine
+
+[logger_alembic]
+level = INFO
+handlers =
+qualname = alembic
+
+[handler_console]
+class = StreamHandler
+args = (sys.stderr,)
+level = NOTSET
+formatter = generic
+
+[formatter_generic]
+format = %(levelname)-5.5s [%(name)s] %(message)s
+datefmt = %H:%M:%S
--- a/application/alembic/env.py
+++ b/application/alembic/env.py
@@ -0,0 +1,82 @@
+"""Alembic environment for the DocsGPT user-data Postgres database.
+
+The URL is pulled from ``application.core.settings`` rather than
+``alembic.ini`` so that a single ``POSTGRES_URI`` env var drives both the
+running app and ``alembic`` CLI invocations.
+"""
+
+import sys
+from logging.config import fileConfig
+from pathlib import Path
+
+# Make the project root importable regardless of cwd. env.py lives at
+# <repo>/application/alembic/env.py, so parents[2] is the repo root.
+_PROJECT_ROOT = Path(__file__).resolve().parents[2]
+if str(_PROJECT_ROOT) not in sys.path:
+    sys.path.insert(0, str(_PROJECT_ROOT))
+
+from alembic import context  # noqa: E402
+from sqlalchemy import engine_from_config, pool  # noqa: E402
+
+from application.core.settings import settings  # noqa: E402
+from application.storage.db.models import metadata as target_metadata  # noqa: E402
+
+config = context.config
+
+# Populate the runtime URL from settings.
+if settings.POSTGRES_URI:
+    config.set_main_option("sqlalchemy.url", settings.POSTGRES_URI)
+
+if config.config_file_name is not None:
+    fileConfig(config.config_file_name)
+
+
+def run_migrations_offline() -> None:
+    """Run migrations in 'offline' mode (emits SQL without a live DB)."""
+    url = config.get_main_option("sqlalchemy.url")
+    if not url:
+        raise RuntimeError(
+            "POSTGRES_URI is not configured. Set it in your .env to a "
+            "psycopg3 URI such as "
+            "'postgresql+psycopg://user:pass@host:5432/docsgpt'."
+        )
+    context.configure(
+        url=url,
+        target_metadata=target_metadata,
+        literal_binds=True,
+        dialect_opts={"paramstyle": "named"},
+        compare_type=True,
+    )
+    with context.begin_transaction():
+        context.run_migrations()
+
+
+def run_migrations_online() -> None:
+    """Run migrations in 'online' mode against a live connection."""
+    if not config.get_main_option("sqlalchemy.url"):
+        raise RuntimeError(
+            "POSTGRES_URI is not configured. Set it in your .env to a "
+            "psycopg3 URI such as "
+            "'postgresql+psycopg://user:pass@host:5432/docsgpt'."
+        )
+    connectable = engine_from_config(
+        config.get_section(config.config_ini_section, {}),
+        prefix="sqlalchemy.",
+        poolclass=pool.NullPool,
+        future=True,
+    )
+
+    with connectable.connect() as connection:
+        context.configure(
+            connection=connection,
+            target_metadata=target_metadata,
+            compare_type=True,
+        )
+        with context.begin_transaction():
+            context.run_migrations()
+
+
+if context.is_offline_mode():
+    run_migrations_offline()
+else:
+    run_migrations_online()
--- a/application/alembic/script.py.mako
+++ b/application/alembic/script.py.mako
@@ -0,0 +1,26 @@
+"""${message}
+
+Revision ID: ${up_revision}
+Revises: ${down_revision | comma,n}
+Create Date: ${create_date}
+
+"""
+from typing import Sequence, Union
+
+from alembic import op
+import sqlalchemy as sa
+${imports if imports else ""}
+
+# revision identifiers, used by Alembic.
+revision: str = ${repr(up_revision)}
+down_revision: Union[str, None] = ${repr(down_revision)}
+branch_labels: Union[str, Sequence[str], None] = ${repr(branch_labels)}
+depends_on: Union[str, Sequence[str], None] = ${repr(depends_on)}
+
+
+def upgrade() -> None:
+    ${upgrades if upgrades else "pass"}
+
+
+def downgrade() -> None:
+    ${downgrades if downgrades else "pass"}
--- a/application/alembic/versions/0001_initial.py
+++ b/application/alembic/versions/0001_initial.py
@@ -0,0 +1,825 @@
+"""0001 initial schema — consolidated Phase-1..3 baseline.
+
+Revision ID: 0001_initial
+Revises:
+Create Date: 2026-04-13
+"""
+
+from typing import Sequence, Union
+
+from alembic import op
+
+
+revision: str = "0001_initial"
+down_revision: Union[str, None] = None
+branch_labels: Union[str, Sequence[str], None] = None
+depends_on: Union[str, Sequence[str], None] = None
+
+
+def upgrade() -> None:
+    # ------------------------------------------------------------------
+    # Extensions
+    # ------------------------------------------------------------------
+    op.execute('CREATE EXTENSION IF NOT EXISTS "pgcrypto";')
+    op.execute('CREATE EXTENSION IF NOT EXISTS "citext";')
+
+    # ------------------------------------------------------------------
+    # Trigger functions
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE FUNCTION set_updated_at() RETURNS trigger
+        LANGUAGE plpgsql AS $$
+        BEGIN
+            NEW.updated_at = now();
+            RETURN NEW;
+        END;
+        $$;
+        """
+    )
+
+    op.execute(
+        """
+        CREATE FUNCTION ensure_user_exists() RETURNS trigger
+        LANGUAGE plpgsql AS $$
+        BEGIN
+            IF NEW.user_id IS NOT NULL THEN
+                INSERT INTO users (user_id) VALUES (NEW.user_id)
+                ON CONFLICT (user_id) DO NOTHING;
+            END IF;
+            RETURN NEW;
+        END;
+        $$;
+        """
+    )
+
+    op.execute(
+        """
+        CREATE FUNCTION cleanup_message_attachment_refs() RETURNS trigger
+        LANGUAGE plpgsql AS $$
+        BEGIN
+            UPDATE conversation_messages
+            SET attachments = array_remove(attachments, OLD.id)
+            WHERE OLD.id = ANY(attachments);
+            RETURN OLD;
+        END;
+        $$;
+        """
+    )
+
+    op.execute(
+        """
+        CREATE FUNCTION cleanup_agent_extra_source_refs() RETURNS trigger
+        LANGUAGE plpgsql AS $$
+        BEGIN
+            UPDATE agents
+            SET extra_source_ids = array_remove(extra_source_ids, OLD.id)
+            WHERE OLD.id = ANY(extra_source_ids);
+            RETURN OLD;
+        END;
+        $$;
+        """
+    )
+
+    op.execute(
+        """
+        CREATE FUNCTION cleanup_user_agent_prefs() RETURNS trigger
+        LANGUAGE plpgsql AS $$
+        DECLARE
+            agent_id_text text := OLD.id::text;
+        BEGIN
+            UPDATE users
+            SET agent_preferences = jsonb_set(
+                jsonb_set(
+                    agent_preferences,
+                    '{pinned}',
+                    COALESCE((
+                        SELECT jsonb_agg(e)
+                        FROM jsonb_array_elements(
+                            COALESCE(agent_preferences->'pinned', '[]'::jsonb)
+                        ) e
+                        WHERE (e #>> '{}') <> agent_id_text
+                    ), '[]'::jsonb)
+                ),
+                '{shared_with_me}',
+                COALESCE((
+                    SELECT jsonb_agg(e)
+                    FROM jsonb_array_elements(
+                        COALESCE(agent_preferences->'shared_with_me', '[]'::jsonb)
+                    ) e
+                    WHERE (e #>> '{}') <> agent_id_text
+                ), '[]'::jsonb)
+            )
+            WHERE agent_preferences->'pinned' @> to_jsonb(agent_id_text)
+               OR agent_preferences->'shared_with_me' @> to_jsonb(agent_id_text);
+            RETURN OLD;
+        END;
+        $$;
+        """
+    )
+
+    op.execute(
+        """
+        CREATE FUNCTION conversation_messages_fill_user_id() RETURNS trigger
+        LANGUAGE plpgsql AS $$
+        BEGIN
+            IF NEW.user_id IS NULL THEN
+                SELECT user_id INTO NEW.user_id
+                FROM conversations
+                WHERE id = NEW.conversation_id;
+            END IF;
+            RETURN NEW;
+        END;
+        $$;
+        """
+    )
+
+    # ------------------------------------------------------------------
+    # Tables
+    # ------------------------------------------------------------------
+    op.execute(
+        """
+        CREATE TABLE users (
+            id                UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id           TEXT NOT NULL UNIQUE,
+            agent_preferences JSONB NOT NULL
+                              DEFAULT '{"pinned": [], "shared_with_me": []}'::jsonb,
+            created_at        TIMESTAMPTZ NOT NULL DEFAULT now(),
+            updated_at        TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE prompts (
+            id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id         TEXT NOT NULL,
+            name            TEXT NOT NULL,
+            content         TEXT NOT NULL,
+            created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
+            updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
+            legacy_mongo_id TEXT
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE user_tools (
+            id           UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id      TEXT NOT NULL,
+            name         TEXT NOT NULL,
+            custom_name  TEXT,
+            display_name TEXT,
+            config       JSONB NOT NULL DEFAULT '{}'::jsonb,
+            created_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
+            updated_at   TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE token_usage (
+            id               BIGSERIAL PRIMARY KEY,
+            user_id          TEXT,
+            api_key          TEXT,
+            agent_id         UUID,
+            prompt_tokens    INTEGER NOT NULL DEFAULT 0,
+            generated_tokens INTEGER NOT NULL DEFAULT 0,
+            timestamp        TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+    op.execute(
+        "ALTER TABLE token_usage ADD CONSTRAINT token_usage_attribution_chk "
+        "CHECK (user_id IS NOT NULL OR api_key IS NOT NULL) NOT VALID;"
+    )
+
+    op.execute(
+        """
+        CREATE TABLE user_logs (
+            id        BIGSERIAL PRIMARY KEY,
+            user_id   TEXT,
+            endpoint  TEXT,
+            timestamp TIMESTAMPTZ NOT NULL DEFAULT now(),
+            data      JSONB
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE stack_logs (
+            id          BIGSERIAL PRIMARY KEY,
+            activity_id TEXT NOT NULL,
+            endpoint    TEXT,
+            level       TEXT,
+            user_id     TEXT,
+            api_key     TEXT,
+            query       TEXT,
+            stacks      JSONB NOT NULL DEFAULT '[]'::jsonb,
+            timestamp   TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE agent_folders (
+            id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id     TEXT NOT NULL,
+            name        TEXT NOT NULL,
+            description TEXT,
+            created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
+            updated_at  TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE sources (
+            id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id    TEXT NOT NULL,
+            name       TEXT NOT NULL,
+            type       TEXT,
+            metadata   JSONB NOT NULL DEFAULT '{}'::jsonb,
+            created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+            updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE agents (
+            id                     UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id                TEXT NOT NULL,
+            name                   TEXT NOT NULL,
+            description            TEXT,
+            agent_type             TEXT,
+            status                 TEXT NOT NULL,
+            key                    CITEXT UNIQUE,
+            source_id              UUID REFERENCES sources(id) ON DELETE SET NULL,
+            extra_source_ids       UUID[] NOT NULL DEFAULT '{}',
+            chunks                 INTEGER,
+            retriever              TEXT,
+            prompt_id              UUID REFERENCES prompts(id) ON DELETE SET NULL,
+            tools                  JSONB NOT NULL DEFAULT '[]'::jsonb,
+            json_schema            JSONB,
+            models                 JSONB,
+            default_model_id       TEXT,
+            folder_id              UUID REFERENCES agent_folders(id) ON DELETE SET NULL,
+            limited_token_mode     BOOLEAN NOT NULL DEFAULT false,
+            token_limit            INTEGER,
+            limited_request_mode   BOOLEAN NOT NULL DEFAULT false,
+            request_limit          INTEGER,
+            shared                 BOOLEAN NOT NULL DEFAULT false,
+            incoming_webhook_token CITEXT UNIQUE,
+            created_at             TIMESTAMPTZ NOT NULL DEFAULT now(),
+            updated_at             TIMESTAMPTZ NOT NULL DEFAULT now(),
+            last_used_at           TIMESTAMPTZ,
+            legacy_mongo_id        TEXT
+        );
+        """
+    )
+    op.execute(
+        "ALTER TABLE token_usage ADD CONSTRAINT token_usage_agent_fk "
+        "FOREIGN KEY (agent_id) REFERENCES agents(id) ON DELETE SET NULL;"
+    )
+
+    op.execute(
+        """
+        CREATE TABLE attachments (
+            id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id         TEXT NOT NULL,
+            filename        TEXT NOT NULL,
+            upload_path     TEXT NOT NULL,
+            mime_type       TEXT,
+            size            BIGINT,
+            created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
+            legacy_mongo_id TEXT
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE memories (
+            id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id    TEXT NOT NULL,
+            tool_id    UUID REFERENCES user_tools(id) ON DELETE CASCADE,
+            path       TEXT NOT NULL,
+            content    TEXT NOT NULL,
+            updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE todos (
+            id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id    TEXT NOT NULL,
+            tool_id    UUID REFERENCES user_tools(id) ON DELETE CASCADE,
+            title      TEXT NOT NULL,
+            completed  BOOLEAN NOT NULL DEFAULT false,
+            created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+            updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE notes (
+            id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id    TEXT NOT NULL,
+            tool_id    UUID REFERENCES user_tools(id) ON DELETE CASCADE,
+            title      TEXT NOT NULL,
+            content    TEXT NOT NULL,
+            created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+            updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE connector_sessions (
+            id           UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id      TEXT NOT NULL,
+            provider     TEXT NOT NULL,
+            session_data JSONB NOT NULL,
+            expires_at   TIMESTAMPTZ,
+            created_at   TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE conversations (
+            id                   UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id              TEXT NOT NULL,
+            agent_id             UUID REFERENCES agents(id) ON DELETE SET NULL,
+            name                 TEXT,
+            api_key              TEXT,
+            is_shared_usage      BOOLEAN NOT NULL DEFAULT false,
+            shared_token         TEXT,
+            date                 TIMESTAMPTZ NOT NULL DEFAULT now(),
+            created_at           TIMESTAMPTZ NOT NULL DEFAULT now(),
+            updated_at           TIMESTAMPTZ NOT NULL DEFAULT now(),
+            shared_with          TEXT[] NOT NULL DEFAULT '{}'::text[],
+            compression_metadata JSONB,
+            legacy_mongo_id      TEXT,
+            CONSTRAINT conversations_api_key_nonempty_chk
+                CHECK (api_key IS NULL OR api_key <> '')
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE conversation_messages (
+            id               UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            conversation_id  UUID NOT NULL REFERENCES conversations(id) ON DELETE CASCADE,
+            position         INTEGER NOT NULL,
+            prompt           TEXT,
+            response         TEXT,
+            thought          TEXT,
+            sources          JSONB NOT NULL DEFAULT '[]'::jsonb,
+            tool_calls       JSONB NOT NULL DEFAULT '[]'::jsonb,
+            attachments      UUID[] NOT NULL DEFAULT '{}'::uuid[],
+            model_id         TEXT,
+            message_metadata JSONB NOT NULL DEFAULT '{}'::jsonb,
+            feedback         JSONB,
+            timestamp        TIMESTAMPTZ NOT NULL DEFAULT now(),
+            user_id          TEXT NOT NULL,
+            updated_at       TIMESTAMPTZ NOT NULL DEFAULT now()
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE shared_conversations (
+            id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            conversation_id UUID NOT NULL REFERENCES conversations(id) ON DELETE CASCADE,
+            user_id         TEXT NOT NULL,
+            is_promptable   BOOLEAN NOT NULL DEFAULT false,
+            created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
+            uuid            UUID NOT NULL,
+            first_n_queries INTEGER NOT NULL DEFAULT 0,
+            api_key         TEXT,
+            prompt_id       UUID REFERENCES prompts(id) ON DELETE SET NULL,
+            chunks          INTEGER,
+            CONSTRAINT shared_conversations_api_key_nonempty_chk
+                CHECK (api_key IS NULL OR api_key <> '')
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE pending_tool_state (
+            id                 UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            conversation_id    UUID NOT NULL REFERENCES conversations(id) ON DELETE CASCADE,
+            user_id            TEXT NOT NULL,
+            messages           JSONB NOT NULL,
+            pending_tool_calls JSONB NOT NULL,
+            tools_dict         JSONB NOT NULL,
+            tool_schemas       JSONB NOT NULL,
+            agent_config       JSONB NOT NULL,
+            client_tools       JSONB,
+            created_at         TIMESTAMPTZ NOT NULL DEFAULT now(),
+            expires_at         TIMESTAMPTZ NOT NULL
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE workflows (
+            id                    UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            user_id               TEXT NOT NULL,
+            name                  TEXT NOT NULL,
+            description           TEXT,
+            created_at            TIMESTAMPTZ NOT NULL DEFAULT now(),
+            updated_at            TIMESTAMPTZ NOT NULL DEFAULT now(),
+            current_graph_version INTEGER NOT NULL DEFAULT 1,
+            legacy_mongo_id       TEXT
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE workflow_nodes (
+            id              UUID DEFAULT gen_random_uuid() NOT NULL,
+            workflow_id     UUID NOT NULL REFERENCES workflows(id) ON DELETE CASCADE,
+            graph_version   INTEGER NOT NULL,
+            node_type       TEXT NOT NULL,
+            config          JSONB NOT NULL DEFAULT '{}'::jsonb,
+            node_id         TEXT NOT NULL,
+            title           TEXT,
+            description     TEXT,
+            position        JSONB NOT NULL DEFAULT '{"x": 0, "y": 0}'::jsonb,
+            legacy_mongo_id TEXT,
+            PRIMARY KEY (id),
+            CONSTRAINT workflow_nodes_id_wf_ver_key
+                UNIQUE (id, workflow_id, graph_version)
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE workflow_edges (
+            id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            workflow_id   UUID NOT NULL REFERENCES workflows(id) ON DELETE CASCADE,
+            graph_version INTEGER NOT NULL,
+            from_node_id  UUID NOT NULL,
+            to_node_id    UUID NOT NULL,
+            config        JSONB NOT NULL DEFAULT '{}'::jsonb,
+            edge_id       TEXT NOT NULL,
+            source_handle TEXT,
+            target_handle TEXT,
+            CONSTRAINT workflow_edges_from_node_fk
+                FOREIGN KEY (from_node_id, workflow_id, graph_version)
+                REFERENCES workflow_nodes(id, workflow_id, graph_version) ON DELETE CASCADE,
+            CONSTRAINT workflow_edges_to_node_fk
+                FOREIGN KEY (to_node_id, workflow_id, graph_version)
+                REFERENCES workflow_nodes(id, workflow_id, graph_version) ON DELETE CASCADE
+        );
+        """
+    )
+
+    op.execute(
+        """
+        CREATE TABLE workflow_runs (
+            id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+            workflow_id     UUID NOT NULL REFERENCES workflows(id) ON DELETE CASCADE,
+            user_id         TEXT NOT NULL,
+            status          TEXT NOT NULL,
+            started_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
+            ended_at        TIMESTAMPTZ,
+            result          JSONB,
+            inputs          JSONB,
+            steps           JSONB NOT NULL DEFAULT '[]'::jsonb,
+            legacy_mongo_id TEXT,
+            CONSTRAINT workflow_runs_status_chk
+                CHECK (status IN ('pending', 'running', 'completed', 'failed'))
+        );
+        """
+    )
+
+    # ------------------------------------------------------------------
+    # Indexes
+    # ------------------------------------------------------------------
+    op.execute("CREATE INDEX agent_folders_user_idx ON agent_folders (user_id);")
+
+    op.execute("CREATE INDEX agents_user_idx   ON agents (user_id);")
+    op.execute("CREATE INDEX agents_shared_idx ON agents (shared) WHERE shared = true;")
+    op.execute("CREATE INDEX agents_status_idx ON agents (status);")
+    op.execute("CREATE INDEX agents_source_id_idx ON agents (source_id);")
+    op.execute("CREATE INDEX agents_prompt_id_idx ON agents (prompt_id);")
+    op.execute("CREATE INDEX agents_folder_id_idx ON agents (folder_id);")
+    op.execute(
+        "CREATE UNIQUE INDEX agents_legacy_mongo_id_uidx "
+        "ON agents (legacy_mongo_id) WHERE legacy_mongo_id IS NOT NULL;"
+    )
+
+    op.execute("CREATE INDEX attachments_user_idx ON attachments (user_id);")
+    op.execute(
+        "CREATE UNIQUE INDEX attachments_legacy_mongo_id_uidx "
+        "ON attachments (legacy_mongo_id) WHERE legacy_mongo_id IS NOT NULL;"
+    )
+
+    op.execute(
+        "CREATE UNIQUE INDEX connector_sessions_user_provider_uidx "
+        "ON connector_sessions (user_id, provider);"
+    )
+    op.execute(
+        "CREATE INDEX connector_sessions_expiry_idx "
+        "ON connector_sessions (expires_at) WHERE expires_at IS NOT NULL;"
+    )
+
+    op.execute(
+        "CREATE UNIQUE INDEX conversation_messages_conv_pos_uidx "
+        "ON conversation_messages (conversation_id, position);"
+    )
+    op.execute(
+        "CREATE INDEX conversation_messages_user_ts_idx "
+        "ON conversation_messages (user_id, timestamp DESC);"
+    )
+
+    op.execute("CREATE INDEX conversations_user_date_idx ON conversations (user_id, date DESC);")
+    op.execute("CREATE INDEX conversations_agent_idx    ON conversations (agent_id);")
+    op.execute(
+        "CREATE UNIQUE INDEX conversations_shared_token_uidx "
+        "ON conversations (shared_token) WHERE shared_token IS NOT NULL;"
+    )
+    op.execute(
+        "CREATE INDEX conversations_api_key_date_idx "
+        "ON conversations (api_key, date DESC) WHERE api_key IS NOT NULL;"
+    )
+    op.execute(
+        "CREATE UNIQUE INDEX conversations_legacy_mongo_id_uidx "
+        "ON conversations (legacy_mongo_id) WHERE legacy_mongo_id IS NOT NULL;"
+    )
+
+    op.execute(
+        "CREATE UNIQUE INDEX memories_user_tool_path_uidx "
+        "ON memories (user_id, tool_id, path);"
+    )
+    op.execute(
+        "CREATE UNIQUE INDEX memories_user_path_null_tool_uidx "
+        "ON memories (user_id, path) WHERE tool_id IS NULL;"
+    )
+    op.execute(
+        "CREATE INDEX memories_path_prefix_idx "
+        "ON memories (user_id, tool_id, path text_pattern_ops);"
+    )
+    op.execute("CREATE INDEX memories_tool_id_idx ON memories (tool_id);")
+
+    op.execute("CREATE UNIQUE INDEX notes_user_tool_uidx ON notes (user_id, tool_id);")
+    op.execute("CREATE INDEX notes_tool_id_idx ON notes (tool_id);")
+
+    op.execute(
+        "CREATE UNIQUE INDEX pending_tool_state_conv_user_uidx "
+        "ON pending_tool_state (conversation_id, user_id);"
+    )
+    op.execute(
+        "CREATE INDEX pending_tool_state_expires_idx ON pending_tool_state (expires_at);"
+    )
+
+    op.execute("CREATE INDEX prompts_user_id_idx ON prompts (user_id);")
+    op.execute(
+        "CREATE UNIQUE INDEX prompts_legacy_mongo_id_uidx "
+        "ON prompts (legacy_mongo_id) WHERE legacy_mongo_id IS NOT NULL;"
+    )
+
+    op.execute("CREATE INDEX shared_conversations_user_idx ON shared_conversations (user_id);")
+    op.execute("CREATE INDEX shared_conversations_conv_idx ON shared_conversations (conversation_id);")
+    op.execute(
+        "CREATE INDEX shared_conversations_prompt_id_idx ON shared_conversations (prompt_id);"
+    )
+    op.execute(
+        "CREATE UNIQUE INDEX shared_conversations_uuid_uidx ON shared_conversations (uuid);"
+    )
+    op.execute(
+        "CREATE UNIQUE INDEX shared_conversations_dedup_uidx "
+        "ON shared_conversations (conversation_id, user_id, is_promptable, first_n_queries, COALESCE(api_key, ''));"
+    )
+
+    op.execute("CREATE INDEX sources_user_idx ON sources (user_id);")
+
+    op.execute('CREATE INDEX stack_logs_timestamp_idx ON stack_logs ("timestamp" DESC);')
+    op.execute('CREATE INDEX stack_logs_user_ts_idx   ON stack_logs (user_id, "timestamp" DESC);')
+    op.execute('CREATE INDEX stack_logs_level_ts_idx  ON stack_logs (level, "timestamp" DESC);')
+    op.execute("CREATE INDEX stack_logs_activity_idx  ON stack_logs (activity_id);")
+
+    op.execute("CREATE INDEX todos_user_tool_idx ON todos (user_id, tool_id);")
+    op.execute("CREATE INDEX todos_tool_id_idx   ON todos (tool_id);")
+
+    op.execute('CREATE INDEX token_usage_user_ts_idx  ON token_usage (user_id, "timestamp" DESC);')
+    op.execute('CREATE INDEX token_usage_key_ts_idx   ON token_usage (api_key, "timestamp" DESC);')
+    op.execute('CREATE INDEX token_usage_agent_ts_idx ON token_usage (agent_id, "timestamp" DESC);')
+
+    op.execute('CREATE INDEX user_logs_user_ts_idx ON user_logs (user_id, "timestamp" DESC);')
+
+    op.execute("CREATE INDEX user_tools_user_id_idx ON user_tools (user_id);")
+
+    op.execute("CREATE INDEX workflow_edges_from_node_idx ON workflow_edges (from_node_id);")
+    op.execute("CREATE INDEX workflow_edges_to_node_idx   ON workflow_edges (to_node_id);")
+    op.execute(
+        "CREATE UNIQUE INDEX workflow_edges_wf_ver_eid_uidx "
+        "ON workflow_edges (workflow_id, graph_version, edge_id);"
+    )
+
+    op.execute(
+        "CREATE UNIQUE INDEX workflow_nodes_wf_ver_nid_uidx "
+        "ON workflow_nodes (workflow_id, graph_version, node_id);"
+    )
+    op.execute(
+        "CREATE UNIQUE INDEX workflow_nodes_legacy_mongo_id_uidx "
+        "ON workflow_nodes (legacy_mongo_id) WHERE legacy_mongo_id IS NOT NULL;"
+    )
+
+    op.execute("CREATE INDEX workflow_runs_workflow_idx ON workflow_runs (workflow_id);")
+    op.execute("CREATE INDEX workflow_runs_user_idx     ON workflow_runs (user_id);")
+    op.execute(
+        "CREATE INDEX workflow_runs_status_started_idx "
+        "ON workflow_runs (status, started_at DESC);"
+    )
+    op.execute(
+        "CREATE UNIQUE INDEX workflow_runs_legacy_mongo_id_uidx "
+        "ON workflow_runs (legacy_mongo_id) WHERE legacy_mongo_id IS NOT NULL;"
+    )
+
+    op.execute("CREATE INDEX workflows_user_idx ON workflows (user_id);")
+    op.execute(
+        "CREATE UNIQUE INDEX workflows_legacy_mongo_id_uidx "
+        "ON workflows (legacy_mongo_id) WHERE legacy_mongo_id IS NOT NULL;"
+    )
+
+    # ------------------------------------------------------------------
+    # user_id foreign keys (deferrable so backfills can stage rows)
+    # ------------------------------------------------------------------
+    user_fk_tables = (
+        "agent_folders",
+        "agents",
+        "attachments",
+        "connector_sessions",
+        "conversation_messages",
+        "conversations",
+        "memories",
+        "notes",
+        "pending_tool_state",
+        "prompts",
+        "shared_conversations",
+        "sources",
+        "stack_logs",
+        "todos",
+        "token_usage",
+        "user_logs",
+        "user_tools",
+        "workflow_runs",
+        "workflows",
+    )
+    for table in user_fk_tables:
+        op.execute(
+            f"ALTER TABLE {table} "
+            f"ADD CONSTRAINT {table}_user_id_fk "
+            f"FOREIGN KEY (user_id) REFERENCES users(user_id) "
+            f"ON DELETE RESTRICT DEFERRABLE INITIALLY IMMEDIATE;"
+        )
+
+    # ------------------------------------------------------------------
+    # Triggers
+    # ------------------------------------------------------------------
+    updated_at_tables = (
+        "agent_folders",
+        "agents",
+        "conversation_messages",
+        "conversations",
+        "memories",
+        "notes",
+        "prompts",
+        "sources",
+        "todos",
+        "user_tools",
+        "users",
+        "workflows",
+    )
+    for table in updated_at_tables:
+        op.execute(
+            f"CREATE TRIGGER {table}_set_updated_at "
+            f"BEFORE UPDATE ON {table} "
+            f"FOR EACH ROW WHEN (OLD.* IS DISTINCT FROM NEW.*) "
+            f"EXECUTE FUNCTION set_updated_at();"
+        )
+
+    ensure_user_tables = (
+        "agent_folders",
+        "agents",
+        "attachments",
+        "connector_sessions",
+        "conversation_messages",
+        "conversations",
+        "memories",
+        "notes",
+        "pending_tool_state",
+        "prompts",
+        "shared_conversations",
+        "sources",
+        "stack_logs",
+        "todos",
+        "token_usage",
+        "user_logs",
+        "user_tools",
+        "workflow_runs",
+        "workflows",
+    )
+    for table in ensure_user_tables:
+        op.execute(
+            f"CREATE TRIGGER {table}_ensure_user "
+            f"BEFORE INSERT OR UPDATE OF user_id ON {table} "
+            f"FOR EACH ROW EXECUTE FUNCTION ensure_user_exists();"
+        )
+
+    op.execute(
+        "CREATE TRIGGER conversation_messages_fill_user "
+        "BEFORE INSERT ON conversation_messages "
+        "FOR EACH ROW EXECUTE FUNCTION conversation_messages_fill_user_id();"
+    )
+
+    op.execute(
+        "CREATE TRIGGER attachments_cleanup_message_refs "
+        "AFTER DELETE ON attachments "
+        "FOR EACH ROW EXECUTE FUNCTION cleanup_message_attachment_refs();"
+    )
+    op.execute(
+        "CREATE TRIGGER agents_cleanup_user_prefs "
+        "AFTER DELETE ON agents "
+        "FOR EACH ROW EXECUTE FUNCTION cleanup_user_agent_prefs();"
+    )
+    op.execute(
+        "CREATE TRIGGER sources_cleanup_agent_extra_refs "
+        "AFTER DELETE ON sources "
+        "FOR EACH ROW EXECUTE FUNCTION cleanup_agent_extra_source_refs();"
+    )
+
+    # ------------------------------------------------------------------
+    # Seed sentinel __system__ user (system/template sources attribute here)
+    # ------------------------------------------------------------------
+    op.execute(
+        "INSERT INTO users (user_id) VALUES ('__system__') "
+        "ON CONFLICT (user_id) DO NOTHING;"
+    )
+
+
+def downgrade() -> None:
+    # Nuclear downgrade: drop everything this migration created. The
+    # ordering drops FK-bearing children before parents; CASCADE would
+    # also work but explicit ordering is easier to reason about in code
+    # review.
+    tables_in_drop_order = (
+        "workflow_edges",
+        "workflow_runs",
+        "workflow_nodes",
+        "workflows",
+        "pending_tool_state",
+        "shared_conversations",
+        "conversation_messages",
+        "conversations",
+        "connector_sessions",
+        "notes",
+        "todos",
+        "memories",
+        "attachments",
+        "agents",
+        "sources",
+        "agent_folders",
+        "stack_logs",
+        "user_logs",
+        "token_usage",
+        "user_tools",
+        "prompts",
+        "users",
+    )
+    for table in tables_in_drop_order:
+        op.execute(f"DROP TABLE IF EXISTS {table} CASCADE;")
+
+    for fn in (
+        "conversation_messages_fill_user_id",
+        "cleanup_user_agent_prefs",
+        "cleanup_agent_extra_source_refs",
+        "cleanup_message_attachment_refs",
+        "ensure_user_exists",
+        "set_updated_at",
+    ):
+        op.execute(f"DROP FUNCTION IF EXISTS {fn}();")
--- a/application/api/answer/routes/answer.py
+++ b/application/api/answer/routes/answer.py
@@ -74,68 +74,76 @@ class AnswerResource(Resource, BaseAnswerResource):
        decoded_token = getattr(request, "decoded_token", None)
        processor = StreamProcessor(data, decoded_token)
        try:
-            processor.initialize()
-            if not processor.decoded_token:
-                return make_response({"error": "Unauthorized"}, 401)
+            # ---- Continuation mode ----
+            if data.get("tool_actions"):
+                (
+                    agent,
+                    messages,
+                    tools_dict,
+                    pending_tool_calls,
+                    tool_actions,
+                ) = processor.resume_from_tool_actions(
+                    data["tool_actions"], data["conversation_id"]
+                )
+                if not processor.decoded_token:
+                    return make_response({"error": "Unauthorized"}, 401)
+                if error := self.check_usage(processor.agent_config):
+                    return error
+                stream = self.complete_stream(
+                    question="",
+                    agent=agent,
+                    conversation_id=processor.conversation_id,
+                    user_api_key=processor.agent_config.get("user_api_key"),
+                    decoded_token=processor.decoded_token,
+                    agent_id=processor.agent_id,
+                    model_id=processor.model_id,
+                    _continuation={
+                        "messages": messages,
+                        "tools_dict": tools_dict,
+                        "pending_tool_calls": pending_tool_calls,
+                        "tool_actions": tool_actions,
+                    },
+                )
+            else:
+                # ---- Normal mode ----
+                agent = processor.build_agent(data.get("question", ""))
+                if not processor.decoded_token:
+                    return make_response({"error": "Unauthorized"}, 401)

-            docs_together, docs_list = processor.pre_fetch_docs(
-                data.get("question", "")
-            )
-            tools_data = processor.pre_fetch_tools()
+                if error := self.check_usage(processor.agent_config):
+                    return error

-            agent = processor.create_agent(
-                docs_together=docs_together,
-                docs=docs_list,
-                tools_data=tools_data,
-            )
+                stream = self.complete_stream(
+                    question=data["question"],
+                    agent=agent,
+                    conversation_id=processor.conversation_id,
+                    user_api_key=processor.agent_config.get("user_api_key"),
+                    decoded_token=processor.decoded_token,
+                    isNoneDoc=data.get("isNoneDoc"),
+                    index=None,
+                    should_save_conversation=data.get("save_conversation", True),
+                    agent_id=processor.agent_id,
+                    is_shared_usage=processor.is_shared_usage,
+                    shared_token=processor.shared_token,
+                    model_id=processor.model_id,
+                )

-            if error := self.check_usage(processor.agent_config):
-                return error
-
-            stream = self.complete_stream(
-                question=data["question"],
-                agent=agent,
-                conversation_id=processor.conversation_id,
-                user_api_key=processor.agent_config.get("user_api_key"),
-                decoded_token=processor.decoded_token,
-                isNoneDoc=data.get("isNoneDoc"),
-                index=None,
-                should_save_conversation=data.get("save_conversation", True),
-                agent_id=processor.agent_id,
-                is_shared_usage=processor.is_shared_usage,
-                shared_token=processor.shared_token,
-                model_id=processor.model_id,
-            )
            stream_result = self.process_response_stream(stream)

-            if len(stream_result) == 7:
-                (
-                    conversation_id,
-                    response,
-                    sources,
-                    tool_calls,
-                    thought,
-                    error,
-                    structured_info,
-                ) = stream_result
-            else:
-                conversation_id, response, sources, tool_calls, thought, error = (
-                    stream_result
-                )
-                structured_info = None
+            if stream_result["error"]:
+                return make_response({"error": stream_result["error"]}, 400)

-            if error:
-                return make_response({"error": error}, 400)
            result = {
-                "conversation_id": conversation_id,
-                "answer": response,
-                "sources": sources,
-                "tool_calls": tool_calls,
-                "thought": thought,
+                "conversation_id": stream_result["conversation_id"],
+                "answer": stream_result["answer"],
+                "sources": stream_result["sources"],
+                "tool_calls": stream_result["tool_calls"],
+                "thought": stream_result["thought"],
            }

-            if structured_info:
-                result.update(structured_info)
+            extra_info = stream_result.get("extra")
+            if extra_info:
+                result.update(extra_info)
        except Exception as e:
            logger.error(
                f"/api/answer - error: {str(e)} - traceback: {traceback.format_exc()}",
--- a/application/api/answer/routes/base.py
+++ b/application/api/answer/routes/base.py
@@ -6,6 +6,7 @@ from typing import Any, Dict, Generator, List, Optional
 from flask import jsonify, make_response, Response
 from flask_restx import Namespace

+from application.api.answer.services.continuation_service import ContinuationService
 from application.api.answer.services.conversation_service import ConversationService
 from application.core.model_utils import (
    get_api_key_for_provider,
@@ -15,6 +16,7 @@ from application.core.model_utils import (

 from application.core.mongo_db import MongoDB
 from application.core.settings import settings
+from application.error import sanitize_api_error
 from application.llm.llm_creator import LLMCreator
 from application.utils import check_required_fields

@@ -38,7 +40,16 @@ class BaseAnswerResource:
    def validate_request(
        self, data: Dict[str, Any], require_conversation_id: bool = False
    ) -> Optional[Response]:
-        """Common request validation"""
+        """Common request validation.
+
+        Continuation requests (``tool_actions`` present) require
+        ``conversation_id`` but not ``question``.
+        """
+        if data.get("tool_actions"):
+            # Continuation mode — question is not required
+            if missing := check_required_fields(data, ["conversation_id"]):
+                return missing
+            return None
        required_fields = ["question"]
        if require_conversation_id:
            required_fields.append("conversation_id")
@@ -176,6 +187,7 @@ class BaseAnswerResource:
        is_shared_usage: bool = False,
        shared_token: Optional[str] = None,
        model_id: Optional[str] = None,
+        _continuation: Optional[Dict] = None,
    ) -> Generator[str, None, None]:
        """
        Generator function that streams the complete conversation response.
@@ -205,9 +217,23 @@ class BaseAnswerResource:
            is_structured = False
            schema_info = None
            structured_chunks = []
+            query_metadata = {}
+            paused = False

-            for line in agent.gen(query=question):
-                if "answer" in line:
+            if _continuation:
+                gen_iter = agent.gen_continuation(
+                    messages=_continuation["messages"],
+                    tools_dict=_continuation["tools_dict"],
+                    pending_tool_calls=_continuation["pending_tool_calls"],
+                    tool_actions=_continuation["tool_actions"],
+                )
+            else:
+                gen_iter = agent.gen(query=question)
+
+            for line in gen_iter:
+                if "metadata" in line:
+                    query_metadata.update(line["metadata"])
+                elif "answer" in line:
                    response_full += str(line["answer"])
                    if line.get("structured"):
                        is_structured = True
@@ -240,8 +266,21 @@ class BaseAnswerResource:
                    data = json.dumps({"type": "thought", "thought": line["thought"]})
                    yield f"data: {data}\n\n"
                elif "type" in line:
-                    data = json.dumps(line)
-                    yield f"data: {data}\n\n"
+                    if line.get("type") == "tool_calls_pending":
+                        # Save continuation state and end the stream
+                        paused = True
+                        data = json.dumps(line)
+                        yield f"data: {data}\n\n"
+                    elif line.get("type") == "error":
+                        sanitized_error = {
+                            "type": "error",
+                            "error": sanitize_api_error(line.get("error", "An error occurred"))
+                        }
+                        data = json.dumps(sanitized_error)
+                        yield f"data: {data}\n\n"
+                    else:
+                        data = json.dumps(line)
+                        yield f"data: {data}\n\n"
            if is_structured and structured_chunks:
                structured_data = {
                    "type": "structured_answer",
@@ -251,6 +290,93 @@ class BaseAnswerResource:
                }
                data = json.dumps(structured_data)
                yield f"data: {data}\n\n"
+
+            # ---- Paused: save continuation state and end stream early ----
+            if paused:
+                continuation = getattr(agent, "_pending_continuation", None)
+                if continuation:
+                    # Ensure we have a conversation_id — create a partial
+                    # conversation if this is the first turn.
+                    if not conversation_id and should_save_conversation:
+                        try:
+                            provider = (
+                                get_provider_from_model_id(model_id)
+                                if model_id
+                                else settings.LLM_PROVIDER
+                            )
+                            sys_api_key = get_api_key_for_provider(
+                                provider or settings.LLM_PROVIDER
+                            )
+                            llm = LLMCreator.create_llm(
+                                provider or settings.LLM_PROVIDER,
+                                api_key=sys_api_key,
+                                user_api_key=user_api_key,
+                                decoded_token=decoded_token,
+                                model_id=model_id,
+                                agent_id=agent_id,
+                            )
+                            conversation_id = (
+                                self.conversation_service.save_conversation(
+                                    None,
+                                    question,
+                                    response_full,
+                                    thought,
+                                    source_log_docs,
+                                    tool_calls,
+                                    llm,
+                                    model_id or self.default_model_id,
+                                    decoded_token,
+                                    api_key=user_api_key,
+                                    agent_id=agent_id,
+                                    is_shared_usage=is_shared_usage,
+                                    shared_token=shared_token,
+                                )
+                            )
+                        except Exception as e:
+                            logger.error(
+                                f"Failed to create conversation for continuation: {e}",
+                                exc_info=True,
+                            )
+
+                    if conversation_id:
+                        try:
+                            cont_service = ContinuationService()
+                            cont_service.save_state(
+                                conversation_id=str(conversation_id),
+                                user=decoded_token.get("sub", "local"),
+                                messages=continuation["messages"],
+                                pending_tool_calls=continuation["pending_tool_calls"],
+                                tools_dict=continuation["tools_dict"],
+                                tool_schemas=getattr(agent, "tools", []),
+                                agent_config={
+                                    "model_id": model_id or self.default_model_id,
+                                    "llm_name": getattr(agent, "llm_name", settings.LLM_PROVIDER),
+                                    "api_key": getattr(agent, "api_key", None),
+                                    "user_api_key": user_api_key,
+                                    "agent_id": agent_id,
+                                    "agent_type": agent.__class__.__name__,
+                                    "prompt": getattr(agent, "prompt", ""),
+                                    "json_schema": getattr(agent, "json_schema", None),
+                                    "retriever_config": getattr(agent, "retriever_config", None),
+                                },
+                                client_tools=getattr(
+                                    agent.tool_executor, "client_tools", None
+                                ),
+                            )
+                        except Exception as e:
+                            logger.error(
+                                f"Failed to save continuation state: {str(e)}",
+                                exc_info=True,
+                            )
+
+                id_data = {"type": "id", "id": str(conversation_id)}
+                data = json.dumps(id_data)
+                yield f"data: {data}\n\n"
+
+                data = json.dumps({"type": "end"})
+                yield f"data: {data}\n\n"
+                return
+
            if isNoneDoc:
                for doc in source_log_docs:
                    doc["source"] = "None"
@@ -287,6 +413,7 @@ class BaseAnswerResource:
                    is_shared_usage=is_shared_usage,
                    shared_token=shared_token,
                    attachment_ids=attachment_ids,
+                    metadata=query_metadata if query_metadata else None,
                )
                # Persist compression metadata/summary if it exists and wasn't saved mid-execution
                compression_meta = getattr(agent, "compression_metadata", None)
@@ -342,6 +469,18 @@ class BaseAnswerResource:
                    log_data[key] = value[:10000]
            self.user_logs_collection.insert_one(log_data)

+            from application.storage.db.dual_write import dual_write
+            from application.storage.db.repositories.user_logs import UserLogsRepository
+
+            dual_write(
+                UserLogsRepository,
+                lambda repo, d=log_data: repo.insert(
+                    user_id=d.get("user"),
+                    endpoint="stream_answer",
+                    data=d,
+                ),
+            )
+
            data = json.dumps({"type": "end"})
            yield f"data: {data}\n\n"
        except GeneratorExit:
@@ -376,6 +515,7 @@ class BaseAnswerResource:
                        is_shared_usage=is_shared_usage,
                        shared_token=shared_token,
                        attachment_ids=attachment_ids,
+                        metadata=query_metadata if query_metadata else None,
                    )
                    compression_meta = getattr(agent, "compression_metadata", None)
                    compression_saved = getattr(agent, "compression_saved", False)
@@ -412,8 +552,13 @@ class BaseAnswerResource:
            yield f"data: {data}\n\n"
            return

-    def process_response_stream(self, stream):
-        """Process the stream response for non-streaming endpoint"""
+    def process_response_stream(self, stream) -> Dict[str, Any]:
+        """Process the stream response for non-streaming endpoint.
+
+        Returns:
+            Dict with keys: conversation_id, answer, sources, tool_calls,
+            thought, error, and optional extra.
+        """
        conversation_id = ""
        response_full = ""
        source_log_docs = []
@@ -422,6 +567,7 @@ class BaseAnswerResource:
        stream_ended = False
        is_structured = False
        schema_info = None
+        pending_tool_calls = None

        for line in stream:
            try:
@@ -440,11 +586,22 @@ class BaseAnswerResource:
                    source_log_docs = event["source"]
                elif event["type"] == "tool_calls":
                    tool_calls = event["tool_calls"]
+                elif event["type"] == "tool_calls_pending":
+                    pending_tool_calls = event.get("data", {}).get(
+                        "pending_tool_calls", []
+                    )
                elif event["type"] == "thought":
                    thought = event["thought"]
                elif event["type"] == "error":
                    logger.error(f"Error from stream: {event['error']}")
-                    return None, None, None, None, event["error"], None
+                    return {
+                        "conversation_id": None,
+                        "answer": None,
+                        "sources": None,
+                        "tool_calls": None,
+                        "thought": None,
+                        "error": event["error"],
+                    }
                elif event["type"] == "end":
                    stream_ended = True
            except (json.JSONDecodeError, KeyError) as e:
@@ -452,18 +609,30 @@ class BaseAnswerResource:
                continue
        if not stream_ended:
            logger.error("Stream ended unexpectedly without an 'end' event.")
-            return None, None, None, None, "Stream ended unexpectedly", None
-        result = (
-            conversation_id,
-            response_full,
-            source_log_docs,
-            tool_calls,
-            thought,
-            None,
-        )
+            return {
+                "conversation_id": None,
+                "answer": None,
+                "sources": None,
+                "tool_calls": None,
+                "thought": None,
+                "error": "Stream ended unexpectedly",
+            }
+
+        result: Dict[str, Any] = {
+            "conversation_id": conversation_id,
+            "answer": response_full,
+            "sources": source_log_docs,
+            "tool_calls": tool_calls,
+            "thought": thought,
+            "error": None,
+        }
+
+        if pending_tool_calls is not None:
+            result["extra"] = {"pending_tool_calls": pending_tool_calls}

        if is_structured:
-            result = result + ({"structured": True, "schema": schema_info},)
+            result["extra"] = {"structured": True, "schema": schema_info}
+
        return result

    def error_stream_generate(self, err_response):
--- a/application/api/answer/routes/stream.py
+++ b/application/api/answer/routes/stream.py
@@ -79,8 +79,48 @@ class StreamResource(Resource, BaseAnswerResource):
            return error
        decoded_token = getattr(request, "decoded_token", None)
        processor = StreamProcessor(data, decoded_token)
+
        try:
-            processor.initialize()
+            # ---- Continuation mode ----
+            if data.get("tool_actions"):
+                (
+                    agent,
+                    messages,
+                    tools_dict,
+                    pending_tool_calls,
+                    tool_actions,
+                ) = processor.resume_from_tool_actions(
+                    data["tool_actions"], data["conversation_id"]
+                )
+                if not processor.decoded_token:
+                    return Response(
+                        self.error_stream_generate("Unauthorized"),
+                        status=401,
+                        mimetype="text/event-stream",
+                    )
+                if error := self.check_usage(processor.agent_config):
+                    return error
+                return Response(
+                    self.complete_stream(
+                        question="",
+                        agent=agent,
+                        conversation_id=processor.conversation_id,
+                        user_api_key=processor.agent_config.get("user_api_key"),
+                        decoded_token=processor.decoded_token,
+                        agent_id=processor.agent_id,
+                        model_id=processor.model_id,
+                        _continuation={
+                            "messages": messages,
+                            "tools_dict": tools_dict,
+                            "pending_tool_calls": pending_tool_calls,
+                            "tool_actions": tool_actions,
+                        },
+                    ),
+                    mimetype="text/event-stream",
+                )
+
+            # ---- Normal mode ----
+            agent = processor.build_agent(data["question"])
            if not processor.decoded_token:
                return Response(
                    self.error_stream_generate("Unauthorized"),
@@ -88,13 +128,6 @@ class StreamResource(Resource, BaseAnswerResource):
                    mimetype="text/event-stream",
                )

-            docs_together, docs_list = processor.pre_fetch_docs(data["question"])
-            tools_data = processor.pre_fetch_tools()
-
-            agent = processor.create_agent(
-                docs_together=docs_together, docs=docs_list, tools_data=tools_data
-            )
-
            if error := self.check_usage(processor.agent_config):
                return error
            return Response(
--- a/application/api/answer/services/compression/message_builder.py
+++ b/application/api/answer/services/compression/message_builder.py
@@ -1,5 +1,6 @@
 """Message reconstruction utilities for compression."""

+import json
 import logging
 import uuid
 from typing import Dict, List, Optional
@@ -49,28 +50,35 @@ class MessageBuilder:
            if include_tool_calls and "tool_calls" in query:
                for tool_call in query["tool_calls"]:
                    call_id = tool_call.get("call_id") or str(uuid.uuid4())
-
-                    function_call_dict = {
-                        "function_call": {
-                            "name": tool_call.get("action_name"),
-                            "args": tool_call.get("arguments"),
-                            "call_id": call_id,
-                        }
-                    }
-                    function_response_dict = {
-                        "function_response": {
-                            "name": tool_call.get("action_name"),
-                            "response": {"result": tool_call.get("result")},
-                            "call_id": call_id,
-                        }
-                    }
-
-                    messages.append(
-                        {"role": "assistant", "content": [function_call_dict]}
+                    args = tool_call.get("arguments")
+                    args_str = (
+                        json.dumps(args)
+                        if isinstance(args, dict)
+                        else (args or "{}")
                    )
-                    messages.append(
-                        {"role": "tool", "content": [function_response_dict]}
+                    messages.append({
+                        "role": "assistant",
+                        "content": None,
+                        "tool_calls": [{
+                            "id": call_id,
+                            "type": "function",
+                            "function": {
+                                "name": tool_call.get("action_name", ""),
+                                "arguments": args_str,
+                            },
+                        }],
+                    })
+                    result = tool_call.get("result")
+                    result_str = (
+                        json.dumps(result)
+                        if not isinstance(result, str)
+                        else (result or "")
                    )
+                    messages.append({
+                        "role": "tool",
+                        "tool_call_id": call_id,
+                        "content": result_str,
+                    })

        # If no recent queries (everything was compressed), add a continuation user message
        if len(recent_queries) == 0 and compressed_summary:
@@ -180,28 +188,35 @@ class MessageBuilder:
            if include_tool_calls and "tool_calls" in query:
                for tool_call in query["tool_calls"]:
                    call_id = tool_call.get("call_id") or str(uuid.uuid4())
-
-                    function_call_dict = {
-                        "function_call": {
-                            "name": tool_call.get("action_name"),
-                            "args": tool_call.get("arguments"),
-                            "call_id": call_id,
-                        }
-                    }
-                    function_response_dict = {
-                        "function_response": {
-                            "name": tool_call.get("action_name"),
-                            "response": {"result": tool_call.get("result")},
-                            "call_id": call_id,
-                        }
-                    }
-
-                    rebuilt_messages.append(
-                        {"role": "assistant", "content": [function_call_dict]}
+                    args = tool_call.get("arguments")
+                    args_str = (
+                        json.dumps(args)
+                        if isinstance(args, dict)
+                        else (args or "{}")
                    )
-                    rebuilt_messages.append(
-                        {"role": "tool", "content": [function_response_dict]}
+                    rebuilt_messages.append({
+                        "role": "assistant",
+                        "content": None,
+                        "tool_calls": [{
+                            "id": call_id,
+                            "type": "function",
+                            "function": {
+                                "name": tool_call.get("action_name", ""),
+                                "arguments": args_str,
+                            },
+                        }],
+                    })
+                    result = tool_call.get("result")
+                    result_str = (
+                        json.dumps(result)
+                        if not isinstance(result, str)
+                        else (result or "")
                    )
+                    rebuilt_messages.append({
+                        "role": "tool",
+                        "tool_call_id": call_id,
+                        "content": result_str,
+                    })

        # If no recent queries (everything was compressed), add a continuation user message
        if len(recent_queries) == 0 and compressed_summary:
--- a/application/api/answer/services/continuation_service.py
+++ b/application/api/answer/services/continuation_service.py
@@ -0,0 +1,175 @@
+"""Service for saving and restoring tool-call continuation state.
+
+When a stream pauses (tool needs approval or client-side execution),
+the full execution state is persisted to MongoDB so the client can
+resume later by sending tool_actions.
+"""
+
+import datetime
+import logging
+from typing import Any, Dict, List, Optional
+
+from bson import ObjectId
+
+from application.core.mongo_db import MongoDB
+from application.core.settings import settings
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.conversations import ConversationsRepository
+from application.storage.db.repositories.pending_tool_state import (
+    PendingToolStateRepository,
+)
+
+logger = logging.getLogger(__name__)
+
+# TTL for pending states — auto-cleaned after this period
+PENDING_STATE_TTL_SECONDS = 30 * 60  # 30 minutes
+
+
+def _make_serializable(obj: Any) -> Any:
+    """Recursively convert MongoDB ObjectIds and other non-JSON types."""
+    if isinstance(obj, ObjectId):
+        return str(obj)
+    if isinstance(obj, dict):
+        return {str(k): _make_serializable(v) for k, v in obj.items()}
+    if isinstance(obj, list):
+        return [_make_serializable(v) for v in obj]
+    if isinstance(obj, bytes):
+        return obj.decode("utf-8", errors="replace")
+    return obj
+
+
+class ContinuationService:
+    """Manages pending tool-call state in MongoDB."""
+
+    def __init__(self):
+        mongo = MongoDB.get_client()
+        db = mongo[settings.MONGO_DB_NAME]
+        self.collection = db["pending_tool_state"]
+        self._ensure_indexes()
+
+    def _ensure_indexes(self):
+        try:
+            self.collection.create_index(
+                "expires_at", expireAfterSeconds=0
+            )
+            self.collection.create_index(
+                [("conversation_id", 1), ("user", 1)], unique=True
+            )
+        except Exception:
+            # Indexes may already exist or mongomock doesn't support TTL
+            pass
+
+    def save_state(
+        self,
+        conversation_id: str,
+        user: str,
+        messages: List[Dict],
+        pending_tool_calls: List[Dict],
+        tools_dict: Dict,
+        tool_schemas: List[Dict],
+        agent_config: Dict,
+        client_tools: Optional[List[Dict]] = None,
+    ) -> str:
+        """Save execution state for later continuation.
+
+        Args:
+            conversation_id: The conversation this state belongs to.
+            user: Owner user ID.
+            messages: Full messages array at the pause point.
+            pending_tool_calls: Tool calls awaiting client action.
+            tools_dict: Serializable tools configuration dict.
+            tool_schemas: LLM-formatted tool schemas (agent.tools).
+            agent_config: Config needed to recreate the agent on resume.
+            client_tools: Client-provided tool schemas for client-side execution.
+
+        Returns:
+            The string ID of the saved state document.
+        """
+        now = datetime.datetime.now(datetime.timezone.utc)
+        expires_at = now + datetime.timedelta(seconds=PENDING_STATE_TTL_SECONDS)
+
+        doc = {
+            "conversation_id": conversation_id,
+            "user": user,
+            "messages": _make_serializable(messages),
+            "pending_tool_calls": _make_serializable(pending_tool_calls),
+            "tools_dict": _make_serializable(tools_dict),
+            "tool_schemas": _make_serializable(tool_schemas),
+            "agent_config": _make_serializable(agent_config),
+            "client_tools": _make_serializable(client_tools) if client_tools else None,
+            "created_at": now,
+            "expires_at": expires_at,
+        }
+
+        # Upsert — only one pending state per conversation per user
+        result = self.collection.replace_one(
+            {"conversation_id": conversation_id, "user": user},
+            doc,
+            upsert=True,
+        )
+        state_id = str(result.upserted_id) if result.upserted_id else conversation_id
+        logger.info(
+            f"Saved continuation state for conversation {conversation_id} "
+            f"with {len(pending_tool_calls)} pending tool call(s)"
+        )
+
+        # Dual-write to Postgres — upsert against the same Mongo conversation
+        # by resolving its UUID via conversations.legacy_mongo_id.
+        def _pg_save(_: PendingToolStateRepository) -> None:
+            conn = _._conn  # reuse the existing transaction
+            conv = ConversationsRepository(conn).get_by_legacy_id(conversation_id)
+            if conv is None:
+                return
+            _.save_state(
+                conv["id"],
+                user,
+                messages=_make_serializable(messages),
+                pending_tool_calls=_make_serializable(pending_tool_calls),
+                tools_dict=_make_serializable(tools_dict),
+                tool_schemas=_make_serializable(tool_schemas),
+                agent_config=_make_serializable(agent_config),
+                client_tools=_make_serializable(client_tools) if client_tools else None,
+            )
+
+        dual_write(PendingToolStateRepository, _pg_save)
+        return state_id
+
+    def load_state(
+        self, conversation_id: str, user: str
+    ) -> Optional[Dict[str, Any]]:
+        """Load pending continuation state.
+
+        Returns:
+            The state dict, or None if no pending state exists.
+        """
+        doc = self.collection.find_one(
+            {"conversation_id": conversation_id, "user": user}
+        )
+        if not doc:
+            return None
+        doc["_id"] = str(doc["_id"])
+        return doc
+
+    def delete_state(self, conversation_id: str, user: str) -> bool:
+        """Delete pending state after successful resumption.
+
+        Returns:
+            True if a document was deleted.
+        """
+        result = self.collection.delete_one(
+            {"conversation_id": conversation_id, "user": user}
+        )
+        if result.deleted_count:
+            logger.info(
+                f"Deleted continuation state for conversation {conversation_id}"
+            )
+
+        # Dual-write to Postgres — delete the same row.
+        def _pg_delete(repo: PendingToolStateRepository) -> None:
+            conv = ConversationsRepository(repo._conn).get_by_legacy_id(conversation_id)
+            if conv is None:
+                return
+            repo.delete_state(conv["id"], user)
+
+        dual_write(PendingToolStateRepository, _pg_delete)
+        return result.deleted_count > 0
--- a/application/api/answer/services/conversation_service.py
+++ b/application/api/answer/services/conversation_service.py
@@ -5,6 +5,8 @@ from typing import Any, Dict, List, Optional
 from application.core.mongo_db import MongoDB

 from application.core.settings import settings
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.conversations import ConversationsRepository
 from bson import ObjectId


@@ -60,6 +62,7 @@ class ConversationService:
        is_shared_usage: bool = False,
        shared_token: Optional[str] = None,
        attachment_ids: Optional[List[str]] = None,
+        metadata: Optional[Dict[str, Any]] = None,
    ) -> str:
        """Save or update a conversation in the database"""
        if decoded_token is None:
@@ -93,6 +96,11 @@ class ConversationService:
                        f"queries.{index}.timestamp": current_time,
                        f"queries.{index}.attachments": attachment_ids,
                        f"queries.{index}.model_id": model_id,
+                        **(
+                            {f"queries.{index}.metadata": metadata}
+                            if metadata
+                            else {}
+                        ),
                    }
                },
            )
@@ -107,6 +115,26 @@ class ConversationService:
                },
                {"$push": {"queries": {"$each": [], "$slice": index + 1}}},
            )
+            # Dual-write to Postgres: update the message at :index and
+            # truncate anything after it, mirroring Mongo's $set+$slice.
+            def _pg_update_at_index(repo: ConversationsRepository) -> None:
+                conv = repo.get_by_legacy_id(conversation_id)
+                if conv is None:
+                    return
+                repo.update_message_at(conv["id"], index, {
+                    "prompt": question,
+                    "response": response,
+                    "thought": thought,
+                    "sources": sources,
+                    "tool_calls": tool_calls,
+                    "attachments": attachment_ids,
+                    "model_id": model_id,
+                    "timestamp": current_time,
+                    **({"metadata": metadata} if metadata else {}),
+                })
+                repo.truncate_after(conv["id"], index)
+
+            dual_write(ConversationsRepository, _pg_update_at_index)
            return conversation_id
        elif conversation_id:
            # Append new message to existing conversation
@@ -124,6 +152,7 @@ class ConversationService:
                            "timestamp": current_time,
                            "attachments": attachment_ids,
                            "model_id": model_id,
+                            **({"metadata": metadata} if metadata else {}),
                        }
                    }
                },
@@ -131,6 +160,25 @@ class ConversationService:

            if result.matched_count == 0:
                raise ValueError("Conversation not found or unauthorized")
+
+            # Dual-write to Postgres: append the same message.
+            def _pg_append(repo: ConversationsRepository) -> None:
+                conv = repo.get_by_legacy_id(conversation_id)
+                if conv is None:
+                    return
+                repo.append_message(conv["id"], {
+                    "prompt": question,
+                    "response": response,
+                    "thought": thought,
+                    "sources": sources,
+                    "tool_calls": tool_calls,
+                    "attachments": attachment_ids,
+                    "model_id": model_id,
+                    "timestamp": current_time,
+                    "metadata": metadata or {},
+                })
+
+            dual_write(ConversationsRepository, _pg_append)
            return conversation_id
        else:
            # Create new conversation
@@ -156,22 +204,24 @@ class ConversationService:
            if not completion or not completion.strip():
                completion = question[:50] if question else "New Conversation"

+            query_doc = {
+                "prompt": question,
+                "response": response,
+                "thought": thought,
+                "sources": sources,
+                "tool_calls": tool_calls,
+                "timestamp": current_time,
+                "attachments": attachment_ids,
+                "model_id": model_id,
+            }
+            if metadata:
+                query_doc["metadata"] = metadata
+
            conversation_data = {
                "user": user_id,
                "date": current_time,
                "name": completion,
-                "queries": [
-                    {
-                        "prompt": question,
-                        "response": response,
-                        "thought": thought,
-                        "sources": sources,
-                        "tool_calls": tool_calls,
-                        "timestamp": current_time,
-                        "attachments": attachment_ids,
-                        "model_id": model_id,
-                    }
-                ],
+                "queries": [query_doc],
            }

            if api_key:
@@ -184,7 +234,34 @@ class ConversationService:
                if agent:
                    conversation_data["api_key"] = agent["key"]
            result = self.conversations_collection.insert_one(conversation_data)
-            return str(result.inserted_id)
+            inserted_id = str(result.inserted_id)
+
+            # Dual-write to Postgres: create the conversation row with
+            # legacy_mongo_id and append the first message.
+            def _pg_create(repo: ConversationsRepository) -> None:
+                conv = repo.create(
+                    user_id,
+                    completion,
+                    agent_id=conversation_data.get("agent_id"),
+                    api_key=conversation_data.get("api_key"),
+                    is_shared_usage=conversation_data.get("is_shared_usage", False),
+                    shared_token=conversation_data.get("shared_token"),
+                    legacy_mongo_id=inserted_id,
+                )
+                repo.append_message(conv["id"], {
+                    "prompt": question,
+                    "response": response,
+                    "thought": thought,
+                    "sources": sources,
+                    "tool_calls": tool_calls,
+                    "attachments": attachment_ids,
+                    "model_id": model_id,
+                    "timestamp": current_time,
+                    "metadata": metadata or {},
+                })
+
+            dual_write(ConversationsRepository, _pg_create)
+            return inserted_id

    def update_compression_metadata(
        self, conversation_id: str, compression_metadata: Dict[str, Any]
@@ -221,6 +298,24 @@ class ConversationService:
            logger.info(
                f"Updated compression metadata for conversation {conversation_id}"
            )
+
+            # Dual-write to Postgres: mirror $set + $push $slice.
+            def _pg_compression(repo: ConversationsRepository) -> None:
+                conv = repo.get_by_legacy_id(conversation_id)
+                if conv is None:
+                    return
+                repo.set_compression_flags(
+                    conv["id"],
+                    is_compressed=True,
+                    last_compression_at=compression_metadata.get("timestamp"),
+                )
+                repo.append_compression_point(
+                    conv["id"],
+                    compression_metadata,
+                    max_points=settings.COMPRESSION_MAX_HISTORY_POINTS,
+                )
+
+            dual_write(ConversationsRepository, _pg_compression)
        except Exception as e:
            logger.error(
                f"Error updating compression metadata: {str(e)}", exc_info=True
@@ -257,6 +352,23 @@ class ConversationService:
                    }
                },
            )
+
+            def _pg_append_summary(repo: ConversationsRepository) -> None:
+                conv = repo.get_by_legacy_id(conversation_id)
+                if conv is None:
+                    return
+                repo.append_message(conv["id"], {
+                    "prompt": "[Context Compression Summary]",
+                    "response": summary,
+                    "thought": "",
+                    "sources": [],
+                    "tool_calls": [],
+                    "attachments": [],
+                    "model_id": compression_metadata.get("model_used"),
+                    "timestamp": timestamp,
+                })
+
+            dual_write(ConversationsRepository, _pg_append_summary)
            logger.info(f"Appended compression summary to conversation {conversation_id}")
        except Exception as e:
            logger.error(
--- a/application/api/answer/services/stream_processor.py
+++ b/application/api/answer/services/stream_processor.py
@@ -38,13 +38,23 @@ def get_prompt(prompt_id: str, prompts_collection=None) -> str:
    current_dir = Path(__file__).resolve().parents[3]
    prompts_dir = current_dir / "prompts"

-    preset_mapping = {
+    # Maps for classic agent types
+    CLASSIC_PRESETS = {
        "default": "chat_combine_default.txt",
        "creative": "chat_combine_creative.txt",
        "strict": "chat_combine_strict.txt",
        "reduce": "chat_reduce_prompt.txt",
    }

+    # Agentic counterparts — same styles, but with search tool instructions
+    AGENTIC_PRESETS = {
+        "default": "agentic/default.txt",
+        "creative": "agentic/creative.txt",
+        "strict": "agentic/strict.txt",
+    }
+
+    preset_mapping = {**CLASSIC_PRESETS, **{f"agentic_{k}": v for k, v in AGENTIC_PRESETS.items()}}
+
    if prompt_id in preset_mapping:
        file_path = os.path.join(prompts_dir, preset_mapping[prompt_id])
        try:
@@ -91,6 +101,7 @@ class StreamProcessor:
        self.is_shared_usage = False
        self.shared_token = None
        self.agent_id = self.data.get("agent_id")
+        self.agent_key = None
        self.model_id: Optional[str] = None
        self.conversation_service = ConversationService()
        self.compression_orchestrator = CompressionOrchestrator(
@@ -101,6 +112,7 @@ class StreamProcessor:
        self._required_tool_actions: Optional[Dict[str, Set[Optional[str]]]] = None
        self.compressed_summary: Optional[str] = None
        self.compressed_summary_tokens: int = 0
+        self._agent_data: Optional[Dict[str, Any]] = None

    def initialize(self):
        """Initialize all required components for processing"""
@@ -111,6 +123,29 @@ class StreamProcessor:
        self._load_conversation_history()
        self._process_attachments()

+    def build_agent(self, question: str):
+        """One call to go from request data to a ready-to-run agent.
+
+        Combines initialize(), pre_fetch_docs(), pre_fetch_tools(), and
+        create_agent() into a single convenience method.
+        """
+        self.initialize()
+
+        agent_type = self.agent_config.get("agent_type", "classic")
+
+        # Agentic/research agents skip pre-fetch — the LLM searches on-demand via tools
+        if agent_type in ("agentic", "research"):
+            tools_data = self.pre_fetch_tools()
+            return self.create_agent(tools_data=tools_data)
+
+        docs_together, docs_list = self.pre_fetch_docs(question)
+        tools_data = self.pre_fetch_tools()
+        return self.create_agent(
+            docs_together=docs_together,
+            docs=docs_list,
+            tools_data=tools_data,
+        )
+
    def _load_conversation_history(self):
        """Load conversation history either from DB or request"""
        if self.conversation_id and self.initial_user_id:
@@ -124,9 +159,17 @@ class StreamProcessor:
            if settings.ENABLE_CONVERSATION_COMPRESSION:
                self._handle_compression(conversation)
            else:
-                # Original behavior - load all history
+                # Original behavior - load all history (include metadata if present)
                self.history = [
-                    {"prompt": query["prompt"], "response": query["response"]}
+                    {
+                        "prompt": query["prompt"],
+                        "response": query["response"],
+                        **(
+                            {"metadata": query["metadata"]}
+                            if "metadata" in query
+                            else {}
+                        ),
+                    }
                    for query in conversation.get("queries", [])
                ]
        else:
@@ -135,14 +178,8 @@ class StreamProcessor:
            )

    def _handle_compression(self, conversation: Dict[str, Any]):
-        """
-        Handle conversation compression logic using orchestrator.
-
-        Args:
-            conversation: Full conversation document
-        """
+        """Handle conversation compression logic using orchestrator."""
        try:
-            # Use orchestrator to handle all compression logic
            result = self.compression_orchestrator.compress_if_needed(
                conversation_id=self.conversation_id,
                user_id=self.initial_user_id,
@@ -153,12 +190,15 @@ class StreamProcessor:
            if not result.success:
                logger.error(f"Compression failed: {result.error}, using full history")
                self.history = [
-                    {"prompt": query["prompt"], "response": query["response"]}
+                    {
+                        "prompt": query["prompt"],
+                        "response": query["response"],
+                        **({"metadata": query["metadata"]} if "metadata" in query else {}),
+                    }
                    for query in conversation.get("queries", [])
                ]
                return

-            # Set compressed summary if compression was performed
            if result.compression_performed and result.compressed_summary:
                self.compressed_summary = result.compressed_summary
                self.compressed_summary_tokens = TokenCounter.count_message_tokens(
@@ -169,17 +209,27 @@ class StreamProcessor:
                    f"+ {len(result.recent_queries)} recent messages"
                )

-            # Build history from recent queries
            self.history = result.as_history()
+            # Preserve metadata from recent queries (as_history only has prompt/response)
+            recent = result.recent_queries if result.recent_queries else conversation.get("queries", [])
+            for i, entry in enumerate(self.history):
+                # Match by index from the end of recent queries
+                offset = len(recent) - len(self.history)
+                qi = offset + i
+                if 0 <= qi < len(recent) and "metadata" in recent[qi]:
+                    entry["metadata"] = recent[qi]["metadata"]

        except Exception as e:
            logger.error(
                f"Error handling compression, falling back to standard history: {str(e)}",
                exc_info=True,
            )
-            # Fallback to original behavior
            self.history = [
-                {"prompt": query["prompt"], "response": query["response"]}
+                {
+                    "prompt": query["prompt"],
+                    "response": query["response"],
+                    **({"metadata": query["metadata"]} if "metadata" in query else {}),
+                }
                for query in conversation.get("queries", [])
            ]

@@ -191,9 +241,6 @@ class StreamProcessor:
        )

    def _get_attachments_content(self, attachment_ids, user_id):
-        """
-        Retrieve content from attachment documents based on their IDs.
-        """
        if not attachment_ids:
            return []
        attachments = []
@@ -202,7 +249,6 @@ class StreamProcessor:
                attachment_doc = self.attachments_collection.find_one(
                    {"_id": ObjectId(attachment_id), "user": user_id}
                )
-
                if attachment_doc:
                    attachments.append(attachment_doc)
            except Exception as e:
@@ -232,7 +278,6 @@ class StreamProcessor:
                )
            self.model_id = requested_model
        else:
-            # Check if agent has a default model configured
            agent_default_model = self.agent_config.get("default_model_id", "")
            if agent_default_model and validate_model_id(agent_default_model):
                self.model_id = agent_default_model
@@ -285,7 +330,6 @@ class StreamProcessor:
            data["source"] = "default"
        else:
            data["source"] = None
-        # Handle multiple sources

        sources = data.get("sources", [])
        if sources and isinstance(sources, list):
@@ -311,28 +355,34 @@ class StreamProcessor:
        else:
            data["sources"] = []

-        # Preserve model configuration from agent
        data["default_model_id"] = data.get("default_model_id", "")

        return data

    def _configure_source(self):
-        """Configure the source based on agent data"""
-        api_key = self.data.get("api_key") or self.agent_key
+        """Configure the source based on agent data.

-        if api_key:
-            agent_data = self._get_data_from_api_key(api_key)
+        The literal string ``"default"`` is a placeholder meaning "no
+        ingested source" and is normalized to an empty source so that no
+        retrieval is attempted.
+        """
+        if self._agent_data:
+            agent_data = self._agent_data

            if agent_data.get("sources") and len(agent_data["sources"]) > 0:
                source_ids = [
-                    source["id"] for source in agent_data["sources"] if source.get("id")
+                    source["id"]
+                    for source in agent_data["sources"]
+                    if source.get("id") and source["id"] != "default"
                ]
                if source_ids:
                    self.source = {"active_docs": source_ids}
                else:
                    self.source = {}
-                self.all_sources = agent_data["sources"]
-            elif agent_data.get("source"):
+                self.all_sources = [
+                    s for s in agent_data["sources"] if s.get("id") != "default"
+                ]
+            elif agent_data.get("source") and agent_data["source"] != "default":
                self.source = {"active_docs": agent_data["source"]}
                self.all_sources = [
                    {
@@ -345,84 +395,100 @@ class StreamProcessor:
                self.all_sources = []
            return
        if "active_docs" in self.data:
-            self.source = {"active_docs": self.data["active_docs"]}
+            active_docs = self.data["active_docs"]
+            if active_docs and active_docs != "default":
+                self.source = {"active_docs": active_docs}
+            else:
+                self.source = {}
            return
        self.source = {}
        self.all_sources = []

+    def _has_active_docs(self) -> bool:
+        """Return True if a real document source is configured for retrieval."""
+        active_docs = self.source.get("active_docs") if self.source else None
+        if not active_docs:
+            return False
+        if active_docs == "default":
+            return False
+        return True
+
+    def _resolve_agent_id(self) -> Optional[str]:
+        """Resolve agent_id from request, then fall back to conversation context."""
+        request_agent_id = self.data.get("agent_id")
+        if request_agent_id:
+            return str(request_agent_id)
+
+        if not self.conversation_id or not self.initial_user_id:
+            return None
+
+        try:
+            conversation = self.conversation_service.get_conversation(
+                self.conversation_id, self.initial_user_id
+            )
+        except Exception:
+            return None
+
+        if not conversation:
+            return None
+
+        conversation_agent_id = conversation.get("agent_id")
+        if conversation_agent_id:
+            return str(conversation_agent_id)
+
+        return None
+
    def _configure_agent(self):
-        """Configure the agent based on request data"""
-        agent_id = self.data.get("agent_id")
+        """Configure the agent based on request data.
+
+        Unified flow: resolve the effective API key, then extract config once.
+        """
+        agent_id = self._resolve_agent_id()
+
        self.agent_key, self.is_shared_usage, self.shared_token = self._get_agent_key(
            agent_id, self.initial_user_id
        )
        self.agent_id = str(agent_id) if agent_id else None

-        api_key = self.data.get("api_key")
-        if api_key:
-            data_key = self._get_data_from_api_key(api_key)
-            if data_key.get("_id"):
-                self.agent_id = str(data_key.get("_id"))
+        # Determine the effective API key (explicit > agent-derived)
+        effective_key = self.data.get("api_key") or self.agent_key
+
+        if effective_key:
+            self._agent_data = self._get_data_from_api_key(effective_key)
+            if self._agent_data.get("_id"):
+                self.agent_id = str(self._agent_data.get("_id"))
+
            self.agent_config.update(
                {
-                    "prompt_id": data_key.get("prompt_id", "default"),
-                    "agent_type": data_key.get("agent_type", settings.AGENT_NAME),
-                    "user_api_key": api_key,
-                    "json_schema": data_key.get("json_schema"),
-                    "default_model_id": data_key.get("default_model_id", ""),
+                    "prompt_id": self._agent_data.get("prompt_id", "default"),
+                    "agent_type": self._agent_data.get("agent_type", settings.AGENT_NAME),
+                    "user_api_key": effective_key,
+                    "json_schema": self._agent_data.get("json_schema"),
+                    "default_model_id": self._agent_data.get("default_model_id", ""),
+                    "models": self._agent_data.get("models", []),
+                    "allow_system_prompt_override": self._agent_data.get(
+                        "allow_system_prompt_override", False
+                    ),
                }
            )
-            self.initial_user_id = data_key.get("user")
-            self.decoded_token = {"sub": data_key.get("user")}
-            if data_key.get("source"):
-                self.source = {"active_docs": data_key["source"]}
-            if data_key.get("workflow"):
-                self.agent_config["workflow"] = data_key["workflow"]
-                self.agent_config["workflow_owner"] = data_key.get("user")
-            if data_key.get("retriever"):
-                self.retriever_config["retriever_name"] = data_key["retriever"]
-            if data_key.get("chunks") is not None:
-                try:
-                    self.retriever_config["chunks"] = int(data_key["chunks"])
-                except (ValueError, TypeError):
-                    logger.warning(
-                        f"Invalid chunks value: {data_key['chunks']}, using default value 2"
-                    )
-                    self.retriever_config["chunks"] = 2
-        elif self.agent_key:
-            data_key = self._get_data_from_api_key(self.agent_key)
-            if data_key.get("_id"):
-                self.agent_id = str(data_key.get("_id"))
-            self.agent_config.update(
-                {
-                    "prompt_id": data_key.get("prompt_id", "default"),
-                    "agent_type": data_key.get("agent_type", settings.AGENT_NAME),
-                    "user_api_key": self.agent_key,
-                    "json_schema": data_key.get("json_schema"),
-                    "default_model_id": data_key.get("default_model_id", ""),
-                }
-            )
-            self.decoded_token = (
-                self.decoded_token
-                if self.is_shared_usage
-                else {"sub": data_key.get("user")}
-            )
-            if data_key.get("source"):
-                self.source = {"active_docs": data_key["source"]}
-            if data_key.get("workflow"):
-                self.agent_config["workflow"] = data_key["workflow"]
-                self.agent_config["workflow_owner"] = data_key.get("user")
-            if data_key.get("retriever"):
-                self.retriever_config["retriever_name"] = data_key["retriever"]
-            if data_key.get("chunks") is not None:
-                try:
-                    self.retriever_config["chunks"] = int(data_key["chunks"])
-                except (ValueError, TypeError):
-                    logger.warning(
-                        f"Invalid chunks value: {data_key['chunks']}, using default value 2"
-                    )
-                    self.retriever_config["chunks"] = 2
+
+            # Set identity context
+            if self.data.get("api_key"):
+                # External API key: use the key owner's identity
+                self.initial_user_id = self._agent_data.get("user")
+                self.decoded_token = {"sub": self._agent_data.get("user")}
+            elif self.is_shared_usage:
+                # Shared agent: keep the caller's identity
+                pass
+            else:
+                # Owner using their own agent
+                self.decoded_token = {"sub": self._agent_data.get("user")}
+
+            if self._agent_data.get("workflow"):
+                self.agent_config["workflow"] = self._agent_data["workflow"]
+                self.agent_config["workflow_owner"] = self._agent_data.get("user")
        else:
+            # No API key — default/workflow configuration
            agent_type = settings.AGENT_NAME
            if self.data.get("workflow") and isinstance(
                self.data.get("workflow"), dict
@@ -443,14 +509,45 @@ class StreamProcessor:
            )

    def _configure_retriever(self):
+        """Assemble retriever config with precedence: request > agent > default."""
        doc_token_limit = calculate_doc_token_budget(model_id=self.model_id)

+        # Start with defaults
+        retriever_name = "classic"
+        chunks = 2
+
+        # Layer agent-level config (if present)
+        if self._agent_data:
+            if self._agent_data.get("retriever"):
+                retriever_name = self._agent_data["retriever"]
+            if self._agent_data.get("chunks") is not None:
+                try:
+                    chunks = int(self._agent_data["chunks"])
+                except (ValueError, TypeError):
+                    logger.warning(
+                        f"Invalid agent chunks value: {self._agent_data['chunks']}, "
+                        "using default value 2"
+                    )
+
+        # Explicit request values win over agent config
+        if "retriever" in self.data:
+            retriever_name = self.data["retriever"]
+        if "chunks" in self.data:
+            try:
+                chunks = int(self.data["chunks"])
+            except (ValueError, TypeError):
+                logger.warning(
+                    f"Invalid request chunks value: {self.data['chunks']}, "
+                    "using default value 2"
+                )
+
        self.retriever_config = {
-            "retriever_name": self.data.get("retriever", "classic"),
-            "chunks": int(self.data.get("chunks", 2)),
+            "retriever_name": retriever_name,
+            "chunks": chunks,
            "doc_token_limit": doc_token_limit,
        }

+        # isNoneDoc without an API key forces no retrieval
        api_key = self.data.get("api_key") or self.agent_key
        if not api_key and "isNoneDoc" in self.data and self.data["isNoneDoc"]:
            self.retriever_config["chunks"] = 0
@@ -471,9 +568,12 @@ class StreamProcessor:

    def pre_fetch_docs(self, question: str) -> tuple[Optional[str], Optional[list]]:
        """Pre-fetch documents for template rendering before agent creation"""
-        if self.data.get("isNoneDoc", False):
+        if self.data.get("isNoneDoc", False) and not self.agent_id:
            logger.info("Pre-fetch skipped: isNoneDoc=True")
            return None, None
+        if not self._has_active_docs():
+            logger.info("Pre-fetch skipped: no active docs configured")
+            return None, None
        try:
            retriever = self.create_retriever()
            logger.info(
@@ -505,12 +605,7 @@ class StreamProcessor:
            return None, None

    def pre_fetch_tools(self) -> Optional[Dict[str, Any]]:
-        """Pre-fetch tool data for template rendering before agent creation
-
-        Can be controlled via:
-        1. Global setting: ENABLE_TOOL_PREFETCH in .env
-        2. Per-request: disable_tool_prefetch in request data
-        """
+        """Pre-fetch tool data for template rendering before agent creation"""
        if not settings.ENABLE_TOOL_PREFETCH:
            logger.info(
                "Tool pre-fetching disabled globally via ENABLE_TOOL_PREFETCH setting"
@@ -722,6 +817,121 @@ class StreamProcessor:
            logger.warning(f"Failed to fetch memory tool data: {str(e)}")
            return None

+    def resume_from_tool_actions(
+        self,
+        tool_actions: list,
+        conversation_id: str,
+    ):
+        """Resume a paused agent from saved continuation state.
+
+        Loads the pending state from MongoDB, recreates the agent with
+        the saved configuration, and returns an agent ready to call
+        ``gen_continuation()``.
+
+        Args:
+            tool_actions: Client-provided actions (approvals / results).
+            conversation_id: The conversation being resumed.
+
+        Returns:
+            Tuple of (agent, messages, tools_dict, pending_tool_calls, tool_actions).
+        """
+        from application.api.answer.services.continuation_service import (
+            ContinuationService,
+        )
+        from application.agents.agent_creator import AgentCreator
+        from application.agents.tool_executor import ToolExecutor
+        from application.llm.handlers.handler_creator import LLMHandlerCreator
+        from application.llm.llm_creator import LLMCreator
+
+        cont_service = ContinuationService()
+        state = cont_service.load_state(conversation_id, self.initial_user_id)
+        if not state:
+            raise ValueError("No pending tool state found for this conversation")
+
+        messages = state["messages"]
+        pending_tool_calls = state["pending_tool_calls"]
+        tools_dict = state["tools_dict"]
+        tool_schemas = state.get("tool_schemas", [])
+        agent_config = state["agent_config"]
+
+        model_id = agent_config.get("model_id")
+        llm_name = agent_config.get("llm_name", settings.LLM_PROVIDER)
+        api_key = agent_config.get("api_key")
+        user_api_key = agent_config.get("user_api_key")
+        agent_id = agent_config.get("agent_id")
+        prompt = agent_config.get("prompt", "")
+        json_schema = agent_config.get("json_schema")
+        retriever_config = agent_config.get("retriever_config")
+
+        # Recreate dependencies
+        system_api_key = api_key or get_api_key_for_provider(llm_name)
+        llm = LLMCreator.create_llm(
+            llm_name,
+            api_key=system_api_key,
+            user_api_key=user_api_key,
+            decoded_token=self.decoded_token,
+            model_id=model_id,
+            agent_id=agent_id,
+        )
+        llm_handler = LLMHandlerCreator.create_handler(llm_name or "default")
+        tool_executor = ToolExecutor(
+            user_api_key=user_api_key,
+            user=self.initial_user_id,
+            decoded_token=self.decoded_token,
+        )
+        tool_executor.conversation_id = conversation_id
+        # Restore client tools so they stay available for subsequent LLM calls
+        saved_client_tools = state.get("client_tools")
+        if saved_client_tools:
+            tool_executor.client_tools = saved_client_tools
+            # Re-merge into tools_dict (they may have been stripped during serialization)
+            tool_executor.merge_client_tools(tools_dict, saved_client_tools)
+
+        agent_type = agent_config.get("agent_type", "ClassicAgent")
+        # Map class names back to agent creator keys
+        type_map = {
+            "ClassicAgent": "classic",
+            "AgenticAgent": "agentic",
+            "ResearchAgent": "research",
+            "WorkflowAgent": "workflow",
+        }
+        agent_key = type_map.get(agent_type, "classic")
+
+        agent_kwargs = {
+            "endpoint": "stream",
+            "llm_name": llm_name,
+            "model_id": model_id,
+            "api_key": system_api_key,
+            "agent_id": agent_id,
+            "user_api_key": user_api_key,
+            "prompt": prompt,
+            "chat_history": [],
+            "decoded_token": self.decoded_token,
+            "json_schema": json_schema,
+            "llm": llm,
+            "llm_handler": llm_handler,
+            "tool_executor": tool_executor,
+        }
+
+        if agent_key in ("agentic", "research") and retriever_config:
+            agent_kwargs["retriever_config"] = retriever_config
+
+        agent = AgentCreator.create_agent(agent_key, **agent_kwargs)
+        agent.conversation_id = conversation_id
+        agent.initial_user_id = self.initial_user_id
+        agent.tools = tool_schemas
+
+        # Store config for the route layer
+        self.model_id = model_id
+        self.agent_id = agent_id
+        self.agent_config["user_api_key"] = user_api_key
+        self.conversation_id = conversation_id
+
+        # Delete state so it can't be replayed
+        cont_service.delete_state(conversation_id, self.initial_user_id)
+
+        return agent, messages, tools_dict, pending_tool_calls, tool_actions
+
    def create_agent(
        self,
        docs_together: Optional[str] = None,
@@ -729,22 +939,40 @@ class StreamProcessor:
        tools_data: Optional[Dict[str, Any]] = None,
    ):
        """Create and return the configured agent with rendered prompt"""
+        agent_type = self.agent_config["agent_type"]
+
+        # For agentic agents, swap standard presets for their agentic
+        # counterparts (which include search tool instructions instead of
+        # {summaries}). Custom / user-provided prompts pass through as-is.
        raw_prompt = self._get_prompt_content()
        if raw_prompt is None:
-            raw_prompt = get_prompt(
-                self.agent_config["prompt_id"], self.prompts_collection
-            )
+            prompt_id = self.agent_config.get("prompt_id", "default")
+            agentic_presets = {"default", "creative", "strict"}
+            if agent_type in ("agentic", "research") and prompt_id in agentic_presets:
+                raw_prompt = get_prompt(
+                    f"agentic_{prompt_id}", self.prompts_collection
+                )
+            else:
+                raw_prompt = get_prompt(prompt_id, self.prompts_collection)
            self._prompt_content = raw_prompt

-        rendered_prompt = self.prompt_renderer.render_prompt(
-            prompt_content=raw_prompt,
-            user_id=self.initial_user_id,
-            request_id=self.data.get("request_id"),
-            passthrough_data=self.data.get("passthrough"),
-            docs=docs,
-            docs_together=docs_together,
-            tools_data=tools_data,
-        )
+        # Allow API callers to override the system prompt when the agent
+        # has opted in via allow_system_prompt_override.
+        if (
+            self.agent_config.get("allow_system_prompt_override", False)
+            and self.data.get("system_prompt_override")
+        ):
+            rendered_prompt = self.data["system_prompt_override"]
+        else:
+            rendered_prompt = self.prompt_renderer.render_prompt(
+                prompt_content=raw_prompt,
+                user_id=self.initial_user_id,
+                request_id=self.data.get("request_id"),
+                passthrough_data=self.data.get("passthrough"),
+                docs=docs,
+                docs_together=docs_together,
+                tools_data=tools_data,
+            )

        provider = (
            get_provider_from_model_id(self.model_id)
@@ -753,7 +981,39 @@ class StreamProcessor:
        )
        system_api_key = get_api_key_for_provider(provider or settings.LLM_PROVIDER)

-        agent_type = self.agent_config["agent_type"]
+        # Create LLM and handler (dependency injection)
+        from application.llm.llm_creator import LLMCreator
+        from application.llm.handlers.handler_creator import LLMHandlerCreator
+        from application.agents.tool_executor import ToolExecutor
+
+        # Compute backup models: agent's configured models minus the active one
+        agent_models = self.agent_config.get("models", [])
+        backup_models = [m for m in agent_models if m != self.model_id]
+
+        llm = LLMCreator.create_llm(
+            provider or settings.LLM_PROVIDER,
+            api_key=system_api_key,
+            user_api_key=self.agent_config["user_api_key"],
+            decoded_token=self.decoded_token,
+            model_id=self.model_id,
+            agent_id=self.agent_id,
+            backup_models=backup_models,
+        )
+        llm_handler = LLMHandlerCreator.create_handler(
+            provider if provider else "default"
+        )
+
+        user = self.decoded_token.get("sub") if self.decoded_token else None
+        tool_executor = ToolExecutor(
+            user_api_key=self.agent_config["user_api_key"],
+            user=user,
+            decoded_token=self.decoded_token,
+        )
+        tool_executor.conversation_id = self.conversation_id
+        # Pass client-side tools so they get merged in get_tools()
+        client_tools = self.data.get("client_tools")
+        if client_tools:
+            tool_executor.client_tools = client_tools

        # Base agent kwargs
        agent_kwargs = {
@@ -770,10 +1030,31 @@ class StreamProcessor:
            "attachments": self.attachments,
            "json_schema": self.agent_config.get("json_schema"),
            "compressed_summary": self.compressed_summary,
+            "llm": llm,
+            "llm_handler": llm_handler,
+            "tool_executor": tool_executor,
        }

-        # Workflow-specific kwargs for workflow agents
-        if agent_type == "workflow":
+        # Type-specific kwargs
+        if agent_type in ("agentic", "research"):
+            agent_kwargs["retriever_config"] = {
+                "source": self.source,
+                "retriever_name": self.retriever_config.get(
+                    "retriever_name", "classic"
+                ),
+                "chunks": self.retriever_config.get("chunks", 2),
+                "doc_token_limit": self.retriever_config.get(
+                    "doc_token_limit", 50000
+                ),
+                "model_id": self.model_id,
+                "user_api_key": self.agent_config["user_api_key"],
+                "agent_id": self.agent_id,
+                "llm_name": provider or settings.LLM_PROVIDER,
+                "api_key": system_api_key,
+                "decoded_token": self.decoded_token,
+            }
+
+        elif agent_type == "workflow":
            workflow_config = self.agent_config.get("workflow")
            if isinstance(workflow_config, str):
                agent_kwargs["workflow_id"] = workflow_config
--- a/application/api/connector/routes.py
+++ b/application/api/connector/routes.py
@@ -146,20 +146,19 @@ class ConnectorsCallback(Resource):
                session_token = str(uuid.uuid4())

                try:
-                    credentials = auth.create_credentials_from_token_info(token_info)
-                    service = auth.build_drive_service(credentials)
-                    user_info = service.about().get(fields="user").execute()
-                    user_email = user_info.get('user', {}).get('emailAddress', 'Connected User')
+                    if provider == "google_drive":
+                        credentials = auth.create_credentials_from_token_info(token_info)
+                        service = auth.build_drive_service(credentials)
+                        user_info = service.about().get(fields="user").execute()
+                        user_email = user_info.get('user', {}).get('emailAddress', 'Connected User')
+                    else:
+                        user_email = token_info.get('user_info', {}).get('email', 'Connected User')
+
                except Exception as e:
                    current_app.logger.warning(f"Could not get user info: {e}")
                    user_email = 'Connected User'

-                sanitized_token_info = {
-                    "access_token": token_info.get("access_token"),
-                    "refresh_token": token_info.get("refresh_token"),
-                    "token_uri": token_info.get("token_uri"),
-                    "expiry": token_info.get("expiry")
-                }
+                sanitized_token_info = auth.sanitize_token_info(token_info)

                sessions_collection.find_one_and_update(
                    {"_id": ObjectId(state_object_id), "provider": provider},
@@ -201,12 +200,12 @@ class ConnectorsCallback(Resource):
@connectors_ns.route("/api/connectors/files")
 class ConnectorFiles(Resource):
    @api.expect(api.model("ConnectorFilesModel", {
-        "provider": fields.String(required=True), 
-        "session_token": fields.String(required=True), 
-        "folder_id": fields.String(required=False), 
-        "limit": fields.Integer(required=False), 
+        "provider": fields.String(required=True),
+        "session_token": fields.String(required=True),
+        "folder_id": fields.String(required=False),
+        "limit": fields.Integer(required=False),
        "page_token": fields.String(required=False),
-        "search_query": fields.String(required=False)
+        "search_query": fields.String(required=False),
    }))
    @api.doc(description="List files from a connector provider (supports pagination and search)")
    def post(self):
@@ -214,11 +213,8 @@ class ConnectorFiles(Resource):
            data = request.get_json()
            provider = data.get('provider')
            session_token = data.get('session_token')
-            folder_id = data.get('folder_id')
            limit = data.get('limit', 10)
-            page_token = data.get('page_token')
-            search_query = data.get('search_query')
-            
+
            if not provider or not session_token:
                return make_response(jsonify({"success": False, "error": "provider and session_token are required"}), 400)

@@ -231,15 +227,12 @@ class ConnectorFiles(Resource):
                return make_response(jsonify({"success": False, "error": "Invalid or unauthorized session"}), 401)

            loader = ConnectorCreator.create_connector(provider, session_token)
+
+            generic_keys = {'provider', 'session_token'}
            input_config = {
-                'limit': limit,
-                'list_only': True,
-                'session_token': session_token,
-                'folder_id': folder_id,
-                'page_token': page_token
+                k: v for k, v in data.items() if k not in generic_keys
            }
-            if search_query:
-                input_config['search_query'] = search_query
+            input_config['list_only'] = True
                
            documents = loader.load_data(input_config)

@@ -306,12 +299,7 @@ class ConnectorValidateSession(Resource):
            if is_expired and token_info.get('refresh_token'):
                try:
                    refreshed_token_info = auth.refresh_access_token(token_info.get('refresh_token'))
-                    sanitized_token_info = {
-                    "access_token": refreshed_token_info.get("access_token"),
-                    "refresh_token": refreshed_token_info.get("refresh_token"),
-                    "token_uri": refreshed_token_info.get("token_uri"),
-                    "expiry": refreshed_token_info.get("expiry")
-                }    
+                    sanitized_token_info = auth.sanitize_token_info(refreshed_token_info)
                    sessions_collection.update_one(
                        {"session_token": session_token},
                        {"$set": {"token_info": sanitized_token_info}}
@@ -328,12 +316,18 @@ class ConnectorValidateSession(Resource):
                    "error": "Session token has expired. Please reconnect."
                }), 401)

-            return make_response(jsonify({
+            _base_fields = {"access_token", "refresh_token", "token_uri", "expiry"}
+            provider_extras = {k: v for k, v in token_info.items() if k not in _base_fields}
+
+            response_data = {
                "success": True,
                "expired": False,
                "user_email": session.get('user_email', 'Connected User'),
-                "access_token": token_info.get('access_token')
-            }), 200)
+                "access_token": token_info.get('access_token'),
+                **provider_extras,
+            }
+
+            return make_response(jsonify(response_data), 200)
        except Exception as e:
            current_app.logger.error(f"Error validating connector session: {e}", exc_info=True)
            return make_response(jsonify({"success": False, "error": "Failed to validate session"}), 500)
--- a/application/api/internal/routes.py
+++ b/application/api/internal/routes.py
@@ -26,12 +26,20 @@ internal = Blueprint("internal", __name__)

@internal.before_request
 def verify_internal_key():
-    """Verify INTERNAL_KEY for all internal endpoint requests."""
-    if settings.INTERNAL_KEY:
-        internal_key = request.headers.get("X-Internal-Key")
-        if not internal_key or internal_key != settings.INTERNAL_KEY:
-            logger.warning(f"Unauthorized internal API access attempt from {request.remote_addr}")
-            return jsonify({"error": "Unauthorized", "message": "Invalid or missing internal key"}), 401
+    """Verify INTERNAL_KEY for all internal endpoint requests.
+
+    Deny by default: if INTERNAL_KEY is not configured, reject all requests.
+    """
+    if not settings.INTERNAL_KEY:
+        logger.warning(
+            f"Internal API request rejected from {request.remote_addr}: "
+            "INTERNAL_KEY is not configured"
+        )
+        return jsonify({"error": "Unauthorized", "message": "Internal API is not configured"}), 401
+    internal_key = request.headers.get("X-Internal-Key")
+    if not internal_key or internal_key != settings.INTERNAL_KEY:
+        logger.warning(f"Unauthorized internal API access attempt from {request.remote_addr}")
+        return jsonify({"error": "Unauthorized", "message": "Invalid or missing internal key"}), 401


@internal.route("/api/download", methods=["get"])
--- a/application/api/user/agents/folders.py
+++ b/application/api/user/agents/folders.py
@@ -5,7 +5,7 @@ Provides virtual folder organization for agents (Google Drive-like structure).

 import datetime
 from bson.objectid import ObjectId
-from flask import jsonify, make_response, request
+from flask import current_app, jsonify, make_response, request
 from flask_restx import Namespace, Resource, fields

 from application.api import api
@@ -13,12 +13,19 @@ from application.api.user.base import (
    agent_folders_collection,
    agents_collection,
 )
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.agent_folders import AgentFoldersRepository

 agents_folders_ns = Namespace(
    "agents_folders", description="Agent folder management", path="/api/agents/folders"
 )


+def _folder_error_response(message: str, err: Exception):
+    current_app.logger.error(f"{message}: {err}", exc_info=True)
+    return make_response(jsonify({"success": False, "message": message}), 400)
+
+
@agents_folders_ns.route("/")
 class AgentFolders(Resource):
    @api.doc(description="Get all folders for the user")
@@ -40,8 +47,8 @@ class AgentFolders(Resource):
                for f in folders
            ]
            return make_response(jsonify({"folders": result}), 200)
-        except Exception as e:
-            return make_response(jsonify({"success": False, "message": str(e)}), 400)
+        except Exception as err:
+            return _folder_error_response("Failed to fetch folders", err)

    @api.doc(description="Create a new folder")
    @api.expect(
@@ -78,12 +85,16 @@ class AgentFolders(Resource):
                "updated_at": now,
            }
            result = agent_folders_collection.insert_one(folder)
+            dual_write(
+                AgentFoldersRepository,
+                lambda repo, u=user, n=data["name"]: repo.create(u, n),
+            )
            return make_response(
                jsonify({"id": str(result.inserted_id), "name": data["name"], "parent_id": parent_id}),
                201,
            )
-        except Exception as e:
-            return make_response(jsonify({"success": False, "message": str(e)}), 400)
+        except Exception as err:
+            return _folder_error_response("Failed to create folder", err)


@agents_folders_ns.route("/<string:folder_id>")
@@ -117,8 +128,8 @@ class AgentFolder(Resource):
                }),
                200,
            )
-        except Exception as e:
-            return make_response(jsonify({"success": False, "message": str(e)}), 400)
+        except Exception as err:
+            return _folder_error_response("Failed to fetch folder", err)

    @api.doc(description="Update a folder")
    def put(self, folder_id):
@@ -145,8 +156,8 @@ class AgentFolder(Resource):
            if result.matched_count == 0:
                return make_response(jsonify({"success": False, "message": "Folder not found"}), 404)
            return make_response(jsonify({"success": True}), 200)
-        except Exception as e:
-            return make_response(jsonify({"success": False, "message": str(e)}), 400)
+        except Exception as err:
+            return _folder_error_response("Failed to update folder", err)

    @api.doc(description="Delete a folder")
    def delete(self, folder_id):
@@ -162,11 +173,15 @@ class AgentFolder(Resource):
                {"user": user, "parent_id": folder_id}, {"$unset": {"parent_id": ""}}
            )
            result = agent_folders_collection.delete_one({"_id": ObjectId(folder_id), "user": user})
+            dual_write(
+                AgentFoldersRepository,
+                lambda repo, fid=folder_id, u=user: repo.delete(fid, u),
+            )
            if result.deleted_count == 0:
                return make_response(jsonify({"success": False, "message": "Folder not found"}), 404)
            return make_response(jsonify({"success": True}), 200)
-        except Exception as e:
-            return make_response(jsonify({"success": False, "message": str(e)}), 400)
+        except Exception as err:
+            return _folder_error_response("Failed to delete folder", err)


@agents_folders_ns.route("/move_agent")
@@ -211,8 +226,8 @@ class MoveAgentToFolder(Resource):
                )

            return make_response(jsonify({"success": True}), 200)
-        except Exception as e:
-            return make_response(jsonify({"success": False, "message": str(e)}), 400)
+        except Exception as err:
+            return _folder_error_response("Failed to move agent", err)


@agents_folders_ns.route("/bulk_move")
@@ -257,5 +272,5 @@ class BulkMoveAgents(Resource):
                    {"$unset": {"folder_id": ""}},
                )
            return make_response(jsonify({"success": True}), 200)
-        except Exception as e:
-            return make_response(jsonify({"success": False, "message": str(e)}), 400)
+        except Exception as err:
+            return _folder_error_response("Failed to move agents", err)
--- a/application/api/user/agents/routes.py
+++ b/application/api/user/agents/routes.py
@@ -23,6 +23,9 @@ from application.api.user.base import (
    workflow_nodes_collection,
    workflows_collection,
 )
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.agents import AgentsRepository
+from application.storage.db.repositories.users import UsersRepository
 from application.core.json_schema_utils import (
    JsonSchemaValidationError,
    normalize_json_schema_payload,
@@ -73,6 +76,7 @@ AGENT_TYPE_SCHEMAS = {
            "token_limit",
            "limited_request_mode",
            "request_limit",
+            "allow_system_prompt_override",
            "createdAt",
            "updatedAt",
            "lastUsedAt",
@@ -96,6 +100,7 @@ AGENT_TYPE_SCHEMAS = {
            "token_limit",
            "limited_request_mode",
            "request_limit",
+            "allow_system_prompt_override",
            "createdAt",
            "updatedAt",
            "lastUsedAt",
@@ -104,9 +109,40 @@ AGENT_TYPE_SCHEMAS = {
 }

 AGENT_TYPE_SCHEMAS["react"] = AGENT_TYPE_SCHEMAS["classic"]
+AGENT_TYPE_SCHEMAS["agentic"] = AGENT_TYPE_SCHEMAS["classic"]
+AGENT_TYPE_SCHEMAS["research"] = AGENT_TYPE_SCHEMAS["classic"]
 AGENT_TYPE_SCHEMAS["openai"] = AGENT_TYPE_SCHEMAS["classic"]


+def _build_pg_agent_fields(fields: dict) -> dict:
+    """Translate Mongo-shaped agent fields into the Postgres mirror subset."""
+    allowed = {
+        "name",
+        "description",
+        "agent_type",
+        "status",
+        "key",
+        "chunks",
+        "retriever",
+        "tools",
+        "json_schema",
+        "models",
+        "default_model_id",
+        "limited_token_mode",
+        "token_limit",
+        "limited_request_mode",
+        "request_limit",
+        "incoming_webhook_token",
+        "lastUsedAt",
+    }
+    translated: dict = {}
+    for key, value in fields.items():
+        if key not in allowed:
+            continue
+        translated["last_used_at" if key == "lastUsedAt" else key] = value
+    return translated
+
+
 def normalize_workflow_reference(workflow_value):
    """Normalize workflow references from form/json payloads."""
    if workflow_value is None:
@@ -218,6 +254,12 @@ def build_agent_document(
        base_doc["request_limit"] = int(
            data.get("request_limit", settings.DEFAULT_AGENT_LIMITS["request_limit"])
        )
+    if "allow_system_prompt_override" in allowed_fields:
+        base_doc["allow_system_prompt_override"] = (
+            data.get("allow_system_prompt_override") == "True"
+            if isinstance(data.get("allow_system_prompt_override"), str)
+            else bool(data.get("allow_system_prompt_override", False))
+        )
    return {k: v for k, v in base_doc.items() if k in allowed_fields}


@@ -290,6 +332,9 @@ class GetAgent(Resource):
                "default_model_id": agent.get("default_model_id", ""),
                "folder_id": agent.get("folder_id"),
                "workflow": agent.get("workflow"),
+                "allow_system_prompt_override": agent.get(
+                    "allow_system_prompt_override", False
+                ),
            }
            return make_response(jsonify(data), 200)
        except Exception as e:
@@ -371,6 +416,9 @@ class GetAgents(Resource):
                    "default_model_id": agent.get("default_model_id", ""),
                    "folder_id": agent.get("folder_id"),
                    "workflow": agent.get("workflow"),
+                    "allow_system_prompt_override": agent.get(
+                        "allow_system_prompt_override", False
+                    ),
                }
                for agent in agents
                if "source" in agent
@@ -448,6 +496,10 @@ class CreateAgent(Resource):
            "folder_id": fields.String(
                required=False, description="Folder ID to organize the agent"
            ),
+            "allow_system_prompt_override": fields.Boolean(
+                required=False,
+                description="Allow API callers to override the system prompt via the v1 endpoint",
+            ),
        },
    )

@@ -489,9 +541,9 @@ class CreateAgent(Resource):
                data["json_schema"] = normalize_json_schema_payload(
                    data.get("json_schema")
                )
-            except JsonSchemaValidationError as exc:
+            except JsonSchemaValidationError:
                return make_response(
-                    jsonify({"success": False, "message": f"JSON schema {exc}"}),
+                    jsonify({"success": False, "message": "Invalid JSON schema"}),
                    400,
                )
        if data.get("status") not in ["draft", "published"]:
@@ -601,6 +653,18 @@ class CreateAgent(Resource):
                    new_agent["retriever"] = "classic"
            resp = agents_collection.insert_one(new_agent)
            new_id = str(resp.inserted_id)
+            dual_write(
+                AgentsRepository,
+                lambda repo, u=user, a=new_agent, mid=new_id: repo.create(
+                    u, a.get("name", ""), a.get("status", "draft"),
+                    key=a.get("key"), description=a.get("description"),
+                    retriever=a.get("retriever"), chunks=a.get("chunks"),
+                    tools=a.get("tools"), models=a.get("models"),
+                    shared=a.get("shared", False),
+                    incoming_webhook_token=a.get("incoming_webhook_token"),
+                    legacy_mongo_id=mid,
+                ),
+            )
        except Exception as err:
            current_app.logger.error(f"Error creating agent: {err}", exc_info=True)
            return make_response(jsonify({"success": False}), 400)
@@ -672,6 +736,10 @@ class UpdateAgent(Resource):
            "folder_id": fields.String(
                required=False, description="Folder ID to organize the agent"
            ),
+            "allow_system_prompt_override": fields.Boolean(
+                required=False,
+                description="Allow API callers to override the system prompt via the v1 endpoint",
+            ),
        },
    )

@@ -740,13 +808,7 @@ class UpdateAgent(Resource):
            request, existing_agent.get("image", ""), user, storage
        )
        if error:
-            current_app.logger.error(
-                f"Image upload error for agent {agent_id}: {error}"
-            )
-            return make_response(
-                jsonify({"success": False, "message": f"Image upload failed: {error}"}),
-                400,
-            )
+            return error
        update_fields = {}
        allowed_fields = [
            "name",
@@ -769,6 +831,7 @@ class UpdateAgent(Resource):
            "default_model_id",
            "folder_id",
            "workflow",
+            "allow_system_prompt_override",
        ]

        for field in allowed_fields:
@@ -876,9 +939,9 @@ class UpdateAgent(Resource):
                        update_fields[field] = normalize_json_schema_payload(
                            json_schema
                        )
-                    except JsonSchemaValidationError as exc:
+                    except JsonSchemaValidationError:
                        return make_response(
-                            jsonify({"success": False, "message": f"JSON schema {exc}"}),
+                            jsonify({"success": False, "message": "Invalid JSON schema"}),
                            400,
                        )
                else:
@@ -987,6 +1050,13 @@ class UpdateAgent(Resource):
                if workflow_error:
                    return workflow_error
                update_fields[field] = workflow_id
+            elif field == "allow_system_prompt_override":
+                raw_value = data.get("allow_system_prompt_override", False)
+                update_fields[field] = (
+                    raw_value == "True"
+                    if isinstance(raw_value, str)
+                    else bool(raw_value)
+                )
            else:
                value = data[field]
                if field in ["name", "description", "prompt_id", "agent_type"]:
@@ -1130,6 +1200,14 @@ class UpdateAgent(Resource):
                jsonify({"success": False, "message": "Database error during update"}),
                500,
            )
+        pg_update_fields = _build_pg_agent_fields(update_fields)
+        if pg_update_fields:
+            dual_write(
+                AgentsRepository,
+                lambda repo, aid=agent_id, u=user, fields=pg_update_fields: repo.update_by_legacy_id(
+                    aid, u, fields,
+                ),
+            )
        response_data = {
            "success": True,
            "id": agent_id,
@@ -1157,6 +1235,10 @@ class DeleteAgent(Resource):
            deleted_agent = agents_collection.find_one_and_delete(
                {"_id": ObjectId(agent_id), "user": user}
            )
+            dual_write(
+                AgentsRepository,
+                lambda repo, aid=agent_id, u=user: repo.delete_by_legacy_id(aid, u),
+            )
            if not deleted_agent:
                return make_response(
                    jsonify({"success": False, "message": "Agent not found"}), 404
@@ -1224,6 +1306,9 @@ class PinnedAgents(Resource):
                    {"user_id": user_id},
                    {"$pullAll": {"agent_preferences.pinned": stale_ids}},
                )
+                dual_write(UsersRepository,
+                    lambda repo, uid=user_id, ids=stale_ids: repo.remove_pinned_bulk(uid, ids)
+                )
            list_pinned_agents = [
                {
                    "id": str(agent["_id"]),
@@ -1355,12 +1440,18 @@ class PinAgent(Resource):
                    {"user_id": user_id},
                    {"$pull": {"agent_preferences.pinned": agent_id}},
                )
+                dual_write(UsersRepository,
+                    lambda repo, uid=user_id, aid=agent_id: repo.remove_pinned(uid, aid)
+                )
                action = "unpinned"
            else:
                users_collection.update_one(
                    {"user_id": user_id},
                    {"$addToSet": {"agent_preferences.pinned": agent_id}},
                )
+                dual_write(UsersRepository,
+                    lambda repo, uid=user_id, aid=agent_id: repo.add_pinned(uid, aid)
+                )
                action = "pinned"
        except Exception as err:
            current_app.logger.error(f"Error pinning/unpinning agent: {err}")
@@ -1406,6 +1497,9 @@ class RemoveSharedAgent(Resource):
                    }
                },
            )
+            dual_write(UsersRepository,
+                lambda repo, uid=user_id, aid=agent_id: repo.remove_agent_from_all(uid, aid)
+            )

            return make_response(jsonify({"success": True, "action": "removed"}), 200)
        except Exception as err:
--- a/application/api/user/agents/sharing.py
+++ b/application/api/user/agents/sharing.py
@@ -18,6 +18,8 @@ from application.api.user.base import (
    user_tools_collection,
    users_collection,
 )
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.users import UsersRepository
 from application.utils import generate_image_url

 agents_sharing_ns = Namespace(
@@ -105,6 +107,9 @@ class SharedAgent(Resource):
                        {"user_id": user_id},
                        {"$addToSet": {"agent_preferences.shared_with_me": agent_id}},
                    )
+                    dual_write(UsersRepository,
+                        lambda repo, uid=user_id, aid=agent_id: repo.add_shared(uid, aid)
+                    )
            return make_response(jsonify(data), 200)
        except Exception as err:
            current_app.logger.error(f"Error retrieving shared agent: {err}")
@@ -139,6 +144,9 @@ class SharedAgents(Resource):
                    {"user_id": user_id},
                    {"$pullAll": {"agent_preferences.shared_with_me": stale_ids}},
                )
+                dual_write(UsersRepository,
+                    lambda repo, uid=user_id, ids=stale_ids: repo.remove_shared_bulk(uid, ids)
+                )
            pinned_ids = set(user_doc.get("agent_preferences", {}).get("pinned", []))

            list_shared_agents = [
--- a/application/api/user/attachments/routes.py
+++ b/application/api/user/attachments/routes.py
@@ -1,15 +1,36 @@
 """File attachments and media routes."""

 import os
+import tempfile
+from pathlib import Path

 from bson.objectid import ObjectId
 from flask import current_app, jsonify, make_response, request
 from flask_restx import fields, Namespace, Resource

 from application.api import api
-from application.api.user.base import agents_collection, storage
-from application.api.user.tasks import store_attachment
+from application.cache import get_redis_instance
 from application.core.settings import settings
+from application.stt.constants import (
+    SUPPORTED_AUDIO_EXTENSIONS,
+    SUPPORTED_AUDIO_MIME_TYPES,
+)
+from application.stt.upload_limits import (
+    AudioFileTooLargeError,
+    build_stt_file_size_limit_message,
+    enforce_audio_file_size_limit,
+    is_audio_filename,
+)
+from application.stt.live_session import (
+    apply_live_stt_hypothesis,
+    create_live_stt_session,
+    delete_live_stt_session,
+    finalize_live_stt_session,
+    get_live_stt_transcript_text,
+    load_live_stt_session,
+    save_live_stt_session,
+)
+from application.stt.stt_creator import STTCreator
 from application.tts.tts_creator import TTSCreator
 from application.utils import safe_filename

@@ -19,6 +40,74 @@ attachments_ns = Namespace(
 )


+def _resolve_authenticated_user():
+    decoded_token = getattr(request, "decoded_token", None)
+    api_key = request.form.get("api_key") or request.args.get("api_key")
+
+    if decoded_token:
+        return safe_filename(decoded_token.get("sub"))
+
+    if api_key:
+        from application.api.user.base import agents_collection
+
+        agent = agents_collection.find_one({"key": api_key})
+        if not agent:
+            return make_response(
+                jsonify({"success": False, "message": "Invalid API key"}), 401
+            )
+        return safe_filename(agent.get("user"))
+
+    return None
+
+
+def _get_uploaded_file_size(file) -> int:
+    try:
+        current_position = file.stream.tell()
+        file.stream.seek(0, os.SEEK_END)
+        size_bytes = file.stream.tell()
+        file.stream.seek(current_position)
+        return size_bytes
+    except Exception:
+        return 0
+
+
+def _is_supported_audio_mimetype(mimetype: str) -> bool:
+    if not mimetype:
+        return True
+    normalized = mimetype.split(";")[0].strip().lower()
+    return normalized.startswith("audio/") or normalized in SUPPORTED_AUDIO_MIME_TYPES
+
+
+def _enforce_uploaded_audio_size_limit(file, filename: str) -> None:
+    if not is_audio_filename(filename):
+        return
+    size_bytes = _get_uploaded_file_size(file)
+    if size_bytes:
+        enforce_audio_file_size_limit(size_bytes)
+
+
+def _get_store_attachment_user_error(exc: Exception) -> str:
+    if isinstance(exc, AudioFileTooLargeError):
+        return build_stt_file_size_limit_message()
+    return "Failed to process file"
+
+
+def _require_live_stt_redis():
+    redis_client = get_redis_instance()
+    if redis_client:
+        return redis_client
+    return make_response(
+        jsonify({"success": False, "message": "Live transcription is unavailable"}),
+        503,
+    )
+
+
+def _parse_bool_form_value(value: str | None) -> bool:
+    if value is None:
+        return False
+    return value.strip().lower() in {"1", "true", "yes", "on"}
+
+
@attachments_ns.route("/store_attachment")
 class StoreAttachment(Resource):
    @api.expect(
@@ -36,8 +125,9 @@ class StoreAttachment(Resource):
        description="Stores one or multiple attachments without vectorization or training. Supports user or API key authentication."
    )
    def post(self):
-        decoded_token = getattr(request, "decoded_token", None)
-        api_key = request.form.get("api_key") or request.args.get("api_key")
+        auth_user = _resolve_authenticated_user()
+        if hasattr(auth_user, "status_code"):
+            return auth_user
        
        files = request.files.getlist("file")
        if not files:
@@ -51,22 +141,16 @@ class StoreAttachment(Resource):
                400,
            )
        
-        user = None
-        if decoded_token:
-            user = safe_filename(decoded_token.get("sub"))
-        elif api_key:
-            agent = agents_collection.find_one({"key": api_key})
-            if not agent:
-                return make_response(
-                    jsonify({"success": False, "message": "Invalid API key"}), 401
-                )
-            user = safe_filename(agent.get("user"))
-        else:
+        user = auth_user
+        if not user:
            return make_response(
                jsonify({"success": False, "message": "Authentication required"}), 401
            )
        
        try:
+            from application.api.user.tasks import store_attachment
+            from application.api.user.base import storage
+
            tasks = []
            errors = []
            original_file_count = len(files)
@@ -75,6 +159,7 @@ class StoreAttachment(Resource):
                try:
                    attachment_id = ObjectId()
                    original_filename = safe_filename(os.path.basename(file.filename))
+                    _enforce_uploaded_audio_size_limit(file, original_filename)
                    relative_path = f"{settings.UPLOAD_FOLDER}/{user}/attachments/{str(attachment_id)}/{original_filename}"

                    metadata = storage.save_file(file, relative_path)
@@ -90,15 +175,31 @@ class StoreAttachment(Resource):
                        "task_id": task.id,
                        "filename": original_filename,
                        "attachment_id": str(attachment_id),
+                        "upload_index": idx,
                    })
                except Exception as file_err:
                    current_app.logger.error(f"Error processing file {idx} ({file.filename}): {file_err}", exc_info=True)
                    errors.append({
+                        "upload_index": idx,
                        "filename": file.filename,
-                        "error": str(file_err)
+                        "error": _get_store_attachment_user_error(file_err),
                    })
            
            if not tasks:
+                if errors and all(
+                    error.get("error") == build_stt_file_size_limit_message()
+                    for error in errors
+                ):
+                    return make_response(
+                        jsonify(
+                            {
+                                "success": False,
+                                "message": build_stt_file_size_limit_message(),
+                                "errors": errors,
+                            }
+                        ),
+                        413,
+                    )
                return make_response(
                    jsonify({"status": "error", "message": "No valid files to upload"}),
                    400,
@@ -135,11 +236,389 @@ class StoreAttachment(Resource):
            return make_response(jsonify({"success": False, "error": "Failed to store attachment"}), 400)


+@attachments_ns.route("/stt")
+class SpeechToText(Resource):
+    @api.expect(
+        api.model(
+            "SpeechToTextModel",
+            {
+                "file": fields.Raw(required=True, description="Audio file"),
+                "language": fields.String(
+                    required=False, description="Optional transcription language hint"
+                ),
+            },
+        )
+    )
+    @api.doc(description="Transcribe an uploaded audio file")
+    def post(self):
+        auth_user = _resolve_authenticated_user()
+        if hasattr(auth_user, "status_code"):
+            return auth_user
+        if not auth_user:
+            return make_response(
+                jsonify({"success": False, "message": "Authentication required"}),
+                401,
+            )
+
+        file = request.files.get("file")
+        if not file or file.filename == "":
+            return make_response(
+                jsonify({"success": False, "message": "Missing file"}),
+                400,
+            )
+
+        filename = safe_filename(os.path.basename(file.filename))
+        suffix = Path(filename).suffix.lower()
+        if suffix not in SUPPORTED_AUDIO_EXTENSIONS:
+            return make_response(
+                jsonify({"success": False, "message": "Unsupported audio format"}),
+                400,
+            )
+
+        if not _is_supported_audio_mimetype(file.mimetype or ""):
+            return make_response(
+                jsonify({"success": False, "message": "Unsupported audio MIME type"}),
+                400,
+            )
+
+        try:
+            _enforce_uploaded_audio_size_limit(file, filename)
+        except AudioFileTooLargeError:
+            return make_response(
+                jsonify(
+                    {
+                        "success": False,
+                        "message": build_stt_file_size_limit_message(),
+                    }
+                ),
+                413,
+            )
+
+        temp_path = None
+        try:
+            with tempfile.NamedTemporaryFile(delete=False, suffix=suffix) as temp_file:
+                file.save(temp_file.name)
+                temp_path = Path(temp_file.name)
+
+            stt_instance = STTCreator.create_stt(settings.STT_PROVIDER)
+            transcript = stt_instance.transcribe(
+                temp_path,
+                language=request.form.get("language") or settings.STT_LANGUAGE,
+                timestamps=settings.STT_ENABLE_TIMESTAMPS,
+                diarize=settings.STT_ENABLE_DIARIZATION,
+            )
+            return make_response(jsonify({"success": True, **transcript}), 200)
+        except Exception as err:
+            current_app.logger.error(f"Error transcribing audio: {err}", exc_info=True)
+            return make_response(
+                jsonify({"success": False, "message": "Failed to transcribe audio"}),
+                400,
+            )
+        finally:
+            if temp_path and temp_path.exists():
+                temp_path.unlink()
+
+
+@attachments_ns.route("/stt/live/start")
+class LiveSpeechToTextStart(Resource):
+    @api.doc(description="Start a live speech-to-text session")
+    def post(self):
+        auth_user = _resolve_authenticated_user()
+        if hasattr(auth_user, "status_code"):
+            return auth_user
+        if not auth_user:
+            return make_response(
+                jsonify({"success": False, "message": "Authentication required"}),
+                401,
+            )
+
+        redis_client = _require_live_stt_redis()
+        if hasattr(redis_client, "status_code"):
+            return redis_client
+
+        payload = request.get_json(silent=True) or {}
+        session_state = create_live_stt_session(
+            user=auth_user,
+            language=payload.get("language") or settings.STT_LANGUAGE,
+        )
+        save_live_stt_session(redis_client, session_state)
+
+        return make_response(
+            jsonify(
+                {
+                    "success": True,
+                    "session_id": session_state["session_id"],
+                    "language": session_state.get("language"),
+                    "committed_text": "",
+                    "mutable_text": "",
+                    "previous_hypothesis": "",
+                    "latest_hypothesis": "",
+                    "finalized_text": "",
+                    "pending_text": "",
+                    "transcript_text": "",
+                }
+            ),
+            200,
+        )
+
+
+@attachments_ns.route("/stt/live/chunk")
+class LiveSpeechToTextChunk(Resource):
+    @api.expect(
+        api.model(
+            "LiveSpeechToTextChunkModel",
+            {
+                "session_id": fields.String(
+                    required=True, description="Live transcription session ID"
+                ),
+                "chunk_index": fields.Integer(
+                    required=True, description="Sequential chunk index"
+                ),
+                "is_silence": fields.Boolean(
+                    required=False,
+                    description="Whether the latest capture window was mostly silence",
+                ),
+                "file": fields.Raw(required=True, description="Audio chunk"),
+            },
+        )
+    )
+    @api.doc(description="Transcribe a chunk for a live speech-to-text session")
+    def post(self):
+        auth_user = _resolve_authenticated_user()
+        if hasattr(auth_user, "status_code"):
+            return auth_user
+        if not auth_user:
+            return make_response(
+                jsonify({"success": False, "message": "Authentication required"}),
+                401,
+            )
+
+        redis_client = _require_live_stt_redis()
+        if hasattr(redis_client, "status_code"):
+            return redis_client
+
+        session_id = request.form.get("session_id", "").strip()
+        if not session_id:
+            return make_response(
+                jsonify({"success": False, "message": "Missing session_id"}),
+                400,
+            )
+
+        session_state = load_live_stt_session(redis_client, session_id)
+        if not session_state:
+            return make_response(
+                jsonify(
+                    {
+                        "success": False,
+                        "message": "Live transcription session not found",
+                    }
+                ),
+                404,
+            )
+
+        if safe_filename(str(session_state.get("user", ""))) != auth_user:
+            return make_response(
+                jsonify({"success": False, "message": "Forbidden"}),
+                403,
+            )
+
+        chunk_index_raw = request.form.get("chunk_index", "").strip()
+        if chunk_index_raw == "":
+            return make_response(
+                jsonify({"success": False, "message": "Missing chunk_index"}),
+                400,
+            )
+
+        try:
+            chunk_index = int(chunk_index_raw)
+        except ValueError:
+            return make_response(
+                jsonify({"success": False, "message": "Invalid chunk_index"}),
+                400,
+            )
+        is_silence = _parse_bool_form_value(request.form.get("is_silence"))
+
+        file = request.files.get("file")
+        if not file or file.filename == "":
+            return make_response(
+                jsonify({"success": False, "message": "Missing file"}),
+                400,
+            )
+
+        filename = safe_filename(os.path.basename(file.filename))
+        suffix = Path(filename).suffix.lower()
+        if suffix not in SUPPORTED_AUDIO_EXTENSIONS:
+            return make_response(
+                jsonify({"success": False, "message": "Unsupported audio format"}),
+                400,
+            )
+
+        if not _is_supported_audio_mimetype(file.mimetype or ""):
+            return make_response(
+                jsonify({"success": False, "message": "Unsupported audio MIME type"}),
+                400,
+            )
+
+        try:
+            _enforce_uploaded_audio_size_limit(file, filename)
+        except AudioFileTooLargeError:
+            return make_response(
+                jsonify(
+                    {
+                        "success": False,
+                        "message": build_stt_file_size_limit_message(),
+                    }
+                ),
+                413,
+            )
+
+        temp_path = None
+        try:
+            with tempfile.NamedTemporaryFile(delete=False, suffix=suffix) as temp_file:
+                file.save(temp_file.name)
+                temp_path = Path(temp_file.name)
+
+            session_language = session_state.get("language") or settings.STT_LANGUAGE
+            stt_instance = STTCreator.create_stt(settings.STT_PROVIDER)
+            transcript = stt_instance.transcribe(
+                temp_path,
+                language=session_language,
+                timestamps=False,
+                diarize=False,
+            )
+            if not session_state.get("language") and transcript.get("language"):
+                session_state["language"] = transcript["language"]
+
+            try:
+                apply_live_stt_hypothesis(
+                    session_state,
+                    str(transcript.get("text", "")),
+                    chunk_index,
+                    is_silence=is_silence,
+                )
+            except ValueError:
+                current_app.logger.warning(
+                    "Invalid live transcription chunk",
+                    exc_info=True,
+                )
+                return make_response(
+                    jsonify(
+                        {
+                            "success": False,
+                            "message": "Invalid live transcription chunk",
+                        }
+                    ),
+                    409,
+                )
+            save_live_stt_session(redis_client, session_state)
+
+            return make_response(
+                jsonify(
+                    {
+                        "success": True,
+                        "session_id": session_id,
+                        "chunk_index": chunk_index,
+                        "chunk_text": transcript.get("text", ""),
+                        "is_silence": is_silence,
+                        "language": session_state.get("language"),
+                        "committed_text": session_state.get("committed_text", ""),
+                        "mutable_text": session_state.get("mutable_text", ""),
+                        "previous_hypothesis": session_state.get(
+                            "previous_hypothesis", ""
+                        ),
+                        "latest_hypothesis": session_state.get(
+                            "latest_hypothesis", ""
+                        ),
+                        "finalized_text": session_state.get("committed_text", ""),
+                        "pending_text": session_state.get("mutable_text", ""),
+                        "transcript_text": get_live_stt_transcript_text(session_state),
+                    }
+                ),
+                200,
+            )
+        except Exception as err:
+            current_app.logger.error(
+                f"Error transcribing live audio chunk: {err}", exc_info=True
+            )
+            return make_response(
+                jsonify({"success": False, "message": "Failed to transcribe audio"}),
+                400,
+            )
+        finally:
+            if temp_path and temp_path.exists():
+                temp_path.unlink()
+
+
+@attachments_ns.route("/stt/live/finish")
+class LiveSpeechToTextFinish(Resource):
+    @api.doc(description="Finish a live speech-to-text session")
+    def post(self):
+        auth_user = _resolve_authenticated_user()
+        if hasattr(auth_user, "status_code"):
+            return auth_user
+        if not auth_user:
+            return make_response(
+                jsonify({"success": False, "message": "Authentication required"}),
+                401,
+            )
+
+        redis_client = _require_live_stt_redis()
+        if hasattr(redis_client, "status_code"):
+            return redis_client
+
+        payload = request.get_json(silent=True) or {}
+        session_id = str(payload.get("session_id", "")).strip()
+        if not session_id:
+            return make_response(
+                jsonify({"success": False, "message": "Missing session_id"}),
+                400,
+            )
+
+        session_state = load_live_stt_session(redis_client, session_id)
+        if not session_state:
+            return make_response(
+                jsonify(
+                    {
+                        "success": False,
+                        "message": "Live transcription session not found",
+                    }
+                ),
+                404,
+            )
+
+        if safe_filename(str(session_state.get("user", ""))) != auth_user:
+            return make_response(
+                jsonify({"success": False, "message": "Forbidden"}),
+                403,
+            )
+
+        final_text = finalize_live_stt_session(session_state)
+        delete_live_stt_session(redis_client, session_id)
+
+        return make_response(
+            jsonify(
+                {
+                    "success": True,
+                    "session_id": session_id,
+                    "language": session_state.get("language"),
+                    "text": final_text,
+                }
+            ),
+            200,
+        )
+
+
@attachments_ns.route("/images/<path:image_path>")
 class ServeImage(Resource):
    @api.doc(description="Serve an image from storage")
    def get(self, image_path):
+        if ".." in image_path or image_path.startswith("/") or "\x00" in image_path:
+            return make_response(
+                jsonify({"success": False, "message": "Invalid image path"}), 400
+            )
        try:
+            from application.api.user.base import storage
+
            file_obj = storage.get_file(image_path)
            extension = image_path.split(".")[-1].lower()
            content_type = f"image/{extension}"
@@ -154,6 +633,10 @@ class ServeImage(Resource):
            return make_response(
                jsonify({"success": False, "message": "Image not found"}), 404
            )
+        except ValueError:
+            return make_response(
+                jsonify({"success": False, "message": "Invalid image path"}), 400
+            )
        except Exception as e:
            current_app.logger.error(f"Error serving image: {e}")
            return make_response(
--- a/application/api/user/base.py
+++ b/application/api/user/base.py
@@ -15,6 +15,8 @@ from werkzeug.utils import secure_filename

 from application.core.mongo_db import MongoDB
 from application.core.settings import settings
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.users import UsersRepository
 from application.storage.storage_creator import StorageCreator
 from application.vectorstore.vector_creator import VectorCreator

@@ -132,6 +134,9 @@ def ensure_user_doc(user_id):
    if updates:
        users_collection.update_one({"user_id": user_id}, {"$set": updates})
        user_doc = users_collection.find_one({"user_id": user_id})
+
+    dual_write(UsersRepository, lambda repo: repo.upsert(user_id))
+
    return user_doc


@@ -145,14 +150,22 @@ def resolve_tool_details(tool_ids):
    Returns:
        List of tool details with id, name, and display_name
    """
+    valid_ids = []
+    for tid in tool_ids:
+        try:
+            valid_ids.append(ObjectId(tid))
+        except Exception:
+            continue
    tools = user_tools_collection.find(
-        {"_id": {"$in": [ObjectId(tid) for tid in tool_ids]}}
-    )
+        {"_id": {"$in": valid_ids}}
+    ) if valid_ids else []
    return [
        {
            "id": str(tool["_id"]),
            "name": tool.get("name", ""),
-            "display_name": tool.get("displayName", tool.get("name", "")),
+            "display_name": tool.get("customName")
+            or tool.get("displayName")
+            or tool.get("name", ""),
        }
        for tool in tools
    ]
--- a/application/api/user/conversations/routes.py
+++ b/application/api/user/conversations/routes.py
@@ -8,6 +8,8 @@ from flask_restx import fields, Namespace, Resource

 from application.api import api
 from application.api.user.base import attachments_collection, conversations_collection
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.conversations import ConversationsRepository
 from application.utils import check_required_fields

 conversations_ns = Namespace(
@@ -30,15 +32,23 @@ class DeleteConversation(Resource):
            return make_response(
                jsonify({"success": False, "message": "ID is required"}), 400
            )
+        user_id = decoded_token["sub"]
        try:
            conversations_collection.delete_one(
-                {"_id": ObjectId(conversation_id), "user": decoded_token["sub"]}
+                {"_id": ObjectId(conversation_id), "user": user_id}
            )
        except Exception as err:
            current_app.logger.error(
                f"Error deleting conversation: {err}", exc_info=True
            )
            return make_response(jsonify({"success": False}), 400)
+
+        def _pg_delete(repo: ConversationsRepository) -> None:
+            conv = repo.get_by_legacy_id(conversation_id)
+            if conv is not None:
+                repo.delete(conv["id"], user_id)
+
+        dual_write(ConversationsRepository, _pg_delete)
        return make_response(jsonify({"success": True}), 200)


@@ -59,6 +69,11 @@ class DeleteAllConversations(Resource):
                f"Error deleting all conversations: {err}", exc_info=True
            )
            return make_response(jsonify({"success": False}), 400)
+
+        dual_write(
+            ConversationsRepository,
+            lambda r, uid=user_id: r.delete_all_for_user(uid),
+        )
        return make_response(jsonify({"success": True}), 200)


@@ -190,9 +205,10 @@ class UpdateConversationName(Resource):
        missing_fields = check_required_fields(data, required_fields)
        if missing_fields:
            return missing_fields
+        user_id = decoded_token.get("sub")
        try:
            conversations_collection.update_one(
-                {"_id": ObjectId(data["id"]), "user": decoded_token.get("sub")},
+                {"_id": ObjectId(data["id"]), "user": user_id},
                {"$set": {"name": data["name"]}},
            )
        except Exception as err:
@@ -200,6 +216,13 @@ class UpdateConversationName(Resource):
                f"Error updating conversation name: {err}", exc_info=True
            )
            return make_response(jsonify({"success": False}), 400)
+
+        def _pg_rename(repo: ConversationsRepository) -> None:
+            conv = repo.get_by_legacy_id(data["id"])
+            if conv is not None:
+                repo.rename(conv["id"], user_id, data["name"])
+
+        dual_write(ConversationsRepository, _pg_rename)
        return make_response(jsonify({"success": True}), 200)


@@ -277,4 +300,21 @@ class SubmitFeedback(Resource):
        except Exception as err:
            current_app.logger.error(f"Error submitting feedback: {err}", exc_info=True)
            return make_response(jsonify({"success": False}), 400)
+
+        # Dual-write to Postgres: mirror the per-message feedback set/unset.
+        feedback_value = data["feedback"]
+        question_index = int(data["question_index"])
+        feedback_payload = (
+            None if feedback_value is None
+            else {"text": feedback_value, "timestamp": datetime.datetime.now(
+                datetime.timezone.utc
+            ).isoformat()}
+        )
+
+        def _pg_feedback(repo: ConversationsRepository) -> None:
+            conv = repo.get_by_legacy_id(data["conversation_id"])
+            if conv is not None:
+                repo.set_feedback(conv["id"], question_index, feedback_payload)
+
+        dual_write(ConversationsRepository, _pg_feedback)
        return make_response(jsonify({"success": True}), 200)
--- a/application/api/user/prompts/routes.py
+++ b/application/api/user/prompts/routes.py
@@ -8,6 +8,8 @@ from flask_restx import fields, Namespace, Resource

 from application.api import api
 from application.api.user.base import current_dir, prompts_collection
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.prompts import PromptsRepository
 from application.utils import check_required_fields

 prompts_ns = Namespace(
@@ -49,6 +51,12 @@ class CreatePrompt(Resource):
                }
            )
            new_id = str(resp.inserted_id)
+            dual_write(
+                PromptsRepository,
+                lambda repo, u=user, n=data["name"], c=data["content"], mid=new_id: repo.create(
+                    u, n, c, legacy_mongo_id=mid,
+                ),
+            )
        except Exception as err:
            current_app.logger.error(f"Error creating prompt: {err}", exc_info=True)
            return make_response(jsonify({"success": False}), 400)
@@ -149,6 +157,10 @@ class DeletePrompt(Resource):
            return missing_fields
        try:
            prompts_collection.delete_one({"_id": ObjectId(data["id"]), "user": user})
+            dual_write(
+                PromptsRepository,
+                lambda repo, pid=data["id"], u=user: repo.delete_by_legacy_id(pid, u),
+            )
        except Exception as err:
            current_app.logger.error(f"Error deleting prompt: {err}", exc_info=True)
            return make_response(jsonify({"success": False}), 400)
@@ -185,6 +197,12 @@ class UpdatePrompt(Resource):
                {"_id": ObjectId(data["id"]), "user": user},
                {"$set": {"name": data["name"], "content": data["content"]}},
            )
+            dual_write(
+                PromptsRepository,
+                lambda repo, pid=data["id"], u=user, n=data["name"], c=data["content"]: repo.update_by_legacy_id(
+                    pid, u, n, c,
+                ),
+            )
        except Exception as err:
            current_app.logger.error(f"Error updating prompt: {err}", exc_info=True)
            return make_response(jsonify({"success": False}), 400)
--- a/application/api/user/sharing/routes.py
+++ b/application/api/user/sharing/routes.py
@@ -15,8 +15,71 @@ from application.api.user.base import (
    conversations_collection,
    shared_conversations_collections,
 )
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.conversations import ConversationsRepository
+from application.storage.db.repositories.shared_conversations import (
+    SharedConversationsRepository,
+)
 from application.utils import check_required_fields

+
+def _dual_write_share(
+    mongo_conv_id: str,
+    share_uuid: str,
+    user: str,
+    *,
+    is_promptable: bool,
+    first_n_queries: int,
+    api_key: str | None,
+    prompt_id: str | None = None,
+    chunks: int | None = None,
+) -> None:
+    """Mirror a Mongo share-record insert into Postgres.
+
+    Preserves the Mongo-generated UUID so public ``/shared/{uuid}`` URLs
+    resolve from both stores during cutover.
+    """
+    def _write(repo: SharedConversationsRepository) -> None:
+        conv = ConversationsRepository(repo._conn).get_by_legacy_id(
+            mongo_conv_id, user_id=user,
+        )
+        if conv is None:
+            return
+        # prompt_id / chunks are only meaningful for promptable shares;
+        # prompt_id is often the string "default" or an ObjectId that
+        # hasn't been migrated — pass as-is and let the repo drop
+        # non-UUID values. Scope the prompt lookup by user_id so an
+        # authenticated caller can't link another user's prompt into
+        # their share record.
+        resolved_prompt_id = None
+        if prompt_id and len(str(prompt_id)) == 24:
+            from sqlalchemy import text as _text
+            row = repo._conn.execute(
+                _text(
+                    "SELECT id FROM prompts "
+                    "WHERE legacy_mongo_id = :legacy_id AND user_id = :user_id"
+                ),
+                {"legacy_id": str(prompt_id), "user_id": user},
+            ).fetchone()
+            if row:
+                resolved_prompt_id = str(row[0])
+        # get_or_create is race-free on the PG side thanks to the
+        # composite partial unique index on the dedup tuple
+        # (migration 0008). It converges concurrent share requests to
+        # a single row.
+        repo.get_or_create(
+            conv["id"],
+            user,
+            is_promptable=is_promptable,
+            first_n_queries=first_n_queries,
+            api_key=api_key,
+            prompt_id=resolved_prompt_id,
+            chunks=chunks,
+            share_uuid=share_uuid,
+        )
+
+    dual_write(SharedConversationsRepository, _write)
+
 sharing_ns = Namespace(
    "sharing", description="Conversation sharing operations", path="/api"
 )
@@ -57,7 +120,7 @@ class ShareConversation(Resource):

        try:
            conversation = conversations_collection.find_one(
-                {"_id": ObjectId(conversation_id)}
+                {"_id": ObjectId(conversation_id), "user": user}
            )
            if conversation is None:
                return make_response(
@@ -124,6 +187,16 @@ class ShareConversation(Resource):
                                "api_key": api_uuid,
                            }
                        )
+                        _dual_write_share(
+                            conversation_id,
+                            str(explicit_binary.as_uuid()),
+                            user,
+                            is_promptable=is_promptable,
+                            first_n_queries=current_n_queries,
+                            api_key=api_uuid,
+                            prompt_id=prompt_id,
+                            chunks=int(chunks) if chunks else None,
+                        )
                        return make_response(
                            jsonify(
                                {
@@ -155,6 +228,16 @@ class ShareConversation(Resource):
                            "api_key": api_uuid,
                        }
                    )
+                    _dual_write_share(
+                        conversation_id,
+                        str(explicit_binary.as_uuid()),
+                        user,
+                        is_promptable=is_promptable,
+                        first_n_queries=current_n_queries,
+                        api_key=api_uuid,
+                        prompt_id=prompt_id,
+                        chunks=int(chunks) if chunks else None,
+                    )
                    return make_response(
                        jsonify(
                            {
@@ -192,6 +275,14 @@ class ShareConversation(Resource):
                        "user": user,
                    }
                )
+                _dual_write_share(
+                    conversation_id,
+                    str(explicit_binary.as_uuid()),
+                    user,
+                    is_promptable=is_promptable,
+                    first_n_queries=current_n_queries,
+                    api_key=None,
+                )
                return make_response(
                    jsonify(
                        {"success": True, "identifier": str(explicit_binary.as_uuid())}
--- a/application/api/user/sources/upload.py
+++ b/application/api/user/sources/upload.py
@@ -14,7 +14,14 @@ from application.api.user.base import sources_collection
 from application.api.user.tasks import ingest, ingest_connector_task, ingest_remote
 from application.core.settings import settings
 from application.parser.connectors.connector_creator import ConnectorCreator
+from application.parser.file.constants import SUPPORTED_SOURCE_EXTENSIONS
 from application.storage.storage_creator import StorageCreator
+from application.stt.upload_limits import (
+    AudioFileTooLargeError,
+    build_stt_file_size_limit_message,
+    enforce_audio_file_size_limit,
+    is_audio_filename,
+)
 from application.utils import check_required_fields, safe_filename


@@ -23,6 +30,12 @@ sources_upload_ns = Namespace(
 )


+def _enforce_audio_path_size_limit(file_path: str, filename: str) -> None:
+    if not is_audio_filename(filename):
+        return
+    enforce_audio_file_size_limit(os.path.getsize(file_path))
+
+
@sources_upload_ns.route("/upload")
 class UploadFile(Resource):
    @api.expect(
@@ -78,6 +91,7 @@ class UploadFile(Resource):
                with tempfile.TemporaryDirectory() as temp_dir:
                    temp_file_path = os.path.join(temp_dir, safe_file)
                    file.save(temp_file_path)
+                    _enforce_audio_path_size_limit(temp_file_path, safe_file)

                    # Only extract actual .zip files, not Office formats (.docx, .xlsx, .pptx)
                    # which are technically zip archives but should be processed as-is
@@ -102,6 +116,10 @@ class UploadFile(Resource):
                                            os.path.join(root, extracted_file), temp_dir
                                        )
                                        storage_path = f"{base_path}/{rel_path}"
+                                        _enforce_audio_path_size_limit(
+                                            os.path.join(root, extracted_file),
+                                            extracted_file,
+                                        )

                                        with open(
                                            os.path.join(root, extracted_file), "rb"
@@ -124,29 +142,23 @@ class UploadFile(Resource):
                            storage.save_file(f, file_path)
            task = ingest.delay(
                settings.UPLOAD_FOLDER,
-                [
-                    ".rst",
-                    ".md",
-                    ".pdf",
-                    ".txt",
-                    ".docx",
-                    ".csv",
-                    ".epub",
-                    ".html",
-                    ".mdx",
-                    ".json",
-                    ".xlsx",
-                    ".pptx",
-                    ".png",
-                    ".jpg",
-                    ".jpeg",
-                ],
+                list(SUPPORTED_SOURCE_EXTENSIONS),
                job_name,
                user,
                file_path=base_path,
                filename=dir_name,
                file_name_map=file_name_map,
            )
+        except AudioFileTooLargeError:
+            return make_response(
+                jsonify(
+                    {
+                        "success": False,
+                        "message": build_stt_file_size_limit_message(),
+                    }
+                ),
+                413,
+            )
        except Exception as err:
            current_app.logger.error(f"Error uploading file: {err}", exc_info=True)
            return make_response(jsonify({"success": False}), 400)
@@ -451,6 +463,16 @@ class ManageSourceFiles(Resource):
                removed_files = []
                map_updated = False
                for file_path in file_paths:
+                    if ".." in str(file_path) or str(file_path).startswith("/"):
+                        return make_response(
+                            jsonify(
+                                {
+                                    "success": False,
+                                    "message": "Invalid file path",
+                                }
+                            ),
+                            400,
+                        )
                    full_path = f"{source_file_path}/{file_path}"

                    # Remove from storage
--- a/application/api/user/tasks.py
+++ b/application/api/user/tasks.py
@@ -134,6 +134,12 @@ def setup_periodic_tasks(sender, **kwargs):
        timedelta(days=30),
        schedule_syncs.s("monthly"),
    )
+    # Replaces Mongo's TTL index on pending_tool_state.expires_at.
+    sender.add_periodic_task(
+        timedelta(seconds=60),
+        cleanup_pending_tool_state.s(),
+        name="cleanup-pending-tool-state",
+    )


@celery.task(bind=True)
@@ -146,3 +152,27 @@ def mcp_oauth_task(self, config, user):
 def mcp_oauth_status_task(self, task_id):
    resp = mcp_oauth_status(self, task_id)
    return resp
+
+
+@celery.task(bind=True)
+def cleanup_pending_tool_state(self):
+    """Delete pending_tool_state rows past their TTL.
+
+    Replaces Mongo's ``expireAfterSeconds=0`` TTL index — Postgres has
+    no native TTL, so this task runs every 60 seconds to keep
+    ``pending_tool_state`` bounded. No-ops if ``POSTGRES_URI`` isn't
+    configured (keeps the task runnable in Mongo-only environments).
+    """
+    from application.core.settings import settings
+    if not settings.POSTGRES_URI:
+        return {"deleted": 0, "skipped": "POSTGRES_URI not set"}
+
+    from application.storage.db.engine import get_engine
+    from application.storage.db.repositories.pending_tool_state import (
+        PendingToolStateRepository,
+    )
+
+    engine = get_engine()
+    with engine.begin() as conn:
+        deleted = PendingToolStateRepository(conn).cleanup_expired()
+    return {"deleted": deleted}
--- a/application/api/user/tools/mcp.py
+++ b/application/api/user/tools/mcp.py
@@ -1,21 +1,83 @@
 """Tool management MCP server integration."""

 import json
-from urllib.parse import unquote, urlencode
+from urllib.parse import urlencode, urlparse

 from bson.objectid import ObjectId
 from flask import current_app, jsonify, make_response, redirect, request
-from flask_restx import fields, Namespace, Resource
+from flask_restx import Namespace, Resource, fields

 from application.agents.tools.mcp_tool import MCPOAuthManager, MCPTool
 from application.api import api
 from application.api.user.base import user_tools_collection
+from application.api.user.tools.routes import transform_actions
 from application.cache import get_redis_instance
-from application.security.encryption import encrypt_credentials
+from application.core.mongo_db import MongoDB
+from application.core.settings import settings
+from application.core.url_validation import SSRFError, validate_url
+from application.security.encryption import decrypt_credentials, encrypt_credentials
 from application.utils import check_required_fields

 tools_mcp_ns = Namespace("tools", description="Tool management operations", path="/api")

+_mongo = MongoDB.get_client()
+_db = _mongo[settings.MONGO_DB_NAME]
+_connector_sessions = _db["connector_sessions"]
+
+_ALLOWED_TRANSPORTS = {"auto", "sse", "http"}
+
+
+def _sanitize_mcp_transport(config):
+    """Normalise and validate the transport_type field.
+
+    Strips ``command`` / ``args`` keys that are only valid for local STDIO
+    transports and returns the cleaned transport type string.
+    """
+    transport_type = (config.get("transport_type") or "auto").lower()
+    if transport_type not in _ALLOWED_TRANSPORTS:
+        raise ValueError(f"Unsupported transport_type: {transport_type}")
+    config.pop("command", None)
+    config.pop("args", None)
+    config["transport_type"] = transport_type
+    return transport_type
+
+
+def _extract_auth_credentials(config):
+    """Build an ``auth_credentials`` dict from the raw MCP config."""
+    auth_credentials = {}
+    auth_type = config.get("auth_type", "none")
+
+    if auth_type == "api_key":
+        if config.get("api_key"):
+            auth_credentials["api_key"] = config["api_key"]
+        if config.get("api_key_header"):
+            auth_credentials["api_key_header"] = config["api_key_header"]
+    elif auth_type == "bearer":
+        if config.get("bearer_token"):
+            auth_credentials["bearer_token"] = config["bearer_token"]
+    elif auth_type == "basic":
+        if config.get("username"):
+            auth_credentials["username"] = config["username"]
+        if config.get("password"):
+            auth_credentials["password"] = config["password"]
+
+    return auth_credentials
+
+
+def _validate_mcp_server_url(config: dict) -> None:
+    """Validate the server_url in an MCP config to prevent SSRF.
+
+    Raises:
+        ValueError: If the URL is missing or points to a blocked address.
+    """
+    server_url = (config.get("server_url") or "").strip()
+    if not server_url:
+        raise ValueError("server_url is required")
+    try:
+        validate_url(server_url)
+    except SSRFError as exc:
+        raise ValueError(f"Invalid server URL: {exc}") from exc
+

@tools_mcp_ns.route("/mcp_server/test")
 class TestMCPServerConfig(Resource):
@@ -43,49 +105,63 @@ class TestMCPServerConfig(Resource):
            return missing_fields
        try:
            config = data["config"]
-            transport_type = (config.get("transport_type") or "auto").lower()
-            allowed_transports = {"auto", "sse", "http"}
-            if transport_type not in allowed_transports:
+            try:
+                _sanitize_mcp_transport(config)
+            except ValueError:
                return make_response(
                    jsonify({"success": False, "error": "Unsupported transport_type"}),
                    400,
                )
-            config.pop("command", None)
-            config.pop("args", None)
-            config["transport_type"] = transport_type

-            auth_credentials = {}
-            auth_type = config.get("auth_type", "none")
+            _validate_mcp_server_url(config)

-            if auth_type == "api_key" and "api_key" in config:
-                auth_credentials["api_key"] = config["api_key"]
-                if "api_key_header" in config:
-                    auth_credentials["api_key_header"] = config["api_key_header"]
-            elif auth_type == "bearer" and "bearer_token" in config:
-                auth_credentials["bearer_token"] = config["bearer_token"]
-            elif auth_type == "basic":
-                if "username" in config:
-                    auth_credentials["username"] = config["username"]
-                if "password" in config:
-                    auth_credentials["password"] = config["password"]
+            auth_credentials = _extract_auth_credentials(config)
            test_config = config.copy()
            test_config["auth_credentials"] = auth_credentials

            mcp_tool = MCPTool(config=test_config, user_id=user)
            result = mcp_tool.test_connection()

-            # Sanitize the response to avoid exposing internal error details
-            if not result.get("success") and "message" in result:
-                current_app.logger.error(f"MCP connection test failed: {result.get('message')}")
-                result["message"] = "Connection test failed"
+            if result.get("requires_oauth"):
+                safe_result = {
+                    k: v
+                    for k, v in result.items()
+                    if k in ("success", "requires_oauth", "auth_url")
+                }
+                return make_response(jsonify(safe_result), 200)

-            return make_response(jsonify(result), 200)
+            if not result.get("success"):
+                current_app.logger.error(
+                    f"MCP connection test failed: {result.get('message')}"
+                )
+                return make_response(
+                    jsonify(
+                        {
+                            "success": False,
+                            "message": "Connection test failed",
+                            "tools_count": 0,
+                        }
+                    ),
+                    200,
+                )
+
+            safe_result = {
+                "success": True,
+                "message": result.get("message", "Connection successful"),
+                "tools_count": result.get("tools_count", 0),
+                "tools": result.get("tools", []),
+            }
+            return make_response(jsonify(safe_result), 200)
+        except ValueError as e:
+            current_app.logger.warning(f"Invalid MCP server test request: {e}")
+            return make_response(
+                jsonify({"success": False, "error": "Invalid MCP server configuration"}),
+                400,
+            )
        except Exception as e:
            current_app.logger.error(f"Error testing MCP server: {e}", exc_info=True)
            return make_response(
-                jsonify(
-                    {"success": False, "error": "Connection test failed"}
-                ),
+                jsonify({"success": False, "error": "Connection test failed"}),
                500,
            )

@@ -125,32 +201,18 @@ class MCPServerSave(Resource):
            return missing_fields
        try:
            config = data["config"]
-            transport_type = (config.get("transport_type") or "auto").lower()
-            allowed_transports = {"auto", "sse", "http"}
-            if transport_type not in allowed_transports:
+            try:
+                _sanitize_mcp_transport(config)
+            except ValueError:
                return make_response(
                    jsonify({"success": False, "error": "Unsupported transport_type"}),
                    400,
                )
-            config.pop("command", None)
-            config.pop("args", None)
-            config["transport_type"] = transport_type

-            auth_credentials = {}
+            _validate_mcp_server_url(config)
+
+            auth_credentials = _extract_auth_credentials(config)
            auth_type = config.get("auth_type", "none")
-            if auth_type == "api_key":
-                if "api_key" in config and config["api_key"]:
-                    auth_credentials["api_key"] = config["api_key"]
-                if "api_key_header" in config:
-                    auth_credentials["api_key_header"] = config["api_key_header"]
-            elif auth_type == "bearer":
-                if "bearer_token" in config and config["bearer_token"]:
-                    auth_credentials["bearer_token"] = config["bearer_token"]
-            elif auth_type == "basic":
-                if "username" in config and config["username"]:
-                    auth_credentials["username"] = config["username"]
-                if "password" in config and config["password"]:
-                    auth_credentials["password"] = config["password"]
            mcp_config = config.copy()
            mcp_config["auth_credentials"] = auth_credentials

@@ -188,30 +250,39 @@ class MCPServerSave(Resource):
                    "No valid credentials provided for the selected authentication type"
                )
            storage_config = config.copy()
+
+            tool_id = data.get("id")
+            existing_encrypted = None
+            if tool_id:
+                existing_doc = user_tools_collection.find_one(
+                    {"_id": ObjectId(tool_id), "user": user, "name": "mcp_tool"}
+                )
+                if existing_doc:
+                    existing_encrypted = existing_doc.get("config", {}).get(
+                        "encrypted_credentials"
+                    )
+
            if auth_credentials:
-                encrypted_credentials_string = encrypt_credentials(
+                if existing_encrypted:
+                    existing_secrets = decrypt_credentials(existing_encrypted, user)
+                    existing_secrets.update(auth_credentials)
+                    auth_credentials = existing_secrets
+                storage_config["encrypted_credentials"] = encrypt_credentials(
                    auth_credentials, user
                )
-                storage_config["encrypted_credentials"] = encrypted_credentials_string
+            elif existing_encrypted:
+                storage_config["encrypted_credentials"] = existing_encrypted
+
            for field in [
                "api_key",
                "bearer_token",
                "username",
                "password",
                "api_key_header",
+                "redirect_uri",
            ]:
                storage_config.pop(field, None)
-            transformed_actions = []
-            for action in actions_metadata:
-                action["active"] = True
-                if "parameters" in action:
-                    if "properties" in action["parameters"]:
-                        for param_name, param_details in action["parameters"][
-                            "properties"
-                        ].items():
-                            param_details["filled_by_llm"] = True
-                            param_details["value"] = ""
-                transformed_actions.append(action)
+            transformed_actions = transform_actions(actions_metadata)
            tool_data = {
                "name": "mcp_tool",
                "displayName": data["displayName"],
@@ -223,7 +294,6 @@ class MCPServerSave(Resource):
                "user": user,
            }

-            tool_id = data.get("id")
            if tool_id:
                result = user_tools_collection.update_one(
                    {"_id": ObjectId(tool_id), "user": user, "name": "mcp_tool"},
@@ -255,12 +325,16 @@ class MCPServerSave(Resource):
                    "tools_count": len(transformed_actions),
                }
            return make_response(jsonify(response_data), 200)
+        except ValueError as e:
+            current_app.logger.warning(f"Invalid MCP server save request: {e}")
+            return make_response(
+                jsonify({"success": False, "error": "Invalid MCP server configuration"}),
+                400,
+            )
        except Exception as e:
            current_app.logger.error(f"Error saving MCP server: {e}", exc_info=True)
            return make_response(
-                jsonify(
-                    {"success": False, "error": "Failed to save MCP server"}
-                ),
+                jsonify({"success": False, "error": "Failed to save MCP server"}),
                500,
            )

@@ -291,7 +365,7 @@ class MCPOAuthCallback(Resource):
            params = {
                "status": "error",
                "message": f"OAuth error: {error}. Please try again and make sure to grant all requested permissions, including offline access.",
-                "provider": "mcp_tool"
+                "provider": "mcp_tool",
            }
            return redirect(f"/api/connectors/callback-status?{urlencode(params)}")
        if not code or not state:
@@ -304,7 +378,6 @@ class MCPOAuthCallback(Resource):
                return redirect(
                    "/api/connectors/callback-status?status=error&message=Internal+server+error:+Redis+not+available.&provider=mcp_tool"
                )
-            code = unquote(code)
            manager = MCPOAuthManager(redis_client)
            success = manager.handle_oauth_callback(state, code, error)
            if success:
@@ -327,10 +400,6 @@ class MCPOAuthCallback(Resource):
@tools_mcp_ns.route("/mcp_server/oauth_status/<string:task_id>")
 class MCPOAuthStatus(Resource):
    def get(self, task_id):
-        """
-        Get current status of OAuth flow.
-        Frontend should poll this endpoint periodically.
-        """
        try:
            redis_client = get_redis_instance()
            status_key = f"mcp_oauth_status:{task_id}"
@@ -338,6 +407,14 @@ class MCPOAuthStatus(Resource):

            if status_data:
                status = json.loads(status_data)
+                if "tools" in status and isinstance(status["tools"], list):
+                    status["tools"] = [
+                        {
+                            "name": t.get("name", "unknown"),
+                            "description": t.get("description", ""),
+                        }
+                        for t in status["tools"]
+                    ]
                return make_response(
                    jsonify({"success": True, "task_id": task_id, **status})
                )
@@ -345,17 +422,93 @@ class MCPOAuthStatus(Resource):
                return make_response(
                    jsonify(
                        {
-                            "success": False,
-                            "error": "Task not found or expired",
+                            "success": True,
                            "task_id": task_id,
+                            "status": "pending",
+                            "message": "Waiting for OAuth to start...",
                        }
                    ),
-                    404,
+                    200,
                )
        except Exception as e:
            current_app.logger.error(
-                f"Error getting OAuth status for task {task_id}: {str(e)}", exc_info=True
+                f"Error getting OAuth status for task {task_id}: {str(e)}",
+                exc_info=True,
            )
            return make_response(
-                jsonify({"success": False, "error": "Failed to get OAuth status", "task_id": task_id}), 500
+                jsonify(
+                    {
+                        "success": False,
+                        "error": "Failed to get OAuth status",
+                        "task_id": task_id,
+                    }
+                ),
+                500,
+            )
+
+
+@tools_mcp_ns.route("/mcp_server/auth_status")
+class MCPAuthStatus(Resource):
+    @api.doc(
+        description="Batch check auth status for all MCP tools. "
+        "Lightweight DB-only check — no network calls to MCP servers."
+    )
+    def get(self):
+        decoded_token = request.decoded_token
+        if not decoded_token:
+            return make_response(jsonify({"success": False}), 401)
+        user = decoded_token.get("sub")
+        try:
+            mcp_tools = list(
+                user_tools_collection.find(
+                    {"user": user, "name": "mcp_tool"},
+                    {"_id": 1, "config": 1},
+                )
+            )
+            if not mcp_tools:
+                return make_response(jsonify({"success": True, "statuses": {}}), 200)
+
+            oauth_server_urls = {}
+            statuses = {}
+            for tool in mcp_tools:
+                tool_id = str(tool["_id"])
+                config = tool.get("config", {})
+                auth_type = config.get("auth_type", "none")
+                if auth_type == "oauth":
+                    server_url = config.get("server_url", "")
+                    if server_url:
+                        parsed = urlparse(server_url)
+                        base_url = f"{parsed.scheme}://{parsed.netloc}"
+                        oauth_server_urls[tool_id] = base_url
+                    else:
+                        statuses[tool_id] = "needs_auth"
+                else:
+                    statuses[tool_id] = "configured"
+
+            if oauth_server_urls:
+                unique_urls = list(set(oauth_server_urls.values()))
+                sessions = list(
+                    _connector_sessions.find(
+                        {"user_id": user, "server_url": {"$in": unique_urls}},
+                        {"server_url": 1, "tokens": 1},
+                    )
+                )
+                url_has_tokens = {
+                    doc["server_url"]: bool(doc.get("tokens", {}).get("access_token"))
+                    for doc in sessions
+                }
+                for tool_id, base_url in oauth_server_urls.items():
+                    if url_has_tokens.get(base_url):
+                        statuses[tool_id] = "connected"
+                    else:
+                        statuses[tool_id] = "needs_auth"
+
+            return make_response(jsonify({"success": True, "statuses": statuses}), 200)
+        except Exception as e:
+            current_app.logger.error(
+                "Error checking MCP auth status: %s", e, exc_info=True
+            )
+            return make_response(
+                jsonify({"success": False, "error": "Failed to check auth status"}),
+                500,
            )
--- a/application/api/user/tools/routes.py
+++ b/application/api/user/tools/routes.py
@@ -8,6 +8,9 @@ from application.agents.tools.spec_parser import parse_spec
 from application.agents.tools.tool_manager import ToolManager
 from application.api import api
 from application.api.user.base import user_tools_collection
+from application.core.url_validation import SSRFError, validate_url
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.user_tools import UserToolsRepository
 from application.security.encryption import decrypt_credentials, encrypt_credentials
 from application.utils import check_required_fields, validate_function_name

@@ -15,6 +18,114 @@ tool_config = {}
 tool_manager = ToolManager(config=tool_config)


+def _encrypt_secret_fields(config, config_requirements, user_id):
+    secret_keys = [
+        key for key, spec in config_requirements.items()
+        if spec.get("secret") and key in config and config[key]
+    ]
+    if not secret_keys:
+        return config
+
+    storage_config = config.copy()
+    secret_values = {k: config[k] for k in secret_keys}
+    storage_config["encrypted_credentials"] = encrypt_credentials(secret_values, user_id)
+    for key in secret_keys:
+        storage_config.pop(key, None)
+    return storage_config
+
+
+def _validate_config(config, config_requirements, has_existing_secrets=False):
+    errors = {}
+    for key, spec in config_requirements.items():
+        depends_on = spec.get("depends_on")
+        if depends_on:
+            if not all(config.get(dk) == dv for dk, dv in depends_on.items()):
+                continue
+        if spec.get("required") and not config.get(key):
+            if has_existing_secrets and spec.get("secret"):
+                continue
+            errors[key] = f"{spec.get('label', key)} is required"
+        value = config.get(key)
+        if value is not None and value != "":
+            if spec.get("type") == "number":
+                try:
+                    num = float(value)
+                    if key == "timeout" and (num < 1 or num > 300):
+                        errors[key] = "Timeout must be between 1 and 300"
+                except (ValueError, TypeError):
+                    errors[key] = f"{spec.get('label', key)} must be a number"
+            if spec.get("enum") and value not in spec["enum"]:
+                errors[key] = f"Invalid value for {spec.get('label', key)}"
+    return errors
+
+
+def _merge_secrets_on_update(new_config, existing_config, config_requirements, user_id):
+    """Merge incoming config with existing encrypted secrets and re-encrypt.
+
+    For updates, the client may omit unchanged secret values.  This helper
+    decrypts any previously stored secrets, overlays whatever the client *did*
+    send, strips plain-text secrets from the stored config, and re-encrypts
+    the merged result.
+
+    Returns the final ``config`` dict ready for persistence.
+    """
+    secret_keys = [
+        key for key, spec in config_requirements.items()
+        if spec.get("secret")
+    ]
+
+    if not secret_keys:
+        return new_config
+
+    existing_secrets = {}
+    if "encrypted_credentials" in existing_config:
+        existing_secrets = decrypt_credentials(
+            existing_config["encrypted_credentials"], user_id
+        )
+
+    merged_secrets = existing_secrets.copy()
+    for key in secret_keys:
+        if key in new_config and new_config[key]:
+            merged_secrets[key] = new_config[key]
+
+    # Start from existing non-secret values, then overlay incoming non-secrets
+    storage_config = {
+        k: v for k, v in existing_config.items()
+        if k not in secret_keys and k != "encrypted_credentials"
+    }
+    storage_config.update(
+        {k: v for k, v in new_config.items() if k not in secret_keys}
+    )
+
+    if merged_secrets:
+        storage_config["encrypted_credentials"] = encrypt_credentials(
+            merged_secrets, user_id
+        )
+    else:
+        storage_config.pop("encrypted_credentials", None)
+
+    storage_config.pop("has_encrypted_credentials", None)
+    return storage_config
+
+
+def transform_actions(actions_metadata):
+    """Set default flags on action metadata for storage.
+
+    Marks each action as active, sets ``filled_by_llm`` and ``value`` on every
+    parameter property. Used by both the generic create_tool and MCP save routes.
+    """
+    transformed = []
+    for action in actions_metadata:
+        action["active"] = True
+        if "parameters" in action:
+            props = action["parameters"].get("properties", {})
+            for param_details in props.values():
+                param_details["filled_by_llm"] = True
+                param_details["value"] = ""
+        transformed.append(action)
+    return transformed
+
+
 tools_ns = Namespace("tools", description="Tool management operations", path="/api")


@@ -22,6 +133,8 @@ tools_ns = Namespace("tools", description="Tool management operations", path="/a
 class AvailableTools(Resource):
    @api.doc(description="Get available tools for a user")
    def get(self):
+        if not request.decoded_token:
+            return make_response(jsonify({"success": False}), 401)
        try:
            tools_metadata = []
            for tool_name, tool_instance in tool_manager.tools.items():
@@ -29,12 +142,15 @@ class AvailableTools(Resource):
                lines = doc.split("\n", 1)
                name = lines[0].strip()
                description = lines[1].strip() if len(lines) > 1 else ""
+                config_req = tool_instance.get_config_requirements()
+                actions = tool_instance.get_actions_metadata()
                tools_metadata.append(
                    {
                        "name": tool_name,
                        "displayName": name,
                        "description": description,
-                        "configRequirements": tool_instance.get_config_requirements(),
+                        "configRequirements": config_req,
+                        "actions": actions,
                    }
                )
        except Exception as err:
@@ -60,6 +176,21 @@ class GetTools(Resource):
                tool_copy = {**tool}
                tool_copy["id"] = str(tool["_id"])
                tool_copy.pop("_id", None)
+
+                config_req = tool_copy.get("configRequirements", {})
+                if not config_req:
+                    tool_instance = tool_manager.tools.get(tool_copy.get("name"))
+                    if tool_instance:
+                        config_req = tool_instance.get_config_requirements()
+                        tool_copy["configRequirements"] = config_req
+
+                has_secrets = any(
+                    spec.get("secret") for spec in config_req.values()
+                ) if config_req else False
+                if has_secrets and "encrypted_credentials" in tool_copy.get("config", {}):
+                    tool_copy["config"]["has_encrypted_credentials"] = True
+                    tool_copy["config"].pop("encrypted_credentials", None)
+
                user_tools.append(tool_copy)
        except Exception as err:
            current_app.logger.error(f"Error getting user tools: {err}", exc_info=True)
@@ -110,29 +241,48 @@ class CreateTool(Resource):
        if missing_fields:
            return missing_fields
        try:
+            if data["name"] == "mcp_tool":
+                server_url = (data.get("config", {}).get("server_url") or "").strip()
+                if server_url:
+                    try:
+                        validate_url(server_url)
+                    except SSRFError:
+                        return make_response(
+                            jsonify({"success": False, "message": "Invalid server URL"}),
+                            400,
+                        )
            tool_instance = tool_manager.tools.get(data["name"])
            if not tool_instance:
                return make_response(
                    jsonify({"success": False, "message": "Tool not found"}), 404
                )
            actions_metadata = tool_instance.get_actions_metadata()
-            transformed_actions = []
-            for action in actions_metadata:
-                action["active"] = True
-                if "parameters" in action:
-                    if "properties" in action["parameters"]:
-                        for param_name, param_details in action["parameters"][
-                            "properties"
-                        ].items():
-                            param_details["filled_by_llm"] = True
-                            param_details["value"] = ""
-                transformed_actions.append(action)
+            transformed_actions = transform_actions(actions_metadata)
        except Exception as err:
            current_app.logger.error(
                f"Error getting tool actions: {err}", exc_info=True
            )
            return make_response(jsonify({"success": False}), 400)
        try:
+            config_requirements = tool_instance.get_config_requirements()
+            if config_requirements:
+                validation_errors = _validate_config(
+                    data["config"], config_requirements
+                )
+                if validation_errors:
+                    return make_response(
+                        jsonify(
+                            {
+                                "success": False,
+                                "message": "Validation failed",
+                                "errors": validation_errors,
+                            }
+                        ),
+                        400,
+                    )
+            storage_config = _encrypt_secret_fields(
+                data["config"], config_requirements, user
+            )
            new_tool = {
                "user": user,
                "name": data["name"],
@@ -140,11 +290,19 @@ class CreateTool(Resource):
                "description": data["description"],
                "customName": data.get("customName", ""),
                "actions": transformed_actions,
-                "config": data["config"],
+                "config": storage_config,
+                "configRequirements": config_requirements,
                "status": data["status"],
            }
            resp = user_tools_collection.insert_one(new_tool)
            new_id = str(resp.inserted_id)
+            dual_write(
+                UserToolsRepository,
+                lambda repo, u=user, t=new_tool: repo.create(
+                    u, t["name"], config=t.get("config"),
+                    custom_name=t.get("customName"), display_name=t.get("displayName"),
+                ),
+            )
        except Exception as err:
            current_app.logger.error(f"Error creating tool: {err}", exc_info=True)
            return make_response(jsonify({"success": False}), 400)
@@ -210,57 +368,37 @@ class UpdateTool(Resource):
                tool_doc = user_tools_collection.find_one(
                    {"_id": ObjectId(data["id"]), "user": user}
                )
-                if tool_doc and tool_doc.get("name") == "mcp_tool":
-                    config = data["config"]
-                    existing_config = tool_doc.get("config", {})
-                    storage_config = existing_config.copy()
+                if not tool_doc:
+                    return make_response(
+                        jsonify({"success": False, "message": "Tool not found"}),
+                        404,
+                    )
+                tool_name = tool_doc.get("name", data.get("name"))
+                tool_instance = tool_manager.tools.get(tool_name)
+                config_requirements = (
+                    tool_instance.get_config_requirements() if tool_instance else {}
+                )
+                existing_config = tool_doc.get("config", {})
+                has_existing_secrets = "encrypted_credentials" in existing_config

-                    storage_config.update(config)
-                    existing_credentials = {}
-                    if "encrypted_credentials" in existing_config:
-                        existing_credentials = decrypt_credentials(
-                            existing_config["encrypted_credentials"], user
+                if config_requirements:
+                    validation_errors = _validate_config(
+                        data["config"], config_requirements,
+                        has_existing_secrets=has_existing_secrets,
+                    )
+                    if validation_errors:
+                        return make_response(
+                            jsonify({
+                                "success": False,
+                                "message": "Validation failed",
+                                "errors": validation_errors,
+                            }),
+                            400,
                        )
-                    auth_credentials = existing_credentials.copy()
-                    auth_type = storage_config.get("auth_type", "none")
-                    if auth_type == "api_key":
-                        if "api_key" in config and config["api_key"]:
-                            auth_credentials["api_key"] = config["api_key"]
-                        if "api_key_header" in config:
-                            auth_credentials["api_key_header"] = config[
-                                "api_key_header"
-                            ]
-                    elif auth_type == "bearer":
-                        if "bearer_token" in config and config["bearer_token"]:
-                            auth_credentials["bearer_token"] = config["bearer_token"]
-                        elif "encrypted_token" in config and config["encrypted_token"]:
-                            auth_credentials["bearer_token"] = config["encrypted_token"]
-                    elif auth_type == "basic":
-                        if "username" in config and config["username"]:
-                            auth_credentials["username"] = config["username"]
-                        if "password" in config and config["password"]:
-                            auth_credentials["password"] = config["password"]
-                    if auth_type != "none" and auth_credentials:
-                        encrypted_credentials_string = encrypt_credentials(
-                            auth_credentials, user
-                        )
-                        storage_config["encrypted_credentials"] = (
-                            encrypted_credentials_string
-                        )
-                    elif auth_type == "none":
-                        storage_config.pop("encrypted_credentials", None)
-                    for field in [
-                        "api_key",
-                        "bearer_token",
-                        "encrypted_token",
-                        "username",
-                        "password",
-                        "api_key_header",
-                    ]:
-                        storage_config.pop(field, None)
-                    update_data["config"] = storage_config
-                else:
-                    update_data["config"] = data["config"]
+
+                update_data["config"] = _merge_secrets_on_update(
+                    data["config"], existing_config, config_requirements, user
+                )
            if "status" in data:
                update_data["status"] = data["status"]
            user_tools_collection.update_one(
@@ -298,9 +436,52 @@ class UpdateToolConfig(Resource):
        if missing_fields:
            return missing_fields
        try:
+            tool_doc = user_tools_collection.find_one(
+                {"_id": ObjectId(data["id"]), "user": user}
+            )
+            if not tool_doc:
+                return make_response(jsonify({"success": False}), 404)
+
+            tool_name = tool_doc.get("name")
+            if tool_name == "mcp_tool":
+                server_url = (data["config"].get("server_url") or "").strip()
+                if server_url:
+                    try:
+                        validate_url(server_url)
+                    except SSRFError:
+                        return make_response(
+                            jsonify({"success": False, "message": "Invalid server URL"}),
+                            400,
+                        )
+            tool_instance = tool_manager.tools.get(tool_name)
+            config_requirements = (
+                tool_instance.get_config_requirements() if tool_instance else {}
+            )
+            existing_config = tool_doc.get("config", {})
+            has_existing_secrets = "encrypted_credentials" in existing_config
+
+            if config_requirements:
+                validation_errors = _validate_config(
+                    data["config"], config_requirements,
+                    has_existing_secrets=has_existing_secrets,
+                )
+                if validation_errors:
+                    return make_response(
+                        jsonify({
+                            "success": False,
+                            "message": "Validation failed",
+                            "errors": validation_errors,
+                        }),
+                        400,
+                    )
+
+            final_config = _merge_secrets_on_update(
+                data["config"], existing_config, config_requirements, user
+            )
+
            user_tools_collection.update_one(
                {"_id": ObjectId(data["id"]), "user": user},
-                {"$set": {"config": data["config"]}},
+                {"$set": {"config": final_config}},
            )
        except Exception as err:
            current_app.logger.error(
@@ -409,12 +590,18 @@ class DeleteTool(Resource):
            result = user_tools_collection.delete_one(
                {"_id": ObjectId(data["id"]), "user": user}
            )
+            dual_write(
+                UserToolsRepository,
+                lambda repo, tid=data["id"], u=user: repo.delete(tid, u),
+            )
            if result.deleted_count == 0:
-                return {"success": False, "message": "Tool not found"}, 404
+                return make_response(
+                    jsonify({"success": False, "message": "Tool not found"}), 404
+                )
        except Exception as err:
            current_app.logger.error(f"Error deleting tool: {err}", exc_info=True)
-            return {"success": False}, 400
-        return {"success": True}, 200
+            return make_response(jsonify({"success": False}), 400)
+        return make_response(jsonify({"success": True}), 200)


@tools_ns.route("/parse_spec")
@@ -511,7 +698,6 @@ class GetArtifact(Resource):
        todo_doc = db["todos"].find_one({"_id": obj_id, "user_id": user_id})
        if todo_doc:
            tool_id = todo_doc.get("tool_id")
-            # Return all todos for the tool
            query = {"user_id": user_id, "tool_id": tool_id}
            all_todos = list(db["todos"].find(query))
            items = []
--- a/application/api/user/utils.py
+++ b/application/api/user/utils.py
@@ -5,7 +5,14 @@ from typing import Any, Callable, Dict, List, Optional, Tuple

 from bson.errors import InvalidId
 from bson.objectid import ObjectId
-from flask import jsonify, make_response, request, Response
+from flask import (
+    Response,
+    current_app,
+    has_app_context,
+    jsonify,
+    make_response,
+    request,
+)
 from pymongo.collection import Collection


@@ -319,8 +326,10 @@ def safe_db_operation(
    try:
        result = operation()
        return result, None
-    except Exception as e:
-        return None, error_response(f"{error_message}: {str(e)}")
+    except Exception as err:
+        if has_app_context():
+            current_app.logger.error(f"{error_message}: {err}", exc_info=True)
+        return None, error_response(error_message)


 def validate_enum(
--- a/application/api/user/workflows/routes.py
+++ b/application/api/user/workflows/routes.py
@@ -11,6 +11,10 @@ from application.api.user.base import (
    workflow_nodes_collection,
    workflows_collection,
 )
+from application.storage.db.dual_write import dual_write
+from application.storage.db.repositories.workflow_edges import WorkflowEdgesRepository
+from application.storage.db.repositories.workflow_nodes import WorkflowNodesRepository
+from application.storage.db.repositories.workflows import WorkflowsRepository
 from application.core.json_schema_utils import (
    JsonSchemaValidationError,
    normalize_json_schema_payload,
@@ -30,6 +34,179 @@ from application.api.user.utils import (
 workflows_ns = Namespace("workflows", path="/api")


+def _workflow_error_response(message: str, err: Exception):
+    current_app.logger.error(f"{message}: {err}", exc_info=True)
+    return error_response(message)
+
+
+# ---------------------------------------------------------------------------
+# Postgres dual-write helpers
+#
+# Workflows are unusual relative to other Phase 3 tables: a single user
+# action (create / update) writes to three collections in concert
+# (workflows + workflow_nodes + workflow_edges) and the edges reference
+# nodes by user-provided string ids. The Postgres mirror needs to:
+#
+# 1. Run all three writes inside one PG transaction (so the just-created
+#    nodes are visible when we resolve their UUIDs for the edge insert).
+# 2. Translate edge source_id/target_id strings → workflow_nodes.id UUIDs
+#    after the bulk_create returns them.
+#
+# Each helper opens exactly one ``dual_write`` call (one PG txn) and uses
+# the connection from whichever repo it was instantiated with to spin up
+# any sibling repos it needs.
+# ---------------------------------------------------------------------------
+
+
+def _dual_write_workflow_create(
+    mongo_workflow_id: str,
+    user_id: str,
+    name: str,
+    description: str,
+    nodes_data: List[Dict],
+    edges_data: List[Dict],
+    graph_version: int = 1,
+) -> None:
+    """Mirror a Mongo workflow create into Postgres."""
+
+    def _do(repo: WorkflowsRepository) -> None:
+        conn = repo._conn
+        wf = repo.create(
+            user_id,
+            name,
+            description=description,
+            legacy_mongo_id=mongo_workflow_id,
+        )
+        _write_graph(conn, wf["id"], graph_version, nodes_data, edges_data)
+
+    dual_write(WorkflowsRepository, _do)
+
+
+def _dual_write_workflow_update(
+    mongo_workflow_id: str,
+    user_id: str,
+    name: str,
+    description: str,
+    nodes_data: List[Dict],
+    edges_data: List[Dict],
+    next_graph_version: int,
+) -> None:
+    """Mirror a Mongo workflow update into Postgres.
+
+    Mirrors the Mongo route: insert the new graph_version's nodes/edges,
+    bump the workflow's name/description/current_graph_version, then drop
+    every other graph_version's nodes/edges.
+    """
+
+    def _do(repo: WorkflowsRepository) -> None:
+        conn = repo._conn
+        wf = _resolve_pg_workflow(conn, mongo_workflow_id)
+        if wf is None:
+            return
+        _write_graph(conn, wf["id"], next_graph_version, nodes_data, edges_data)
+        repo.update(wf["id"], user_id, {
+            "name": name,
+            "description": description,
+            "current_graph_version": next_graph_version,
+        })
+        WorkflowNodesRepository(conn).delete_other_versions(
+            wf["id"], next_graph_version,
+        )
+        WorkflowEdgesRepository(conn).delete_other_versions(
+            wf["id"], next_graph_version,
+        )
+
+    dual_write(WorkflowsRepository, _do)
+
+
+def _dual_write_workflow_delete(mongo_workflow_id: str, user_id: str) -> None:
+    """Mirror a Mongo workflow delete into Postgres.
+
+    The CASCADE on workflows.id → workflow_nodes/workflow_edges takes
+    care of the children automatically.
+    """
+
+    def _do(repo: WorkflowsRepository) -> None:
+        wf = _resolve_pg_workflow(repo._conn, mongo_workflow_id)
+        if wf is not None:
+            repo.delete(wf["id"], user_id)
+
+    dual_write(WorkflowsRepository, _do)
+
+
+def _resolve_pg_workflow(conn, mongo_workflow_id: str) -> Optional[Dict]:
+    """Look up a Postgres workflow by its Mongo ObjectId string."""
+    from sqlalchemy import text as _text
+    row = conn.execute(
+        _text("SELECT id FROM workflows WHERE legacy_mongo_id = :legacy_id"),
+        {"legacy_id": mongo_workflow_id},
+    ).fetchone()
+    return {"id": str(row[0])} if row else None
+
+
+def _write_graph(
+    conn,
+    pg_workflow_id: str,
+    graph_version: int,
+    nodes_data: List[Dict],
+    edges_data: List[Dict],
+) -> None:
+    """Bulk-create nodes + edges for one graph version inside one txn.
+
+    Edges arrive with source/target as user-provided node-id strings
+    (the same shape the Mongo route stores). We bulk-insert nodes first,
+    capture their ``node_id → UUID`` map from the returned rows, then
+    translate edge source/target strings to those UUIDs before the edge
+    bulk insert. Edges referencing missing nodes are dropped (logged).
+    """
+    nodes_repo = WorkflowNodesRepository(conn)
+    edges_repo = WorkflowEdgesRepository(conn)
+
+    if nodes_data:
+        created_nodes = nodes_repo.bulk_create(
+            pg_workflow_id, graph_version,
+            [
+                {
+                    "node_id": n["id"],
+                    "node_type": n["type"],
+                    "title": n.get("title", ""),
+                    "description": n.get("description", ""),
+                    "position": n.get("position", {"x": 0, "y": 0}),
+                    "config": n.get("data", {}),
+                    "legacy_mongo_id": n.get("legacy_mongo_id"),
+                }
+                for n in nodes_data
+            ],
+        )
+        node_uuid_by_str = {n["node_id"]: n["id"] for n in created_nodes}
+    else:
+        node_uuid_by_str = {}
+
+    if edges_data:
+        translated_edges: List[Dict] = []
+        for e in edges_data:
+            src = e.get("source")
+            tgt = e.get("target")
+            from_uuid = node_uuid_by_str.get(src)
+            to_uuid = node_uuid_by_str.get(tgt)
+            if not from_uuid or not to_uuid:
+                current_app.logger.warning(
+                    "PG dual-write: dropping edge %s; node refs unresolved "
+                    "(source=%s, target=%s)",
+                    e.get("id"), src, tgt,
+                )
+                continue
+            translated_edges.append({
+                "edge_id": e["id"],
+                "from_node_id": from_uuid,
+                "to_node_id": to_uuid,
+                "source_handle": e.get("sourceHandle"),
+                "target_handle": e.get("targetHandle"),
+            })
+        if translated_edges:
+            edges_repo.bulk_create(pg_workflow_id, graph_version, translated_edges)
+
+
 def serialize_workflow(w: Dict) -> Dict:
    """Serialize workflow document to API response format."""
    return {
@@ -312,24 +489,28 @@ def _can_reach_end(

 def create_workflow_nodes(
    workflow_id: str, nodes_data: List[Dict], graph_version: int
-) -> None:
-    """Insert workflow nodes into database."""
+) -> List[Dict]:
+    """Insert workflow nodes into Mongo and return rows with Mongo ids."""
    if nodes_data:
-        workflow_nodes_collection.insert_many(
-            [
-                {
-                    "id": n["id"],
-                    "workflow_id": workflow_id,
-                    "graph_version": graph_version,
-                    "type": n["type"],
-                    "title": n.get("title", ""),
-                    "description": n.get("description", ""),
-                    "position": n.get("position", {"x": 0, "y": 0}),
-                    "config": n.get("data", {}),
-                }
-                for n in nodes_data
-            ]
-        )
+        mongo_nodes = [
+            {
+                "id": n["id"],
+                "workflow_id": workflow_id,
+                "graph_version": graph_version,
+                "type": n["type"],
+                "title": n.get("title", ""),
+                "description": n.get("description", ""),
+                "position": n.get("position", {"x": 0, "y": 0}),
+                "config": n.get("data", {}),
+            }
+            for n in nodes_data
+        ]
+        result = workflow_nodes_collection.insert_many(mongo_nodes)
+        return [
+            {**node, "legacy_mongo_id": str(inserted_id)}
+            for node, inserted_id in zip(nodes_data, result.inserted_ids)
+        ]
+    return []


 def create_workflow_edges(
@@ -394,13 +575,22 @@ class WorkflowList(Resource):
        workflow_id = str(result.inserted_id)

        try:
-            create_workflow_nodes(workflow_id, nodes_data, 1)
+            created_nodes = create_workflow_nodes(workflow_id, nodes_data, 1)
            create_workflow_edges(workflow_id, edges_data, 1)
-        except Exception as e:
+        except Exception as err:
            workflow_nodes_collection.delete_many({"workflow_id": workflow_id})
            workflow_edges_collection.delete_many({"workflow_id": workflow_id})
            workflows_collection.delete_one({"_id": result.inserted_id})
-            return error_response(f"Failed to create workflow structure: {str(e)}")
+            return _workflow_error_response("Failed to create workflow structure", err)
+
+        _dual_write_workflow_create(
+            workflow_id,
+            user_id,
+            name,
+            data.get("description", ""),
+            created_nodes,
+            edges_data,
+        )

        return success_response({"id": workflow_id}, 201)

@@ -468,16 +658,18 @@ class WorkflowDetail(Resource):
        current_graph_version = get_workflow_graph_version(workflow)
        next_graph_version = current_graph_version + 1
        try:
-            create_workflow_nodes(workflow_id, nodes_data, next_graph_version)
+            created_nodes = create_workflow_nodes(
+                workflow_id, nodes_data, next_graph_version,
+            )
            create_workflow_edges(workflow_id, edges_data, next_graph_version)
-        except Exception as e:
+        except Exception as err:
            workflow_nodes_collection.delete_many(
                {"workflow_id": workflow_id, "graph_version": next_graph_version}
            )
            workflow_edges_collection.delete_many(
                {"workflow_id": workflow_id, "graph_version": next_graph_version}
            )
-            return error_response(f"Failed to update workflow structure: {str(e)}")
+            return _workflow_error_response("Failed to update workflow structure", err)

        now = datetime.now(timezone.utc)
        _, error = safe_db_operation(
@@ -515,6 +707,16 @@ class WorkflowDetail(Resource):
                f"Failed to clean old workflow graph versions for {workflow_id}: {cleanup_err}"
            )

+        _dual_write_workflow_update(
+            workflow_id,
+            user_id,
+            name,
+            data.get("description", ""),
+            created_nodes,
+            edges_data,
+            next_graph_version,
+        )
+
        return success_response()

    @require_auth
@@ -535,7 +737,9 @@ class WorkflowDetail(Resource):
            workflow_nodes_collection.delete_many({"workflow_id": workflow_id})
            workflow_edges_collection.delete_many({"workflow_id": workflow_id})
            workflows_collection.delete_one({"_id": workflow["_id"], "user": user_id})
-        except Exception as e:
-            return error_response(f"Failed to delete workflow: {str(e)}")
+        except Exception as err:
+            return _workflow_error_response("Failed to delete workflow", err)
+
+        _dual_write_workflow_delete(workflow_id, user_id)

        return success_response()
--- a/application/api/v1/init.py
+++ b/application/api/v1/init.py
@@ -0,0 +1,3 @@
+from application.api.v1.routes import v1_bp
+
+__all__ = ["v1_bp"]
--- a/application/api/v1/routes.py
+++ b/application/api/v1/routes.py
@@ -0,0 +1,333 @@
+"""Standard chat completions API routes.
+
+Exposes ``/v1/chat/completions`` and ``/v1/models`` endpoints that
+follow the widely-adopted chat completions protocol so external tools
+(opencode, continue, etc.) can connect to DocsGPT agents.
+"""
+
+import json
+import logging
+import time
+import traceback
+from typing import Any, Dict, Generator, Optional
+
+from flask import Blueprint, jsonify, make_response, request, Response
+
+from application.api.answer.routes.base import BaseAnswerResource
+from application.api.answer.services.stream_processor import StreamProcessor
+from application.api.v1.translator import (
+    translate_request,
+    translate_response,
+    translate_stream_event,
+)
+from application.core.mongo_db import MongoDB
+from application.core.settings import settings
+
+logger = logging.getLogger(__name__)
+
+v1_bp = Blueprint("v1", __name__, url_prefix="/v1")
+
+
+def _extract_bearer_token() -> Optional[str]:
+    """Extract API key from Authorization: Bearer header."""
+    auth = request.headers.get("Authorization", "")
+    if auth.startswith("Bearer "):
+        return auth[7:].strip()
+    return None
+
+
+def _lookup_agent(api_key: str) -> Optional[Dict]:
+    """Look up the agent document for this API key."""
+    try:
+        mongo = MongoDB.get_client()
+        db = mongo[settings.MONGO_DB_NAME]
+        return db["agents"].find_one({"key": api_key})
+    except Exception:
+        logger.warning("Failed to look up agent for API key", exc_info=True)
+        return None
+
+
+def _get_model_name(agent: Optional[Dict], api_key: str) -> str:
+    """Return agent name for display as model name."""
+    if agent:
+        return agent.get("name", api_key)
+    return api_key
+
+
+class _V1AnswerHelper(BaseAnswerResource):
+    """Thin wrapper to access complete_stream / process_response_stream."""
+    pass
+
+
+@v1_bp.route("/chat/completions", methods=["POST"])
+def chat_completions():
+    """Handle POST /v1/chat/completions."""
+    api_key = _extract_bearer_token()
+    if not api_key:
+        return make_response(
+            jsonify({"error": {"message": "Missing Authorization header", "type": "auth_error"}}),
+            401,
+        )
+
+    data = request.get_json()
+    if not data or not data.get("messages"):
+        return make_response(
+            jsonify({"error": {"message": "messages field is required", "type": "invalid_request"}}),
+            400,
+        )
+
+    is_stream = data.get("stream", False)
+    agent_doc = _lookup_agent(api_key)
+    model_name = _get_model_name(agent_doc, api_key)
+
+    try:
+        internal_data = translate_request(data, api_key)
+    except Exception as e:
+        logger.error(f"/v1/chat/completions translate error: {e}", exc_info=True)
+        return make_response(
+            jsonify({"error": {"message": "Failed to process request", "type": "invalid_request"}}),
+            400,
+        )
+
+    # Link decoded_token to the agent's owner so continuation state,
+    # logs, and tool execution use the correct user identity.
+    agent_user = agent_doc.get("user") if agent_doc else None
+    decoded_token = {"sub": agent_user or "api_key_user"}
+
+    try:
+        processor = StreamProcessor(internal_data, decoded_token)
+
+        if internal_data.get("tool_actions"):
+            # Continuation mode
+            conversation_id = internal_data.get("conversation_id")
+            if not conversation_id:
+                return make_response(
+                    jsonify({"error": {"message": "conversation_id required for tool continuation", "type": "invalid_request"}}),
+                    400,
+                )
+            (
+                agent,
+                messages,
+                tools_dict,
+                pending_tool_calls,
+                tool_actions,
+            ) = processor.resume_from_tool_actions(
+                internal_data["tool_actions"], conversation_id
+            )
+            continuation = {
+                "messages": messages,
+                "tools_dict": tools_dict,
+                "pending_tool_calls": pending_tool_calls,
+                "tool_actions": tool_actions,
+            }
+            question = ""
+        else:
+            # Normal mode
+            question = internal_data.get("question", "")
+            agent = processor.build_agent(question)
+            continuation = None
+
+        if not processor.decoded_token:
+            return make_response(
+                jsonify({"error": {"message": "Unauthorized", "type": "auth_error"}}),
+                401,
+            )
+
+        helper = _V1AnswerHelper()
+        usage_error = helper.check_usage(processor.agent_config)
+        if usage_error:
+            return usage_error
+
+        should_save_conversation = bool(internal_data.get("save_conversation", False))
+
+        if is_stream:
+            return Response(
+                _stream_response(
+                    helper,
+                    question,
+                    agent,
+                    processor,
+                    model_name,
+                    continuation,
+                    should_save_conversation,
+                ),
+                mimetype="text/event-stream",
+                headers={
+                    "Cache-Control": "no-cache",
+                    "X-Accel-Buffering": "no",
+                },
+            )
+        else:
+            return _non_stream_response(
+                helper,
+                question,
+                agent,
+                processor,
+                model_name,
+                continuation,
+                should_save_conversation,
+            )
+
+    except ValueError as e:
+        logger.error(
+            f"/v1/chat/completions error: {e} - {traceback.format_exc()}",
+            extra={"error": str(e)},
+        )
+        return make_response(
+            jsonify({"error": {"message": "Failed to process request", "type": "invalid_request"}}),
+            400,
+        )
+    except Exception as e:
+        logger.error(
+            f"/v1/chat/completions error: {e} - {traceback.format_exc()}",
+            extra={"error": str(e)},
+        )
+        return make_response(
+            jsonify({"error": {"message": "Internal server error", "type": "server_error"}}),
+            500,
+        )
+
+
+def _stream_response(
+    helper: _V1AnswerHelper,
+    question: str,
+    agent: Any,
+    processor: StreamProcessor,
+    model_name: str,
+    continuation: Optional[Dict],
+    should_save_conversation: bool,
+) -> Generator[str, None, None]:
+    """Generate translated SSE chunks for streaming response."""
+    completion_id = f"chatcmpl-{int(time.time())}"
+
+    internal_stream = helper.complete_stream(
+        question=question,
+        agent=agent,
+        conversation_id=processor.conversation_id,
+        user_api_key=processor.agent_config.get("user_api_key"),
+        decoded_token=processor.decoded_token,
+        agent_id=processor.agent_id,
+        model_id=processor.model_id,
+        should_save_conversation=should_save_conversation,
+        _continuation=continuation,
+    )
+
+    for line in internal_stream:
+        if not line.strip():
+            continue
+        # Parse the internal SSE event
+        event_str = line.replace("data: ", "").strip()
+        try:
+            event_data = json.loads(event_str)
+        except (json.JSONDecodeError, TypeError):
+            continue
+
+        # Update completion_id when we get the conversation id
+        if event_data.get("type") == "id":
+            conv_id = event_data.get("id", "")
+            if conv_id:
+                completion_id = f"chatcmpl-{conv_id}"
+
+        # Translate to standard format
+        translated = translate_stream_event(event_data, completion_id, model_name)
+        for chunk in translated:
+            yield chunk
+
+
+def _non_stream_response(
+    helper: _V1AnswerHelper,
+    question: str,
+    agent: Any,
+    processor: StreamProcessor,
+    model_name: str,
+    continuation: Optional[Dict],
+    should_save_conversation: bool,
+) -> Response:
+    """Collect full response and return as single JSON."""
+    stream = helper.complete_stream(
+        question=question,
+        agent=agent,
+        conversation_id=processor.conversation_id,
+        user_api_key=processor.agent_config.get("user_api_key"),
+        decoded_token=processor.decoded_token,
+        agent_id=processor.agent_id,
+        model_id=processor.model_id,
+        should_save_conversation=should_save_conversation,
+        _continuation=continuation,
+    )
+
+    result = helper.process_response_stream(stream)
+
+    if result["error"]:
+        return make_response(
+            jsonify({"error": {"message": result["error"], "type": "server_error"}}),
+            500,
+        )
+
+    extra = result.get("extra")
+    pending = extra.get("pending_tool_calls") if isinstance(extra, dict) else None
+
+    response = translate_response(
+        conversation_id=result["conversation_id"],
+        answer=result["answer"] or "",
+        sources=result["sources"],
+        tool_calls=result["tool_calls"],
+        thought=result["thought"] or "",
+        model_name=model_name,
+        pending_tool_calls=pending,
+    )
+    return make_response(jsonify(response), 200)
+
+
+@v1_bp.route("/models", methods=["GET"])
+def list_models():
+    """Handle GET /v1/models — return agents as models."""
+    api_key = _extract_bearer_token()
+    if not api_key:
+        return make_response(
+            jsonify({"error": {"message": "Missing Authorization header", "type": "auth_error"}}),
+            401,
+        )
+
+    try:
+        mongo = MongoDB.get_client()
+        db = mongo[settings.MONGO_DB_NAME]
+        agents_collection = db["agents"]
+
+        # Find the agent for this api_key
+        agent = agents_collection.find_one({"key": api_key})
+        if not agent:
+            return make_response(
+                jsonify({"error": {"message": "Invalid API key", "type": "auth_error"}}),
+                401,
+            )
+
+        user = agent.get("user")
+
+        # Return all agents belonging to this user
+        user_agents = list(agents_collection.find({"user": user}))
+
+        models = []
+        for ag in user_agents:
+            created = ag.get("createdAt")
+            created_ts = int(created.timestamp()) if created else int(time.time())
+            model_id = str(ag.get("_id") or ag.get("id") or "")
+            models.append({
+                "id": model_id,
+                "object": "model",
+                "created": created_ts,
+                "owned_by": "docsgpt",
+                "name": ag.get("name", ""),
+                "description": ag.get("description", ""),
+            })
+
+        return make_response(
+            jsonify({"object": "list", "data": models}),
+            200,
+        )
+    except Exception as e:
+        logger.error(f"/v1/models error: {e}", exc_info=True)
+        return make_response(
+            jsonify({"error": {"message": "Internal server error", "type": "server_error"}}),
+            500,
+        )
--- a/application/api/v1/translator.py
+++ b/application/api/v1/translator.py
@@ -0,0 +1,433 @@
+"""Translate between standard chat completions format and DocsGPT internals.
+
+This module handles:
+- Request translation (chat completions -> DocsGPT internal format)
+- Response translation (DocsGPT response -> chat completions format)
+- Streaming event translation (DocsGPT SSE -> standard SSE chunks)
+"""
+
+import json
+import time
+from typing import Any, Dict, List, Optional
+
+def _get_client_tool_name(tc: Dict) -> str:
+    """Return the original tool name for client-facing responses.
+
+    For client-side tools the ``tool_name`` field carries the name the
+    client originally registered.  Fall back to ``action_name`` (which
+    is now the clean LLM-visible name) or ``name``.
+    """
+    return tc.get("tool_name", tc.get("action_name", tc.get("name", "")))
+
+
+# ---------------------------------------------------------------------------
+# Request translation
+# ---------------------------------------------------------------------------
+
+
+def is_continuation(messages: List[Dict]) -> bool:
+    """Check if messages represent a tool-call continuation.
+
+    A continuation is detected when the last message(s) have ``role: "tool"``
+    immediately after an assistant message with ``tool_calls``.
+    """
+    if not messages:
+        return False
+    # Walk backwards: if we see tool messages before hitting a non-tool, non-assistant message
+    # and there's an assistant message with tool_calls, it's a continuation.
+    i = len(messages) - 1
+    while i >= 0 and messages[i].get("role") == "tool":
+        i -= 1
+    if i < 0:
+        return False
+    return (
+        messages[i].get("role") == "assistant"
+        and bool(messages[i].get("tool_calls"))
+    )
+
+
+def extract_tool_results(messages: List[Dict]) -> List[Dict]:
+    """Extract tool results from trailing tool messages for continuation.
+
+    Returns a list of ``tool_actions`` dicts with ``call_id`` and ``result``.
+    """
+    results = []
+    for msg in reversed(messages):
+        if msg.get("role") != "tool":
+            break
+        call_id = msg.get("tool_call_id", "")
+        content = msg.get("content", "")
+        if isinstance(content, str):
+            try:
+                content = json.loads(content)
+            except (json.JSONDecodeError, TypeError):
+                pass
+        results.append({"call_id": call_id, "result": content})
+    results.reverse()
+    return results
+
+
+def extract_conversation_id(messages: List[Dict]) -> Optional[str]:
+    """Try to extract conversation_id from the assistant message before tool results.
+
+    The conversation_id may be stored in a custom field on the assistant message
+    from a previous response cycle.
+    """
+    for msg in reversed(messages):
+        if msg.get("role") == "assistant":
+            # Check docsgpt extension
+            return msg.get("docsgpt", {}).get("conversation_id")
+    return None
+
+
+def extract_system_prompt(messages: List[Dict]) -> Optional[str]:
+    """Extract the first system message content from the messages array.
+
+    Returns None if no system message is present.
+    """
+    for msg in messages:
+        if msg.get("role") == "system":
+            return msg.get("content", "")
+    return None
+
+
+def convert_history(messages: List[Dict]) -> List[Dict]:
+    """Convert chat completions messages array to DocsGPT history format.
+
+    DocsGPT history is a list of ``{prompt, response}`` dicts.
+    Excludes the last user message (that becomes the ``question``).
+    """
+    history = []
+    i = 0
+    while i < len(messages):
+        msg = messages[i]
+        if msg.get("role") == "system":
+            i += 1
+            continue
+        if msg.get("role") == "user":
+            # Look ahead for assistant response
+            if i + 1 < len(messages) and messages[i + 1].get("role") == "assistant":
+                content = messages[i + 1].get("content") or ""
+                history.append({
+                    "prompt": msg.get("content", ""),
+                    "response": content,
+                })
+                i += 2
+                continue
+            # Last user message without response — skip (it's the question)
+            i += 1
+            continue
+        i += 1
+    return history
+
+
+def translate_request(
+    data: Dict[str, Any], api_key: str
+) -> Dict[str, Any]:
+    """Translate a chat completions request to DocsGPT internal format.
+
+    Args:
+        data: The incoming request body.
+        api_key: Agent API key from the Authorization header.
+
+    Returns:
+        Dict suitable for passing to ``StreamProcessor``.
+    """
+    messages = data.get("messages", [])
+
+    # Check for continuation (tool results after assistant tool_calls)
+    if is_continuation(messages):
+        tool_actions = extract_tool_results(messages)
+        conversation_id = extract_conversation_id(messages)
+        if not conversation_id:
+            conversation_id = data.get("conversation_id")
+        result = {
+            "conversation_id": conversation_id,
+            "tool_actions": tool_actions,
+            "api_key": api_key,
+        }
+        # Carry tools forward for next iteration
+        if data.get("tools"):
+            result["client_tools"] = data["tools"]
+        return result
+
+    # Normal request — extract question from last user message
+    question = ""
+    for msg in reversed(messages):
+        if msg.get("role") == "user":
+            question = msg.get("content", "")
+            break
+
+    history = convert_history(messages)
+    system_prompt_override = extract_system_prompt(messages)
+
+    docsgpt = data.get("docsgpt", {})
+
+    result = {
+        "question": question,
+        "api_key": api_key,
+        "history": json.dumps(history),
+        # Conversations are NOT persisted by default on the v1 endpoint.
+        # Callers opt in via ``docsgpt.save_conversation: true``.
+        "save_conversation": bool(docsgpt.get("save_conversation", False)),
+    }
+
+    if system_prompt_override is not None:
+        result["system_prompt_override"] = system_prompt_override
+
+    # Client tools
+    if data.get("tools"):
+        result["client_tools"] = data["tools"]
+
+    # DocsGPT extensions
+    if docsgpt.get("attachments"):
+        result["attachments"] = docsgpt["attachments"]
+
+    return result
+
+
+# ---------------------------------------------------------------------------
+# Response translation (non-streaming)
+# ---------------------------------------------------------------------------
+
+
+def translate_response(
+    conversation_id: str,
+    answer: str,
+    sources: Optional[List[Dict]],
+    tool_calls: Optional[List[Dict]],
+    thought: str,
+    model_name: str,
+    pending_tool_calls: Optional[List[Dict]] = None,
+) -> Dict[str, Any]:
+    """Translate DocsGPT response to chat completions format.
+
+    Args:
+        conversation_id: The DocsGPT conversation ID.
+        answer: The assistant's text response.
+        sources: RAG retrieval sources.
+        tool_calls: Completed tool call results.
+        thought: Reasoning/thinking tokens.
+        model_name: Model/agent identifier.
+        pending_tool_calls: Pending client-side tool calls (if paused).
+
+    Returns:
+        Dict in the standard chat completions response format.
+    """
+    created = int(time.time())
+    completion_id = f"chatcmpl-{conversation_id}" if conversation_id else f"chatcmpl-{created}"
+
+    # Build message
+    message: Dict[str, Any] = {"role": "assistant"}
+
+    if pending_tool_calls:
+        # Tool calls pending — return them for client execution
+        message["content"] = None
+        message["tool_calls"] = [
+            {
+                "id": tc.get("call_id", ""),
+                "type": "function",
+                "function": {
+                    "name": _get_client_tool_name(tc),
+                    "arguments": (
+                        json.dumps(tc["arguments"])
+                        if isinstance(tc.get("arguments"), dict)
+                        else tc.get("arguments", "{}")
+                    ),
+                },
+            }
+            for tc in pending_tool_calls
+        ]
+        finish_reason = "tool_calls"
+    else:
+        message["content"] = answer
+        if thought:
+            message["reasoning_content"] = thought
+        finish_reason = "stop"
+
+    result: Dict[str, Any] = {
+        "id": completion_id,
+        "object": "chat.completion",
+        "created": created,
+        "model": model_name,
+        "choices": [
+            {
+                "index": 0,
+                "message": message,
+                "finish_reason": finish_reason,
+            }
+        ],
+        "usage": {
+            "prompt_tokens": 0,
+            "completion_tokens": 0,
+            "total_tokens": 0,
+        },
+    }
+
+    # DocsGPT extensions
+    docsgpt: Dict[str, Any] = {}
+    if conversation_id:
+        docsgpt["conversation_id"] = conversation_id
+    if sources:
+        docsgpt["sources"] = sources
+    if tool_calls:
+        docsgpt["tool_calls"] = tool_calls
+    if docsgpt:
+        result["docsgpt"] = docsgpt
+
+    return result
+
+
+# ---------------------------------------------------------------------------
+# Streaming event translation
+# ---------------------------------------------------------------------------
+
+
+def _make_chunk(
+    completion_id: str,
+    model_name: str,
+    delta: Dict[str, Any],
+    finish_reason: Optional[str] = None,
+) -> str:
+    """Build a single SSE chunk in the standard streaming format."""
+    chunk = {
+        "id": completion_id,
+        "object": "chat.completion.chunk",
+        "created": int(time.time()),
+        "model": model_name,
+        "choices": [
+            {
+                "index": 0,
+                "delta": delta,
+                "finish_reason": finish_reason,
+            }
+        ],
+    }
+    return f"data: {json.dumps(chunk)}\n\n"
+
+
+def _make_docsgpt_chunk(data: Dict[str, Any]) -> str:
+    """Build a DocsGPT extension SSE chunk."""
+    return f"data: {json.dumps({'docsgpt': data})}\n\n"
+
+
+def translate_stream_event(
+    event_data: Dict[str, Any],
+    completion_id: str,
+    model_name: str,
+) -> List[str]:
+    """Translate a DocsGPT SSE event dict to standard streaming chunks.
+
+    May return 0, 1, or 2 chunks per input event. For example, a completed
+    tool call produces both a docsgpt extension chunk and nothing on the
+    standard side (since server-side tool calls aren't surfaced in standard
+    format).
+
+    Args:
+        event_data: Parsed DocsGPT event dict.
+        completion_id: The completion ID for this response.
+        model_name: Model/agent identifier.
+
+    Returns:
+        List of SSE-formatted strings to send to the client.
+    """
+    event_type = event_data.get("type")
+    chunks: List[str] = []
+
+    if event_type == "answer":
+        chunks.append(
+            _make_chunk(completion_id, model_name, {"content": event_data.get("answer", "")})
+        )
+
+    elif event_type == "thought":
+        chunks.append(
+            _make_chunk(
+                completion_id, model_name,
+                {"reasoning_content": event_data.get("thought", "")},
+            )
+        )
+
+    elif event_type == "source":
+        chunks.append(
+            _make_docsgpt_chunk({
+                "type": "source",
+                "sources": event_data.get("source", []),
+            })
+        )
+
+    elif event_type == "tool_call":
+        tc_data = event_data.get("data", {})
+        status = tc_data.get("status")
+
+        if status == "requires_client_execution":
+            # Standard: stream as tool_calls delta
+            args = tc_data.get("arguments", {})
+            args_str = json.dumps(args) if isinstance(args, dict) else str(args)
+            chunks.append(
+                _make_chunk(completion_id, model_name, {
+                    "tool_calls": [{
+                        "index": 0,
+                        "id": tc_data.get("call_id", ""),
+                        "type": "function",
+                        "function": {
+                            "name": _get_client_tool_name(tc_data),
+                            "arguments": args_str,
+                        },
+                    }],
+                })
+            )
+        elif status == "awaiting_approval":
+            # Extension: approval needed
+            chunks.append(_make_docsgpt_chunk({"type": "tool_call", "data": tc_data}))
+        elif status in ("completed", "pending", "error", "denied", "skipped"):
+            # Extension: tool call progress
+            chunks.append(_make_docsgpt_chunk({"type": "tool_call", "data": tc_data}))
+
+    elif event_type == "tool_calls_pending":
+        # Standard: finish_reason = tool_calls
+        chunks.append(
+            _make_chunk(completion_id, model_name, {}, finish_reason="tool_calls")
+        )
+        # Also emit as docsgpt extension
+        chunks.append(
+            _make_docsgpt_chunk({
+                "type": "tool_calls_pending",
+                "pending_tool_calls": event_data.get("data", {}).get("pending_tool_calls", []),
+            })
+        )
+
+    elif event_type == "end":
+        chunks.append(
+            _make_chunk(completion_id, model_name, {}, finish_reason="stop")
+        )
+        chunks.append("data: [DONE]\n\n")
+
+    elif event_type == "id":
+        chunks.append(
+            _make_docsgpt_chunk({
+                "type": "id",
+                "conversation_id": event_data.get("id", ""),
+            })
+        )
+
+    elif event_type == "error":
+        # Emit as standard error (non-standard but widely supported)
+        error_data = {
+            "error": {
+                "message": event_data.get("error", "An error occurred"),
+                "type": "server_error",
+            }
+        }
+        chunks.append(f"data: {json.dumps(error_data)}\n\n")
+
+    elif event_type == "structured_answer":
+        chunks.append(
+            _make_chunk(
+                completion_id, model_name,
+                {"content": event_data.get("answer", "")},
+            )
+        )
+
+    # Skip: tool_calls (redundant), research_plan, research_progress
+
+    return chunks
--- a/application/app.py
+++ b/application/app.py
@@ -17,8 +17,13 @@ from application.api.answer import answer  # noqa: E402
 from application.api.internal.routes import internal  # noqa: E402
 from application.api.user.routes import user  # noqa: E402
 from application.api.connector.routes import connector  # noqa: E402
+from application.api.v1 import v1_bp  # noqa: E402
 from application.celery_init import celery  # noqa: E402
 from application.core.settings import settings  # noqa: E402
+from application.stt.upload_limits import (  # noqa: E402
+    build_stt_file_size_limit_message,
+    should_reject_stt_request,
+)


 if platform.system() == "Windows":
@@ -32,6 +37,7 @@ app.register_blueprint(user)
 app.register_blueprint(answer)
 app.register_blueprint(internal)
 app.register_blueprint(connector)
+app.register_blueprint(v1_bp)
 app.config.update(
    UPLOAD_FOLDER="inputs",
    CELERY_BROKER_URL=settings.CELERY_BROKER_URL,
@@ -68,6 +74,11 @@ def home():
        return "Welcome to DocsGPT Backend!"


+@app.route("/api/health")
+def health():
+    return jsonify({"status": "ok"})
+
+
@app.route("/api/config")
 def get_config():
    response = {
@@ -88,6 +99,23 @@ def generate_token():
    return jsonify({"error": "Token generation not allowed in current auth mode"}), 400


+@app.before_request
+def enforce_stt_request_size_limits():
+    if request.method == "OPTIONS":
+        return None
+    if should_reject_stt_request(request.path, request.content_length):
+        return (
+            jsonify(
+                {
+                    "success": False,
+                    "message": build_stt_file_size_limit_message(),
+                }
+            ),
+            413,
+        )
+    return None
+
+
@app.before_request
 def authenticate_request():
    if request.method == "OPTIONS":
--- a/application/celery_init.py
+++ b/application/celery_init.py
@@ -1,6 +1,6 @@
 from celery import Celery
 from application.core.settings import settings
-from celery.signals import setup_logging
+from celery.signals import setup_logging, worker_process_init


 def make_celery(app_name=__name__):
@@ -20,5 +20,24 @@ def config_loggers(*args, **kwargs):
    setup_logging()


+@worker_process_init.connect
+def _dispose_db_engine_on_fork(*args, **kwargs):
+    """Dispose the SQLAlchemy engine pool in each forked Celery worker.
+
+    SQLAlchemy connection pools are not fork-safe: file descriptors shared
+    between the parent and a forked worker will corrupt the pool. Disposing
+    on ``worker_process_init`` gives every worker its own fresh pool on
+    first use.
+
+    Imported lazily so Celery workers that don't touch Postgres (or where
+    ``POSTGRES_URI`` is unset) don't fail at startup.
+    """
+    try:
+        from application.storage.db.engine import dispose_engine
+    except Exception:
+        return
+    dispose_engine()
+
+
 celery = make_celery()
 celery.config_from_object("application.celeryconfig")
--- a/application/core/db_uri.py
+++ b/application/core/db_uri.py
@@ -0,0 +1,89 @@
+"""Normalize user-supplied Postgres URIs for different drivers.
+
+DocsGPT has two Postgres connection strings pointing at potentially
+different databases:
+
+* ``POSTGRES_URI`` feeds SQLAlchemy, which needs the
+  ``postgresql+psycopg://`` dialect prefix to pick the psycopg v3 driver.
+* ``PGVECTOR_CONNECTION_STRING`` feeds ``psycopg.connect()`` directly
+  (via libpq) in ``application/vectorstore/pgvector.py``. libpq only
+  understands ``postgres://`` and ``postgresql://`` — the SQLAlchemy
+  dialect prefix is an invalid URI from its point of view.
+
+The two fields therefore need opposite normalization so operators don't
+have to know which driver a given field feeds. Each normalizer also
+silently upgrades the legacy ``postgresql+psycopg2://`` prefix since
+psycopg2 is no longer in the project.
+
+This module is deliberately separate from ``application/core/settings.py``
+so the Settings class stays focused on field declarations, and the
+URI-rewriting logic can be unit-tested without triggering ``.env``
+file loading from importing Settings.
+"""
+
+from __future__ import annotations
+
+
+def _rewrite_uri_prefixes(v, rewrites):
+    """Shared URI prefix rewriter used by both normalizers below.
+
+    Strips whitespace, returns ``None`` for empty / ``"none"`` values,
+    applies the first matching rewrite, and passes unrecognised input
+    through so downstream consumers (SQLAlchemy, libpq) can produce
+    their own error messages rather than us silently eating a
+    misconfiguration.
+    """
+    if v is None:
+        return None
+    if not isinstance(v, str):
+        return v
+    v = v.strip()
+    if not v or v.lower() == "none":
+        return None
+    for prefix, target in rewrites:
+        if v.startswith(prefix):
+            return target + v[len(prefix):]
+    return v
+
+
+# POSTGRES_URI feeds SQLAlchemy, which needs a ``postgresql+psycopg://``
+# dialect prefix to select the psycopg v3 driver. Normalize the
+# operator-friendly forms TOWARD that dialect.
+_POSTGRES_URI_REWRITES = (
+    ("postgresql+psycopg2://", "postgresql+psycopg://"),
+    ("postgresql://", "postgresql+psycopg://"),
+    ("postgres://", "postgresql+psycopg://"),
+)
+
+
+# PGVECTOR_CONNECTION_STRING feeds ``psycopg.connect()`` directly in
+# application/vectorstore/pgvector.py — NOT SQLAlchemy. libpq only
+# understands ``postgres://`` and ``postgresql://``; the SQLAlchemy
+# dialect prefix is an invalid URI from libpq's point of view. Strip it
+# if the operator accidentally copied their POSTGRES_URI value here.
+_PGVECTOR_CONNECTION_STRING_REWRITES = (
+    ("postgresql+psycopg2://", "postgresql://"),
+    ("postgresql+psycopg://", "postgresql://"),
+)
+
+
+def normalize_postgres_uri(v):
+    """Normalize a user-supplied POSTGRES_URI to the SQLAlchemy psycopg3 form.
+
+    Accepts the forms operators naturally write (``postgres://``,
+    ``postgresql://``) and rewrites them to ``postgresql+psycopg://``.
+    Unknown schemes pass through unchanged so SQLAlchemy can produce its
+    own dialect-not-found error.
+    """
+    return _rewrite_uri_prefixes(v, _POSTGRES_URI_REWRITES)
+
+
+def normalize_pgvector_connection_string(v):
+    """Normalize a user-supplied PGVECTOR_CONNECTION_STRING for libpq.
+
+    Strips the SQLAlchemy dialect prefix if the operator accidentally
+    copied their POSTGRES_URI value here — libpq can't parse it.
+    User-friendly forms (``postgres://``, ``postgresql://``) pass
+    through unchanged since libpq accepts them natively.
+    """
+    return _rewrite_uri_prefixes(v, _PGVECTOR_CONNECTION_STRING_REWRITES)
--- a/application/core/model_configs.py
+++ b/application/core/model_configs.py
@@ -27,6 +27,8 @@ ANTHROPIC_ATTACHMENTS = IMAGE_ATTACHMENTS

 OPENROUTER_ATTACHMENTS = IMAGE_ATTACHMENTS

+NOVITA_ATTACHMENTS = IMAGE_ATTACHMENTS
+

 OPENAI_MODELS = [
    AvailableModel(
@@ -193,6 +195,46 @@ OPENROUTER_MODELS = [
    ),
 ]

+NOVITA_MODELS = [
+    AvailableModel(
+        id="moonshotai/kimi-k2.5",
+        provider=ModelProvider.NOVITA,
+        display_name="Kimi K2.5",
+        description="MoE model with function calling, structured output, reasoning, and vision",
+        capabilities=ModelCapabilities(
+            supports_tools=True,
+            supports_structured_output=True,
+            supported_attachment_types=NOVITA_ATTACHMENTS,
+            context_window=262144,
+        ),
+    ),
+    AvailableModel(
+        id="zai-org/glm-5",
+        provider=ModelProvider.NOVITA,
+        display_name="GLM-5",
+        description="MoE model with function calling, structured output, and reasoning",
+        capabilities=ModelCapabilities(
+            supports_tools=True,
+            supports_structured_output=True,
+            supported_attachment_types=[],
+            context_window=202800,
+        ),
+    ),
+    AvailableModel(
+        id="minimax/minimax-m2.5",
+        provider=ModelProvider.NOVITA,
+        display_name="MiniMax M2.5",
+        description="MoE model with function calling, structured output, and reasoning",
+        capabilities=ModelCapabilities(
+            supports_tools=True,
+            supports_structured_output=True,
+            supported_attachment_types=[],
+            context_window=204800,
+        ),
+    ),
+]
+
+
 AZURE_OPENAI_MODELS = [
    AvailableModel(
        id="azure-gpt-4",
--- a/application/core/model_settings.py
+++ b/application/core/model_settings.py
@@ -114,6 +114,10 @@ class ModelRegistry:
            settings.LLM_PROVIDER == "openrouter" and settings.API_KEY
        ):
            self._add_openrouter_models(settings)
+        if settings.NOVITA_API_KEY or (
+            settings.LLM_PROVIDER == "novita" and settings.API_KEY
+        ):
+            self._add_novita_models(settings)
        if settings.HUGGINGFACE_API_KEY or (
            settings.LLM_PROVIDER == "huggingface" and settings.API_KEY
        ):
@@ -245,6 +249,21 @@ class ModelRegistry:
        for model in OPENROUTER_MODELS:
            self.models[model.id] = model

+    def _add_novita_models(self, settings):
+        from application.core.model_configs import NOVITA_MODELS
+
+        if settings.NOVITA_API_KEY:
+            for model in NOVITA_MODELS:
+                self.models[model.id] = model
+            return
+        if settings.LLM_PROVIDER == "novita" and settings.LLM_NAME:
+            for model in NOVITA_MODELS:
+                if model.id == settings.LLM_NAME:
+                    self.models[model.id] = model
+                    return
+        for model in NOVITA_MODELS:
+            self.models[model.id] = model
+
    def _add_docsgpt_models(self, settings):
        model_id = "docsgpt-local"
        model = AvailableModel(
--- a/application/core/model_utils.py
+++ b/application/core/model_utils.py
@@ -10,6 +10,7 @@ def get_api_key_for_provider(provider: str) -> Optional[str]:
    provider_key_map = {
        "openai": settings.OPENAI_API_KEY,
        "openrouter": settings.OPEN_ROUTER_API_KEY,
+        "novita": settings.NOVITA_API_KEY,
        "anthropic": settings.ANTHROPIC_API_KEY,
        "google": settings.GOOGLE_API_KEY,
        "groq": settings.GROQ_API_KEY,
--- a/application/core/settings.py
+++ b/application/core/settings.py
@@ -5,8 +5,12 @@ from typing import Optional
 from pydantic import field_validator
 from pydantic_settings import BaseSettings, SettingsConfigDict

-current_dir = os.path.dirname(
-    os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+current_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+
+
+from application.core.db_uri import (  # noqa: E402
+    normalize_pgvector_connection_string,
+    normalize_postgres_uri,
 )


@@ -15,19 +19,20 @@ class Settings(BaseSettings):

    AUTH_TYPE: Optional[str] = None  # simple_jwt, session_jwt, or None
    LLM_PROVIDER: str = "docsgpt"
-    LLM_NAME: Optional[str] = (
-        None  # if LLM_PROVIDER is openai, LLM_NAME can be gpt-4 or gpt-3.5-turbo
-    )
+    LLM_NAME: Optional[str] = None  # if LLM_PROVIDER is openai, LLM_NAME can be gpt-4 or gpt-3.5-turbo
    EMBEDDINGS_NAME: str = "huggingface_sentence-transformers/all-mpnet-base-v2"
    EMBEDDINGS_BASE_URL: Optional[str] = None  # Remote embeddings API URL (OpenAI-compatible)
-    EMBEDDINGS_KEY: Optional[str] = (
-        None  # api key for embeddings (if using openai, just copy API_KEY)
-    )
-    
+    EMBEDDINGS_KEY: Optional[str] = None  # api key for embeddings (if using openai, just copy API_KEY)
+
    CELERY_BROKER_URL: str = "redis://localhost:6379/0"
    CELERY_RESULT_BACKEND: str = "redis://localhost:6379/1"
    MONGO_URI: str = "mongodb://localhost:27017/docsgpt"
    MONGO_DB_NAME: str = "docsgpt"
+    # User-data Postgres DB.
+    POSTGRES_URI: Optional[str] = None
+
+    # MongoDB→Postgres migration: dual-write to Postgres (Mongo stays source of truth)
+    USE_POSTGRES: bool = False
    LLM_PATH: str = os.path.join(current_dir, "models/docsgpt-7b-f16.gguf")
    DEFAULT_MAX_HISTORY: int = 150
    DEFAULT_LLM_TOKEN_LIMIT: int = 128000  # Fallback when model not found in registry
@@ -45,9 +50,7 @@ class Settings(BaseSettings):
    PARSE_IMAGE_REMOTE: bool = False
    DOCLING_OCR_ENABLED: bool = False  # Enable OCR for docling parsers (PDF, images)
    DOCLING_OCR_ATTACHMENTS_ENABLED: bool = False  # Enable OCR for docling when parsing attachments
-    VECTOR_STORE: str = (
-        "faiss"  #  "faiss" or "elasticsearch" or "qdrant" or "milvus" or "lancedb" or "pgvector"
-    )
+    VECTOR_STORE: str = "faiss"  #  "faiss" or "elasticsearch" or "qdrant" or "milvus" or "lancedb" or "pgvector"
    RETRIEVERS_ENABLED: list = ["classic_rag"]
    AGENT_NAME: str = "classic"
    FALLBACK_LLM_PROVIDER: Optional[str] = None  # provider for fallback llm
@@ -55,16 +58,22 @@ class Settings(BaseSettings):
    FALLBACK_LLM_API_KEY: Optional[str] = None  # api key for fallback llm

    # Google Drive integration
-    GOOGLE_CLIENT_ID: Optional[str] = (
-        None  # Replace with your actual Google OAuth client ID
-    )
-    GOOGLE_CLIENT_SECRET: Optional[str] = (
-        None  # Replace with your actual Google OAuth client secret
-    )
+    GOOGLE_CLIENT_ID: Optional[str] = None  # Replace with your actual Google OAuth client ID
+    GOOGLE_CLIENT_SECRET: Optional[str] = None  # Replace with your actual Google OAuth client secret
    CONNECTOR_REDIRECT_BASE_URI: Optional[str] = (
        "http://127.0.0.1:7091/api/connectors/callback"  ##add redirect url as it is to your provider's console(gcp)
    )

+    # Microsoft Entra ID (Azure AD) integration
+    MICROSOFT_CLIENT_ID: Optional[str] = None  # Azure AD Application (client) ID
+    MICROSOFT_CLIENT_SECRET: Optional[str] = None  # Azure AD Application client secret
+    MICROSOFT_TENANT_ID: Optional[str] = "common"  # Azure AD Tenant ID (or 'common' for multi-tenant)
+    MICROSOFT_AUTHORITY: Optional[str] = None  # e.g., "https://login.microsoftonline.com/{tenant_id}"
+
+    # Confluence Cloud integration
+    CONFLUENCE_CLIENT_ID: Optional[str] = None
+    CONFLUENCE_CLIENT_SECRET: Optional[str] = None
+
    # GitHub source
    GITHUB_ACCESS_TOKEN: Optional[str] = None  # PAT token with read repo access

@@ -72,6 +81,7 @@ class Settings(BaseSettings):
    CACHE_REDIS_URL: str = "redis://localhost:6379/2"

    API_URL: str = "http://localhost:7091"  # backend url for celery worker
+    MCP_OAUTH_REDIRECT_URI: Optional[str] = None  # public callback URL for MCP OAuth
    INTERNAL_KEY: Optional[str] = None  # internal api key for worker-to-backend auth

    API_KEY: Optional[str] = None  # LLM api key (used by LLM_PROVIDER)
@@ -83,16 +93,13 @@ class Settings(BaseSettings):
    GROQ_API_KEY: Optional[str] = None
    HUGGINGFACE_API_KEY: Optional[str] = None
    OPEN_ROUTER_API_KEY: Optional[str] = None
+    NOVITA_API_KEY: Optional[str] = None

    OPENAI_API_BASE: Optional[str] = None  # azure openai api base url
    OPENAI_API_VERSION: Optional[str] = None  # azure openai api version
    AZURE_DEPLOYMENT_NAME: Optional[str] = None  # azure deployment name for answering
-    AZURE_EMBEDDINGS_DEPLOYMENT_NAME: Optional[str] = (
-        None  # azure deployment name for embeddings
-    )
-    OPENAI_BASE_URL: Optional[str] = (
-        None  # openai base url for open ai compatable models
-    )
+    AZURE_EMBEDDINGS_DEPLOYMENT_NAME: Optional[str] = None  # azure deployment name for embeddings
+    OPENAI_BASE_URL: Optional[str] = None  # openai base url for open ai compatable models

    # elasticsearch
    ELASTIC_CLOUD_ID: Optional[str] = None  # cloud id for elasticsearch
@@ -125,7 +132,10 @@ class Settings(BaseSettings):
    QDRANT_PATH: Optional[str] = None
    QDRANT_DISTANCE_FUNC: str = "Cosine"

-    # PGVector vectorstore config
+    # PGVector vectorstore config. Write the URI in whichever form you
+    # prefer — ``postgres://``, ``postgresql://``, or even the SQLAlchemy
+    # dialect form (``postgresql+psycopg://``) are all accepted and
+    # normalized internally for ``psycopg.connect()``.
    PGVECTOR_CONNECTION_STRING: Optional[str] = None
    # Milvus vectorstore config
    MILVUS_COLLECTION_NAME: Optional[str] = "docsgpt"
@@ -134,9 +144,7 @@ class Settings(BaseSettings):

    # LanceDB vectorstore config
    LANCEDB_PATH: str = "./data/lancedb"  # Path where LanceDB stores its local data
-    LANCEDB_TABLE_NAME: Optional[str] = (
-        "docsgpts"  # Name of the table to use for storing vectors
-    )
+    LANCEDB_TABLE_NAME: Optional[str] = "docsgpts"  # Name of the table to use for storing vectors

    FLASK_DEBUG_MODE: bool = False
    STORAGE_TYPE: str = "local"  # local or s3
@@ -149,6 +157,12 @@ class Settings(BaseSettings):

    TTS_PROVIDER: str = "google_tts"  # google_tts or elevenlabs
    ELEVENLABS_API_KEY: Optional[str] = None
+    STT_PROVIDER: str = "openai"  # openai or faster_whisper
+    OPENAI_STT_MODEL: str = "gpt-4o-mini-transcribe"
+    STT_LANGUAGE: Optional[str] = None
+    STT_MAX_FILE_SIZE_MB: int = 50
+    STT_ENABLE_TIMESTAMPS: bool = False
+    STT_ENABLE_DIARIZATION: bool = False

    # Tool pre-fetch settings
    ENABLE_TOOL_PREFETCH: bool = True
@@ -160,6 +174,16 @@ class Settings(BaseSettings):
    COMPRESSION_PROMPT_VERSION: str = "v1.0"  # Track prompt iterations
    COMPRESSION_MAX_HISTORY_POINTS: int = 3  # Keep only last N compression points to prevent DB bloat

+    @field_validator("POSTGRES_URI", mode="before")
+    @classmethod
+    def _normalize_postgres_uri_validator(cls, v):
+        return normalize_postgres_uri(v)
+
+    @field_validator("PGVECTOR_CONNECTION_STRING", mode="before")
+    @classmethod
+    def _normalize_pgvector_connection_string_validator(cls, v):
+        return normalize_pgvector_connection_string(v)
+
    @field_validator(
        "API_KEY",
        "OPENAI_API_KEY",
@@ -167,6 +191,7 @@ class Settings(BaseSettings):
        "GOOGLE_API_KEY",
        "GROQ_API_KEY",
        "HUGGINGFACE_API_KEY",
+        "NOVITA_API_KEY",
        "EMBEDDINGS_KEY",
        "FALLBACK_LLM_API_KEY",
        "QDRANT_API_KEY",
--- a/application/error.py
+++ b/application/error.py
@@ -13,3 +13,25 @@ def response_error(code_status, message=None):

 def bad_request(status_code=400, message=''):
    return response_error(code_status=status_code, message=message)
+
+
+def sanitize_api_error(error) -> str:
+    """
+    Convert technical API errors to user-friendly messages.
+    Works with both Exception objects and error message strings.
+    """
+    error_str = str(error).lower()
+    if "503" in error_str or "unavailable" in error_str or "high demand" in error_str:
+        return "The AI service is temporarily unavailable due to high demand. Please try again in a moment."
+    if "429" in error_str or "rate limit" in error_str or "quota" in error_str:
+        return "Rate limit exceeded. Please wait a moment before trying again."
+    if "401" in error_str or "unauthorized" in error_str or "invalid api key" in error_str:
+        return "Authentication error. Please check your API configuration."
+    if "timeout" in error_str or "timed out" in error_str:
+        return "The request timed out. Please try again."
+    if "connection" in error_str or "network" in error_str:
+        return "Network error. Please check your connection and try again."
+    original = str(error)
+    if len(original) > 200 or "{" in original or "traceback" in error_str:
+        return "An error occurred while processing your request. Please try again later."
+    return original
--- a/application/llm/base.py
+++ b/application/llm/base.py
@@ -16,22 +16,61 @@ class BaseLLM(ABC):
        agent_id=None,
        model_id=None,
        base_url=None,
+        backup_models=None,
    ):
        self.decoded_token = decoded_token
        self.agent_id = str(agent_id) if agent_id else None
        self.model_id = model_id
        self.base_url = base_url
        self.token_usage = {"prompt_tokens": 0, "generated_tokens": 0}
+        self._backup_models = backup_models or []
        self._fallback_llm = None
-        self._fallback_sequence_index = 0

    @property
    def fallback_llm(self):
-        """Lazy-loaded fallback LLM from FALLBACK_* settings."""
-        if self._fallback_llm is None and settings.FALLBACK_LLM_PROVIDER:
-            try:
-                from application.llm.llm_creator import LLMCreator
+        """Lazy-loaded fallback LLM: tries per-agent backup models first,
+        then the global FALLBACK_* settings."""
+        if self._fallback_llm is not None:
+            return self._fallback_llm

+        from application.llm.llm_creator import LLMCreator
+        from application.core.model_utils import (
+            get_provider_from_model_id,
+            get_api_key_for_provider,
+        )
+
+        # Try per-agent backup models first
+        for backup_model_id in self._backup_models:
+            try:
+                provider = get_provider_from_model_id(backup_model_id)
+                if not provider:
+                    logger.warning(
+                        f"Could not resolve provider for backup model: {backup_model_id}"
+                    )
+                    continue
+                api_key = get_api_key_for_provider(provider)
+                self._fallback_llm = LLMCreator.create_llm(
+                    provider,
+                    api_key=api_key,
+                    user_api_key=getattr(self, "user_api_key", None),
+                    decoded_token=self.decoded_token,
+                    model_id=backup_model_id,
+                    agent_id=self.agent_id,
+                )
+                logger.info(
+                    f"Fallback LLM initialized from agent backup model: "
+                    f"{provider}/{backup_model_id}"
+                )
+                return self._fallback_llm
+            except Exception as e:
+                logger.warning(
+                    f"Failed to initialize backup model {backup_model_id}: {str(e)}"
+                )
+                continue
+
+        # Fall back to global FALLBACK_* settings
+        if settings.FALLBACK_LLM_PROVIDER:
+            try:
                self._fallback_llm = LLMCreator.create_llm(
                    settings.FALLBACK_LLM_PROVIDER,
                    api_key=settings.FALLBACK_LLM_API_KEY or settings.API_KEY,
@@ -41,12 +80,14 @@ class BaseLLM(ABC):
                    agent_id=self.agent_id,
                )
                logger.info(
-                    f"Fallback LLM initialized: {settings.FALLBACK_LLM_PROVIDER}/{settings.FALLBACK_LLM_NAME}"
+                    f"Fallback LLM initialized from global settings: "
+                    f"{settings.FALLBACK_LLM_PROVIDER}/{settings.FALLBACK_LLM_NAME}"
                )
            except Exception as e:
                logger.error(
                    f"Failed to initialize fallback LLM: {str(e)}", exc_info=True
                )
+
        return self._fallback_llm

    @staticmethod
@@ -74,20 +115,60 @@ class BaseLLM(ABC):
                method = decorator(method)
            return method(self, *args, **kwargs)

+        is_stream = "stream" in method_name
+
+        if is_stream:
+            return self._stream_with_fallback(
+                decorated_method, method_name, *args, **kwargs
+            )
+
        try:
            return decorated_method()
        except Exception as e:
            if not self.fallback_llm:
                logger.error(f"Primary LLM failed and no fallback configured: {str(e)}")
                raise
+            fallback = self.fallback_llm
            logger.warning(
-                f"Primary LLM failed. Falling back to {settings.FALLBACK_LLM_PROVIDER}/{settings.FALLBACK_LLM_NAME}. Error: {str(e)}"
+                f"Primary LLM failed. Falling back to "
+                f"{fallback.model_id}. Error: {str(e)}"
            )

            fallback_method = getattr(
-                self.fallback_llm, method_name.replace("_raw_", "")
+                fallback, method_name.replace("_raw_", "")
            )
-            return fallback_method(*args, **kwargs)
+            fallback_kwargs = {**kwargs, "model": fallback.model_id}
+            return fallback_method(*args, **fallback_kwargs)
+
+    def _stream_with_fallback(
+        self, decorated_method, method_name, *args, **kwargs
+    ):
+        """
+        Wrapper generator that catches mid-stream errors and falls back.
+
+        Unlike non-streaming calls where exceptions are raised immediately,
+        streaming generators raise exceptions during iteration. This wrapper
+        ensures that if the primary LLM fails at any point during streaming
+        (creation or mid-stream), we fall back to the backup model.
+        """
+        try:
+            yield from decorated_method()
+        except Exception as e:
+            if not self.fallback_llm:
+                logger.error(
+                    f"Primary LLM failed and no fallback configured: {str(e)}"
+                )
+                raise
+            fallback = self.fallback_llm
+            logger.warning(
+                f"Primary LLM failed mid-stream. Falling back to "
+                f"{fallback.model_id}. Error: {str(e)}"
+            )
+            fallback_method = getattr(
+                fallback, method_name.replace("_raw_", "")
+            )
+            fallback_kwargs = {**kwargs, "model": fallback.model_id}
+            yield from fallback_method(*args, **fallback_kwargs)

    def gen(self, model, messages, stream=False, tools=None, *args, **kwargs):
        decorators = [gen_token_usage, gen_cache]
--- a/application/llm/google_ai.py
+++ b/application/llm/google_ai.py
@@ -158,11 +158,17 @@ class GoogleLLM(BaseLLM):
            if isinstance(content, list):
                parts = []
                for item in content:
-                    if isinstance(item, dict) and "text" in item and item["text"] is not None:
+                    if (
+                        isinstance(item, dict)
+                        and "text" in item
+                        and item["text"] is not None
+                    ):
                        parts.append(item["text"])
                return "\n".join(parts)
            return ""

+        import json as _json
+
        for message in messages:
            role = message.get("role")
            content = message.get("content")
@@ -176,9 +182,66 @@ class GoogleLLM(BaseLLM):

            if role == "assistant":
                role = "model"
-            elif role == "tool":
-                role = "model"
+
            parts = []
+
+            # Standard format: assistant message with tool_calls array
+            msg_tool_calls = message.get("tool_calls")
+            if msg_tool_calls and role == "model":
+                for tc in msg_tool_calls:
+                    func = tc.get("function", {})
+                    args = func.get("arguments", "{}")
+                    if isinstance(args, str):
+                        try:
+                            args = _json.loads(args)
+                        except (_json.JSONDecodeError, TypeError):
+                            args = {}
+                    cleaned_args = self._remove_null_values(args)
+                    thought_sig = tc.get("thought_signature")
+                    if thought_sig:
+                        parts.append(
+                            types.Part(
+                                functionCall=types.FunctionCall(
+                                    name=func.get("name", ""),
+                                    args=cleaned_args,
+                                ),
+                                thoughtSignature=thought_sig,
+                            )
+                        )
+                    else:
+                        parts.append(
+                            types.Part.from_function_call(
+                                name=func.get("name", ""),
+                                args=cleaned_args,
+                            )
+                        )
+                if parts:
+                    cleaned_messages.append(types.Content(role=role, parts=parts))
+                continue
+
+            # Standard format: tool message with tool_call_id
+            tool_call_id = message.get("tool_call_id")
+            if role == "tool" and tool_call_id is not None:
+                result_content = content
+                if isinstance(result_content, str):
+                    try:
+                        result_content = _json.loads(result_content)
+                    except (_json.JSONDecodeError, TypeError):
+                        pass
+                # Google expects function_response name — extract from tool_call_id context
+                # We use a placeholder name since Google API doesn't require exact match
+                parts.append(
+                    types.Part.from_function_response(
+                        name="tool_result",
+                        response={"result": result_content},
+                    )
+                )
+                cleaned_messages.append(types.Content(role="model", parts=parts))
+                continue
+
+            if role == "tool":
+                role = "model"
+
            if role and content is not None:
                if isinstance(content, str):
                    parts = [types.Part.from_text(text=content)]
@@ -187,15 +250,11 @@ class GoogleLLM(BaseLLM):
                        if "text" in item:
                            parts.append(types.Part.from_text(text=item["text"]))
                        elif "function_call" in item:
-                            # Remove null values from args to avoid API errors
-
+                            # Legacy format support
                            cleaned_args = self._remove_null_values(
                                item["function_call"]["args"]
                            )
-                            # Create function call part with thought_signature if present
-                            # For Gemini 3 models, we need to include thought_signature
                            if "thought_signature" in item:
-                                # Use Part constructor with functionCall and thoughtSignature
                                parts.append(
                                    types.Part(
                                        functionCall=types.FunctionCall(
@@ -206,7 +265,6 @@ class GoogleLLM(BaseLLM):
                                    )
                                )
                            else:
-                                # Use helper method when no thought_signature
                                parts.append(
                                    types.Part.from_function_call(
                                        name=item["function_call"]["name"],
@@ -236,7 +294,9 @@ class GoogleLLM(BaseLLM):
                    raise ValueError(f"Unexpected content type: {type(content)}")
                if parts:
                    cleaned_messages.append(types.Content(role=role, parts=parts))
-        system_instruction = "\n\n".join(system_instructions) if system_instructions else None
+        system_instruction = (
+            "\n\n".join(system_instructions) if system_instructions else None
+        )
        return cleaned_messages, system_instruction

    def _clean_schema(self, schema_obj):
@@ -336,7 +396,10 @@ class GoogleLLM(BaseLLM):
                        return f"function_call:{name}"
                    function_response = getattr(part, "function_response", None)
                    if function_response:
-                        name = getattr(function_response, "name", "") or "function_response"
+                        name = (
+                            getattr(function_response, "name", "")
+                            or "function_response"
+                        )
                        return f"function_response:{name}"
            if isinstance(message, dict):
                content = message.get("content")
@@ -507,6 +570,9 @@ class GoogleLLM(BaseLLM):
                            yield {"type": "thought", "thought": chunk_text}
                        else:
                            yield chunk_text
+        except Exception as e:
+            logging.error(f"GoogleLLM: Stream error: {e}", exc_info=True)
+            raise
        finally:
            if hasattr(response, "close"):
                response.close()
--- a/application/llm/handlers/base.py
+++ b/application/llm/handlers/base.py
@@ -1,3 +1,4 @@
+import json
 import logging
 import uuid
 from abc import ABC, abstractmethod
@@ -315,10 +316,34 @@ class LLMHandler(ABC):
                current_prompt = self._extract_text_from_content(content)

            elif role in {"assistant", "model"}:
-                # If this assistant turn contains tool calls, collect them; otherwise commit a response.
+                # Standard format: tool_calls array on assistant message
+                msg_tool_calls = message.get("tool_calls")
+                if msg_tool_calls:
+                    for tc in msg_tool_calls:
+                        call_id = tc.get("id") or str(uuid.uuid4())
+                        func = tc.get("function", {})
+                        args = func.get("arguments")
+                        if isinstance(args, str):
+                            try:
+                                args = json.loads(args)
+                            except (json.JSONDecodeError, TypeError):
+                                pass
+                        current_tool_calls[call_id] = {
+                            "tool_name": "unknown_tool",
+                            "action_name": func.get("name"),
+                            "arguments": args,
+                            "result": None,
+                            "status": "called",
+                            "call_id": call_id,
+                        }
+                    continue
+
+                # Legacy format: function_call/function_response in content list
                if isinstance(content, list):
+                    has_fc = False
                    for item in content:
                        if "function_call" in item:
+                            has_fc = True
                            fc = item["function_call"]
                            call_id = fc.get("call_id") or str(uuid.uuid4())
                            current_tool_calls[call_id] = {
@@ -329,37 +354,30 @@ class LLMHandler(ABC):
                                "status": "called",
                                "call_id": call_id,
                            }
-                        elif "function_response" in item:
-                            fr = item["function_response"]
-                            call_id = fr.get("call_id") or str(uuid.uuid4())
-                            current_tool_calls[call_id] = {
-                                "tool_name": "unknown_tool",
-                                "action_name": fr.get("name"),
-                                "arguments": None,
-                                "result": fr.get("response", {}).get("result"),
-                                "status": "completed",
-                                "call_id": call_id,
-                            }
-                    # No direct assistant text here; continue to next message
-                    continue
+                    if has_fc:
+                        continue

                response_text = self._extract_text_from_content(content)
                _commit_query(response_text)

            elif role == "tool":
-                # Attach tool outputs to the latest pending tool call if possible
+                # Standard format: tool_call_id on tool message
+                call_id = message.get("tool_call_id")
                tool_text = self._extract_text_from_content(content)
-                # Attempt to parse function_response style
-                call_id = None
-                if isinstance(content, list):
-                    for item in content:
-                        if "function_response" in item and item["function_response"].get("call_id"):
-                            call_id = item["function_response"]["call_id"]
-                            break
+
                if call_id and call_id in current_tool_calls:
                    current_tool_calls[call_id]["result"] = tool_text
                    current_tool_calls[call_id]["status"] = "completed"
-                elif queries:
+                # Legacy: function_response in content list
+                elif isinstance(content, list):
+                    for item in content:
+                        if "function_response" in item:
+                            legacy_id = item["function_response"].get("call_id")
+                            if legacy_id and legacy_id in current_tool_calls:
+                                current_tool_calls[legacy_id]["result"] = tool_text
+                                current_tool_calls[legacy_id]["status"] = "completed"
+                                break
+                elif call_id is None and queries:
                    queries[-1].setdefault("tool_calls", []).append(
                        {
                            "tool_name": "unknown_tool",
@@ -648,6 +666,13 @@ class LLMHandler(ABC):
        """
        Execute tool calls and update conversation history.

+        When a tool requires approval or client-side execution, it is
+        collected as a pending action instead of being executed.  The
+        generator returns ``(updated_messages, pending_actions)`` where
+        *pending_actions* is ``None`` when every tool was executed
+        normally, or a list of dicts describing actions the client must
+        resolve before the LLM loop can continue.
+
        Args:
            agent: The agent instance
            tool_calls: List of tool calls to execute
@@ -655,9 +680,11 @@ class LLMHandler(ABC):
            messages: Current conversation history

        Returns:
-            Updated messages list
+            Tuple of (updated_messages, pending_actions).
+            pending_actions is None if all tools executed, otherwise a list.
        """
        updated_messages = messages.copy()
+        pending_actions: List[Dict] = []

        for i, call in enumerate(tool_calls):
            # Check context limit before executing tool call
@@ -763,6 +790,29 @@ class LLMHandler(ABC):
                    # Set flag on agent
                    agent.context_limit_reached = True
                    break
+
+            # ---- Pause check: approval / client-side execution ----
+            llm_class = agent.llm.__class__.__name__
+            pause_info = agent.tool_executor.check_pause(
+                tools_dict, call, llm_class
+            )
+            if pause_info:
+                # Yield pause event so the client knows this tool is waiting
+                yield {
+                    "type": "tool_call",
+                    "data": {
+                        "tool_name": pause_info["tool_name"],
+                        "call_id": pause_info["call_id"],
+                        "action_name": pause_info.get("llm_name", pause_info["name"]),
+                        "arguments": pause_info["arguments"],
+                        "status": pause_info["pause_type"],
+                    },
+                }
+                pending_actions.append(pause_info)
+                # Do NOT add messages for pending tools here.
+                # They will be added on resume to keep call/result pairs together.
+                continue
+
            try:
                self.tool_calls.append(call)
                tool_executor_gen = agent._execute_tool_action(tools_dict, call)
@@ -772,25 +822,30 @@ class LLMHandler(ABC):
                    except StopIteration as e:
                        tool_response, call_id = e.value
                        break
-                    
-                function_call_content = {
-                    "function_call": {
-                        "name": call.name,
-                        "args": call.arguments,
-                        "call_id": call_id,
-                    }
-                }
-                # Include thought_signature for Google Gemini 3 models
-                # It should be at the same level as function_call, not inside it
-                if call.thought_signature:
-                    function_call_content["thought_signature"] = call.thought_signature
-                updated_messages.append(
-                    {
-                        "role": "assistant",
-                        "content": [function_call_content],
-                    }
-                )

+                # Standard internal format: assistant message with tool_calls array
+                args_str = (
+                    json.dumps(call.arguments)
+                    if isinstance(call.arguments, dict)
+                    else call.arguments
+                )
+                tool_call_obj = {
+                    "id": call_id,
+                    "type": "function",
+                    "function": {
+                        "name": call.name,
+                        "arguments": args_str,
+                    },
+                }
+                # Preserve thought_signature for Google Gemini 3 models
+                if call.thought_signature:
+                    tool_call_obj["thought_signature"] = call.thought_signature
+
+                updated_messages.append({
+                    "role": "assistant",
+                    "content": None,
+                    "tool_calls": [tool_call_obj],
+                })

                updated_messages.append(self.create_tool_message(call, tool_response))
            except Exception as e:
@@ -802,16 +857,15 @@ class LLMHandler(ABC):
                error_message = self.create_tool_message(error_call, error_response)
                updated_messages.append(error_message)

-                call_parts = call.name.split("_")
-                if len(call_parts) >= 2:
-                    tool_id = call_parts[-1]  # Last part is tool ID (e.g., "1")
-                    action_name = "_".join(call_parts[:-1])
-                    tool_name = tools_dict.get(tool_id, {}).get("name", "unknown_tool")
-                    full_action_name = f"{action_name}_{tool_id}"
+                mapping = agent.tool_executor._name_to_tool
+                if call.name in mapping:
+                    resolved_tool_id, _ = mapping[call.name]
+                    tool_name = tools_dict.get(resolved_tool_id, {}).get(
+                        "name", "unknown_tool"
+                    )
                else:
                    tool_name = "unknown_tool"
-                    action_name = call.name
-                    full_action_name = call.name
+                full_action_name = call.name
                yield {
                    "type": "tool_call",
                    "data": {
@@ -823,7 +877,7 @@ class LLMHandler(ABC):
                        "status": "error",
                    },
                }
-        return updated_messages
+        return updated_messages, pending_actions if pending_actions else None

    def handle_non_streaming(
        self, agent, response: Any, tools_dict: Dict, messages: List[Dict]
@@ -851,8 +905,22 @@ class LLMHandler(ABC):
                try:
                    yield next(tool_handler_gen)
                except StopIteration as e:
-                    messages = e.value
+                    messages, pending_actions = e.value
                    break
+
+            # If tools need approval or client execution, pause the loop
+            if pending_actions:
+                agent._pending_continuation = {
+                    "messages": messages,
+                    "pending_tool_calls": pending_actions,
+                    "tools_dict": tools_dict,
+                }
+                yield {
+                    "type": "tool_calls_pending",
+                    "data": {"pending_tool_calls": pending_actions},
+                }
+                return ""
+
            response = agent.llm.gen(
                model=agent.model_id, messages=messages, tools=agent.tools
            )
@@ -913,10 +981,23 @@ class LLMHandler(ABC):
                    try:
                        yield next(tool_handler_gen)
                    except StopIteration as e:
-                        messages = e.value
+                        messages, pending_actions = e.value
                        break
                tool_calls = {}

+                # If tools need approval or client execution, pause the loop
+                if pending_actions:
+                    agent._pending_continuation = {
+                        "messages": messages,
+                        "pending_tool_calls": pending_actions,
+                        "tools_dict": tools_dict,
+                    }
+                    yield {
+                        "type": "tool_calls_pending",
+                        "data": {"pending_tool_calls": pending_actions},
+                    }
+                    return
+
                # Check if context limit was reached during tool execution
                if hasattr(agent, 'context_limit_reached') and agent.context_limit_reached:
                    # Add system message warning about context limit
--- a/application/llm/handlers/google.py
+++ b/application/llm/handlers/google.py
@@ -67,18 +67,18 @@ class GoogleLLMHandler(LLMHandler):
            )

    def create_tool_message(self, tool_call: ToolCall, result: Any) -> Dict:
-        """Create Google-style tool message."""
+        """Create a tool result message in the standard internal format."""
+        import json as _json

+        content = (
+            _json.dumps(result)
+            if not isinstance(result, str)
+            else result
+        )
        return {
-            "role": "model",
-            "content": [
-                {
-                    "function_response": {
-                        "name": tool_call.name,
-                        "response": {"result": result},
-                    }
-                }
-            ],
+            "role": "tool",
+            "tool_call_id": tool_call.id,
+            "content": content,
        }

    def _iterate_stream(self, response: Any) -> Generator:
--- a/application/llm/handlers/handler_creator.py
+++ b/application/llm/handlers/handler_creator.py
@@ -7,6 +7,7 @@ class LLMHandlerCreator:
    handlers = {
        "openai": OpenAILLMHandler,
        "google": GoogleLLMHandler,
+        "novita": OpenAILLMHandler,  # Novita uses OpenAI-compatible API
        "default": OpenAILLMHandler,
    }

--- a/application/llm/handlers/openai.py
+++ b/application/llm/handlers/openai.py
@@ -37,18 +37,18 @@ class OpenAILLMHandler(LLMHandler):
        )

    def create_tool_message(self, tool_call: ToolCall, result: Any) -> Dict:
-        """Create OpenAI-style tool message."""
+        """Create a tool result message in the standard internal format."""
+        import json as _json
+
+        content = (
+            _json.dumps(result)
+            if not isinstance(result, str)
+            else result
+        )
        return {
            "role": "tool",
-            "content": [
-                {
-                    "function_response": {
-                        "name": tool_call.name,
-                        "response": {"result": result},
-                        "call_id": tool_call.id,
-                    }
-                }
-            ],
+            "tool_call_id": tool_call.id,
+            "content": content,
        }

    def _iterate_stream(self, response: Any) -> Generator:
--- a/application/llm/llm_creator.py
+++ b/application/llm/llm_creator.py
@@ -38,6 +38,7 @@ class LLMCreator:
        decoded_token,
        model_id=None,
        agent_id=None,
+        backup_models=None,
        *args,
        **kwargs,
    ):
@@ -59,6 +60,7 @@ class LLMCreator:
            model_id=model_id,
            agent_id=agent_id,
            base_url=base_url,
+            backup_models=backup_models,
            *args,
            **kwargs,
        )
--- a/application/llm/novita.py
+++ b/application/llm/novita.py
@@ -1,13 +1,13 @@
 from application.core.settings import settings
 from application.llm.openai import OpenAILLM

-NOVITA_BASE_URL = "https://api.novita.ai/v3/openai"
+NOVITA_BASE_URL = "https://api.novita.ai/openai"


 class NovitaLLM(OpenAILLM):
    def __init__(self, api_key=None, user_api_key=None, base_url=None, *args, **kwargs):
        super().__init__(
-            api_key=api_key or settings.API_KEY,
+            api_key=api_key or settings.NOVITA_API_KEY or settings.API_KEY,
            user_api_key=user_api_key,
            base_url=base_url or NOVITA_BASE_URL,
            *args,
--- a/application/llm/openai.py
+++ b/application/llm/openai.py
@@ -91,19 +91,59 @@ class OpenAILLM(BaseLLM):

            if role == "model":
                role = "assistant"
+
+            # Standard format: assistant message with tool_calls (passthrough)
+            tool_calls = message.get("tool_calls")
+            if tool_calls and role == "assistant":
+                cleaned_tcs = []
+                for tc in tool_calls:
+                    func = tc.get("function", {})
+                    args = func.get("arguments", "{}")
+                    if isinstance(args, dict):
+                        args = json.dumps(self._remove_null_values(args))
+                    elif isinstance(args, str):
+                        try:
+                            parsed = json.loads(args)
+                            args = json.dumps(self._remove_null_values(parsed))
+                        except (json.JSONDecodeError, TypeError):
+                            pass
+                    cleaned_tcs.append({
+                        "id": tc.get("id", ""),
+                        "type": "function",
+                        "function": {"name": func.get("name", ""), "arguments": args},
+                    })
+                cleaned_messages.append({
+                    "role": "assistant",
+                    "content": None,
+                    "tool_calls": cleaned_tcs,
+                })
+                continue
+
+            # Standard format: tool message with tool_call_id (passthrough)
+            tool_call_id = message.get("tool_call_id")
+            if role == "tool" and tool_call_id is not None:
+                cleaned_messages.append({
+                    "role": "tool",
+                    "tool_call_id": tool_call_id,
+                    "content": content if isinstance(content, str) else json.dumps(content),
+                })
+                continue
+
            if role and content is not None:
                if isinstance(content, str):
                    cleaned_messages.append({"role": role, "content": content})
                elif isinstance(content, list):
-                    # Collect all content parts into a single message
                    content_parts = []
-
                    for item in content:
+                        # Legacy format support: function_call / function_response
                        if "function_call" in item:
-                            # Function calls need their own message
-                            cleaned_args = self._remove_null_values(
-                                item["function_call"]["args"]
-                            )
+                            args = item["function_call"]["args"]
+                            if isinstance(args, str):
+                                try:
+                                    args = json.loads(args)
+                                except (json.JSONDecodeError, TypeError):
+                                    pass
+                            cleaned_args = self._remove_null_values(args)
                            tool_call = {
                                "id": item["function_call"]["call_id"],
                                "type": "function",
@@ -112,28 +152,20 @@ class OpenAILLM(BaseLLM):
                                    "arguments": json.dumps(cleaned_args),
                                },
                            }
-                            cleaned_messages.append(
-                                {
-                                    "role": "assistant",
-                                    "content": None,
-                                    "tool_calls": [tool_call],
-                                }
-                            )
+                            cleaned_messages.append({
+                                "role": "assistant",
+                                "content": None,
+                                "tool_calls": [tool_call],
+                            })
                        elif "function_response" in item:
-                            # Function responses need their own message
-                            cleaned_messages.append(
-                                {
-                                    "role": "tool",
-                                    "tool_call_id": item["function_response"][
-                                        "call_id"
-                                    ],
-                                    "content": json.dumps(
-                                        item["function_response"]["response"]["result"]
-                                    ),
-                                }
-                            )
+                            cleaned_messages.append({
+                                "role": "tool",
+                                "tool_call_id": item["function_response"]["call_id"],
+                                "content": json.dumps(
+                                    item["function_response"]["response"]["result"]
+                                ),
+                            })
                        elif isinstance(item, dict):
-                            # Collect content parts (text, images, files) into a single message
                            if "type" in item and item["type"] == "text" and "text" in item:
                                content_parts.append(item)
                            elif "type" in item and item["type"] == "file" and "file" in item:
@@ -141,10 +173,7 @@ class OpenAILLM(BaseLLM):
                            elif "type" in item and item["type"] == "image_url" and "image_url" in item:
                                content_parts.append(item)
                            elif "text" in item and "type" not in item:
-                                # Legacy format: {"text": "..."} without type
                                content_parts.append({"type": "text", "text": item["text"]})
-
-                    # Add the collected content parts as a single message
                    if content_parts:
                        cleaned_messages.append({"role": role, "content": content_parts})
                else:
--- a/application/logging.py
+++ b/application/logging.py
@@ -157,5 +157,21 @@ def _log_to_mongodb(
        user_logs_collection.insert_one(log_entry)
        logging.debug(f"Logged activity to MongoDB: {activity_id}")

+        from application.storage.db.dual_write import dual_write
+        from application.storage.db.repositories.stack_logs import StackLogsRepository
+
+        dual_write(
+            StackLogsRepository,
+            lambda repo, e=log_entry: repo.insert(
+                activity_id=e["id"],
+                endpoint=e.get("endpoint"),
+                level=e.get("level"),
+                user_id=e.get("user"),
+                api_key=e.get("api_key"),
+                query=e.get("query"),
+                stacks=e.get("stacks"),
+            ),
+        )
+
    except Exception as e:
        logging.error(f"Failed to log to MongoDB: {e}", exc_info=True)
--- a/application/parser/connectors/base.py
+++ b/application/parser/connectors/base.py
@@ -62,15 +62,26 @@ class BaseConnectorAuth(ABC):
    def is_token_expired(self, token_info: Dict[str, Any]) -> bool:
        """
        Check if a token is expired.
-        
+
        Args:
            token_info: Token information dictionary
-            
+
        Returns:
            True if token is expired, False otherwise
        """
        pass

+    def sanitize_token_info(self, token_info: Dict[str, Any], **extra_fields) -> Dict[str, Any]:
+        """Extract the fields safe to persist in the session store.
+        """
+        return {
+            "access_token": token_info.get("access_token"),
+            "refresh_token": token_info.get("refresh_token"),
+            "token_uri": token_info.get("token_uri"),
+            "expiry": token_info.get("expiry"),
+            **extra_fields,
+        }
+

 class BaseConnectorLoader(ABC):
    """
--- a/application/parser/connectors/confluence/init.py
+++ b/application/parser/connectors/confluence/init.py
@@ -0,0 +1,4 @@
+from .auth import ConfluenceAuth
+from .loader import ConfluenceLoader
+
+__all__ = ["ConfluenceAuth", "ConfluenceLoader"]
--- a/application/parser/connectors/confluence/auth.py
+++ b/application/parser/connectors/confluence/auth.py
@@ -0,0 +1,216 @@
+import datetime
+import logging
+from typing import Any, Dict, Optional
+from urllib.parse import urlencode
+
+import requests
+
+from application.core.settings import settings
+from application.parser.connectors.base import BaseConnectorAuth
+
+logger = logging.getLogger(__name__)
+
+
+class ConfluenceAuth(BaseConnectorAuth):
+
+    SCOPES = [
+        "read:page:confluence",
+        "read:space:confluence",
+        "read:attachment:confluence",
+        "read:me",
+        "offline_access",
+    ]
+
+    AUTH_URL = "https://auth.atlassian.com/authorize"
+    TOKEN_URL = "https://auth.atlassian.com/oauth/token"
+    RESOURCES_URL = "https://api.atlassian.com/oauth/token/accessible-resources"
+    ME_URL = "https://api.atlassian.com/me"
+
+    def __init__(self):
+        self.client_id = settings.CONFLUENCE_CLIENT_ID
+        self.client_secret = settings.CONFLUENCE_CLIENT_SECRET
+        self.redirect_uri = settings.CONNECTOR_REDIRECT_BASE_URI
+
+        if not self.client_id or not self.client_secret:
+            raise ValueError(
+                "Confluence OAuth credentials not configured. "
+                "Please set CONFLUENCE_CLIENT_ID and CONFLUENCE_CLIENT_SECRET in settings."
+            )
+
+    def get_authorization_url(self, state: Optional[str] = None) -> str:
+        params = {
+            "audience": "api.atlassian.com",
+            "client_id": self.client_id,
+            "scope": " ".join(self.SCOPES),
+            "redirect_uri": self.redirect_uri,
+            "state": state,
+            "response_type": "code",
+            "prompt": "consent",
+        }
+        return f"{self.AUTH_URL}?{urlencode(params)}"
+
+    def exchange_code_for_tokens(self, authorization_code: str) -> Dict[str, Any]:
+        if not authorization_code:
+            raise ValueError("Authorization code is required")
+
+        response = requests.post(
+            self.TOKEN_URL,
+            json={
+                "grant_type": "authorization_code",
+                "client_id": self.client_id,
+                "client_secret": self.client_secret,
+                "code": authorization_code,
+                "redirect_uri": self.redirect_uri,
+            },
+            headers={"Content-Type": "application/json"},
+            timeout=30,
+        )
+        response.raise_for_status()
+        token_data = response.json()
+
+        access_token = token_data.get("access_token")
+        if not access_token:
+            raise ValueError("OAuth flow did not return an access token")
+
+        refresh_token = token_data.get("refresh_token")
+        if not refresh_token:
+            raise ValueError("OAuth flow did not return a refresh token")
+
+        expires_in = token_data.get("expires_in", 3600)
+        expiry = (
+            datetime.datetime.now(datetime.timezone.utc)
+            + datetime.timedelta(seconds=expires_in)
+        ).isoformat()
+
+        cloud_id = self._fetch_cloud_id(access_token)
+        user_info = self._fetch_user_info(access_token)
+
+        return {
+            "access_token": access_token,
+            "refresh_token": refresh_token,
+            "token_uri": self.TOKEN_URL,
+            "scopes": self.SCOPES,
+            "expiry": expiry,
+            "cloud_id": cloud_id,
+            "user_info": {
+                "name": user_info.get("display_name", ""),
+                "email": user_info.get("email", ""),
+            },
+        }
+
+    def refresh_access_token(self, refresh_token: str) -> Dict[str, Any]:
+        if not refresh_token:
+            raise ValueError("Refresh token is required")
+
+        response = requests.post(
+            self.TOKEN_URL,
+            json={
+                "grant_type": "refresh_token",
+                "client_id": self.client_id,
+                "client_secret": self.client_secret,
+                "refresh_token": refresh_token,
+            },
+            headers={"Content-Type": "application/json"},
+            timeout=30,
+        )
+        response.raise_for_status()
+        token_data = response.json()
+
+        access_token = token_data.get("access_token")
+        new_refresh_token = token_data.get("refresh_token", refresh_token)
+
+        expires_in = token_data.get("expires_in", 3600)
+        expiry = (
+            datetime.datetime.now(datetime.timezone.utc)
+            + datetime.timedelta(seconds=expires_in)
+        ).isoformat()
+
+        cloud_id = self._fetch_cloud_id(access_token)
+
+        return {
+            "access_token": access_token,
+            "refresh_token": new_refresh_token,
+            "token_uri": self.TOKEN_URL,
+            "scopes": self.SCOPES,
+            "expiry": expiry,
+            "cloud_id": cloud_id,
+        }
+
+    def is_token_expired(self, token_info: Dict[str, Any]) -> bool:
+        if not token_info:
+            return True
+
+        expiry = token_info.get("expiry")
+        if not expiry:
+            return bool(token_info.get("access_token"))
+
+        try:
+            expiry_dt = datetime.datetime.fromisoformat(expiry)
+            now = datetime.datetime.now(datetime.timezone.utc)
+            return now >= expiry_dt - datetime.timedelta(seconds=60)
+        except Exception:
+            return True
+
+    def get_token_info_from_session(self, session_token: str) -> Dict[str, Any]:
+        from application.core.mongo_db import MongoDB
+        from application.core.settings import settings as app_settings
+
+        mongo = MongoDB.get_client()
+        db = mongo[app_settings.MONGO_DB_NAME]
+
+        session = db["connector_sessions"].find_one({"session_token": session_token})
+        if not session:
+            raise ValueError(f"Invalid session token: {session_token}")
+
+        token_info = session.get("token_info")
+        if not token_info:
+            raise ValueError("Session missing token information")
+
+        required = ["access_token", "refresh_token", "cloud_id"]
+        missing = [f for f in required if not token_info.get(f)]
+        if missing:
+            raise ValueError(f"Missing required token fields: {missing}")
+
+        return token_info
+
+    def sanitize_token_info(
+        self, token_info: Dict[str, Any], **extra_fields
+    ) -> Dict[str, Any]:
+        return super().sanitize_token_info(
+            token_info,
+            cloud_id=token_info.get("cloud_id"),
+            **extra_fields,
+        )
+
+    def _fetch_cloud_id(self, access_token: str) -> str:
+        response = requests.get(
+            self.RESOURCES_URL,
+            headers={
+                "Authorization": f"Bearer {access_token}",
+                "Accept": "application/json",
+            },
+            timeout=30,
+        )
+        response.raise_for_status()
+        resources = response.json()
+
+        if not resources:
+            raise ValueError("No accessible Confluence sites found for this account")
+
+        return resources[0]["id"]
+
+    def _fetch_user_info(self, access_token: str) -> Dict[str, Any]:
+        try:
+            response = requests.get(
+                self.ME_URL,
+                headers={
+                    "Authorization": f"Bearer {access_token}",
+                    "Accept": "application/json",
+                },
+                timeout=30,
+            )
+            response.raise_for_status()
+            return response.json()
+        except Exception as e:
+            logger.warning("Could not fetch user info: %s", e)
+            return {}
--- a/application/parser/connectors/confluence/loader.py
+++ b/application/parser/connectors/confluence/loader.py
@@ -0,0 +1,416 @@
+import functools
+import logging
+import os
+from typing import Any, Dict, List, Optional
+
+import requests
+
+from application.parser.connectors.base import BaseConnectorLoader
+from application.parser.connectors.confluence.auth import ConfluenceAuth
+from application.parser.schema.base import Document
+
+logger = logging.getLogger(__name__)
+
+API_V2 = "https://api.atlassian.com/ex/confluence/{cloud_id}/wiki/api/v2"
+DOWNLOAD_BASE = "https://api.atlassian.com/ex/confluence/{cloud_id}/wiki"
+
+SUPPORTED_ATTACHMENT_TYPES = {
+    "application/pdf": ".pdf",
+    "application/vnd.openxmlformats-officedocument.wordprocessingml.document": ".docx",
+    "application/vnd.openxmlformats-officedocument.presentationml.presentation": ".pptx",
+    "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet": ".xlsx",
+    "application/msword": ".doc",
+    "application/vnd.ms-powerpoint": ".ppt",
+    "application/vnd.ms-excel": ".xls",
+    "text/plain": ".txt",
+    "text/csv": ".csv",
+    "text/html": ".html",
+    "text/markdown": ".md",
+    "application/json": ".json",
+    "application/epub+zip": ".epub",
+    "image/jpeg": ".jpg",
+    "image/png": ".png",
+}
+
+
+def _retry_on_auth_failure(func):
+    @functools.wraps(func)
+    def wrapper(self, *args, **kwargs):
+        try:
+            return func(self, *args, **kwargs)
+        except requests.exceptions.HTTPError as e:
+            if e.response is not None and e.response.status_code in (401, 403):
+                logger.info(
+                    "Auth failure in %s, refreshing token and retrying", func.__name__
+                )
+                try:
+                    new_token_info = self.auth.refresh_access_token(self.refresh_token)
+                    self.access_token = new_token_info["access_token"]
+                    self.refresh_token = new_token_info.get(
+                        "refresh_token", self.refresh_token
+                    )
+                    self._persist_refreshed_tokens(new_token_info)
+                except Exception as refresh_err:
+                    raise ValueError(
+                        f"Authentication failed and could not be refreshed: {refresh_err}"
+                    ) from e
+                return func(self, *args, **kwargs)
+            raise
+
+    return wrapper
+
+
+class ConfluenceLoader(BaseConnectorLoader):
+
+    def __init__(self, session_token: str):
+        self.auth = ConfluenceAuth()
+        self.session_token = session_token
+
+        token_info = self.auth.get_token_info_from_session(session_token)
+        self.access_token = token_info["access_token"]
+        self.refresh_token = token_info["refresh_token"]
+        self.cloud_id = token_info["cloud_id"]
+
+        self.base_url = API_V2.format(cloud_id=self.cloud_id)
+        self.download_base = DOWNLOAD_BASE.format(cloud_id=self.cloud_id)
+        self.next_page_token = None
+
+    def _headers(self) -> Dict[str, str]:
+        return {
+            "Authorization": f"Bearer {self.access_token}",
+            "Accept": "application/json",
+        }
+
+    def _persist_refreshed_tokens(self, token_info: Dict[str, Any]) -> None:
+        try:
+            from application.core.mongo_db import MongoDB
+            from application.core.settings import settings as app_settings
+
+            sanitized = self.auth.sanitize_token_info(token_info)
+            mongo = MongoDB.get_client()
+            db = mongo[app_settings.MONGO_DB_NAME]
+            db["connector_sessions"].update_one(
+                {"session_token": self.session_token},
+                {"$set": {"token_info": sanitized}},
+            )
+        except Exception as e:
+            logger.warning("Failed to persist refreshed tokens: %s", e)
+
+    @_retry_on_auth_failure
+    def load_data(self, inputs: Dict[str, Any]) -> List[Document]:
+        folder_id = inputs.get("folder_id")
+        file_ids = inputs.get("file_ids", [])
+        limit = inputs.get("limit", 100)
+        list_only = inputs.get("list_only", False)
+        page_token = inputs.get("page_token")
+        search_query = inputs.get("search_query")
+        self.next_page_token = None
+
+        if file_ids:
+            return self._load_pages_by_ids(file_ids, list_only, search_query)
+
+        if folder_id:
+            return self._list_pages_in_space(
+                folder_id, limit, list_only, page_token, search_query
+            )
+
+        return self._list_spaces(limit, page_token, search_query)
+
+    @_retry_on_auth_failure
+    def download_to_directory(self, local_dir: str, source_config: dict = None) -> dict:
+        config = source_config or getattr(self, "config", {})
+        file_ids = config.get("file_ids", [])
+        folder_ids = config.get("folder_ids", [])
+        files_downloaded = 0
+
+        os.makedirs(local_dir, exist_ok=True)
+
+        if isinstance(file_ids, str):
+            file_ids = [file_ids]
+        if isinstance(folder_ids, str):
+            folder_ids = [folder_ids]
+
+        for page_id in file_ids:
+            if self._download_page(page_id, local_dir):
+                files_downloaded += 1
+            files_downloaded += self._download_page_attachments(page_id, local_dir)
+
+        for space_id in folder_ids:
+            files_downloaded += self._download_space(space_id, local_dir)
+
+        return {
+            "files_downloaded": files_downloaded,
+            "directory_path": local_dir,
+            "empty_result": files_downloaded == 0,
+            "source_type": "confluence",
+            "config_used": config,
+        }
+
+    def _list_spaces(
+        self, limit: int, cursor: Optional[str], search_query: Optional[str]
+    ) -> List[Document]:
+        documents: List[Document] = []
+        params: Dict[str, Any] = {"limit": min(limit, 250)}
+        if cursor:
+            params["cursor"] = cursor
+
+        response = requests.get(
+            f"{self.base_url}/spaces",
+            headers=self._headers(),
+            params=params,
+            timeout=30,
+        )
+        response.raise_for_status()
+        data = response.json()
+
+        for space in data.get("results", []):
+            name = space.get("name", "")
+            if search_query and search_query.lower() not in name.lower():
+                continue
+
+            documents.append(
+                Document(
+                    text="",
+                    doc_id=space["id"],
+                    extra_info={
+                        "file_name": name,
+                        "mime_type": "folder",
+                        "size": None,
+                        "created_time": space.get("createdAt"),
+                        "modified_time": None,
+                        "source": "confluence",
+                        "is_folder": True,
+                        "space_key": space.get("key"),
+                    },
+                )
+            )
+
+        next_link = data.get("_links", {}).get("next")
+        self.next_page_token = self._extract_cursor(next_link)
+        return documents
+
+    def _list_pages_in_space(
+        self,
+        space_id: str,
+        limit: int,
+        list_only: bool,
+        cursor: Optional[str],
+        search_query: Optional[str],
+    ) -> List[Document]:
+        documents: List[Document] = []
+        params: Dict[str, Any] = {"limit": min(limit, 250)}
+        if cursor:
+            params["cursor"] = cursor
+
+        response = requests.get(
+            f"{self.base_url}/spaces/{space_id}/pages",
+            headers=self._headers(),
+            params=params,
+            timeout=30,
+        )
+        response.raise_for_status()
+        data = response.json()
+
+        for page in data.get("results", []):
+            title = page.get("title", "")
+            if search_query and search_query.lower() not in title.lower():
+                continue
+
+            doc = self._page_to_document(
+                page, load_content=not list_only, space_id=space_id
+            )
+            if doc:
+                documents.append(doc)
+
+        next_link = data.get("_links", {}).get("next")
+        self.next_page_token = self._extract_cursor(next_link)
+        return documents
+
+    def _load_pages_by_ids(
+        self, page_ids: List[str], list_only: bool, search_query: Optional[str]
+    ) -> List[Document]:
+        documents: List[Document] = []
+        for page_id in page_ids:
+            try:
+                params: Dict[str, str] = {}
+                if not list_only:
+                    params["body-format"] = "storage"
+
+                response = requests.get(
+                    f"{self.base_url}/pages/{page_id}",
+                    headers=self._headers(),
+                    params=params,
+                    timeout=30,
+                )
+                response.raise_for_status()
+                page = response.json()
+
+                title = page.get("title", "")
+                if search_query and search_query.lower() not in title.lower():
+                    continue
+
+                doc = self._page_to_document(page, load_content=not list_only)
+                if doc:
+                    documents.append(doc)
+            except Exception as e:
+                logger.error("Error loading page %s: %s", page_id, e)
+        return documents
+
+    def _page_to_document(
+        self,
+        page: Dict[str, Any],
+        load_content: bool = False,
+        space_id: Optional[str] = None,
+    ) -> Optional[Document]:
+        page_id = page.get("id")
+        title = page.get("title", "Unknown")
+        version = page.get("version", {})
+        modified_time = version.get("createdAt") if isinstance(version, dict) else None
+        created_time = page.get("createdAt")
+        resolved_space_id = space_id or page.get("spaceId")
+
+        text = ""
+        if load_content:
+            body = page.get("body", {})
+            storage = body.get("storage", {}) if isinstance(body, dict) else {}
+            text = storage.get("value", "") if isinstance(storage, dict) else ""
+
+        return Document(
+            text=text,
+            doc_id=str(page_id),
+            extra_info={
+                "file_name": title,
+                "mime_type": "text/html",
+                "size": len(text) if text else None,
+                "created_time": created_time,
+                "modified_time": modified_time,
+                "source": "confluence",
+                "is_folder": False,
+                "page_id": str(page_id),
+                "space_id": resolved_space_id,
+                "cloud_id": self.cloud_id,
+            },
+        )
+
+    def _download_page(self, page_id: str, local_dir: str) -> bool:
+        try:
+            response = requests.get(
+                f"{self.base_url}/pages/{page_id}",
+                headers=self._headers(),
+                params={"body-format": "storage"},
+                timeout=30,
+            )
+            response.raise_for_status()
+            page = response.json()
+
+            title = page.get("title", page_id)
+            safe_name = "".join(c if c.isalnum() or c in " -_" else "_" for c in title)
+            body = page.get("body", {}).get("storage", {}).get("value", "")
+
+            file_path = os.path.join(local_dir, f"{safe_name}.html")
+            with open(file_path, "w", encoding="utf-8") as f:
+                f.write(body)
+
+            return True
+        except Exception as e:
+            logger.error("Error downloading page %s: %s", page_id, e)
+            return False
+
+    def _download_page_attachments(self, page_id: str, local_dir: str) -> int:
+        downloaded = 0
+        try:
+            cursor = None
+            while True:
+                params: Dict[str, Any] = {"limit": 100}
+                if cursor:
+                    params["cursor"] = cursor
+
+                response = requests.get(
+                    f"{self.base_url}/pages/{page_id}/attachments",
+                    headers=self._headers(),
+                    params=params,
+                    timeout=30,
+                )
+                response.raise_for_status()
+                data = response.json()
+
+                for att in data.get("results", []):
+                    media_type = att.get("mediaType", "")
+                    if media_type not in SUPPORTED_ATTACHMENT_TYPES:
+                        continue
+
+                    download_link = att.get("_links", {}).get("download")
+                    if not download_link:
+                        continue
+
+                    raw_name = att.get("title", att.get("id", "attachment"))
+                    file_name = "".join(
+                        c if c.isalnum() or c in " -_." else "_"
+                        for c in os.path.basename(raw_name)
+                    ) or "attachment"
+                    file_path = os.path.join(local_dir, file_name)
+
+                    url = f"{self.download_base}{download_link}"
+                    file_resp = requests.get(
+                        url, headers=self._headers(), timeout=60, stream=True
+                    )
+                    file_resp.raise_for_status()
+
+                    with open(file_path, "wb") as f:
+                        for chunk in file_resp.iter_content(chunk_size=8192):
+                            f.write(chunk)
+
+                    downloaded += 1
+
+                next_link = data.get("_links", {}).get("next")
+                cursor = self._extract_cursor(next_link)
+                if not cursor:
+                    break
+
+        except Exception as e:
+            logger.error("Error downloading attachments for page %s: %s", page_id, e)
+        return downloaded
+
+    def _download_space(self, space_id: str, local_dir: str) -> int:
+        downloaded = 0
+        cursor = None
+        while True:
+            params: Dict[str, Any] = {"limit": 250}
+            if cursor:
+                params["cursor"] = cursor
+
+            try:
+                response = requests.get(
+                    f"{self.base_url}/spaces/{space_id}/pages",
+                    headers=self._headers(),
+                    params=params,
+                    timeout=30,
+                )
+                response.raise_for_status()
+                data = response.json()
+            except Exception as e:
+                logger.error("Error listing pages in space %s: %s", space_id, e)
+                break
+
+            for page in data.get("results", []):
+                page_id = page.get("id")
+                if self._download_page(str(page_id), local_dir):
+                    downloaded += 1
+                downloaded += self._download_page_attachments(str(page_id), local_dir)
+
+            next_link = data.get("_links", {}).get("next")
+            cursor = self._extract_cursor(next_link)
+            if not cursor:
+                break
+
+        return downloaded
+
+    @staticmethod
+    def _extract_cursor(next_link: Optional[str]) -> Optional[str]:
+        if not next_link:
+            return None
+        from urllib.parse import parse_qs, urlparse
+
+        parsed = urlparse(next_link)
+        cursors = parse_qs(parsed.query).get("cursor")
+        return cursors[0] if cursors else None
--- a/application/parser/connectors/connector_creator.py
+++ b/application/parser/connectors/connector_creator.py
@@ -1,5 +1,9 @@
-from application.parser.connectors.google_drive.loader import GoogleDriveLoader
+from application.parser.connectors.confluence.auth import ConfluenceAuth
+from application.parser.connectors.confluence.loader import ConfluenceLoader
 from application.parser.connectors.google_drive.auth import GoogleDriveAuth
+from application.parser.connectors.google_drive.loader import GoogleDriveLoader
+from application.parser.connectors.share_point.auth import SharePointAuth
+from application.parser.connectors.share_point.loader import SharePointLoader


 class ConnectorCreator:
@@ -11,11 +15,15 @@ class ConnectorCreator:
    """

    connectors = {
+        "confluence": ConfluenceLoader,
        "google_drive": GoogleDriveLoader,
+        "share_point": SharePointLoader,
    }

    auth_providers = {
+        "confluence": ConfluenceAuth,
        "google_drive": GoogleDriveAuth,
+        "share_point": SharePointAuth,
    }

    @classmethod
--- a/application/parser/connectors/google_drive/auth.py
+++ b/application/parser/connectors/google_drive/auth.py
@@ -232,10 +232,6 @@ class GoogleDriveAuth(BaseConnectorAuth):
            if missing_fields:
                raise ValueError(f"Missing required token fields: {missing_fields}")

-            if 'client_id' not in token_info:
-                token_info['client_id'] = settings.GOOGLE_CLIENT_ID
-            if 'client_secret' not in token_info:
-                token_info['client_secret'] = settings.GOOGLE_CLIENT_SECRET
            if 'token_uri' not in token_info:
                token_info['token_uri'] = 'https://oauth2.googleapis.com/token'

--- a/application/parser/connectors/google_drive/loader.py
+++ b/application/parser/connectors/google_drive/loader.py
@@ -327,15 +327,10 @@ class GoogleDriveLoader(BaseConnectorLoader):
            content_bytes = file_io.getvalue()

            try:
-                content = content_bytes.decode('utf-8')
+                return content_bytes.decode('utf-8')
            except UnicodeDecodeError:
-                try:
-                    content = content_bytes.decode('latin-1')
-                except UnicodeDecodeError:
-                    logging.error(f"Could not decode file {file_id} as text")
-                    return None
-
-            return content
+                logging.error(f"Could not decode file {file_id} as text")
+                return None

        except HttpError as e:
            logging.error(f"HTTP error downloading file {file_id}: {e.resp.status} - {e.content}")
--- a/application/parser/connectors/share_point/init.py
+++ b/application/parser/connectors/share_point/init.py
@@ -0,0 +1,10 @@
+"""
+Share Point connector package for DocsGPT.
+
+This module provides authentication and document loading capabilities for Share Point.
+"""
+
+from .auth import SharePointAuth
+from .loader import SharePointLoader
+
+__all__ = ['SharePointAuth', 'SharePointLoader']
--- a/application/parser/connectors/share_point/auth.py
+++ b/application/parser/connectors/share_point/auth.py
@@ -0,0 +1,152 @@
+import datetime
+import logging
+from typing import Optional, Dict, Any
+
+from msal import ConfidentialClientApplication
+
+from application.core.settings import settings
+from application.parser.connectors.base import BaseConnectorAuth
+
+logger = logging.getLogger(__name__)
+
+
+class SharePointAuth(BaseConnectorAuth):
+    """
+    Handles Microsoft OAuth 2.0 authentication for SharePoint/OneDrive.
+
+    Note: Files.Read scope allows access to files the user has granted access to,
+    similar to Google Drive's drive.file scope.
+    """
+
+    SCOPES = [
+        "Files.Read",
+        "Sites.Read.All",
+        "User.Read",
+    ]
+
+    def __init__(self):
+        self.client_id = settings.MICROSOFT_CLIENT_ID
+        self.client_secret = settings.MICROSOFT_CLIENT_SECRET
+
+        if not self.client_id:
+            raise ValueError(
+                "Microsoft OAuth credentials not configured. Please set MICROSOFT_CLIENT_ID in settings."
+            )
+        
+        if not self.client_secret:
+            raise ValueError(
+                "Microsoft OAuth credentials not configured. Please set MICROSOFT_CLIENT_SECRET in settings."
+            )
+
+        self.redirect_uri = settings.CONNECTOR_REDIRECT_BASE_URI
+        self.tenant_id = settings.MICROSOFT_TENANT_ID
+        self.authority = getattr(settings, "MICROSOFT_AUTHORITY", f"https://login.microsoftonline.com/{self.tenant_id}")
+
+        self.auth_app = ConfidentialClientApplication(
+            client_id=self.client_id,
+            client_credential=self.client_secret,
+            authority=self.authority
+        )
+
+    def get_authorization_url(self, state: Optional[str] = None) -> str:
+        return self.auth_app.get_authorization_request_url(
+            scopes=self.SCOPES, state=state, redirect_uri=self.redirect_uri
+        )
+
+    def exchange_code_for_tokens(self, authorization_code: str) -> Dict[str, Any]:
+        result = self.auth_app.acquire_token_by_authorization_code(
+            code=authorization_code,
+            scopes=self.SCOPES,
+            redirect_uri=self.redirect_uri
+        )
+
+        if "error" in result:
+            logger.error("Token exchange failed: %s", result.get("error_description"))
+            raise ValueError(f"Error acquiring token: {result.get('error_description')}")
+
+        return self.map_token_response(result)
+
+    def refresh_access_token(self, refresh_token: str) -> Dict[str, Any]:
+        result = self.auth_app.acquire_token_by_refresh_token(refresh_token=refresh_token, scopes=self.SCOPES)
+
+        if "error" in result:
+            logger.error("Token refresh failed: %s", result.get("error_description"))
+            raise ValueError(f"Error refreshing token: {result.get('error_description')}")
+
+        return self.map_token_response(result)
+
+    def get_token_info_from_session(self, session_token: str) -> Dict[str, Any]:
+        try:
+            from application.core.mongo_db import MongoDB
+            from application.core.settings import settings
+
+            mongo = MongoDB.get_client()
+            db = mongo[settings.MONGO_DB_NAME]
+
+            sessions_collection = db["connector_sessions"]
+            session = sessions_collection.find_one({"session_token": session_token})
+
+            if not session:
+                raise ValueError(f"Invalid session token: {session_token}")
+
+            if "token_info" not in session:
+                raise ValueError("Session missing token information")
+
+            token_info = session["token_info"]
+            if not token_info:
+                raise ValueError("Invalid token information")
+
+            required_fields = ["access_token", "refresh_token"]
+            missing_fields = [field for field in required_fields if field not in token_info or not token_info.get(field)]
+            if missing_fields:
+                raise ValueError(f"Missing required token fields: {missing_fields}")
+
+            if 'token_uri' not in token_info:
+                token_info['token_uri'] = f"https://login.microsoftonline.com/{settings.MICROSOFT_TENANT_ID}/oauth2/v2.0/token"
+
+            return token_info
+
+        except Exception as e:
+            logger.error("Failed to retrieve token from session: %s", e)
+            raise ValueError(f"Failed to retrieve SharePoint token information: {str(e)}")
+
+    def is_token_expired(self, token_info: Dict[str, Any]) -> bool:
+        if not token_info:
+            return True
+
+        expiry_timestamp = token_info.get("expiry")
+
+        if expiry_timestamp is None:
+            return True
+
+        current_timestamp = int(datetime.datetime.now().timestamp())
+        return (expiry_timestamp - current_timestamp) < 60
+
+    def sanitize_token_info(self, token_info: Dict[str, Any], **extra_fields) -> Dict[str, Any]:
+        return super().sanitize_token_info(
+            token_info,
+            allows_shared_content=token_info.get("allows_shared_content", False),
+            **extra_fields,
+        )
+
+    PERSONAL_ACCOUNT_TENANT_ID = "9188040d-6c67-4c5b-b112-36a304b66dad"
+
+    def _allows_shared_content(self, id_token_claims: Dict[str, Any]) -> bool:
+        """Return True when the account is a work/school tenant that can access SharePoint shared content."""
+        tid = id_token_claims.get("tid", "")
+        return bool(tid) and tid != self.PERSONAL_ACCOUNT_TENANT_ID
+
+    def map_token_response(self, result) -> Dict[str, Any]:
+        claims = result.get("id_token_claims", {})
+        return {
+            "access_token": result.get("access_token"),
+            "refresh_token": result.get("refresh_token"),
+            "token_uri": claims.get("iss"),
+            "scopes": result.get("scope"),
+            "expiry": claims.get("exp"),
+            "allows_shared_content": self._allows_shared_content(claims),
+            "user_info": {
+                "name": claims.get("name"),
+                "email": claims.get("preferred_username"),
+            },
+        }
--- a/application/parser/connectors/share_point/loader.py
+++ b/application/parser/connectors/share_point/loader.py
@@ -0,0 +1,650 @@
+"""
+SharePoint/OneDrive loader for DocsGPT.
+Loads documents from SharePoint/OneDrive using Microsoft Graph API.
+"""
+
+import functools
+import logging
+import os
+from typing import List, Dict, Any, Optional, Tuple
+from urllib.parse import quote
+
+import requests
+
+from application.parser.connectors.base import BaseConnectorLoader
+from application.parser.connectors.share_point.auth import SharePointAuth
+from application.parser.schema.base import Document
+
+
+def _retry_on_auth_failure(func):
+    """Retry once after refreshing the access token on 401/403 responses."""
+    @functools.wraps(func)
+    def wrapper(self, *args, **kwargs):
+        try:
+            return func(self, *args, **kwargs)
+        except requests.exceptions.HTTPError as e:
+            if e.response is not None and e.response.status_code in (401, 403):
+                logging.info(f"Auth failure in {func.__name__}, refreshing token and retrying")
+                try:
+                    new_token_info = self.auth.refresh_access_token(self.refresh_token)
+                    self.access_token = new_token_info.get('access_token')
+                except Exception as refresh_error:
+                    raise ValueError(
+                        f"Authentication failed and could not be refreshed: {refresh_error}"
+                    ) from e
+                return func(self, *args, **kwargs)
+            raise
+    return wrapper
+
+
+class SharePointLoader(BaseConnectorLoader):
+
+    SUPPORTED_MIME_TYPES = {
+        'application/pdf': '.pdf',
+        'application/vnd.openxmlformats-officedocument.wordprocessingml.document': '.docx',
+        'application/vnd.openxmlformats-officedocument.presentationml.presentation': '.pptx',
+        'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet': '.xlsx',
+        'application/msword': '.doc',
+        'application/vnd.ms-powerpoint': '.ppt',
+        'application/vnd.ms-excel': '.xls',
+        'text/plain': '.txt',
+        'text/csv': '.csv',
+        'text/html': '.html',
+        'text/markdown': '.md',
+        'text/x-rst': '.rst',
+        'application/json': '.json',
+        'application/epub+zip': '.epub',
+        'application/rtf': '.rtf',
+        'image/jpeg': '.jpg',
+        'image/png': '.png',
+    }
+
+    EXTENSION_TO_MIME = {v: k for k, v in SUPPORTED_MIME_TYPES.items()}
+
+    GRAPH_API_BASE = "https://graph.microsoft.com/v1.0"
+
+    def __init__(self, session_token: str):
+        self.auth = SharePointAuth()
+        self.session_token = session_token
+
+        token_info = self.auth.get_token_info_from_session(session_token)
+        self.access_token = token_info.get('access_token')
+        self.refresh_token = token_info.get('refresh_token')
+        self.allows_shared_content = token_info.get('allows_shared_content', False)
+
+        if not self.access_token:
+            raise ValueError("No access token found in session")
+
+        self.next_page_token = None
+
+    def _get_headers(self) -> Dict[str, str]:
+        return {
+            'Authorization': f'Bearer {self.access_token}',
+            'Accept': 'application/json'
+        }
+
+    def _ensure_valid_token(self):
+        if not self.access_token:
+            raise ValueError("No access token available")
+
+        token_info = {'access_token': self.access_token, 'expiry': None}
+        if self.auth.is_token_expired(token_info):
+            logging.info("Token expired, attempting refresh")
+            try:
+                new_token_info = self.auth.refresh_access_token(self.refresh_token)
+                self.access_token = new_token_info.get('access_token')
+            except Exception:
+                raise ValueError("Failed to refresh access token")
+
+    def _get_item_url(self, item_ref: str) -> str:
+        if ':' in item_ref:
+            drive_id, item_id = item_ref.split(':', 1)
+            return f"{self.GRAPH_API_BASE}/drives/{drive_id}/items/{item_id}"
+        return f"{self.GRAPH_API_BASE}/me/drive/items/{item_ref}"
+
+    def _process_file(self, file_metadata: Dict[str, Any], load_content: bool = True) -> Optional[Document]:
+        try:
+            drive_item_id = file_metadata.get('id')
+            file_name = file_metadata.get('name', 'Unknown')
+            file_data = file_metadata.get('file', {})
+            mime_type = file_data.get('mimeType', 'application/octet-stream')
+
+            if mime_type not in self.SUPPORTED_MIME_TYPES:
+                logging.info(f"Skipping unsupported file type: {mime_type} for file {file_name}")
+                return None
+
+            doc_metadata = {
+                'file_name': file_name,
+                'mime_type': mime_type,
+                'size': file_metadata.get('size'),
+                'created_time': file_metadata.get('createdDateTime'),
+                'modified_time': file_metadata.get('lastModifiedDateTime'),
+                'source': 'share_point'
+            }
+
+            if not load_content:
+                return Document(
+                    text="",
+                    doc_id=drive_item_id,
+                    extra_info=doc_metadata
+                )
+
+            content = self._download_file_content(drive_item_id)
+            if content is None:
+                logging.warning(f"Could not load content for file {file_name} ({drive_item_id})")
+                return None
+
+            return Document(
+                text=content,
+                doc_id=drive_item_id,
+                extra_info=doc_metadata
+            )
+
+        except Exception as e:
+            logging.error(f"Error processing file: {e}")
+            return None
+
+    def load_data(self, inputs: Dict[str, Any]) -> List[Document]:
+        try:
+            documents: List[Document] = []
+
+            folder_id = inputs.get('folder_id')
+            file_ids = inputs.get('file_ids', [])
+            limit = inputs.get('limit', 100)
+            list_only = inputs.get('list_only', False)
+            load_content = not list_only
+            page_token = inputs.get('page_token')
+            search_query = inputs.get('search_query')
+            self.next_page_token = None
+
+            shared = inputs.get('shared', False)
+
+            if file_ids:
+                for file_id in file_ids:
+                    try:
+                        doc = self._load_file_by_id(file_id, load_content=load_content)
+                        if doc:
+                            if not search_query or (
+                                search_query.lower() in doc.extra_info.get('file_name', '').lower()
+                            ):
+                                documents.append(doc)
+                    except Exception as e:
+                        logging.error(f"Error loading file {file_id}: {e}")
+                        continue
+            elif shared:
+                if not self.allows_shared_content:
+                    logging.warning("Shared content is only available for work/school Microsoft accounts")
+                    return []
+                documents = self._list_shared_items(
+                    limit=limit,
+                    load_content=load_content,
+                    page_token=page_token,
+                    search_query=search_query
+                )
+            else:
+                parent_id = folder_id if folder_id else 'root'
+                documents = self._list_items_in_parent(
+                    parent_id,
+                    limit=limit,
+                    load_content=load_content,
+                    page_token=page_token,
+                    search_query=search_query
+                )
+
+            logging.info(f"Loaded {len(documents)} documents from SharePoint/OneDrive")
+            return documents
+
+        except Exception as e:
+            logging.error(f"Error loading data from SharePoint/OneDrive: {e}", exc_info=True)
+            raise
+
+    @_retry_on_auth_failure
+    def _load_file_by_id(self, file_id: str, load_content: bool = True) -> Optional[Document]:
+        self._ensure_valid_token()
+
+        try:
+            url = self._get_item_url(file_id)
+            params = {'$select': 'id,name,file,createdDateTime,lastModifiedDateTime,size'}
+            response = requests.get(url, headers=self._get_headers(), params=params, timeout=100)
+            response.raise_for_status()
+
+            file_metadata = response.json()
+            return self._process_file(file_metadata, load_content=load_content)
+
+        except requests.exceptions.HTTPError:
+            raise
+        except Exception as e:
+            logging.error(f"Error loading file {file_id}: {e}")
+            return None
+
+    @_retry_on_auth_failure
+    def _list_items_in_parent(self, parent_id: str, limit: int = 100, load_content: bool = False, page_token: Optional[str] = None, search_query: Optional[str] = None) -> List[Document]:
+        self._ensure_valid_token()
+
+        documents: List[Document] = []
+
+        try:
+            url = f"{self._get_item_url(parent_id)}/children"
+            params = {'$top': min(100, limit) if limit else 100, '$select': 'id,name,file,folder,createdDateTime,lastModifiedDateTime,size'}
+            if page_token:
+                params['$skipToken'] = page_token
+
+            if search_query:
+                encoded_query = quote(search_query, safe='')
+                if ':' in parent_id:
+                    drive_id = parent_id.split(':', 1)[0]
+                    search_url = f"{self.GRAPH_API_BASE}/drives/{drive_id}/root/search(q='{encoded_query}')"
+                else:
+                    search_url = f"{self.GRAPH_API_BASE}/me/drive/search(q='{encoded_query}')"
+                response = requests.get(search_url, headers=self._get_headers(), params=params, timeout=100)
+            else:
+                response = requests.get(url, headers=self._get_headers(), params=params, timeout=100)
+
+            response.raise_for_status()
+
+            results = response.json()
+
+            items = results.get('value', [])
+            for item in items:
+                if 'folder' in item:
+                    doc_metadata = {
+                        'file_name': item.get('name', 'Unknown'),
+                        'mime_type': 'folder',
+                        'size': item.get('size'),
+                        'created_time': item.get('createdDateTime'),
+                        'modified_time': item.get('lastModifiedDateTime'),
+                        'source': 'share_point',
+                        'is_folder': True
+                    }
+                    documents.append(Document(text="", doc_id=item.get('id'), extra_info=doc_metadata))
+                else:
+                    doc = self._process_file(item, load_content=load_content)
+                    if doc:
+                        documents.append(doc)
+
+                if limit and len(documents) >= limit:
+                    break
+
+            next_link = results.get('@odata.nextLink')
+            if next_link:
+                from urllib.parse import urlparse, parse_qs
+                parsed = urlparse(next_link)
+                query_params = parse_qs(parsed.query)
+                skiptoken_list = query_params.get('$skiptoken')
+                if skiptoken_list:
+                    self.next_page_token = skiptoken_list[0]
+                else:
+                    self.next_page_token = None
+            else:
+                self.next_page_token = None
+            return documents
+
+        except Exception as e:
+            logging.error(f"Error listing items under parent {parent_id}: {e}")
+            return documents
+
+
+
+
+    def _resolve_mime_type(self, resource: Dict[str, Any]) -> Tuple[str, bool]:
+        """Resolve mime type from resource, falling back to file extension."""
+        file_data = resource.get('file', {})
+        mime_type = file_data.get('mimeType') if file_data else None
+
+        if mime_type and mime_type in self.SUPPORTED_MIME_TYPES:
+            return mime_type, True
+
+        name = resource.get('name', '')
+        ext = os.path.splitext(name)[1].lower()
+        if ext in self.EXTENSION_TO_MIME:
+            return self.EXTENSION_TO_MIME[ext], True
+
+        return mime_type or 'application/octet-stream', False
+
+    def _get_user_drive_web_url(self) -> Optional[str]:
+        """Fetch the current user's OneDrive web URL for KQL path exclusion."""
+        try:
+            response = requests.get(
+                f"{self.GRAPH_API_BASE}/me/drive",
+                headers=self._get_headers(),
+                params={'$select': 'webUrl'},
+                timeout=100,
+            )
+            response.raise_for_status()
+            return response.json().get('webUrl')
+        except Exception as e:
+            logging.warning(f"Could not fetch user drive web URL: {e}")
+            return None
+
+    def _build_shared_kql_query(self, search_query: Optional[str], user_drive_url: Optional[str]) -> str:
+        """Build KQL query string that excludes the user's own drive items."""
+        base_query = search_query if search_query else "*"
+        if user_drive_url:
+            return f'{base_query} AND -path:"{user_drive_url}"'
+        return base_query
+
+    def _list_shared_items(self, limit: int = 100, load_content: bool = False, page_token: Optional[str] = None, search_query: Optional[str] = None) -> List[Document]:
+        """Fetch shared drive items using Microsoft Graph Search API with local offset paging.
+
+        We always fetch up to a fixed maximum number of hits from Graph (single request),
+        then page through that array locally using `page_token` as a simple integer offset.
+        This avoids relying on buggy or inconsistent remote `from`/`size` semantics.
+        """
+        self._ensure_valid_token()
+        documents: List[Document] = []
+
+        try:
+            user_drive_url = self._get_user_drive_web_url()
+            query_text = self._build_shared_kql_query(search_query, user_drive_url)
+
+            url = f"{self.GRAPH_API_BASE}/search/query"
+            page_size = 500  # maximum number of hits we care about for selection
+
+            body = {
+                "requests": [
+                    {
+                        "entityTypes": ["driveItem"],
+                        "query": {"queryString": query_text},
+                        "from": 0,
+                        "size": page_size,
+                    }
+                ]
+            }
+
+            headers = self._get_headers()
+            headers["Content-Type"] = "application/json"
+            response = requests.post(url, headers=headers, json=body, timeout=100)
+            response.raise_for_status()
+            results = response.json()
+
+            search_response = results.get("value", [])
+            if not search_response:
+                logging.warning("Search API returned empty value array")
+                self.next_page_token = None
+                return documents
+
+            hits_containers = search_response[0].get("hitsContainers", [])
+            if not hits_containers:
+                logging.warning("Search API returned no hitsContainers")
+                self.next_page_token = None
+                return documents
+
+            container = hits_containers[0]
+            total = container.get("total", 0)
+            raw_hits = container.get("hits", [])
+
+            # Deduplicate by effective item ID (driveId:itemId) to avoid the same
+            # resource appearing multiple times across the result set.
+            deduped_hits = []
+            seen_ids = set()
+            for hit in raw_hits:
+                resource = hit.get("resource", {})
+                item_id = resource.get("id")
+                drive_id = resource.get("parentReference", {}).get("driveId")
+                effective_id = f"{drive_id}:{item_id}" if drive_id and item_id else item_id
+                if not effective_id or effective_id in seen_ids:
+                    continue
+                seen_ids.add(effective_id)
+                deduped_hits.append(hit)
+
+            hits = deduped_hits
+            logging.info(
+                f"Search API returned {total} total results, {len(raw_hits)} raw hits, {len(hits)} unique hits in this batch"
+            )
+            try:
+                offset = int(page_token) if page_token is not None else 0
+            except (TypeError, ValueError):
+                logging.warning(
+                    f"Invalid page_token '{page_token}' for shared items search, defaulting to 0"
+                )
+                offset = 0
+
+            if offset < 0:
+                offset = 0
+            if offset >= len(hits):
+                self.next_page_token = None
+                return documents
+
+            end_index = offset + limit if limit else len(hits)
+            end_index = min(end_index, len(hits))
+
+            for hit in hits[offset:end_index]:
+                resource = hit.get("resource", {})
+                item_name = resource.get("name", "Unknown")
+                item_id = resource.get("id")
+                drive_id = resource.get("parentReference", {}).get("driveId")
+
+                effective_id = f"{drive_id}:{item_id}" if drive_id and item_id else item_id
+
+                is_folder = "folder" in resource
+
+                if is_folder:
+                    doc_metadata = {
+                        "file_name": item_name,
+                        "mime_type": "folder",
+                        "size": resource.get("size"),
+                        "created_time": resource.get("createdDateTime"),
+                        "modified_time": resource.get("lastModifiedDateTime"),
+                        "source": "share_point",
+                        "is_folder": True,
+                    }
+                    documents.append(
+                        Document(text="", doc_id=effective_id, extra_info=doc_metadata)
+                    )
+                else:
+                    mime_type, supported = self._resolve_mime_type(resource)
+                    if not supported:
+                        logging.info(
+                            f"Skipping unsupported shared file: {item_name} (mime: {mime_type})"
+                        )
+                        continue
+
+                    doc_metadata = {
+                        "file_name": item_name,
+                        "mime_type": mime_type,
+                        "size": resource.get("size"),
+                        "created_time": resource.get("createdDateTime"),
+                        "modified_time": resource.get("lastModifiedDateTime"),
+                        "source": "share_point",
+                    }
+
+                    content = ""
+                    if load_content:
+                        content = self._download_file_content(effective_id) or ""
+
+                    documents.append(
+                        Document(text=content, doc_id=effective_id, extra_info=doc_metadata)
+                    )
+
+            if limit and end_index < len(hits):
+                self.next_page_token = str(end_index)
+            else:
+                self.next_page_token = None
+
+            return documents
+
+        except Exception as e:
+            logging.error(f"Error listing shared items via search API: {e}", exc_info=True)
+            return documents
+
+    @_retry_on_auth_failure
+    def _download_file_content(self, file_id: str) -> Optional[str]:
+        self._ensure_valid_token()
+
+        try:
+            url = f"{self._get_item_url(file_id)}/content"
+            response = requests.get(url, headers=self._get_headers(), timeout=100)
+            response.raise_for_status()
+
+            try:
+                return response.content.decode('utf-8')
+            except UnicodeDecodeError:
+                logging.error(f"Could not decode file {file_id} as text")
+                return None
+
+        except requests.exceptions.HTTPError:
+            raise
+        except Exception as e:
+            logging.error(f"Error downloading file {file_id}: {e}")
+            return None
+
+    def _download_single_file(self, file_id: str, local_dir: str) -> bool:
+        try:
+            url = self._get_item_url(file_id)
+            params = {'$select': 'id,name,file'}
+            response = requests.get(url, headers=self._get_headers(), params=params, timeout=100)
+            response.raise_for_status()
+
+            metadata = response.json()
+            file_name = metadata.get('name', 'unknown')
+            file_data = metadata.get('file', {})
+            mime_type = file_data.get('mimeType', 'application/octet-stream')
+
+            if mime_type not in self.SUPPORTED_MIME_TYPES:
+                logging.info(f"Skipping unsupported file type: {mime_type}")
+                return False
+
+            os.makedirs(local_dir, exist_ok=True)
+            full_path = os.path.join(local_dir, file_name)
+
+            download_url = f"{self._get_item_url(file_id)}/content"
+            download_response = requests.get(download_url, headers=self._get_headers(), timeout=100)
+            download_response.raise_for_status()
+
+            with open(full_path, 'wb') as f:
+                f.write(download_response.content)
+
+            return True
+        except Exception as e:
+            logging.error(f"Error in _download_single_file: {e}")
+            return False
+
+    def _download_folder_recursive(self, folder_id: str, local_dir: str, recursive: bool = True) -> int:
+        files_downloaded = 0
+        try:
+            os.makedirs(local_dir, exist_ok=True)
+
+            url = f"{self._get_item_url(folder_id)}/children"
+            params = {'$top': 1000}
+
+            while url:
+                response = requests.get(url, headers=self._get_headers(), params=params, timeout=100)
+                response.raise_for_status()
+
+                results = response.json()
+                items = results.get('value', [])
+                logging.info(f"Found {len(items)} items in folder {folder_id}")
+
+                for item in items:
+                    item_name = item.get('name', 'unknown')
+                    item_id = item.get('id')
+
+                    if 'folder' in item:
+                        if recursive:
+                            subfolder_path = os.path.join(local_dir, item_name)
+                            os.makedirs(subfolder_path, exist_ok=True)
+                            subfolder_files = self._download_folder_recursive(
+                                item_id,
+                                subfolder_path,
+                                recursive
+                            )
+                            files_downloaded += subfolder_files
+                            logging.info(f"Downloaded {subfolder_files} files from subfolder {item_name}")
+                    else:
+                        success = self._download_single_file(item_id, local_dir)
+                        if success:
+                            files_downloaded += 1
+                            logging.info(f"Downloaded file: {item_name}")
+                        else:
+                            logging.warning(f"Failed to download file: {item_name}")
+
+                url = results.get('@odata.nextLink')
+
+            return files_downloaded
+
+        except Exception as e:
+            logging.error(f"Error in _download_folder_recursive for folder {folder_id}: {e}", exc_info=True)
+            return files_downloaded
+
+    def _download_folder_contents(self, folder_id: str, local_dir: str, recursive: bool = True) -> int:
+        try:
+            self._ensure_valid_token()
+            return self._download_folder_recursive(folder_id, local_dir, recursive)
+        except Exception as e:
+            logging.error(f"Error downloading folder {folder_id}: {e}", exc_info=True)
+            return 0
+
+    def _download_file_to_directory(self, file_id: str, local_dir: str) -> bool:
+        try:
+            self._ensure_valid_token()
+            return self._download_single_file(file_id, local_dir)
+        except Exception as e:
+            logging.error(f"Error downloading file {file_id}: {e}", exc_info=True)
+            return False
+
+    def download_to_directory(self, local_dir: str, source_config: Dict[str, Any] = None) -> Dict[str, Any]:
+        if source_config is None:
+            source_config = {}
+
+        config = source_config if source_config else getattr(self, 'config', {})
+        files_downloaded = 0
+
+        try:
+            folder_ids = config.get('folder_ids', [])
+            file_ids = config.get('file_ids', [])
+            recursive = config.get('recursive', True)
+
+            if file_ids:
+                if isinstance(file_ids, str):
+                    file_ids = [file_ids]
+
+                for file_id in file_ids:
+                    if self._download_file_to_directory(file_id, local_dir):
+                        files_downloaded += 1
+
+            if folder_ids:
+                if isinstance(folder_ids, str):
+                    folder_ids = [folder_ids]
+
+                for folder_id in folder_ids:
+                    try:
+                        url = self._get_item_url(folder_id)
+                        params = {'$select': 'id,name'}
+                        response = requests.get(url, headers=self._get_headers(), params=params, timeout=100)
+                        response.raise_for_status()
+
+                        folder_metadata = response.json()
+                        folder_name = folder_metadata.get('name', '')
+                        folder_path = os.path.join(local_dir, folder_name)
+                        os.makedirs(folder_path, exist_ok=True)
+
+                        folder_files = self._download_folder_recursive(
+                            folder_id,
+                            folder_path,
+                            recursive
+                        )
+                        files_downloaded += folder_files
+                        logging.info(f"Downloaded {folder_files} files from folder {folder_name}")
+                    except Exception as e:
+                        logging.error(f"Error downloading folder {folder_id}: {e}", exc_info=True)
+
+            if not file_ids and not folder_ids:
+                raise ValueError("No folder_ids or file_ids provided for download")
+
+            return {
+                "files_downloaded": files_downloaded,
+                "directory_path": local_dir,
+                "empty_result": files_downloaded == 0,
+                "source_type": "share_point",
+                "config_used": config
+            }
+
+        except Exception as e:
+            return {
+                "files_downloaded": files_downloaded,
+                "directory_path": local_dir,
+                "empty_result": True,
+                "source_type": "share_point",
+                "config_used": config,
+                "error": str(e)
+            }
--- a/application/parser/file/audio_parser.py
+++ b/application/parser/file/audio_parser.py
@@ -0,0 +1,48 @@
+from pathlib import Path
+from typing import Dict, Union
+
+from application.core.settings import settings
+from application.parser.file.base_parser import BaseParser
+from application.stt.stt_creator import STTCreator
+from application.stt.upload_limits import enforce_audio_file_size_limit
+
+
+class AudioParser(BaseParser):
+    def __init__(self, parser_config=None):
+        super().__init__(parser_config=parser_config)
+        self._transcript_metadata: Dict[str, Dict] = {}
+
+    def _init_parser(self) -> Dict:
+        return {}
+
+    def parse_file(self, file: Path, errors: str = "ignore") -> Union[str, list[str]]:
+        _ = errors
+        try:
+            enforce_audio_file_size_limit(file.stat().st_size)
+        except OSError:
+            pass
+        stt = STTCreator.create_stt(settings.STT_PROVIDER)
+        result = stt.transcribe(
+            file,
+            language=settings.STT_LANGUAGE,
+            timestamps=settings.STT_ENABLE_TIMESTAMPS,
+            diarize=settings.STT_ENABLE_DIARIZATION,
+        )
+
+        transcript_metadata = {
+            "transcript_duration_s": result.get("duration_s"),
+            "transcript_language": result.get("language"),
+            "transcript_provider": result.get("provider"),
+        }
+        if result.get("segments"):
+            transcript_metadata["transcript_segments"] = result["segments"]
+
+        self._transcript_metadata[str(file)] = {
+            key: value
+            for key, value in transcript_metadata.items()
+            if value not in (None, [], {})
+        }
+        return result.get("text", "")
+
+    def get_file_metadata(self, file: Path) -> Dict:
+        return self._transcript_metadata.get(str(file), {})
--- a/application/parser/file/base_parser.py
+++ b/application/parser/file/base_parser.py
@@ -36,3 +36,8 @@ class BaseParser:
    @abstractmethod
    def parse_file(self, file: Path, errors: str = "ignore") -> Union[str, List[str]]:
        """Parse file."""
+
+    def get_file_metadata(self, file: Path) -> Dict:
+        """Return parser-specific metadata for the most recently parsed file."""
+        _ = file
+        return {}
--- a/application/parser/file/bulk.py
+++ b/application/parser/file/bulk.py
@@ -14,11 +14,17 @@ from application.parser.file.tabular_parser import PandasCSVParser, ExcelParser
 from application.parser.file.json_parser import JSONParser
 from application.parser.file.pptx_parser import PPTXParser
 from application.parser.file.image_parser import ImageParser
+from application.parser.file.audio_parser import AudioParser
 from application.parser.schema.base import Document
+from application.stt.constants import SUPPORTED_AUDIO_EXTENSIONS
 from application.utils import num_tokens_from_string
 from application.core.settings import settings


+def _build_audio_parser_mapping() -> Dict[str, BaseParser]:
+    return {extension: AudioParser() for extension in SUPPORTED_AUDIO_EXTENSIONS}
+
+
 def get_default_file_extractor(
    ocr_enabled: Optional[bool] = None,
 ) -> Dict[str, BaseParser]:
@@ -70,6 +76,7 @@ def get_default_file_extractor(
            ".webp": DoclingImageParser(ocr_enabled=ocr_enabled) if ocr_enabled else ImageParser(),
            # Media/subtitles
            ".vtt": DoclingVTTParser(),
+            **_build_audio_parser_mapping(),
            # Specialized XML formats
            ".xml": DoclingXMLParser(),
            # Formats docling doesn't support - use standard parsers
@@ -96,6 +103,7 @@ def get_default_file_extractor(
            ".png": ImageParser(),
            ".jpg": ImageParser(),
            ".jpeg": ImageParser(),
+            **_build_audio_parser_mapping(),
        }


@@ -221,11 +229,13 @@ class SimpleDirectoryReader(BaseReader):
        
        for input_file in self.input_files:
            suffix_lower = input_file.suffix.lower()
+            parser_metadata = {}
            if suffix_lower in self.file_extractor:
                parser = self.file_extractor[suffix_lower]
                if not parser.parser_config_set:
                    parser.init_parser()
                data = parser.parse_file(input_file, errors=self.errors)
+                parser_metadata = parser.get_file_metadata(input_file)
            else:
                # do standard read
                with open(input_file, "r", errors=self.errors) as f:
@@ -244,6 +254,8 @@ class SimpleDirectoryReader(BaseReader):
                'title': input_file.name,
                'token_count': file_tokens,
            }
+            if parser_metadata:
+                base_metadata.update(parser_metadata)
            
            if hasattr(self, 'input_dir'):
                try:
--- a/application/parser/file/constants.py
+++ b/application/parser/file/constants.py
@@ -0,0 +1,27 @@
+"""Shared file-extension constants for parsing and ingestion flows."""
+
+from application.stt.constants import SUPPORTED_AUDIO_EXTENSIONS
+
+
+SUPPORTED_SOURCE_DOCUMENT_EXTENSIONS = (
+    ".rst",
+    ".md",
+    ".pdf",
+    ".txt",
+    ".docx",
+    ".csv",
+    ".epub",
+    ".html",
+    ".mdx",
+    ".json",
+    ".xlsx",
+    ".pptx",
+)
+
+SUPPORTED_SOURCE_IMAGE_EXTENSIONS = (".png", ".jpg", ".jpeg")
+
+SUPPORTED_SOURCE_EXTENSIONS = (
+    *SUPPORTED_SOURCE_DOCUMENT_EXTENSIONS,
+    *SUPPORTED_SOURCE_IMAGE_EXTENSIONS,
+    *SUPPORTED_AUDIO_EXTENSIONS,
+)
--- a/application/parser/file/docs_parser.py
+++ b/application/parser/file/docs_parser.py
@@ -24,7 +24,7 @@ class PDFParser(BaseParser):
            # alternatively you can use local vision capable LLM
            with open(file, "rb") as file_loaded:
                files = {'file': file_loaded}
-                response = requests.post(doc2md_service, files=files)
+                response = requests.post(doc2md_service, files=files, timeout=100)
                data = response.json()["markdown"]
            return data

--- a/application/parser/file/epub_parser.py
+++ b/application/parser/file/epub_parser.py
@@ -19,25 +19,10 @@ class EpubParser(BaseParser):
    def parse_file(self, file: Path, errors: str = "ignore") -> str:
        """Parse file."""
        try:
-            import ebooklib
-            from ebooklib import epub
+            from fast_ebook import epub
        except ImportError:
-            raise ValueError("`EbookLib` is required to read Epub files.")
-        try:
-            import html2text
-        except ImportError:
-            raise ValueError("`html2text` is required to parse Epub files.")
+            raise ValueError("`fast-ebook` is required to read Epub files.")

-        text_list = []
-        book = epub.read_epub(file, options={"ignore_ncx": True})
-
-        # Iterate through all chapters.
-        for item in book.get_items():
-            # Chapters are typically located in epub documents items.
-            if item.get_type() == ebooklib.ITEM_DOCUMENT:
-                text_list.append(
-                    html2text.html2text(item.get_content().decode("utf-8"))
-                )
-
-        text = "\n".join(text_list)
+        book = epub.read_epub(file)
+        text = book.to_markdown()
        return text
--- a/Show More
+++ b/Show More