Compare commits

..

1 Commits

Author SHA1 Message Date
dependabot[bot]
5d0eed6084 chore(deps): bump mpmath from 1.3.0 to 1.4.1 in /application
Bumps [mpmath](https://github.com/mpmath/mpmath) from 1.3.0 to 1.4.1.
- [Release notes](https://github.com/mpmath/mpmath/releases)
- [Changelog](https://github.com/mpmath/mpmath/blob/1.4.1/CHANGES)
- [Commits](https://github.com/mpmath/mpmath/compare/1.3.0...1.4.1)

---
updated-dependencies:
- dependency-name: mpmath
  dependency-version: 1.4.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-04-14 19:43:29 +00:00
6 changed files with 1399 additions and 1104 deletions

View File

@@ -1,144 +0,0 @@
# DocsGPT Public Threat Model
**Classification:** Public
**Last updated:** 2026-04-15
**Applies to:** Open-source and self-hosted DocsGPT deployments
## 1) Overview
DocsGPT ingests content (files/URLs/connectors), indexes it, and answers queries via LLM-backed APIs and optional tools.
Core components:
- Backend API (`application/`)
- Workers/ingestion (`application/worker.py` and related modules)
- Datastores (MongoDB/Redis/vector stores)
- Frontend (`frontend/`)
- Optional extensions/integrations (`extensions/`)
## 2) Scope and assumptions
In scope:
- Application-level threats in this repository.
- Local and internet-exposed self-hosted deployments.
Assumptions:
- Internet-facing instances enable auth and use strong secrets.
- Datastores/internal services are not publicly exposed.
Out of scope:
- Cloud hardware/provider compromise.
- Security guarantees of external LLM vendors.
- Full security audits of third-party systems targeted by tools (external DBs/MCP servers/code-exec APIs).
## 3) Security objectives
- Protect document/conversation confidentiality.
- Preserve integrity of prompts, agents, tools, and indexed data.
- Maintain API/worker availability.
- Enforce tenant isolation in authenticated deployments.
## 4) Assets
- Documents, attachments, chunks/embeddings, summaries.
- Conversations, agents, workflows, prompt templates.
- Secrets (JWT secret, `INTERNAL_KEY`, provider/API/OAuth credentials).
- Operational capacity (worker throughput, queue depth, model quota/cost).
## 5) Trust boundaries and untrusted input
Trust boundaries:
- Internet ↔ Frontend
- Frontend ↔ Backend API
- Backend ↔ Workers/internal APIs
- Backend/workers ↔ Datastores
- Backend ↔ External LLM/connectors/remote URLs
Untrusted input includes API payloads, file uploads, remote URLs, OAuth/webhook data, retrieved content, and LLM/tool arguments.
## 6) Main attack surfaces
1. Auth/authz paths and sharing tokens.
2. File upload + parsing pipeline.
3. Remote URL fetching and connectors (SSRF risk).
4. Agent/tool execution from LLM output.
5. Template/workflow rendering.
6. Frontend rendering + token storage.
7. Internal service endpoints (`INTERNAL_KEY`).
8. High-impact integrations (SQL tool, generic API tool, remote MCP tools).
## 7) Key threats and expected mitigations
### A. Auth/authz misconfiguration
- Threat: weak/no auth or leaked tokens leads to broad data access.
- Mitigations: require auth for public deployments, short-lived tokens, rotation/revocation, least-privilege sharing.
### B. Untrusted file ingestion
- Threat: malicious files/archives trigger traversal, parser exploits, or resource exhaustion.
- Mitigations: strict path checks, archive safeguards, file limits, patched parser dependencies.
### C. SSRF/outbound abuse
- Threat: URL loaders/tools access private/internal/metadata endpoints.
- Mitigations: validate URLs + redirects, block private/link-local ranges, apply egress controls/allowlists.
### D. Prompt injection + tool abuse
- Threat: retrieved text manipulates model behavior and causes unsafe tool calls.
- Threat: never rely on the model to "choose correctly" under adversarial input.
- Mitigations: treat retrieved/model output as untrusted, enforce tool policies, only expose tools explicitly assigned by the user/admin to that agent, separate system instructions from retrieved content, audit tool calls.
### E. Dangerous tool capability chaining (SQL/API/MCP)
- Threat: write-capable SQL credentials allow destructive queries.
- Threat: API tool can trigger side effects (infra/payment/webhook/code-exec endpoints).
- Threat: remote MCP tools may expose privileged operations.
- Mitigations: read-only-by-default credentials, destination allowlists, explicit approval for write/exec actions, per-tool policy enforcement + logging.
### F. Frontend/XSS + token theft
- Threat: XSS can steal local tokens and call APIs.
- Mitigations: reduce unsafe rendering paths, strong CSP, scoped short-lived credentials.
### G. Internal endpoint exposure
- Threat: weak/unset `INTERNAL_KEY` enables internal API abuse.
- Mitigations: fail closed, require strong random keys, keep internal APIs private.
### H. DoS and cost abuse
- Threat: request floods, large ingestion jobs, expensive prompts/crawls.
- Mitigations: rate limits, quotas, timeouts, queue backpressure, usage budgets.
## 8) Example attacker stories
- Internet-exposed deployment runs with weak/no auth and receives unauthorized data access/abuse.
- Intranet deployment intentionally using weak/no auth is vulnerable to insider misuse and lateral-movement abuse.
- Crafted archive attempts path traversal during extraction.
- Malicious URL/redirect chain targets internal services.
- Poisoned document causes data exfiltration through tool calls.
- Over-privileged SQL/API/MCP tool performs destructive side effects.
## 9) Severity calibration
- **Critical:** unauthenticated public data access; prompt-injection-driven exfiltration; SSRF to sensitive internal endpoints.
- **High:** cross-tenant leakage, persistent token compromise, over-privileged destructive tools.
- **Medium:** DoS/cost amplification and non-critical information disclosure.
- **Low:** minor hardening gaps with limited impact.
## 10) Baseline controls for public deployments
1. Enforce authentication and secure defaults.
2. Set/rotate strong secrets (`JWT`, `INTERNAL_KEY`, encryption keys).
3. Restrict CORS and front API with a hardened proxy.
4. Add rate limiting/quotas for answer/upload/crawl/token endpoints.
5. Enforce URL+redirect SSRF protections and egress restrictions.
6. Apply upload/archive/parsing hardening.
7. Require least-privilege tool credentials and auditable tool execution.
8. Monitor auth failures, tool anomalies, ingestion spikes, and cost anomalies.
9. Keep dependencies/images patched and scanned.
10. Validate multi-tenant isolation with explicit tests.
## 11) Maintenance
Review this model after major auth, ingestion, connector, tool, or workflow changes.
## References
- [OWASP Top 10 for LLM Applications](https://owasp.org/www-project-top-10-for-large-language-model-applications/)
- [OWASP ASVS](https://owasp.org/www-project-application-security-verification-standard/)
- [STRIDE overview](https://learn.microsoft.com/azure/security/develop/threat-modeling-tool-threats)
- [DocsGPT SECURITY.md](../SECURITY.md)

View File

@@ -4,7 +4,7 @@ boto3==1.42.83
beautifulsoup4==4.14.3
cel-python==0.5.0
celery==5.6.3
cryptography==46.0.7
cryptography==46.0.6
dataclasses-json==0.6.7
defusedxml==0.7.1
docling>=2.16.0
@@ -29,19 +29,19 @@ jiter==0.13.0
jmespath==1.1.0
joblib==1.5.3
jsonpatch==1.33
jsonpointer==3.1.1
jsonpointer==3.0.0
kombu==5.6.2
langchain==1.2.3
langchain-community==0.4.1
langchain-core==1.2.29
langchain-core==1.2.23
langchain-openai==1.1.12
langchain-text-splitters==1.1.1
langsmith==0.7.31
langsmith==0.7.23
lazy-object-proxy==1.12.0
lxml==6.0.2
markupsafe==3.0.3
marshmallow>=3.18.0,<5.0.0
mpmath==1.3.0
mpmath==1.4.1
multidict==6.7.1
msal==1.35.1
mypy-extensions==1.1.0
@@ -84,7 +84,7 @@ tqdm==4.67.3
transformers==5.4.0
typing-extensions==4.15.0
typing-inspect==0.9.0
tzdata==2026.1
tzdata==2025.3
urllib3==2.6.3
vine==5.1.0
wcwidth==0.6.0

File diff suppressed because it is too large Load Diff

View File

@@ -81,7 +81,7 @@
"parcel": "^2.16.4",
"prettier": "^3.8.1",
"process": "^0.11.10",
"svgo": "^4.0.1",
"svgo": "^3.3.3",
"typescript": "^5.3.3"
},
"publishConfig": {

File diff suppressed because it is too large Load Diff

View File

@@ -33,12 +33,12 @@
"i18next-browser-languagedetector": "^8.2.1",
"lodash": "^4.17.21",
"lucide-react": "^0.562.0",
"mermaid": "^11.14.0",
"mermaid": "^11.12.1",
"prop-types": "^15.8.1",
"radix-ui": "^1.4.3",
"react": "^19.1.0",
"react-chartjs-2": "^5.3.0",
"react-dom": "^19.2.5",
"react-dom": "^19.1.1",
"react-dropzone": "^14.3.8",
"react-google-drive-picker": "^1.2.2",
"react-i18next": "^17.0.2",
@@ -53,10 +53,10 @@
"tailwind-merge": "^3.4.0"
},
"devDependencies": {
"@tailwindcss/postcss": "^4.2.2",
"@tailwindcss/postcss": "^4.1.10",
"@types/lodash": "^4.17.20",
"@types/react": "^19.1.8",
"@types/react-dom": "^19.2.3",
"@types/react-dom": "^19.1.7",
"@types/react-syntax-highlighter": "^15.5.13",
"@typescript-eslint/eslint-plugin": "^8.58.2",
"@typescript-eslint/parser": "^8.46.3",
@@ -64,7 +64,7 @@
"eslint": "^9.39.1",
"eslint-config-prettier": "^10.1.5",
"eslint-plugin-import": "^2.31.0",
"eslint-plugin-n": "^17.24.0",
"eslint-plugin-n": "^17.23.1",
"eslint-plugin-prettier": "^5.5.4",
"eslint-plugin-promise": "^6.6.0",
"eslint-plugin-react": "^7.37.5",
@@ -74,7 +74,7 @@
"postcss": "^8.4.49",
"prettier": "^3.5.3",
"prettier-plugin-tailwindcss": "^0.7.2",
"tailwindcss": "^4.2.2",
"tailwindcss": "^4.2.1",
"tw-animate-css": "^1.4.0",
"typescript": "^5.8.3",
"vite": "^8.0.0",