mirror of
https://github.com/arc53/DocsGPT.git
synced 2026-05-07 06:30:03 +00:00
Compare commits
28 Commits
codex/add-
...
codex/draf
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
c18f85a050 | ||
|
|
5ecb174567 | ||
|
|
ed7212d016 | ||
|
|
f82acdab5d | ||
|
|
361aebc34c | ||
|
|
bf194c1a0f | ||
|
|
54c396750b | ||
|
|
9adebfec69 | ||
|
|
92c321f163 | ||
|
|
e3d36b9e52 | ||
|
|
8950e11208 | ||
|
|
5de0132a65 | ||
|
|
b92ca91512 | ||
|
|
8e0b2844a2 | ||
|
|
0969db5e30 | ||
|
|
af335a27e8 | ||
|
|
0ae3139284 | ||
|
|
7529ca3dd6 | ||
|
|
1b813320f1 | ||
|
|
02012e9a0b | ||
|
|
c2f027265a | ||
|
|
0ae615c10e | ||
|
|
881d0da344 | ||
|
|
1376de6bae | ||
|
|
362ebfcc0a | ||
|
|
bc77eed3d8 | ||
|
|
1f346588e7 | ||
|
|
2fed5c882b |
144
.github/THREAT_MODEL.md
vendored
Normal file
144
.github/THREAT_MODEL.md
vendored
Normal file
@@ -0,0 +1,144 @@
|
||||
# DocsGPT Public Threat Model
|
||||
|
||||
**Classification:** Public
|
||||
**Last updated:** 2026-04-15
|
||||
**Applies to:** Open-source and self-hosted DocsGPT deployments
|
||||
|
||||
## 1) Overview
|
||||
|
||||
DocsGPT ingests content (files/URLs/connectors), indexes it, and answers queries via LLM-backed APIs and optional tools.
|
||||
|
||||
Core components:
|
||||
- Backend API (`application/`)
|
||||
- Workers/ingestion (`application/worker.py` and related modules)
|
||||
- Datastores (MongoDB/Redis/vector stores)
|
||||
- Frontend (`frontend/`)
|
||||
- Optional extensions/integrations (`extensions/`)
|
||||
|
||||
## 2) Scope and assumptions
|
||||
|
||||
In scope:
|
||||
- Application-level threats in this repository.
|
||||
- Local and internet-exposed self-hosted deployments.
|
||||
|
||||
Assumptions:
|
||||
- Internet-facing instances enable auth and use strong secrets.
|
||||
- Datastores/internal services are not publicly exposed.
|
||||
|
||||
Out of scope:
|
||||
- Cloud hardware/provider compromise.
|
||||
- Security guarantees of external LLM vendors.
|
||||
- Full security audits of third-party systems targeted by tools (external DBs/MCP servers/code-exec APIs).
|
||||
|
||||
## 3) Security objectives
|
||||
|
||||
- Protect document/conversation confidentiality.
|
||||
- Preserve integrity of prompts, agents, tools, and indexed data.
|
||||
- Maintain API/worker availability.
|
||||
- Enforce tenant isolation in authenticated deployments.
|
||||
|
||||
## 4) Assets
|
||||
|
||||
- Documents, attachments, chunks/embeddings, summaries.
|
||||
- Conversations, agents, workflows, prompt templates.
|
||||
- Secrets (JWT secret, `INTERNAL_KEY`, provider/API/OAuth credentials).
|
||||
- Operational capacity (worker throughput, queue depth, model quota/cost).
|
||||
|
||||
## 5) Trust boundaries and untrusted input
|
||||
|
||||
Trust boundaries:
|
||||
- Internet ↔ Frontend
|
||||
- Frontend ↔ Backend API
|
||||
- Backend ↔ Workers/internal APIs
|
||||
- Backend/workers ↔ Datastores
|
||||
- Backend ↔ External LLM/connectors/remote URLs
|
||||
|
||||
Untrusted input includes API payloads, file uploads, remote URLs, OAuth/webhook data, retrieved content, and LLM/tool arguments.
|
||||
|
||||
## 6) Main attack surfaces
|
||||
|
||||
1. Auth/authz paths and sharing tokens.
|
||||
2. File upload + parsing pipeline.
|
||||
3. Remote URL fetching and connectors (SSRF risk).
|
||||
4. Agent/tool execution from LLM output.
|
||||
5. Template/workflow rendering.
|
||||
6. Frontend rendering + token storage.
|
||||
7. Internal service endpoints (`INTERNAL_KEY`).
|
||||
8. High-impact integrations (SQL tool, generic API tool, remote MCP tools).
|
||||
|
||||
## 7) Key threats and expected mitigations
|
||||
|
||||
### A. Auth/authz misconfiguration
|
||||
- Threat: weak/no auth or leaked tokens leads to broad data access.
|
||||
- Mitigations: require auth for public deployments, short-lived tokens, rotation/revocation, least-privilege sharing.
|
||||
|
||||
### B. Untrusted file ingestion
|
||||
- Threat: malicious files/archives trigger traversal, parser exploits, or resource exhaustion.
|
||||
- Mitigations: strict path checks, archive safeguards, file limits, patched parser dependencies.
|
||||
|
||||
### C. SSRF/outbound abuse
|
||||
- Threat: URL loaders/tools access private/internal/metadata endpoints.
|
||||
- Mitigations: validate URLs + redirects, block private/link-local ranges, apply egress controls/allowlists.
|
||||
|
||||
### D. Prompt injection + tool abuse
|
||||
- Threat: retrieved text manipulates model behavior and causes unsafe tool calls.
|
||||
- Threat: never rely on the model to "choose correctly" under adversarial input.
|
||||
- Mitigations: treat retrieved/model output as untrusted, enforce tool policies, only expose tools explicitly assigned by the user/admin to that agent, separate system instructions from retrieved content, audit tool calls.
|
||||
|
||||
### E. Dangerous tool capability chaining (SQL/API/MCP)
|
||||
- Threat: write-capable SQL credentials allow destructive queries.
|
||||
- Threat: API tool can trigger side effects (infra/payment/webhook/code-exec endpoints).
|
||||
- Threat: remote MCP tools may expose privileged operations.
|
||||
- Mitigations: read-only-by-default credentials, destination allowlists, explicit approval for write/exec actions, per-tool policy enforcement + logging.
|
||||
|
||||
### F. Frontend/XSS + token theft
|
||||
- Threat: XSS can steal local tokens and call APIs.
|
||||
- Mitigations: reduce unsafe rendering paths, strong CSP, scoped short-lived credentials.
|
||||
|
||||
### G. Internal endpoint exposure
|
||||
- Threat: weak/unset `INTERNAL_KEY` enables internal API abuse.
|
||||
- Mitigations: fail closed, require strong random keys, keep internal APIs private.
|
||||
|
||||
### H. DoS and cost abuse
|
||||
- Threat: request floods, large ingestion jobs, expensive prompts/crawls.
|
||||
- Mitigations: rate limits, quotas, timeouts, queue backpressure, usage budgets.
|
||||
|
||||
## 8) Example attacker stories
|
||||
|
||||
- Internet-exposed deployment runs with weak/no auth and receives unauthorized data access/abuse.
|
||||
- Intranet deployment intentionally using weak/no auth is vulnerable to insider misuse and lateral-movement abuse.
|
||||
- Crafted archive attempts path traversal during extraction.
|
||||
- Malicious URL/redirect chain targets internal services.
|
||||
- Poisoned document causes data exfiltration through tool calls.
|
||||
- Over-privileged SQL/API/MCP tool performs destructive side effects.
|
||||
|
||||
## 9) Severity calibration
|
||||
|
||||
- **Critical:** unauthenticated public data access; prompt-injection-driven exfiltration; SSRF to sensitive internal endpoints.
|
||||
- **High:** cross-tenant leakage, persistent token compromise, over-privileged destructive tools.
|
||||
- **Medium:** DoS/cost amplification and non-critical information disclosure.
|
||||
- **Low:** minor hardening gaps with limited impact.
|
||||
|
||||
## 10) Baseline controls for public deployments
|
||||
|
||||
1. Enforce authentication and secure defaults.
|
||||
2. Set/rotate strong secrets (`JWT`, `INTERNAL_KEY`, encryption keys).
|
||||
3. Restrict CORS and front API with a hardened proxy.
|
||||
4. Add rate limiting/quotas for answer/upload/crawl/token endpoints.
|
||||
5. Enforce URL+redirect SSRF protections and egress restrictions.
|
||||
6. Apply upload/archive/parsing hardening.
|
||||
7. Require least-privilege tool credentials and auditable tool execution.
|
||||
8. Monitor auth failures, tool anomalies, ingestion spikes, and cost anomalies.
|
||||
9. Keep dependencies/images patched and scanned.
|
||||
10. Validate multi-tenant isolation with explicit tests.
|
||||
|
||||
## 11) Maintenance
|
||||
|
||||
Review this model after major auth, ingestion, connector, tool, or workflow changes.
|
||||
|
||||
## References
|
||||
|
||||
- [OWASP Top 10 for LLM Applications](https://owasp.org/www-project-top-10-for-large-language-model-applications/)
|
||||
- [OWASP ASVS](https://owasp.org/www-project-application-security-verification-standard/)
|
||||
- [STRIDE overview](https://learn.microsoft.com/azure/security/develop/threat-modeling-tool-threats)
|
||||
- [DocsGPT SECURITY.md](../SECURITY.md)
|
||||
@@ -4,7 +4,7 @@ boto3==1.42.83
|
||||
beautifulsoup4==4.14.3
|
||||
cel-python==0.5.0
|
||||
celery==5.6.3
|
||||
cryptography==46.0.6
|
||||
cryptography==46.0.7
|
||||
dataclasses-json==0.6.7
|
||||
defusedxml==0.7.1
|
||||
docling>=2.16.0
|
||||
@@ -29,14 +29,14 @@ jiter==0.13.0
|
||||
jmespath==1.1.0
|
||||
joblib==1.5.3
|
||||
jsonpatch==1.33
|
||||
jsonpointer==3.0.0
|
||||
jsonpointer==3.1.1
|
||||
kombu==5.6.2
|
||||
langchain==1.2.3
|
||||
langchain-community==0.4.1
|
||||
langchain-core==1.2.23
|
||||
langchain-core==1.2.29
|
||||
langchain-openai==1.1.12
|
||||
langchain-text-splitters==1.1.1
|
||||
langsmith==0.7.23
|
||||
langsmith==0.7.31
|
||||
lazy-object-proxy==1.12.0
|
||||
lxml==6.0.2
|
||||
markupsafe==3.0.3
|
||||
@@ -84,7 +84,7 @@ tqdm==4.67.3
|
||||
transformers==5.4.0
|
||||
typing-extensions==4.15.0
|
||||
typing-inspect==0.9.0
|
||||
tzdata==2025.3
|
||||
tzdata==2026.1
|
||||
urllib3==2.6.3
|
||||
vine==5.1.0
|
||||
wcwidth==0.6.0
|
||||
|
||||
1643
extensions/react-widget/package-lock.json
generated
1643
extensions/react-widget/package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@@ -81,7 +81,7 @@
|
||||
"parcel": "^2.16.4",
|
||||
"prettier": "^3.8.1",
|
||||
"process": "^0.11.10",
|
||||
"svgo": "^3.3.3",
|
||||
"svgo": "^4.0.1",
|
||||
"typescript": "^5.3.3"
|
||||
},
|
||||
"publishConfig": {
|
||||
|
||||
690
frontend/package-lock.json
generated
690
frontend/package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@@ -33,12 +33,12 @@
|
||||
"i18next-browser-languagedetector": "^8.2.1",
|
||||
"lodash": "^4.17.21",
|
||||
"lucide-react": "^0.562.0",
|
||||
"mermaid": "^11.12.1",
|
||||
"mermaid": "^11.14.0",
|
||||
"prop-types": "^15.8.1",
|
||||
"radix-ui": "^1.4.3",
|
||||
"react": "^19.1.0",
|
||||
"react-chartjs-2": "^5.3.0",
|
||||
"react-dom": "^19.1.1",
|
||||
"react-dom": "^19.2.5",
|
||||
"react-dropzone": "^14.3.8",
|
||||
"react-google-drive-picker": "^1.2.2",
|
||||
"react-i18next": "^17.0.2",
|
||||
@@ -53,10 +53,10 @@
|
||||
"tailwind-merge": "^3.4.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@tailwindcss/postcss": "^4.1.10",
|
||||
"@tailwindcss/postcss": "^4.2.2",
|
||||
"@types/lodash": "^4.17.20",
|
||||
"@types/react": "^19.1.8",
|
||||
"@types/react-dom": "^19.1.7",
|
||||
"@types/react-dom": "^19.2.3",
|
||||
"@types/react-syntax-highlighter": "^15.5.13",
|
||||
"@typescript-eslint/eslint-plugin": "^8.58.2",
|
||||
"@typescript-eslint/parser": "^8.46.3",
|
||||
@@ -64,7 +64,7 @@
|
||||
"eslint": "^9.39.1",
|
||||
"eslint-config-prettier": "^10.1.5",
|
||||
"eslint-plugin-import": "^2.31.0",
|
||||
"eslint-plugin-n": "^17.23.1",
|
||||
"eslint-plugin-n": "^17.24.0",
|
||||
"eslint-plugin-prettier": "^5.5.4",
|
||||
"eslint-plugin-promise": "^6.6.0",
|
||||
"eslint-plugin-react": "^7.37.5",
|
||||
@@ -74,7 +74,7 @@
|
||||
"postcss": "^8.4.49",
|
||||
"prettier": "^3.5.3",
|
||||
"prettier-plugin-tailwindcss": "^0.7.2",
|
||||
"tailwindcss": "^4.2.1",
|
||||
"tailwindcss": "^4.2.2",
|
||||
"tw-animate-css": "^1.4.0",
|
||||
"typescript": "^5.8.3",
|
||||
"vite": "^8.0.0",
|
||||
|
||||
Reference in New Issue
Block a user