--- title: PostgreSQL for User Data description: PostgreSQL is the user-data store for DocsGPT. Covers auto-bootstrap, production hardening, and the one-shot migration from legacy MongoDB deployments. --- import { Callout } from 'nextra/components' # PostgreSQL for User Data DocsGPT stores conversations, agents, prompts, sources, attachments, workflows, logs, and token usage in **PostgreSQL**. MongoDB is no longer required. Vector stores are independent — `VECTOR_STORE` can still be `pgvector`, `faiss`, `qdrant`, `milvus`, `elasticsearch`, or `mongodb`. ## Quickstart Three common paths. Each assumes Postgres 13+ and the default env vars `AUTO_MIGRATE=true` / `AUTO_CREATE_DB=true` (both ship enabled). ### Docker Compose The bundled compose file ships a `postgres` service. App boot handles the rest — no sidecar, no init job. ```bash cd deployment && docker compose up ``` ### Managed Postgres (Neon, RDS, Supabase, Cloud SQL) Point `POSTGRES_URI` at the provider-given URI. The app applies the schema on first boot. ```bash export POSTGRES_URI="postgresql://user:pass@host/docsgpt?sslmode=require" flask --app application/app.py run --host=0.0.0.0 --port=7091 ``` ### Bare-metal Postgres Run Postgres locally and point `POSTGRES_URI` at the default superuser. First boot creates both the database and the schema. ```bash export POSTGRES_URI="postgresql://postgres@localhost/docsgpt" flask --app application/app.py run --host=0.0.0.0 --port=7091 ``` Prefer a dedicated non-superuser role? Create it once as superuser — the app never creates roles. ```sql CREATE ROLE docsgpt LOGIN PASSWORD 'docsgpt' CREATEDB; -- Then: POSTGRES_URI=postgresql://docsgpt:docsgpt@localhost/docsgpt ``` ## How auto-bootstrap works Two env vars control startup behavior. Both default to `true` in the app and are idempotent. | Setting | Effect | Requires | | --- | --- | --- | | `AUTO_CREATE_DB` | If the target database is missing, connects to the server's `postgres` maintenance DB and issues `CREATE DATABASE`. | `CREATEDB` privilege (or superuser) | | `AUTO_MIGRATE` | Runs `alembic upgrade head` against the target database. | Table-owner or superuser on the target DB | Concurrent workers serialize through `alembic_version`, so rolling restarts are safe. If the role lacks the required privilege, startup fails fast with a clear error rather than silently skipping. Convenient in dev. In production, disable both and run migrations as an explicit step — see [Production hardening](#production-hardening). ## Production hardening Set both flags to `false` in prod and run migrations as a gated, auditable step before rolling out the app. ```env AUTO_MIGRATE=false AUTO_CREATE_DB=false ``` Run migrations from your CI/CD pipeline, a Kubernetes `Job`, or an init-container ahead of the app rollout: ```bash python scripts/db/init_postgres.py # equivalently: alembic -c application/alembic.ini upgrade head ``` The reasoning: the app's runtime role shouldn't carry DDL privileges, migrations should gate each rollout, and an explicit step is auditable — implicit first-boot bootstrap is fine for dev but muddies prod deploys. Migrations are not reversible by the app. Always back up production Postgres before running `alembic upgrade head` on a new release. ## Migrating from MongoDB One-shot, offline, app stopped. The app itself will create the Postgres schema when it boots — you only need to run the data copy. ```bash pip install -r application/requirements.txt pip install 'pymongo>=4.6' export POSTGRES_URI="postgresql://docsgpt:docsgpt@localhost:5432/docsgpt" export MONGO_URI="mongodb://user:pass@host:27017/docsgpt" python scripts/db/backfill.py --dry-run # preview python scripts/db/backfill.py # real run # or: python scripts/db/backfill.py --tables users,agents ``` Then unset `MONGO_URI` and start the backend — nothing consults Mongo in the default path anymore. The backfill is idempotent (per-table `ON CONFLICT` upserts, event-log tables deduped via `mongo_id`), so re-running is safe and re-syncs any drifted rows. Keep Mongo online until you've verified Postgres is complete; decommission afterwards unless you still use it as a vector store. No dual-write window and no runtime flag — on the current version, Postgres is the only user-data store the backend reads or writes. ## Troubleshooting - **`relation "..." does not exist`** — schema not applied. Either let the app bootstrap it (`AUTO_MIGRATE=true`) or run `python scripts/db/init_postgres.py`. - **`permission denied to create database`** — the role lacks `CREATEDB`. As superuser: `ALTER ROLE CREATEDB;`. Or create the database manually and set `AUTO_CREATE_DB=false`. - **`role "docsgpt" does not exist`** — roles are never auto-created. As superuser: `CREATE ROLE docsgpt LOGIN PASSWORD '...';`. - **SSL errors on a managed provider** — append `?sslmode=require` to `POSTGRES_URI`. - **`ModuleNotFoundError: pymongo`** — `pip install 'pymongo>=4.6'` (only needed for the one-shot Mongo backfill).