diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 1b0567e4..53b9f3ab 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -27,6 +27,7 @@ Before creating issues, please check out how the latest version of our app looks ### 👨‍💻 If you're interested in contributing code, here are some important things to know: +For instructions on setting up a development environment, please refer to our [Development Deployment Guide](https://docs.docsgpt.cloud/Deploying/Development-Environment). Tech Stack Overview: @@ -36,15 +37,14 @@ Tech Stack Overview: ### 🌐 If you are looking to contribute to frontend (⚛️React, Vite): -- The current frontend is being migrated from [`/application`](https://github.com/arc53/DocsGPT/tree/main/application) to [`/frontend`](https://github.com/arc53/DocsGPT/tree/main/frontend) with a new design, so please contribute to the new one. -- Check out this [milestone](https://github.com/arc53/DocsGPT/milestone/1) and its issues. + - The updated Figma design can be found [here](https://www.figma.com/file/OXLtrl1EAy885to6S69554/DocsGPT?node-id=0%3A1&t=hjWVuxRg9yi5YkJ9-1). Please try to follow the guidelines. ### 🖥 If you are looking to contribute to Backend (🐍 Python): -- Review our issues and contribute to [`/application`](https://github.com/arc53/DocsGPT/tree/main/application) or [`/scripts`](https://github.com/arc53/DocsGPT/tree/main/scripts) (please disregard old [`ingest_rst.py`](https://github.com/arc53/DocsGPT/blob/main/scripts/old/ingest_rst.py) [`ingest_rst_sphinx.py`](https://github.com/arc53/DocsGPT/blob/main/scripts/old/ingest_rst_sphinx.py) files; these will be deprecated soon). +- Review our issues and contribute to [`/application`](https://github.com/arc53/DocsGPT/tree/main/application) - All new code should be covered with unit tests ([pytest](https://github.com/pytest-dev/pytest)). Please find tests under [`/tests`](https://github.com/arc53/DocsGPT/tree/main/tests) folder. - Before submitting your Pull Request, ensure it can be queried after ingesting some test data. diff --git a/README.md b/README.md index f2699fe6..ca512c2f 100644 --- a/README.md +++ b/README.md @@ -3,13 +3,11 @@

- Open-Source Documentation Assistant + Open-Source RAG Assistant

- DocsGPT is a cutting-edge open-source solution that streamlines the process of finding information in the project documentation. With its integration of the powerful GPT models, developers can easily ask questions about a project and receive accurate answers. - -Say goodbye to time-consuming manual searches, and let DocsGPT help you quickly find the information you need. Try it out and see how it revolutionizes your project documentation experience. Contribute to its development and be a part of the future of AI-powered assistance. + DocsGPT is an open-source genAI tool that helps users get reliable answers from any knowledge source, while avoiding hallucinations. It enables quick and reliable information retrieval, with tooling and agentic system capability built in.

@@ -20,175 +18,122 @@ Say goodbye to time-consuming manual searches, and let ![link to discord](https://img.shields.io/discord/1070046503302877216) ![X (formerly Twitter) URL](https://img.shields.io/twitter/follow/docsgptai) - +
+ + [☁️ Cloud Version](https://app.docsgpt.cloud/) • [💬 Discord](https://discord.gg/n5BX8dh8rU) • [📖 Guides](https://docs.docsgpt.cloud/) +
+ [👫 Contribute](https://github.com/arc53/DocsGPT/blob/main/CONTRIBUTING.md) • [🏠 Self-host](https://docs.docsgpt.cloud/Guides/How-to-use-different-LLM) • [⚡️ Quickstart](https://github.com/arc53/DocsGPT#quickstart) +
+
+video-example-of-docs-gpt +
+

+ Key Features: +

+ + +## Roadmap + +- [x] Full GoogleAI compatibility (Jan 2025) +- [x] Add tools (Jan 2025) +- [ ] Anthropic Tool compatibility +- [ ] Add triggerable actions / tools (webhook) +- [ ] Add OAuth 2.0 authentication for tools and sources +- [ ] Manually updating chunks in the app UI +- [ ] Devcontainer for easy development +- [ ] Chatbots menu re-design to handle tools, scheduling, and more + +You can find our full roadmap [here](https://github.com/orgs/arc53/projects/2). Please don't hesitate to contribute or create issues, it helps us improve DocsGPT! ### Production Support / Help for Companies: We're eager to provide personalized assistance when deploying your DocsGPT to a live environment. -[Book a Meeting :wave:](https://cal.com/arc53/docsgpt-demo-b2b)⁠ +[Get a Demo :wave:](https://www.docsgpt.cloud/contact)⁠ -[Send Email :email:](mailto:contact@arc53.com?subject=DocsGPT%20support%2Fsolutions) +[Send Email :email:](mailto:support@docsgpt.cloud?subject=DocsGPT%20support%2Fsolutions) -video-example-of-docs-gpt - -## Roadmap - -You can find our roadmap [here](https://github.com/orgs/arc53/projects/2). Please don't hesitate to contribute or create issues, it helps us improve DocsGPT! - -## Our Open-Source Models Optimized for DocsGPT: - -| Name | Base Model | Requirements (or similar) | -| --------------------------------------------------------------------- | ----------- | ------------------------- | -| [Docsgpt-7b-mistral](https://huggingface.co/Arc53/docsgpt-7b-mistral) | Mistral-7b | 1xA10G gpu | -| [Docsgpt-14b](https://huggingface.co/Arc53/docsgpt-14b) | llama-2-14b | 2xA10 gpu's | -| [Docsgpt-40b-falcon](https://huggingface.co/Arc53/docsgpt-40b-falcon) | falcon-40b | 8xA10G gpu's | - -If you don't have enough resources to run it, you can use bitsnbytes to quantize. - -## End to End AI Framework for Information Retrieval - -![Architecture chart](https://github.com/user-attachments/assets/fc6a7841-ddfc-45e6-b5a0-d05fe648cbe2) - -## Useful Links - -- :mag: :fire: [Cloud Version](https://app.docsgpt.cloud/) - -- :speech_balloon: :tada: [Join our Discord](https://discord.gg/n5BX8dh8rU) - -- :books: :sunglasses: [Guides](https://docs.docsgpt.cloud/) - -- :couple: [Interested in contributing?](https://github.com/arc53/DocsGPT/blob/main/CONTRIBUTING.md) - -- :file_folder: :rocket: [How to use any other documentation](https://docs.docsgpt.cloud/Guides/How-to-train-on-other-documentation) - -- :house: :closed_lock_with_key: [How to host it locally (so all data will stay on-premises)](https://docs.docsgpt.cloud/Guides/How-to-use-different-LLM) - -## Project Structure - -- Application - Flask app (main application). - -- Extensions - Chrome extension. - -- Scripts - Script that creates similarity search index for other libraries. - -- Frontend - Frontend uses Vite and React. - ## QuickStart > [!Note] > Make sure you have [Docker](https://docs.docker.com/engine/install/) installed + +1. Clone the repository and run the following command: + ```bash + git clone https://github.com/arc53/DocsGPT.git + cd DocsGPT + ``` + On Mac OS or Linux, write: -`./setup.sh` + +2. Run the following command: + ```bash + ./setup.sh + ``` It will install all the dependencies and allow you to download the local model, use OpenAI or use our LLM API. Otherwise, refer to this Guide for Windows: -1. Download and open this repository with `git clone https://github.com/arc53/DocsGPT.git` -2. Create a `.env` file in your root directory and set the env variables and `VITE_API_STREAMING` to true or false, depending on whether you want streaming answers or not. +On windows: + +2. Create a `.env` file in your root directory and set the env variables. It should look like this inside: ``` LLM_NAME=[docsgpt or openai or others] - VITE_API_STREAMING=true API_KEY=[if LLM_NAME is openai] ``` - See optional environment variables in the [/.env-template](https://github.com/arc53/DocsGPT/blob/main/.env-template) and [/application/.env_sample](https://github.com/arc53/DocsGPT/blob/main/application/.env_sample) files. + See optional environment variables in the [/application/.env_sample](https://github.com/arc53/DocsGPT/blob/main/application/.env_sample) file. -3. Run [./run-with-docker-compose.sh](https://github.com/arc53/DocsGPT/blob/main/run-with-docker-compose.sh). +3. Run the following command: + + ```bash + docker-compose up + ``` 4. Navigate to http://localhost:5173/. To stop, just run `Ctrl + C`. -## Development Environments - -### Spin up Mongo and Redis - -For development, only two containers are used from [docker-compose.yaml](https://github.com/arc53/DocsGPT/blob/main/docker-compose.yaml) (by deleting all services except for Redis and Mongo). -See file [docker-compose-dev.yaml](./docker-compose-dev.yaml). - -Run - -``` -docker compose -f docker-compose-dev.yaml build -docker compose -f docker-compose-dev.yaml up -d -``` - -### Run the Backend - > [!Note] -> Make sure you have Python 3.12 installed. - -1. Export required environment variables or prepare a `.env` file in the project folder: - - Copy [.env-template](https://github.com/arc53/DocsGPT/blob/main/application/.env-template) and create `.env`. - -(check out [`application/core/settings.py`](application/core/settings.py) if you want to see more config options.) - -2. (optional) Create a Python virtual environment: - You can follow the [Python official documentation](https://docs.python.org/3/tutorial/venv.html) for virtual environments. - -a) On Mac OS and Linux - -```commandline -python -m venv venv -. venv/bin/activate -``` - -b) On Windows - -```commandline -python -m venv venv - venv/Scripts/activate -``` - -3. Download embedding model and save it in the `model/` folder: -You can use the script below, or download it manually from [here](https://d3dg1063dc54p9.cloudfront.net/models/embeddings/mpnet-base-v2.zip), unzip it and save it in the `model/` folder. - -```commandline -wget https://d3dg1063dc54p9.cloudfront.net/models/embeddings/mpnet-base-v2.zip -unzip mpnet-base-v2.zip -d model -rm mpnet-base-v2.zip -``` - -4. Install dependencies for the backend: - -```commandline -pip install -r application/requirements.txt -``` - -5. Run the app using `flask --app application/app.py run --host=0.0.0.0 --port=7091`. -6. Start worker with `celery -A application.app.celery worker -l INFO`. - -### Start Frontend - -> [!Note] -> Make sure you have Node version 16 or higher. - -1. Navigate to the [/frontend](https://github.com/arc53/DocsGPT/tree/main/frontend) folder. -2. Install the required packages `husky` and `vite` (ignore if already installed). - -```commandline -npm install husky -g -npm install vite -g -``` - -3. Install dependencies by running `npm install --include=dev`. -4. Run the app using `npm run dev`. +> For development environment setup instructions, please refer to the [Development Environment Guide](https://docs.docsgpt.cloud/Deploying/Development-Environment). ## Contributing Please refer to the [CONTRIBUTING.md](CONTRIBUTING.md) file for information about how to get involved. We welcome issues, questions, and pull requests. +## Architecture + +![Architecture chart](https://github.com/user-attachments/assets/fc6a7841-ddfc-45e6-b5a0-d05fe648cbe2) + +## Project Structure + +- Application - Flask app (main application). + +- Extensions - Extensions, like react widget or discord bot. + +- Frontend - Frontend uses Vite and React. + +- Scripts - Miscellaneous scripts. + ## Code Of Conduct We as members, contributors, and leaders, pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. Please refer to the [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) file for more information about contributing. + ## Many Thanks To Our Contributors⚡ diff --git a/application/requirements.txt b/application/requirements.txt index c193f38d..8029f9fb 100644 --- a/application/requirements.txt +++ b/application/requirements.txt @@ -60,6 +60,7 @@ prance==23.6.21.0 primp==0.10.0 prompt-toolkit==3.0.48 protobuf==5.29.3 +psycopg2-binary==2.9.10 py==1.11.0 pydantic==2.10.4 pydantic-core==2.27.2 diff --git a/application/tools/implementations/postgres.py b/application/tools/implementations/postgres.py new file mode 100644 index 00000000..a83db9aa --- /dev/null +++ b/application/tools/implementations/postgres.py @@ -0,0 +1,163 @@ +import psycopg2 +from application.tools.base import Tool + +class PostgresTool(Tool): + """ + PostgreSQL Database Tool + A tool for connecting to a PostgreSQL database using a connection string, + executing SQL queries, and retrieving schema information. + """ + + def __init__(self, config): + self.config = config + self.connection_string = config.get("token", "") + + def execute_action(self, action_name, **kwargs): + actions = { + "postgres_execute_sql": self._execute_sql, + "postgres_get_schema": self._get_schema, + } + + if action_name in actions: + return actions[action_name](**kwargs) + else: + raise ValueError(f"Unknown action: {action_name}") + + def _execute_sql(self, sql_query): + """ + Executes an SQL query against the PostgreSQL database using a connection string. + """ + conn = None # Initialize conn to None for error handling + try: + conn = psycopg2.connect(self.connection_string) + cur = conn.cursor() + cur.execute(sql_query) + conn.commit() + + if sql_query.strip().lower().startswith("select"): + column_names = [desc[0] for desc in cur.description] if cur.description else [] + results = [] + rows = cur.fetchall() + for row in rows: + results.append(dict(zip(column_names, row))) + response_data = {"data": results, "column_names": column_names} + else: + row_count = cur.rowcount + response_data = {"message": f"Query executed successfully, {row_count} rows affected."} + + cur.close() + return { + "status_code": 200, + "message": "SQL query executed successfully.", + "response_data": response_data, + } + + except psycopg2.Error as e: + error_message = f"Database error: {e}" + print(f"Database error: {e}") + return { + "status_code": 500, + "message": "Failed to execute SQL query.", + "error": error_message, + } + finally: + if conn: # Ensure connection is closed even if errors occur + conn.close() + + def _get_schema(self, db_name): + """ + Retrieves the schema of the PostgreSQL database using a connection string. + """ + conn = None # Initialize conn to None for error handling + try: + conn = psycopg2.connect(self.connection_string) + cur = conn.cursor() + + cur.execute(""" + SELECT + table_name, + column_name, + data_type, + column_default, + is_nullable + FROM + information_schema.columns + WHERE + table_schema = 'public' + ORDER BY + table_name, + ordinal_position; + """) + + schema_data = {} + for row in cur.fetchall(): + table_name, column_name, data_type, column_default, is_nullable = row + if table_name not in schema_data: + schema_data[table_name] = [] + schema_data[table_name].append({ + "column_name": column_name, + "data_type": data_type, + "column_default": column_default, + "is_nullable": is_nullable + }) + + cur.close() + return { + "status_code": 200, + "message": "Database schema retrieved successfully.", + "schema": schema_data, + } + + except psycopg2.Error as e: + error_message = f"Database error: {e}" + print(f"Database error: {e}") + return { + "status_code": 500, + "message": "Failed to retrieve database schema.", + "error": error_message, + } + finally: + if conn: # Ensure connection is closed even if errors occur + conn.close() + + def get_actions_metadata(self): + return [ + { + "name": "postgres_execute_sql", + "description": "Execute an SQL query against the PostgreSQL database and return the results. Use this tool to interact with the database, e.g., retrieve specific data or perform updates. Only SELECT queries will return data, other queries will return execution status.", + "parameters": { + "type": "object", + "properties": { + "sql_query": { + "type": "string", + "description": "The SQL query to execute.", + }, + }, + "required": ["sql_query"], + "additionalProperties": False, + }, + }, + { + "name": "postgres_get_schema", + "description": "Retrieve the schema of the PostgreSQL database, including tables and their columns. Use this to understand the database structure before executing queries. db_name is 'default' if not provided.", + "parameters": { + "type": "object", + "properties": { + "db_name": { + "type": "string", + "description": "The name of the database to retrieve the schema for.", + }, + }, + "required": ["db_name"], + "additionalProperties": False, + }, + }, + ] + + def get_config_requirements(self): + return { + "token": { + "type": "string", + "description": "PostgreSQL database connection string (e.g., 'postgresql://user:password@host:port/dbname')", + }, + } \ No newline at end of file diff --git a/docs/pages/Deploying/Development-Environment.md b/docs/pages/Deploying/Development-Environment.md new file mode 100644 index 00000000..bf489fb8 --- /dev/null +++ b/docs/pages/Deploying/Development-Environment.md @@ -0,0 +1,78 @@ +## Development Environments + +### Spin up Mongo and Redis + +For development, only two containers are used from [docker-compose.yaml](https://github.com/arc53/DocsGPT/blob/main/docker-compose.yaml) (by deleting all services except for Redis and Mongo). +See file [docker-compose-dev.yaml](https://github.com/arc53/DocsGPT/blob/main/docker-compose-dev.yaml). + +Run + +``` +docker compose -f docker-compose-dev.yaml build +docker compose -f docker-compose-dev.yaml up -d +``` + +### Run the Backend + +> [!Note] +> Make sure you have Python 3.12 installed. + +1. Export required environment variables or prepare a `.env` file in the project folder: + - Copy [.env-template](https://github.com/arc53/DocsGPT/blob/main/application/.env-template) and create `.env`. + +(check out [`application/core/settings.py`](application/core/settings.py) if you want to see more config options.) + +2. (optional) Create a Python virtual environment: + You can follow the [Python official documentation](https://docs.python.org/3/tutorial/venv.html) for virtual environments. + +a) On Mac OS and Linux + +```commandline +python -m venv venv +. venv/bin/activate +``` + +b) On Windows + +```commandline +python -m venv venv + venv/Scripts/activate +``` + +3. Download embedding model and save it in the `model/` folder: +You can use the script below, or download it manually from [here](https://d3dg1063dc54p9.cloudfront.net/models/embeddings/mpnet-base-v2.zip), unzip it and save it in the `model/` folder. + +```commandline +wget https://d3dg1063dc54p9.cloudfront.net/models/embeddings/mpnet-base-v2.zip +unzip mpnet-base-v2.zip -d model +rm mpnet-base-v2.zip +``` + +4. Install dependencies for the backend: + +```commandline +pip install -r application/requirements.txt +``` + +5. Run the app using `flask --app application/app.py run --host=0.0.0.0 --port=7091`. +6. Start worker with `celery -A application.app.celery worker -l INFO`. + +> [!Note] +> You can also launch the in a debugger mode in vscode by accessing SHIFT + CMD + D or SHIFT + Windows + D on windows and selecting Flask or Celery. + + +### Start Frontend + +> [!Note] +> Make sure you have Node version 16 or higher. + +1. Navigate to the [/frontend](https://github.com/arc53/DocsGPT/tree/main/frontend) folder. +2. Install the required packages `husky` and `vite` (ignore if already installed). + +```commandline +npm install husky -g +npm install vite -g +``` + +3. Install dependencies by running `npm install --include=dev`. +4. Run the app using `npm run dev`. \ No newline at end of file diff --git a/docs/pages/Deploying/Quickstart.md b/docs/pages/Deploying/Quickstart.md index a2bdc706..671d242c 100644 --- a/docs/pages/Deploying/Quickstart.md +++ b/docs/pages/Deploying/Quickstart.md @@ -15,11 +15,21 @@ If you prefer to follow manual steps, refer to this guide: 1. Open and download this repository with ```bash git clone https://github.com/arc53/DocsGPT.git + cd DocsGPT ``` -2. Create a `.env` file in your root directory and set your `API_KEY` with your [OpenAI API key](https://platform.openai.com/account/api-keys). (optional in case you want to use OpenAI) +2. Create a `.env` file in your root directory and set the env variables. + It should look like this inside: + + ``` + LLM_NAME=[docsgpt or openai or others] + API_KEY=[if LLM_NAME is openai] + ``` + + See optional environment variables in the [/application/.env_sample](https://github.com/arc53/DocsGPT/blob/main/application/.env_sample) file. + 3. Run the following commands: ```bash - docker-compose build && docker-compose up + docker compose up ``` 4. Navigate to http://localhost:5173/. @@ -27,43 +37,28 @@ To stop, simply press **Ctrl + C**. **For WINDOWS:** -To run the setup on Windows, you have two options: using the Windows Subsystem for Linux (WSL) or using Git Bash or Command Prompt. - -**Option 1: Using Windows Subsystem for Linux (WSL):** - -1. Install WSL if you haven't already. You can follow the official Microsoft documentation for installation: (https://learn.microsoft.com/en-us/windows/wsl/install). -2. After setting up WSL, open the WSL terminal. -3. Clone the repository and create the `.env` file: +1. Open and download this repository with ```bash git clone https://github.com/arc53/DocsGPT.git cd DocsGPT - echo "API_KEY=Yourkey" > .env - echo "VITE_API_STREAMING=true" >> .env ``` -4. Run the following command to start the setup with Docker Compose: - ```bash - ./run-with-docker-compose.sh - ``` -6. Open your web browser and navigate to http://localhost:5173/. -7. To stop the setup, just press **Ctrl + C** in the WSL terminal -**Option 2: Using Git Bash or Command Prompt (CMD):** +2. Create a `.env` file in your root directory and set the env variables. + It should look like this inside: -1. Install Git for Windows if you haven't already. Download it from the official website: (https://gitforwindows.org/). -2. Open Git Bash or Command Prompt. -3. Clone the repository and create the `.env` file: - ```bash - git clone https://github.com/arc53/DocsGPT.git - cd DocsGPT - echo "API_KEY=Yourkey" > .env - echo "VITE_API_STREAMING=true" >> .env ``` -4. Run the following command to start the setup with Docker Compose: - ```bash - ./run-with-docker-compose.sh + LLM_NAME=[docsgpt or openai or others] + API_KEY=[if LLM_NAME is openai] ``` -5. Open your web browser and navigate to http://localhost:5173/. -6. To stop the setup, just press **Ctrl + C** in the Git Bash or Command Prompt terminal. -These steps should help you set up and run the project on Windows using either WSL or Git Bash/Command Prompt. + See optional environment variables in the [/application/.env_sample](https://github.com/arc53/DocsGPT/blob/main/application/.env_sample) file. + +3. Run the following command: + + ```bash + docker-compose up + ``` +4. Navigate to http://localhost:5173/. +5. To stop the setup, just press **Ctrl + C** in the WSL terminal + **Important:** Ensure that Docker is installed and properly configured on your Windows system for these steps to work. diff --git a/docs/pages/Deploying/_meta.json b/docs/pages/Deploying/_meta.json index 64cd77db..d01e1f67 100644 --- a/docs/pages/Deploying/_meta.json +++ b/docs/pages/Deploying/_meta.json @@ -7,6 +7,10 @@ "title": "⚡️Quickstart", "href": "/Deploying/Quickstart" }, + "Development-Environment": { + "title": "🛠️Development Environment", + "href": "/Deploying/Development-Environment" + }, "Railway-Deploying": { "title": "🚂Deploying on Railway", "href": "/Deploying/Railway-Deploying" diff --git a/docs/pages/index.mdx b/docs/pages/index.mdx index eedc2b09..423a0a01 100644 --- a/docs/pages/index.mdx +++ b/docs/pages/index.mdx @@ -25,7 +25,10 @@ DocsGPT 🦖 is an innovative open-source tool designed to simplify the retrieva -homedemo + Try it yourself: [https://www.docsgpt.cloud/](https://www.docsgpt.cloud/) diff --git a/frontend/public/toolIcons/tool_postgres.svg b/frontend/public/toolIcons/tool_postgres.svg new file mode 100644 index 00000000..c7acdb18 --- /dev/null +++ b/frontend/public/toolIcons/tool_postgres.svg @@ -0,0 +1,29 @@ + + + + + + + Data + + + sql-database-generic + + + SQL Database (Generic) + + + image/svg+xml + + + Amido Limited + + + Richard Slater + + + + + + + \ No newline at end of file