mirror of
https://github.com/arc53/DocsGPT.git
synced 2026-04-29 05:20:26 +00:00
docs(models): update provider examples and add native llama.cpp info
This commit is contained in:
@@ -14,7 +14,7 @@ The primary method for configuring your LLM provider in DocsGPT is through the `
|
|||||||
To connect to a cloud LLM provider, you will typically need to configure the following basic settings in your `.env` file:
|
To connect to a cloud LLM provider, you will typically need to configure the following basic settings in your `.env` file:
|
||||||
|
|
||||||
* **`LLM_PROVIDER`**: This setting is essential and identifies the specific cloud provider you wish to use (e.g., `openai`, `google`, `anthropic`).
|
* **`LLM_PROVIDER`**: This setting is essential and identifies the specific cloud provider you wish to use (e.g., `openai`, `google`, `anthropic`).
|
||||||
* **`LLM_NAME`**: Specifies the exact model you want to utilize from your chosen provider (e.g., `gpt-4o`, `gemini-2.0-flash`, `claude-3-5-sonnet-latest`). Refer to your provider's documentation for a list of available models.
|
* **`LLM_NAME`**: Specifies the exact model you want to utilize from your chosen provider (e.g., `gpt-5.1`, `gemini-flash-latest`, `claude-3-5-sonnet-20241022`). Refer to your provider's documentation for a list of available models.
|
||||||
* **`API_KEY`**: Almost all cloud LLM providers require an API key for authentication. Obtain your API key from your chosen provider's platform and securely store it in your `.env` file.
|
* **`API_KEY`**: Almost all cloud LLM providers require an API key for authentication. Obtain your API key from your chosen provider's platform and securely store it in your `.env` file.
|
||||||
|
|
||||||
## Explicitly Supported Cloud Providers
|
## Explicitly Supported Cloud Providers
|
||||||
@@ -24,12 +24,15 @@ DocsGPT offers direct, streamlined support for the following cloud LLM providers
|
|||||||
| Provider | `LLM_PROVIDER` | Example `LLM_NAME` |
|
| Provider | `LLM_PROVIDER` | Example `LLM_NAME` |
|
||||||
| :--------------------------- | :------------- | :-------------------------- |
|
| :--------------------------- | :------------- | :-------------------------- |
|
||||||
| DocsGPT Public API | `docsgpt` | `None` |
|
| DocsGPT Public API | `docsgpt` | `None` |
|
||||||
| OpenAI | `openai` | `gpt-4o` |
|
| OpenAI | `openai` | `gpt-5.1` |
|
||||||
| Google (Vertex AI, Gemini) | `google` | `gemini-2.0-flash` |
|
| Google (Vertex AI, Gemini) | `google` | `gemini-flash-latest` |
|
||||||
| Anthropic (Claude) | `anthropic` | `claude-3-5-sonnet-latest` |
|
| Anthropic (Claude) | `anthropic` | `claude-3-5-sonnet-20241022`|
|
||||||
| Groq | `groq` | `llama-3.1-8b-instant` |
|
| Groq | `groq` | `llama-3.3-70b-versatile` |
|
||||||
| HuggingFace Inference API | `huggingface` | `meta-llama/Llama-3.1-8B-Instruct` |
|
| HuggingFace Inference API | `huggingface` | `meta-llama/Llama-3.1-8B-Instruct` |
|
||||||
| Azure OpenAI | `azure_openai` | `gpt-4o` |
|
| Azure OpenAI | `azure_openai` | `azure-gpt-4` |
|
||||||
|
| Prem AI | `premai` | (See Prem AI docs) |
|
||||||
|
| AWS SageMaker | `sagemaker` | (See SageMaker docs) |
|
||||||
|
| Novita AI | `novita` | (See Novita docs) |
|
||||||
|
|
||||||
## Connecting to OpenAI-Compatible Cloud APIs
|
## Connecting to OpenAI-Compatible Cloud APIs
|
||||||
|
|
||||||
|
|||||||
@@ -20,13 +20,24 @@ To connect to a local inference engine, you will generally need to configure the
|
|||||||
* **`OPENAI_BASE_URL`**: This is essential. Set this to the base URL of your local inference engine's API endpoint. This tells DocsGPT where to find your local LLM server.
|
* **`OPENAI_BASE_URL`**: This is essential. Set this to the base URL of your local inference engine's API endpoint. This tells DocsGPT where to find your local LLM server.
|
||||||
* **`API_KEY`**: Generally, for local inference engines, you can set `API_KEY=None` as authentication is usually not required in local setups.
|
* **`API_KEY`**: Generally, for local inference engines, you can set `API_KEY=None` as authentication is usually not required in local setups.
|
||||||
|
|
||||||
|
## Native llama.cpp Support
|
||||||
|
|
||||||
|
DocsGPT includes native support for llama.cpp without requiring an OpenAI-compatible server. To use this:
|
||||||
|
|
||||||
|
```
|
||||||
|
LLM_PROVIDER=llama.cpp
|
||||||
|
LLM_NAME=your-model-name
|
||||||
|
```
|
||||||
|
|
||||||
|
This provider integrates directly with llama.cpp Python bindings.
|
||||||
|
|
||||||
## Supported Local Inference Engines (OpenAI API Compatible)
|
## Supported Local Inference Engines (OpenAI API Compatible)
|
||||||
|
|
||||||
DocsGPT is readily configurable to work with the following local inference engines, all communicating via the OpenAI API format. Here are example `OPENAI_BASE_URL` values for each, based on default setups:
|
DocsGPT is also readily configurable to work with the following local inference engines, all communicating via the OpenAI API format. Here are example `OPENAI_BASE_URL` values for each, based on default setups:
|
||||||
|
|
||||||
| Inference Engine | `LLM_PROVIDER` | `OPENAI_BASE_URL` |
|
| Inference Engine | `LLM_PROVIDER` | `OPENAI_BASE_URL` |
|
||||||
| :---------------------------- | :------------- | :------------------------- |
|
| :---------------------------- | :------------- | :------------------------- |
|
||||||
| LLaMa.cpp | `openai` | `http://localhost:8000/v1` |
|
| LLaMa.cpp (server mode) | `openai` | `http://localhost:8000/v1` |
|
||||||
| Ollama | `openai` | `http://localhost:11434/v1` |
|
| Ollama | `openai` | `http://localhost:11434/v1` |
|
||||||
| Text Generation Inference (TGI)| `openai` | `http://localhost:8080/v1` |
|
| Text Generation Inference (TGI)| `openai` | `http://localhost:8080/v1` |
|
||||||
| SGLang | `openai` | `http://localhost:30000/v1` |
|
| SGLang | `openai` | `http://localhost:30000/v1` |
|
||||||
@@ -39,6 +50,40 @@ DocsGPT is readily configurable to work with the following local inference engin
|
|||||||
|
|
||||||
The `OPENAI_BASE_URL` examples above use `http://localhost`. If you are running DocsGPT within Docker and your local inference engine is running on your host machine (outside of Docker), you will likely need to replace `localhost` with `http://host.docker.internal` to ensure Docker can correctly access your host's services. For example, `http://host.docker.internal:11434/v1` for Ollama.
|
The `OPENAI_BASE_URL` examples above use `http://localhost`. If you are running DocsGPT within Docker and your local inference engine is running on your host machine (outside of Docker), you will likely need to replace `localhost` with `http://host.docker.internal` to ensure Docker can correctly access your host's services. For example, `http://host.docker.internal:11434/v1` for Ollama.
|
||||||
|
|
||||||
|
## How the Model Registry Works
|
||||||
|
|
||||||
|
DocsGPT uses a **Model Registry** to automatically detect and register available models based on your environment configuration. Understanding this system helps you configure models correctly.
|
||||||
|
|
||||||
|
### Automatic Model Detection
|
||||||
|
|
||||||
|
When DocsGPT starts, the Model Registry scans your environment variables and automatically registers models from providers that have valid API keys configured:
|
||||||
|
|
||||||
|
| Environment Variable | Provider Models Registered |
|
||||||
|
| :--------------------- | :------------------------- |
|
||||||
|
| `OPENAI_API_KEY` | OpenAI models (gpt-5.1, gpt-5-mini, etc.) |
|
||||||
|
| `ANTHROPIC_API_KEY` | Anthropic models (Claude family) |
|
||||||
|
| `GOOGLE_API_KEY` | Google models (Gemini family) |
|
||||||
|
| `GROQ_API_KEY` | Groq models (Llama, Mixtral) |
|
||||||
|
| `HUGGINGFACE_API_KEY` | HuggingFace models |
|
||||||
|
|
||||||
|
You can also use the generic `API_KEY` variable with `LLM_PROVIDER` to configure a single provider.
|
||||||
|
|
||||||
|
### Custom OpenAI-Compatible Models
|
||||||
|
|
||||||
|
When you set `OPENAI_BASE_URL` along with `LLM_PROVIDER=openai` and `LLM_NAME`, the registry automatically creates a custom model entry pointing to your local inference server. This is how local engines like Ollama, vLLM, and others get registered.
|
||||||
|
|
||||||
|
### Default Model Selection
|
||||||
|
|
||||||
|
The registry determines the default model in this priority order:
|
||||||
|
|
||||||
|
1. If `LLM_NAME` is set and matches a registered model, that model becomes the default
|
||||||
|
2. Otherwise, the first model from the configured `LLM_PROVIDER` is selected
|
||||||
|
3. If neither is set, the first available model in the registry is used
|
||||||
|
|
||||||
|
### Multiple Providers
|
||||||
|
|
||||||
|
You can configure multiple API keys simultaneously (e.g., both `OPENAI_API_KEY` and `ANTHROPIC_API_KEY`). The registry will load models from all configured providers, giving users the ability to switch between them in the UI.
|
||||||
|
|
||||||
## Adding Support for Other Local Engines
|
## Adding Support for Other Local Engines
|
||||||
|
|
||||||
While DocsGPT currently focuses on OpenAI API compatible local engines, you can extend its capabilities to support other local inference solutions. To do this, navigate to the `application/llm` directory in the DocsGPT repository. Examine the existing Python files for examples of LLM integrations. You can create a new module for your desired local engine, and then register it in the `llm_creator.py` file within the same directory. This allows for custom integration with a wide range of local LLM servers beyond those listed above.
|
While DocsGPT currently focuses on OpenAI API compatible local engines, you can extend its capabilities to support other local inference solutions. To do this, navigate to the `application/llm` directory in the DocsGPT repository. Examine the existing Python files for examples of LLM integrations. You can create a new module for your desired local engine, and then register it in the `llm_creator.py` file within the same directory. This allows for custom integration with a wide range of local LLM servers beyond those listed above.
|
||||||
Reference in New Issue
Block a user