mirror of
https://github.com/arc53/DocsGPT.git
synced 2026-01-19 21:40:34 +00:00
docs(models): update provider examples and add native llama.cpp info
This commit is contained in:
@@ -14,7 +14,7 @@ The primary method for configuring your LLM provider in DocsGPT is through the `
|
||||
To connect to a cloud LLM provider, you will typically need to configure the following basic settings in your `.env` file:
|
||||
|
||||
* **`LLM_PROVIDER`**: This setting is essential and identifies the specific cloud provider you wish to use (e.g., `openai`, `google`, `anthropic`).
|
||||
* **`LLM_NAME`**: Specifies the exact model you want to utilize from your chosen provider (e.g., `gpt-4o`, `gemini-2.0-flash`, `claude-3-5-sonnet-latest`). Refer to your provider's documentation for a list of available models.
|
||||
* **`LLM_NAME`**: Specifies the exact model you want to utilize from your chosen provider (e.g., `gpt-5.1`, `gemini-flash-latest`, `claude-3-5-sonnet-20241022`). Refer to your provider's documentation for a list of available models.
|
||||
* **`API_KEY`**: Almost all cloud LLM providers require an API key for authentication. Obtain your API key from your chosen provider's platform and securely store it in your `.env` file.
|
||||
|
||||
## Explicitly Supported Cloud Providers
|
||||
@@ -24,12 +24,15 @@ DocsGPT offers direct, streamlined support for the following cloud LLM providers
|
||||
| Provider | `LLM_PROVIDER` | Example `LLM_NAME` |
|
||||
| :--------------------------- | :------------- | :-------------------------- |
|
||||
| DocsGPT Public API | `docsgpt` | `None` |
|
||||
| OpenAI | `openai` | `gpt-4o` |
|
||||
| Google (Vertex AI, Gemini) | `google` | `gemini-2.0-flash` |
|
||||
| Anthropic (Claude) | `anthropic` | `claude-3-5-sonnet-latest` |
|
||||
| Groq | `groq` | `llama-3.1-8b-instant` |
|
||||
| OpenAI | `openai` | `gpt-5.1` |
|
||||
| Google (Vertex AI, Gemini) | `google` | `gemini-flash-latest` |
|
||||
| Anthropic (Claude) | `anthropic` | `claude-3-5-sonnet-20241022`|
|
||||
| Groq | `groq` | `llama-3.3-70b-versatile` |
|
||||
| HuggingFace Inference API | `huggingface` | `meta-llama/Llama-3.1-8B-Instruct` |
|
||||
| Azure OpenAI | `azure_openai` | `gpt-4o` |
|
||||
| Azure OpenAI | `azure_openai` | `azure-gpt-4` |
|
||||
| Prem AI | `premai` | (See Prem AI docs) |
|
||||
| AWS SageMaker | `sagemaker` | (See SageMaker docs) |
|
||||
| Novita AI | `novita` | (See Novita docs) |
|
||||
|
||||
## Connecting to OpenAI-Compatible Cloud APIs
|
||||
|
||||
|
||||
@@ -20,13 +20,24 @@ To connect to a local inference engine, you will generally need to configure the
|
||||
* **`OPENAI_BASE_URL`**: This is essential. Set this to the base URL of your local inference engine's API endpoint. This tells DocsGPT where to find your local LLM server.
|
||||
* **`API_KEY`**: Generally, for local inference engines, you can set `API_KEY=None` as authentication is usually not required in local setups.
|
||||
|
||||
## Native llama.cpp Support
|
||||
|
||||
DocsGPT includes native support for llama.cpp without requiring an OpenAI-compatible server. To use this:
|
||||
|
||||
```
|
||||
LLM_PROVIDER=llama.cpp
|
||||
LLM_NAME=your-model-name
|
||||
```
|
||||
|
||||
This provider integrates directly with llama.cpp Python bindings.
|
||||
|
||||
## Supported Local Inference Engines (OpenAI API Compatible)
|
||||
|
||||
DocsGPT is readily configurable to work with the following local inference engines, all communicating via the OpenAI API format. Here are example `OPENAI_BASE_URL` values for each, based on default setups:
|
||||
DocsGPT is also readily configurable to work with the following local inference engines, all communicating via the OpenAI API format. Here are example `OPENAI_BASE_URL` values for each, based on default setups:
|
||||
|
||||
| Inference Engine | `LLM_PROVIDER` | `OPENAI_BASE_URL` |
|
||||
| :---------------------------- | :------------- | :------------------------- |
|
||||
| LLaMa.cpp | `openai` | `http://localhost:8000/v1` |
|
||||
| LLaMa.cpp (server mode) | `openai` | `http://localhost:8000/v1` |
|
||||
| Ollama | `openai` | `http://localhost:11434/v1` |
|
||||
| Text Generation Inference (TGI)| `openai` | `http://localhost:8080/v1` |
|
||||
| SGLang | `openai` | `http://localhost:30000/v1` |
|
||||
@@ -39,6 +50,40 @@ DocsGPT is readily configurable to work with the following local inference engin
|
||||
|
||||
The `OPENAI_BASE_URL` examples above use `http://localhost`. If you are running DocsGPT within Docker and your local inference engine is running on your host machine (outside of Docker), you will likely need to replace `localhost` with `http://host.docker.internal` to ensure Docker can correctly access your host's services. For example, `http://host.docker.internal:11434/v1` for Ollama.
|
||||
|
||||
## How the Model Registry Works
|
||||
|
||||
DocsGPT uses a **Model Registry** to automatically detect and register available models based on your environment configuration. Understanding this system helps you configure models correctly.
|
||||
|
||||
### Automatic Model Detection
|
||||
|
||||
When DocsGPT starts, the Model Registry scans your environment variables and automatically registers models from providers that have valid API keys configured:
|
||||
|
||||
| Environment Variable | Provider Models Registered |
|
||||
| :--------------------- | :------------------------- |
|
||||
| `OPENAI_API_KEY` | OpenAI models (gpt-5.1, gpt-5-mini, etc.) |
|
||||
| `ANTHROPIC_API_KEY` | Anthropic models (Claude family) |
|
||||
| `GOOGLE_API_KEY` | Google models (Gemini family) |
|
||||
| `GROQ_API_KEY` | Groq models (Llama, Mixtral) |
|
||||
| `HUGGINGFACE_API_KEY` | HuggingFace models |
|
||||
|
||||
You can also use the generic `API_KEY` variable with `LLM_PROVIDER` to configure a single provider.
|
||||
|
||||
### Custom OpenAI-Compatible Models
|
||||
|
||||
When you set `OPENAI_BASE_URL` along with `LLM_PROVIDER=openai` and `LLM_NAME`, the registry automatically creates a custom model entry pointing to your local inference server. This is how local engines like Ollama, vLLM, and others get registered.
|
||||
|
||||
### Default Model Selection
|
||||
|
||||
The registry determines the default model in this priority order:
|
||||
|
||||
1. If `LLM_NAME` is set and matches a registered model, that model becomes the default
|
||||
2. Otherwise, the first model from the configured `LLM_PROVIDER` is selected
|
||||
3. If neither is set, the first available model in the registry is used
|
||||
|
||||
### Multiple Providers
|
||||
|
||||
You can configure multiple API keys simultaneously (e.g., both `OPENAI_API_KEY` and `ANTHROPIC_API_KEY`). The registry will load models from all configured providers, giving users the ability to switch between them in the UI.
|
||||
|
||||
## Adding Support for Other Local Engines
|
||||
|
||||
While DocsGPT currently focuses on OpenAI API compatible local engines, you can extend its capabilities to support other local inference solutions. To do this, navigate to the `application/llm` directory in the DocsGPT repository. Examine the existing Python files for examples of LLM integrations. You can create a new module for your desired local engine, and then register it in the `llm_creator.py` file within the same directory. This allows for custom integration with a wide range of local LLM servers beyond those listed above.
|
||||
Reference in New Issue
Block a user