- Restyle Token Usage diagnostics to match Memory panel (block bars, aligned labels)

- Add /token command to help and align all help entries via two-column layout - Increase help modal height so Close button remains visible (no scroll) - Refresh .env.example: remove unused LLM_PROVIDER, switch to OLLAMA_BASE_URL, add embeddings/debug/context budgeting, pricing vars, and daily token limit
2026-03-07 22:33:38 +00:00 · 2026-01-11 14:46:25 -07:00
parent 4619e43d62
commit 567e25ed8c
2 changed files with 326 additions and 25 deletions
--- a/.env.example
+++ b/.env.example
@@ -15,17 +15,54 @@ TAVILY_API_KEY=
 # Other providers: azure/, bedrock/, groq/, ollama/, together_ai/ (see litellm docs)
 PENTESTAGENT_MODEL=gpt-5

-# Ollama local/remote API base
+# Provider selection:
+# Note: The app determines provider from `PENTESTAGENT_MODEL` prefix
+# (e.g., `ollama/...`, `gpt-5`, `claude-...`, `gemini/...`). No separate
+# `LLM_PROVIDER` variable is used.
+
+# Ollama base URL (set this when using an `ollama/...` model)
 # Example: http://127.0.0.1:11434 or http://192.168.0.165:11434
-# Set this when using Ollama as the provider so LiteLLM/clients point to the correct host
-# OLLAMA_API_BASE=http://127.0.0.1:11434
+OLLAMA_BASE_URL=http://127.0.0.1:11434
+
+# Example local model string (uncomment to use instead of gpt-5)
+# PENTESTAGENT_MODEL="ollama/qwen2.5:7b-instruct"

 # Embeddings (for RAG knowledge base)
 # Options: openai, local (default: openai if OPENAI_API_KEY set, else local)
-# PENTESTAGENT_EMBEDDINGS=local
+PENTESTAGENT_EMBEDDINGS=local

 # Settings
-PENTESTAGENT_DEBUG=false
+PENTESTAGENT_DEBUG=true
+
+# Optional: manually declare model/context and daily token budgeting
+# Useful when provider metadata isn't available or you want to enforce local limits.
+# Set the model's maximum context window (in tokens). Example values:
+#  - Gemini large: 131072
+#  - Gemini flash: 65536
+#  - Ollama local model: 8192
+# PENTESTAGENT_MODEL_MAX_CONTEXT=131072
+
+# Optional daily token budget tracking (integers, tokens):
+# - Set the total token allowance you want to track per day
+# - Set the current used amount (optional; defaults to 0)
+# PENTESTAGENT_DAILY_TOKEN_BUDGET=500000
+# PENTESTAGENT_DAILY_TOKEN_USED=0
+
+# ---------------------------------------------------------------------------
+# Example pricing & daily token limit used by `/token` diagnostics
+# Uncomment and adjust to enable cost calculations.
+
+# Per 1M tokens pricing (USD):
+# Example (input at $2.00 / 1M, output at $12.00 / 1M)
+INPUT_COST_PER_MILLION=2.0
+OUTPUT_COST_PER_MILLION=12.0
+
+# Optional unified override (applies to both input and output)
+# COST_PER_MILLION=14.0
+
+# Example daily budget (tokens)
+DAILY_TOKEN_LIMIT=1000000
+# ---------------------------------------------------------------------------

 # Agent max iterations (regular agent + crew workers, default: 30)
 # PENTESTAGENT_AGENT_MAX_ITERATIONS=30