Compare commits

...

847 Commits

Author SHA1 Message Date
Alex
0475e55518 Merge pull request #1173 from sahil9001/system-theme-unless-set
feat: added theme check unless already set
2024-10-01 10:56:04 +01:00
Alex
872b390cfa Merge pull request #1154 from siiddhantt/feat/openapi-spec
feature: OpenAPI specification compliant endpoints
2024-10-01 10:51:51 +01:00
Alex
dc4c539607 Update HACKTOBERFEST.md 2024-10-01 10:43:14 +01:00
Alex
69f8c76ce2 Update HACKTOBERFEST.md 2024-10-01 10:36:50 +01:00
Alex
cbdcca7fd8 Create lexeu-competition.md 2024-10-01 10:35:52 +01:00
Sahil Silare
1e7fa988da feat: added theme check unless already set 2024-10-01 10:02:51 +05:30
Alex
ba35e1422b Update README.md 2024-09-30 22:41:06 +01:00
Alex
78e99d1171 Merge pull request #1156 from anasKhafaga/feat/docker/enable-debugging
feat(Docker): Configure debugging with vscode and chrome
2024-09-30 16:43:37 +03:00
Alex
6a12f96fdc Update HACKTOBERFEST.md 2024-09-30 13:42:50 +01:00
Alex
b79a20151c fix: uploads 2024-09-30 13:28:42 +01:00
Siddhant Rai
c9976020dd fix: lint error 2024-09-30 01:15:36 +05:30
Siddhant Rai
e8988e82d0 refactor: answer routes to comply with OpenAPI spec using flask-restx 2024-09-30 00:41:34 +05:30
Anas Khafaga
ff95570da6 feat(Docker): Configure debugging with vscode and chrome 2024-09-29 14:39:16 +03:00
Siddhant Rai
b084e3074d refactor: user routes to comply with OpenAPI spec using flask-restx 2024-09-27 20:08:46 +05:30
Alex
bc4f9c3442 Merge pull request #1153 from utin-francis-peter/update-run-with-docker-compose
fix: updated docker execution script from `docker-compose` to `docker…
2024-09-26 14:30:56 +01:00
utin-francis-peter
ecf88e7ea1 fix: updated docker execution script from docker-compose to docker compose so it works on latest versions of docker used across contributors machine 2024-09-26 12:09:44 +01:00
Alex
d6cb66cd2f Update README.md 2024-09-26 11:04:01 +01:00
Alex
bc2241f67a Merge pull request #1152 from siiddhantt/feature/sync-remote
feat: sync remote sources
2024-09-25 14:17:57 +01:00
Siddhant Rai
3d292aa485 feat: sync remote sources through celery periodic tasks 2024-09-25 15:20:11 +05:30
Manish Madan
21f46a8aea Merge pull request #1148 from arc53/dependabot/npm_and_yarn/frontend/lint-staged-15.2.10
chore(deps-dev): bump lint-staged from 15.2.8 to 15.2.10 in /frontend
2024-09-24 21:06:37 +05:30
dependabot[bot]
ba6d61cc35 chore(deps-dev): bump lint-staged from 15.2.8 to 15.2.10 in /frontend
Bumps [lint-staged](https://github.com/lint-staged/lint-staged) from 15.2.8 to 15.2.10.
- [Release notes](https://github.com/lint-staged/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/lint-staged/lint-staged/compare/v15.2.8...v15.2.10)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-24 15:30:57 +00:00
Manish Madan
27b43c11bd Merge pull request #1151 from arc53/dependabot/npm_and_yarn/frontend/rollup-4.22.4
chore(deps): bump rollup from 4.20.0 to 4.22.4 in /frontend
2024-09-24 18:20:33 +05:30
Alex
a5292c4473 Update README.md 2024-09-24 12:50:27 +01:00
Alex
f3923488a5 Merge pull request #1143 from arc53/dependabot/pip/application/tzdata-2024.2
chore(deps): bump tzdata from 2024.1 to 2024.2 in /application
2024-09-23 23:33:53 +01:00
Alex
d964ebfff8 Merge pull request #1149 from arc53/dependabot/npm_and_yarn/frontend/react-i18next-15.0.2
chore(deps): bump react-i18next from 15.0.1 to 15.0.2 in /frontend
2024-09-23 23:33:43 +01:00
Alex
e420eeece4 Merge pull request #1144 from arc53/dependabot/pip/application/elasticsearch-8.15.1
chore(deps): bump elasticsearch from 8.14.0 to 8.15.1 in /application
2024-09-23 23:30:15 +01:00
dependabot[bot]
f34713545e chore(deps): bump react-i18next from 15.0.1 to 15.0.2 in /frontend
Bumps [react-i18next](https://github.com/i18next/react-i18next) from 15.0.1 to 15.0.2.
- [Changelog](https://github.com/i18next/react-i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/react-i18next/compare/v15.0.1...v15.0.2)

---
updated-dependencies:
- dependency-name: react-i18next
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-23 22:30:09 +00:00
Alex
2b513c7d87 Merge pull request #1146 from arc53/dependabot/npm_and_yarn/frontend/i18next-23.15.1
chore(deps): bump i18next from 23.14.0 to 23.15.1 in /frontend
2024-09-23 23:29:09 +01:00
Alex
7e28893613 Merge pull request #1145 from arc53/dependabot/pip/application/pandas-2.2.3
chore(deps): bump pandas from 2.2.2 to 2.2.3 in /application
2024-09-23 23:26:41 +01:00
dependabot[bot]
822674ca78 chore(deps): bump i18next from 23.14.0 to 23.15.1 in /frontend
Bumps [i18next](https://github.com/i18next/i18next) from 23.14.0 to 23.15.1.
- [Release notes](https://github.com/i18next/i18next/releases)
- [Changelog](https://github.com/i18next/i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/i18next/compare/v23.14.0...v23.15.1)

---
updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-23 22:21:26 +00:00
dependabot[bot]
35ebf97fdf chore(deps): bump rollup from 4.20.0 to 4.22.4 in /frontend
Bumps [rollup](https://github.com/rollup/rollup) from 4.20.0 to 4.22.4.
- [Release notes](https://github.com/rollup/rollup/releases)
- [Changelog](https://github.com/rollup/rollup/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rollup/rollup/compare/v4.20.0...v4.22.4)

---
updated-dependencies:
- dependency-name: rollup
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-23 22:20:49 +00:00
Alex
b456c0ca9f Merge pull request #1147 from arc53/dependabot/npm_and_yarn/frontend/eslint-plugin-import-2.30.0
chore(deps-dev): bump eslint-plugin-import from 2.27.5 to 2.30.0 in /frontend
2024-09-23 23:19:49 +01:00
dependabot[bot]
bae49891be chore(deps-dev): bump eslint-plugin-import in /frontend
Bumps [eslint-plugin-import](https://github.com/import-js/eslint-plugin-import) from 2.27.5 to 2.30.0.
- [Release notes](https://github.com/import-js/eslint-plugin-import/releases)
- [Changelog](https://github.com/import-js/eslint-plugin-import/blob/main/CHANGELOG.md)
- [Commits](https://github.com/import-js/eslint-plugin-import/compare/v2.27.5...v2.30.0)

---
updated-dependencies:
- dependency-name: eslint-plugin-import
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-23 20:32:45 +00:00
dependabot[bot]
dfb4a67d87 chore(deps): bump pandas from 2.2.2 to 2.2.3 in /application
Bumps [pandas](https://github.com/pandas-dev/pandas) from 2.2.2 to 2.2.3.
- [Release notes](https://github.com/pandas-dev/pandas/releases)
- [Commits](https://github.com/pandas-dev/pandas/compare/v2.2.2...v2.2.3)

---
updated-dependencies:
- dependency-name: pandas
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-23 20:14:39 +00:00
dependabot[bot]
8e727a3253 chore(deps): bump elasticsearch from 8.14.0 to 8.15.1 in /application
Bumps [elasticsearch](https://github.com/elastic/elasticsearch-py) from 8.14.0 to 8.15.1.
- [Release notes](https://github.com/elastic/elasticsearch-py/releases)
- [Commits](https://github.com/elastic/elasticsearch-py/compare/v8.14.0...v8.15.1)

---
updated-dependencies:
- dependency-name: elasticsearch
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-23 20:14:33 +00:00
dependabot[bot]
d6f03d7a07 chore(deps): bump tzdata from 2024.1 to 2024.2 in /application
Bumps [tzdata](https://github.com/python/tzdata) from 2024.1 to 2024.2.
- [Release notes](https://github.com/python/tzdata/releases)
- [Changelog](https://github.com/python/tzdata/blob/master/NEWS.md)
- [Commits](https://github.com/python/tzdata/compare/2024.1...2024.2)

---
updated-dependencies:
- dependency-name: tzdata
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-23 20:14:30 +00:00
Alex
5ada5bf1a0 Create HACKTOBERFEST.md 2024-09-23 17:30:58 +01:00
Alex
ae9c935c5c Merge pull request #1140 from ManishMadan2882/main
Stream sources
2024-09-23 15:18:41 +01:00
Alex
95618001aa Merge pull request #1135 from anasKhafaga/fix/mock-json-server
fix(deps): downgrade json-server to v0.17.4
2024-09-23 11:48:16 +01:00
ManishMadan2882
40c361968e feat(shared): sync sources flow; clean up 2024-09-23 01:33:41 +05:30
Manish Madan
757abda654 Merge branch 'arc53:main' into main 2024-09-21 17:53:08 +05:30
ManishMadan2882
862807e863 feat(conv): sync source stream in fe 2024-09-21 17:51:36 +05:30
ManishMadan2882
d188db887c feat(stream): stream sources before ans 2024-09-21 17:50:56 +05:30
ManishMadan2882
59b6c56262 feat(stream): yield sources in stream 2024-09-21 01:50:02 +05:30
Manish Madan
f92658de82 Merge pull request #1095 from arc53/dependabot/npm_and_yarn/extensions/react-widget/micromatch-4.0.8
chore(deps): bump micromatch from 4.0.7 to 4.0.8 in /extensions/react-widget
2024-09-20 18:20:12 +05:30
Alex
f2b3402c17 Merge pull request #1124 from arc53/dependabot/pip/application/openapi3-parser-1.1.18
chore(deps): bump openapi3-parser from 1.1.16 to 1.1.18 in /application
2024-09-20 12:59:05 +01:00
Alex
24badf65a4 Merge pull request #1126 from arc53/dependabot/pip/application/html2text-2024.2.26
chore(deps): bump html2text from 2020.1.16 to 2024.2.26 in /application
2024-09-20 12:58:38 +01:00
dependabot[bot]
86442f212f chore(deps): bump html2text from 2020.1.16 to 2024.2.26 in /application
Bumps [html2text](https://github.com/Alir3z4/html2text) from 2020.1.16 to 2024.2.26.
- [Release notes](https://github.com/Alir3z4/html2text/releases)
- [Changelog](https://github.com/Alir3z4/html2text/blob/master/ChangeLog.rst)
- [Commits](https://github.com/Alir3z4/html2text/compare/2020.1.16...2024.2.26)

---
updated-dependencies:
- dependency-name: html2text
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-20 11:13:59 +00:00
dependabot[bot]
9b6d6ecc32 chore(deps): bump openapi3-parser from 1.1.16 to 1.1.18 in /application
Bumps [openapi3-parser](https://github.com/manchenkoff/openapi3-parser) from 1.1.16 to 1.1.18.
- [Release notes](https://github.com/manchenkoff/openapi3-parser/releases)
- [Commits](https://github.com/manchenkoff/openapi3-parser/compare/v1.1.16...v1.1.18)

---
updated-dependencies:
- dependency-name: openapi3-parser
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-20 11:13:57 +00:00
Alex
1f51277004 Merge pull request #1139 from arc53/dependabot/pip/application/langchain-core-0.3.2
chore(deps): bump langchain-core from 0.2.38 to 0.3.2 in /application
2024-09-20 12:12:55 +01:00
Alex
68cc646a3e fix(test): better test + cov 2024-09-20 12:07:17 +01:00
dependabot[bot]
420ca1b3b4 chore(deps): bump micromatch in /extensions/react-widget
Bumps [micromatch](https://github.com/micromatch/micromatch) from 4.0.7 to 4.0.8.
- [Release notes](https://github.com/micromatch/micromatch/releases)
- [Changelog](https://github.com/micromatch/micromatch/blob/4.0.8/CHANGELOG.md)
- [Commits](https://github.com/micromatch/micromatch/compare/4.0.7...4.0.8)

---
updated-dependencies:
- dependency-name: micromatch
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-20 11:02:34 +00:00
Alex
a83e68815a fix(test): only 3.11 python 2024-09-20 11:53:50 +01:00
Alex
d87aebb718 fix(deps): now it works better 2024-09-20 11:51:57 +01:00
Alex
a9a4d14e8a Merge pull request #1132 from ManishMadan2882/main
Fix UI for mobile devices
2024-09-20 11:39:57 +01:00
dependabot[bot]
9ed2fc7359 chore(deps): bump langchain-core from 0.2.38 to 0.3.2 in /application
Bumps [langchain-core](https://github.com/langchain-ai/langchain) from 0.2.38 to 0.3.2.
- [Release notes](https://github.com/langchain-ai/langchain/releases)
- [Commits](https://github.com/langchain-ai/langchain/compare/langchain-core==0.2.38...langchain-core==0.3.2)

---
updated-dependencies:
- dependency-name: langchain-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-20 10:37:01 +00:00
Alex
caad382a95 Merge pull request #1128 from arc53/dependabot/pip/application/anthropic-0.34.2
chore(deps): bump anthropic from 0.34.0 to 0.34.2 in /application
2024-09-20 11:35:26 +01:00
Alex
ea39fc9c48 Merge pull request #1133 from arc53/dependabot/pip/scripts/langchain-0.2.10
chore(deps): bump langchain from 0.1.4 to 0.2.10 in /scripts
2024-09-20 11:33:18 +01:00
ManishMadan2882
bf7cce52db fix(conv): /search after setting id 2024-09-20 02:59:15 +05:30
ManishMadan2882
63a15a3359 fix(retry btn): overlflow in mobile 2024-09-20 02:25:00 +05:30
ManishMadan2882
db34392210 fix(conv): scroll btn; css clean up 2024-09-20 01:06:58 +05:30
Alex
cc4e0ba6c1 Merge pull request #1138 from anasKhafaga/fix/local-docker-bind-mount
fix(Docker): Integrate a bind mount in the local docker compose file
2024-09-19 19:24:27 +01:00
Anas Khafaga
38989f9c68 fix(Docker): Integrate a bind mount in the local docker compose file 2024-09-19 17:48:26 +03:00
Anas Khafaga
c78c92e539 fix(deps): downgrade json-server to v0.17.4 2024-09-19 16:34:35 +03:00
Alex
31e694f50d Merge pull request #1069 from arc53/dependabot/npm_and_yarn/frontend/reduxjs/toolkit-2.2.7
chore(deps): bump @reduxjs/toolkit from 1.9.2 to 2.2.7 in /frontend
2024-09-18 23:36:49 +01:00
ManishMadan2882
5368199517 fix(store.ts): type err 2024-09-19 01:10:16 +05:30
ManishMadan2882
6bbb9176fc Merge branch 'dependabot/npm_and_yarn/frontend/reduxjs/toolkit-2.2.7' of https://github.com/arc53/docsgpt into dependabot/npm_and_yarn/frontend/reduxjs/toolkit-2.2.7 2024-09-18 14:48:25 +05:30
dependabot[bot]
4209eee2f8 chore(deps): bump @reduxjs/toolkit from 1.9.2 to 2.2.7 in /frontend
Bumps [@reduxjs/toolkit](https://github.com/reduxjs/redux-toolkit) from 1.9.2 to 2.2.7.
- [Release notes](https://github.com/reduxjs/redux-toolkit/releases)
- [Commits](https://github.com/reduxjs/redux-toolkit/compare/v1.9.2...v2.2.7)

---
updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-18 09:01:06 +00:00
Manish Madan
f65ebb6b71 Merge pull request #1096 from arc53/dependabot/npm_and_yarn/frontend/types/react-syntax-highlighter-15.5.13
chore(deps-dev): bump @types/react-syntax-highlighter from 15.5.6 to 15.5.13 in /frontend
2024-09-18 14:29:11 +05:30
dependabot[bot]
ef8107e56a chore(deps-dev): bump @types/react-syntax-highlighter in /frontend
Bumps [@types/react-syntax-highlighter](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-syntax-highlighter) from 15.5.6 to 15.5.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-syntax-highlighter)

---
updated-dependencies:
- dependency-name: "@types/react-syntax-highlighter"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-18 08:52:35 +00:00
Manish Madan
2293a30f19 Merge pull request #1130 from arc53/dependabot/npm_and_yarn/frontend/eslint-8.57.1
chore(deps-dev): bump eslint from 8.33.0 to 8.57.1 in /frontend
2024-09-18 14:20:14 +05:30
dependabot[bot]
d7678fd355 chore(deps-dev): bump eslint from 8.33.0 to 8.57.1 in /frontend
Bumps [eslint](https://github.com/eslint/eslint) from 8.33.0 to 8.57.1.
- [Release notes](https://github.com/eslint/eslint/releases)
- [Changelog](https://github.com/eslint/eslint/blob/main/CHANGELOG.md)
- [Commits](https://github.com/eslint/eslint/compare/v8.33.0...v8.57.1)

---
updated-dependencies:
- dependency-name: eslint
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-18 08:40:22 +00:00
Manish Madan
27d8a5cf99 Merge pull request #1098 from arc53/dependabot/npm_and_yarn/frontend/eslint-config-prettier-9.1.0
chore(deps-dev): bump eslint-config-prettier from 8.6.0 to 9.1.0 in /frontend
2024-09-18 14:09:04 +05:30
dependabot[bot]
03f6c58ac6 chore(deps-dev): bump eslint-config-prettier in /frontend
Bumps [eslint-config-prettier](https://github.com/prettier/eslint-config-prettier) from 8.6.0 to 9.1.0.
- [Changelog](https://github.com/prettier/eslint-config-prettier/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prettier/eslint-config-prettier/compare/v8.6.0...v9.1.0)

---
updated-dependencies:
- dependency-name: eslint-config-prettier
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-18 08:30:58 +00:00
Manish Madan
4fb52dc6fc Merge pull request #1131 from arc53/dependabot/npm_and_yarn/frontend/vite-5.4.6
chore(deps-dev): bump vite from 5.3.5 to 5.4.6 in /frontend
2024-09-18 13:59:20 +05:30
Alex
0232ba3f25 Merge pull request #1134 from arc53/dependabot/npm_and_yarn/docs/next-14.2.12
chore(deps): bump next from 14.1.1 to 14.2.12 in /docs
2024-09-18 09:18:01 +01:00
dependabot[bot]
5987e4c8e1 chore(deps): bump next from 14.1.1 to 14.2.12 in /docs
Bumps [next](https://github.com/vercel/next.js) from 14.1.1 to 14.2.12.
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/compare/v14.1.1...v14.2.12)

---
updated-dependencies:
- dependency-name: next
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-18 01:11:06 +00:00
ManishMadan2882
18ce7c8f2f fix(mobile): minor changes 2024-09-18 03:21:46 +05:30
dependabot[bot]
177c9da8b5 chore(deps): bump langchain from 0.1.4 to 0.2.10 in /scripts
Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.1.4 to 0.2.10.
- [Release notes](https://github.com/langchain-ai/langchain/releases)
- [Commits](https://github.com/langchain-ai/langchain/compare/langchain-ai21==0.1.4...langchain==0.2.10)

---
updated-dependencies:
- dependency-name: langchain
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-17 21:45:01 +00:00
ManishMadan2882
b5f1a8e90f fix(mobile): vertical overflows 2024-09-18 01:44:34 +05:30
dependabot[bot]
494c3dd1bd chore(deps-dev): bump vite from 5.3.5 to 5.4.6 in /frontend
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.3.5 to 5.4.6.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v5.4.6/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.4.6/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-17 20:06:05 +00:00
dependabot[bot]
ad8f78d51e chore(deps): bump anthropic from 0.34.0 to 0.34.2 in /application
Bumps [anthropic](https://github.com/anthropics/anthropic-sdk-python) from 0.34.0 to 0.34.2.
- [Release notes](https://github.com/anthropics/anthropic-sdk-python/releases)
- [Changelog](https://github.com/anthropics/anthropic-sdk-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/anthropics/anthropic-sdk-python/compare/v0.34.0...v0.34.2)

---
updated-dependencies:
- dependency-name: anthropic
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-16 20:24:48 +00:00
Alex
5112801c37 Merge pull request #1123 from ManishMadan2882/main 2024-09-13 16:02:41 +01:00
ManishMadan2882
226adfdba2 fix: mobile overflows 2024-09-13 19:13:53 +05:30
ManishMadan2882
22c0375dca fix(ui): overflow in queries 2024-09-13 19:01:38 +05:30
Alex
e77f6c9f6f Merge pull request #1097 from arc53/dependabot/npm_and_yarn/frontend/prettier-plugin-tailwindcss-0.6.6
chore(deps-dev): bump prettier-plugin-tailwindcss from 0.2.2 to 0.6.6 in /frontend
2024-09-12 09:27:18 +01:00
dependabot[bot]
5bd7c0ab8b chore(deps-dev): bump prettier-plugin-tailwindcss in /frontend
Bumps [prettier-plugin-tailwindcss](https://github.com/tailwindlabs/prettier-plugin-tailwindcss) from 0.2.2 to 0.6.6.
- [Release notes](https://github.com/tailwindlabs/prettier-plugin-tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/prettier-plugin-tailwindcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/prettier-plugin-tailwindcss/compare/v0.2.2...v0.6.6)

---
updated-dependencies:
- dependency-name: prettier-plugin-tailwindcss
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-11 22:53:49 +00:00
Alex
97f7f6f7d2 Merge pull request #1121 from arc53/dependabot/npm_and_yarn/frontend/tailwindcss-3.4.11
chore(deps-dev): bump tailwindcss from 3.2.4 to 3.4.11 in /frontend
2024-09-11 23:52:34 +01:00
dependabot[bot]
d65fc70f07 chore(deps-dev): bump tailwindcss from 3.2.4 to 3.4.11 in /frontend
Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss) from 3.2.4 to 3.4.11.
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/v3.4.11/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss/compare/v3.2.4...v3.4.11)

---
updated-dependencies:
- dependency-name: tailwindcss
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-11 22:32:46 +00:00
Alex
dcae85eae8 Merge pull request #1114 from Shubham-EV/patch-1
Removed repetitive text in docs
2024-09-11 23:30:13 +01:00
Alex
686a48298b Merge pull request #1120 from ManishMadan2882/main
Load sources on shared conversations
2024-09-11 20:02:54 +01:00
Alex
8ca590559d fix(lint): types 2024-09-11 19:01:42 +01:00
ManishMadan2882
70251222cc Merge: branch main from upstream 2024-09-11 23:12:15 +05:30
Alex
e55c68d27e Merge pull request #1116 from siiddhantt/feat/analytics-and-logs
Feature: Analytics and Logs Dashboard
2024-09-11 15:10:40 +01:00
Siddhant Rai
da4f2ef6b3 fix: linting issue 2024-09-11 18:02:47 +05:30
Siddhant Rai
dbf2cabd38 fix: linting issue 2024-09-11 18:01:23 +05:30
Siddhant Rai
72e68a163c Merge branch 'main' into feat/analytics-and-logs 2024-09-11 17:58:04 +05:30
ManishMadan2882
919a87bf47 fix: load sources on shared conversations 2024-09-11 17:55:31 +05:30
Siddhant Rai
bea0bbfcdb feat: user logs section in settings 2024-09-11 17:45:54 +05:30
Alex
c12d7a8d82 Merge pull request #1102 from arc53/dependabot/pip/application/flask-3.0.3
chore(deps): bump flask from 3.0.1 to 3.0.3 in /application
2024-09-10 14:16:42 +01:00
Alex
711ad1750a Merge pull request #1100 from arc53/dependabot/pip/application/werkzeug-3.0.4
chore(deps): bump werkzeug from 3.0.3 to 3.0.4 in /application
2024-09-10 14:16:31 +01:00
dependabot[bot]
825567d449 chore(deps): bump flask from 3.0.1 to 3.0.3 in /application
Bumps [flask](https://github.com/pallets/flask) from 3.0.1 to 3.0.3.
- [Release notes](https://github.com/pallets/flask/releases)
- [Changelog](https://github.com/pallets/flask/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/flask/compare/3.0.1...3.0.3)

---
updated-dependencies:
- dependency-name: flask
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-10 13:12:34 +00:00
dependabot[bot]
800439d29e chore(deps): bump werkzeug from 3.0.3 to 3.0.4 in /application
Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.3 to 3.0.4.
- [Release notes](https://github.com/pallets/werkzeug/releases)
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/werkzeug/compare/3.0.3...3.0.4)

---
updated-dependencies:
- dependency-name: werkzeug
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-10 13:11:59 +00:00
Alex
e3517dde13 Merge pull request #1104 from arc53/dependabot/pip/application/transformers-4.44.2
chore(deps): bump transformers from 4.44.0 to 4.44.2 in /application
2024-09-10 14:10:48 +01:00
dependabot[bot]
f2da8473a4 chore(deps): bump transformers from 4.44.0 to 4.44.2 in /application
Bumps [transformers](https://github.com/huggingface/transformers) from 4.44.0 to 4.44.2.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.44.0...v4.44.2)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-10 13:02:52 +00:00
Alex
9cc9a6e9b4 Merge pull request #1103 from arc53/dependabot/pip/application/tqdm-4.66.5
chore(deps): bump tqdm from 4.66.3 to 4.66.5 in /application
2024-09-10 14:01:55 +01:00
dependabot[bot]
873406732f chore(deps): bump tqdm from 4.66.3 to 4.66.5 in /application
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.3 to 4.66.5.
- [Release notes](https://github.com/tqdm/tqdm/releases)
- [Commits](https://github.com/tqdm/tqdm/compare/v4.66.3...v4.66.5)

---
updated-dependencies:
- dependency-name: tqdm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-10 12:13:16 +00:00
Alex
14ab950a6c fix: pinning sentence-transformers==3.0.1 2024-09-10 13:12:00 +01:00
Alex
6cd8e71f4f Merge pull request #1119 from arc53/dependabot/npm_and_yarn/mock-backend/multi-4314d4edfb
chore(deps): bump path-to-regexp and json-server in /mock-backend
2024-09-10 12:42:16 +01:00
Alex
4aeaec9dc7 Merge pull request #1117 from ManishMadan2882/main
Adding option to collect feedback in React widget
2024-09-10 12:18:08 +01:00
ManishMadan2882
e318228a08 purge unwanted linting 2024-09-10 16:26:13 +05:30
dependabot[bot]
d22efbf745 chore(deps): bump path-to-regexp and json-server in /mock-backend
Removes [path-to-regexp](https://github.com/pillarjs/path-to-regexp). It's no longer used after updating ancestor dependency [json-server](https://github.com/typicode/json-server). These dependencies need to be updated together.


Removes `path-to-regexp`

Updates `json-server` from 0.17.4 to 1.0.0-beta.2
- [Release notes](https://github.com/typicode/json-server/releases)
- [Commits](https://github.com/typicode/json-server/compare/v0.17.4...v1.0.0-beta.2)

---
updated-dependencies:
- dependency-name: path-to-regexp
  dependency-type: indirect
- dependency-name: json-server
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-10 09:47:05 +00:00
Alex
90309d5552 feat: user logging api operations level 2024-09-10 01:30:47 +01:00
Alex
72842ecd7a Merge pull request #1118 from arc53/1059-migrating-database-to-new-model 2024-09-10 00:17:59 +01:00
Alex
a1b32ffca9 fix: type 2024-09-10 00:06:43 +01:00
Alex
44d225e6ca Merge branch 'main' into 1059-migrating-database-to-new-model 2024-09-09 23:55:25 +01:00
ManishMadan2882
37ab5f9d7a Merge branch 'main' of https://github.com/ManishMadan2882/docsgpt 2024-09-10 03:28:09 +05:30
ManishMadan2882
61fdcec511 minor change 2024-09-10 03:27:49 +05:30
Manish Madan
45cc4fd97a Merge branch 'arc53:main' into main 2024-09-10 03:01:37 +05:30
ManishMadan2882
3228b88312 widget: minor changes 2024-09-10 02:44:10 +05:30
Alex
a1d3592d08 fix: typo 2024-09-09 21:06:30 +01:00
Alex
c686d950d0 Merge pull request #1062 from ManishMadan2882/1059-migrating-database-to-new-model
Migrating database to new model
2024-09-09 21:01:04 +01:00
Alex
ca779bb0af lint: ruff fix 2024-09-09 20:24:15 +01:00
Alex
90f64e2527 fix: references on shar / api keys 2024-09-09 20:22:03 +01:00
Alex
444d50f751 fix: faiss source name 2024-09-09 16:43:20 +01:00
Alex
2f9c72c1cf feat: migrate store to source_id 2024-09-09 15:46:18 +01:00
Alex
1bb81614a5 fix: metadata things 2024-09-09 13:37:11 +01:00
Alex
888e13e198 feat: mongo vector migrate script 2024-09-09 13:01:58 +01:00
Alex
8166642ff9 fix: write id instead of old path on remote db's 2024-09-09 12:00:59 +01:00
ManishMadan2882
51c42790b7 widget: add option to collect feedback 2024-09-09 16:20:07 +05:30
Alex
f105fd1b2c lint: final fix 2024-09-08 23:19:10 +01:00
Alex
fe78e9a336 lint: more lintingg 2024-09-08 22:55:45 +01:00
Alex
2fce25b0c8 fix: Doc type 2024-09-08 22:52:09 +01:00
Alex
6c0da2ea94 lint: ruff fix 2024-09-08 17:02:48 +01:00
Alex
a353e69648 feat: new vectors structure 2024-09-08 16:59:51 +01:00
Siddhant Rai
ac930d5504 feat: analytics dashboard with respective endpoints 2024-09-07 14:44:35 +05:30
Shubham
d4cf8037b7 Removed repetitive text in docs 2024-09-06 06:06:26 +05:30
Alex
fb1fd851b0 Merge pull request #1045 from Jacksonxhx/Jackson
Integrated Milvus Vector DB into main
2024-09-05 23:57:41 +01:00
Alex
2ff8c0b128 Merge branch 'main' into Jackson 2024-09-05 23:43:17 +01:00
Alex
d232229abf feat Milvus integration 2024-09-05 23:41:51 +01:00
Alex
490e58fb52 Merge pull request #1113 from ManishMadan2882/main
Refining the scrolling functionality while streaming
2024-09-05 00:03:45 +01:00
ManishMadan2882
a8582be54d smooth transition to new input 2024-09-05 03:17:14 +05:30
ManishMadan2882
30bb8449e9 fix: wrap code without a match 2024-09-05 03:07:32 +05:30
ManishMadan2882
adb7132e02 conversation: refined scrolling 2024-09-05 02:34:15 +05:30
Manish Madan
4a1e488bd7 Merge branch 'arc53:main' into main 2024-09-04 00:21:01 +05:30
ManishMadan2882
d200db0eeb clean pkg json after publish 2024-09-04 00:20:42 +05:30
Siddhant Rai
28e06fa684 fix: minor ui inconsistencies 2024-09-03 16:11:24 +05:30
Alex
c4cb9b07cb Merge pull request #1112 from arc53/feat/openai-proxy
feat: added easy way to proxy
2024-09-02 20:14:29 +01:00
Alex
817fc5d4b3 fix: little nextra edit 2024-09-02 20:11:31 +01:00
Alex
2de1e5f71a chore: open ai compatable data 2024-09-02 20:09:16 +01:00
Alex
5246d85f11 fix: ruff 2024-09-02 20:00:24 +01:00
Alex
9526ed0258 feat: added easy way to proxy 2024-09-02 19:46:25 +01:00
Alex
736add031c Merge pull request #1111 from ManishMadan2882/main
Widget: Replaced deprecated dependency
2024-09-02 16:29:01 +01:00
ManishMadan2882
d4042ebaa2 minor change 2024-09-02 17:44:46 +05:30
ManishMadan2882
54e31be3b2 adding markdown-it 2024-09-02 17:12:03 +05:30
Alex
b630be8c8a fix: remove logtail 2024-09-01 11:10:54 +01:00
Alex
0aca41f9a6 fix: faiss dependency 2024-09-01 11:09:24 +01:00
Alex
a3fed0f84b Merge pull request #1101 from arc53/dependabot/pip/application/pandas-2.2.2
chore(deps): bump pandas from 2.2.0 to 2.2.2 in /application
2024-08-31 19:21:02 +01:00
Alex
1414ad6d50 Merge pull request #1109 from aameraryan/csv-upload
CSV upload
2024-08-31 19:18:22 +01:00
Alex
ed6b4dabf8 Merge pull request #1110 from arc53/logging
feat: logging
2024-08-31 19:17:46 +01:00
Alex
d9309ebc6e feat: better token counter 2024-08-31 17:07:40 +01:00
Alex
c49b7613e0 fix: langchain warning 2024-08-31 12:53:37 +01:00
Alex
4f88b6dc71 feat: logging 2024-08-31 12:30:03 +01:00
Aamer Aryan
5c9e6404cc fix: bump eslint-plugin-prettier from 4.2.1 to 5.2.1
Committer: Aamer Aryan <aameraryan@gmail.com>
2024-08-31 04:58:47 +05:30
Aamer Aryan
80df494787 fix: add .csv support to file upload input 2024-08-31 04:54:01 +05:30
Alex
c0886c2785 Merge pull request #1107 from ManishMadan2882/main
React-widget: Adding "large" option for widget size
2024-08-28 23:30:28 +01:00
ManishMadan2882
a83a56eecd update script 2024-08-29 03:14:57 +05:30
ManishMadan2882
130eb56d09 add script: automate publish 2024-08-29 03:10:27 +05:30
ManishMadan2882
b60b473e02 fix(warn): update prop name 2024-08-29 03:08:33 +05:30
Alex
e0504eb957 Merge pull request #1105 from arc53/dependabot/npm_and_yarn/extensions/chrome/braces-3.0.3
chore(deps-dev): bump braces from 3.0.2 to 3.0.3 in /extensions/chrome
2024-08-28 18:14:41 +01:00
dependabot[bot]
3886e41e94 chore(deps-dev): bump braces from 3.0.2 to 3.0.3 in /extensions/chrome
Bumps [braces](https://github.com/micromatch/braces) from 3.0.2 to 3.0.3.
- [Changelog](https://github.com/micromatch/braces/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/braces/compare/3.0.2...3.0.3)

---
updated-dependencies:
- dependency-name: braces
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-28 17:04:55 +00:00
ManishMadan2882
edc54c7120 update script 2024-08-28 16:03:43 +05:30
ManishMadan2882
cef1167ef1 update script 2024-08-28 16:02:10 +05:30
ManishMadan2882
f456500f3a size: add large 2024-08-28 04:30:47 +05:30
dependabot[bot]
59328ea44d chore(deps): bump pandas from 2.2.0 to 2.2.2 in /application
Bumps [pandas](https://github.com/pandas-dev/pandas) from 2.2.0 to 2.2.2.
- [Release notes](https://github.com/pandas-dev/pandas/releases)
- [Commits](https://github.com/pandas-dev/pandas/compare/v2.2.0...v2.2.2)

---
updated-dependencies:
- dependency-name: pandas
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-26 20:48:56 +00:00
Alex
0dc840dc8e Merge pull request #1094 from ManishMadan2882/main
Update docs: minor fixes, version update
2024-08-26 11:15:44 +01:00
ManishMadan2882
6700028bd1 fix: docs, update version; update readme 2024-08-26 02:29:04 +05:30
Alex
213b1d1d0d Merge pull request #1093 from shelar1423/main
Widget Documentation update
2024-08-25 11:00:18 +01:00
Alex
feab64b09a fix: numbered list 2024-08-25 10:58:28 +01:00
digvijay shelar
f9f096cca8 Widget Documentation update 2024-08-25 13:09:57 +05:30
Alex
535d174c2b Merge pull request #1092 from ManishMadan2882/main
React widget: Sync with the Stream endpoint, error handling
2024-08-23 14:41:23 +01:00
ManishMadan2882
11d2401970 version update 2024-08-23 18:25:37 +05:30
ManishMadan2882
232b36d4ae fix: sync with the error handling 2024-08-23 17:58:57 +05:30
Alex
b38b159f4e Merge pull request #1089 from ManishMadan2882/main
Upgrading options in React widget
2024-08-23 12:10:12 +01:00
Alex
606bf1ff58 Merge pull request #1090 from siiddhantt/fix/minor-bugs
fix: minor ui bug
2024-08-23 12:05:08 +01:00
Siddhant Rai
46cec638dd fix: chatbot table out of bounds 2024-08-23 11:57:24 +05:30
ManishMadan2882
8637397c86 options: theme, button icon and bg 2024-08-23 02:55:00 +05:30
Alex
7502e1881f Merge pull request #1087 from shriyaMadan/main
Fix streaming issue on new conversation ID
2024-08-22 11:22:42 +01:00
Alex
734d5e50c5 Merge pull request #1088 from ManishMadan2882/main
React Widget: Fix for Next.js
2024-08-21 23:46:01 +01:00
Alex
052ff6727b Merge pull request #1082 from arc53/dependabot/pip/application/anthropic-0.34.0
chore(deps): bump anthropic from 0.12.0 to 0.34.0 in /application
2024-08-21 21:42:29 +01:00
Alex
2962dbd6b8 Merge pull request #1081 from arc53/dependabot/pip/application/qdrant-client-1.11.0
chore(deps): bump qdrant-client from 1.9.0 to 1.11.0 in /application
2024-08-21 21:42:21 +01:00
ManishMadan2882
392afd6f33 fix(Next): window is not defined 2024-08-22 00:26:09 +05:30
“Shriya
fc3f4dff10 Merge remote-tracking branch 'refs/remotes/origin/main' 2024-08-21 06:00:19 +05:30
“Shriya
24d6889b24 abort stream when conversationId updates 2024-08-21 05:50:04 +05:30
Alex
27e3e22703 Merge pull request #1076 from arc53/dependabot/pip/scripts/nltk-3.9
chore(deps): bump nltk from 3.8.1 to 3.9 in /scripts
2024-08-20 23:29:55 +01:00
dependabot[bot]
7b7f609c47 chore(deps): bump qdrant-client from 1.9.0 to 1.11.0 in /application
Bumps [qdrant-client](https://github.com/qdrant/qdrant-client) from 1.9.0 to 1.11.0.
- [Release notes](https://github.com/qdrant/qdrant-client/releases)
- [Commits](https://github.com/qdrant/qdrant-client/compare/v1.9.0...v1.11.0)

---
updated-dependencies:
- dependency-name: qdrant-client
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-20 22:25:35 +00:00
Alex
5389c8858a Merge pull request #1083 from arc53/dependabot/pip/application/pymongo-4.8.0
chore(deps): bump pymongo from 4.6.3 to 4.8.0 in /application
2024-08-20 23:24:27 +01:00
Alex
c2d2fbba96 Merge pull request #1086 from siiddhantt/fix/remark-gfm-incompatible
fix: updated both remark-gfm and react-markdown to compatible versions
2024-08-20 18:02:18 +01:00
Siddhant Rai
cfc039dae1 fix: refactor code for react-markdown 9.0.1 2024-08-20 22:06:55 +05:30
Siddhant Rai
edf24dc992 fix: updated both remark-gfm and react-markdown to compatible versions 2024-08-20 14:18:15 +05:30
Alex
97710296ac Merge pull request #1085 from siiddhantt/fix/remark-gfm-incompatible
fix: revert to remark-gfm 3.0.1
2024-08-20 09:32:34 +01:00
Siddhant Rai
acbfc0bb81 fix: remark-gfm 4.0.0 incompatible with react-markdown 8.0.7 2024-08-20 13:56:20 +05:30
Alex
e1fe2fb093 Merge pull request #1084 from ManishMadan2882/main
Extensions: Add size option to react-widget
2024-08-20 00:34:09 +01:00
ManishMadan2882
dd5c1ec9ed minor design change 2024-08-20 04:23:16 +05:30
dependabot[bot]
c7f7614646 chore(deps): bump @reduxjs/toolkit from 1.9.2 to 2.2.7 in /frontend
Bumps [@reduxjs/toolkit](https://github.com/reduxjs/redux-toolkit) from 1.9.2 to 2.2.7.
- [Release notes](https://github.com/reduxjs/redux-toolkit/releases)
- [Commits](https://github.com/reduxjs/redux-toolkit/compare/v1.9.2...v2.2.7)

---
updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-19 21:43:00 +00:00
Alex
d604398642 Merge pull request #1080 from arc53/dependabot/npm_and_yarn/frontend/postcss-8.4.41
chore(deps-dev): bump postcss from 8.4.40 to 8.4.41 in /frontend
2024-08-19 22:38:01 +01:00
Alex
d40b1d8937 Merge pull request #1079 from arc53/dependabot/npm_and_yarn/frontend/multi-ec30208de6
chore(deps): bump react-dom and @types/react-dom in /frontend
2024-08-19 22:37:52 +01:00
Alex
49b4b476dc Merge pull request #1078 from arc53/dependabot/npm_and_yarn/frontend/vitejs/plugin-react-4.3.1
chore(deps-dev): bump @vitejs/plugin-react from 4.2.1 to 4.3.1 in /frontend
2024-08-19 22:37:36 +01:00
Alex
c0ec689be9 Merge pull request #1077 from arc53/dependabot/npm_and_yarn/frontend/i18next-23.14.0
chore(deps): bump i18next from 23.11.5 to 23.14.0 in /frontend
2024-08-19 22:37:00 +01:00
Alex
8e26decb5b Merge pull request #1075 from siiddhantt/feat/new-conversation-ui
feat: new ui for sources in conversation
2024-08-19 22:36:35 +01:00
ManishMadan2882
921efcbf4b add size prop - small/medium 2024-08-20 02:25:39 +05:30
dependabot[bot]
705ab58bfb chore(deps): bump pymongo from 4.6.3 to 4.8.0 in /application
Bumps [pymongo](https://github.com/mongodb/mongo-python-driver) from 4.6.3 to 4.8.0.
- [Release notes](https://github.com/mongodb/mongo-python-driver/releases)
- [Changelog](https://github.com/mongodb/mongo-python-driver/blob/master/doc/changelog.rst)
- [Commits](https://github.com/mongodb/mongo-python-driver/compare/4.6.3...4.8.0)

---
updated-dependencies:
- dependency-name: pymongo
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-19 20:54:54 +00:00
dependabot[bot]
b42e32a955 chore(deps): bump anthropic from 0.12.0 to 0.34.0 in /application
Bumps [anthropic](https://github.com/anthropics/anthropic-sdk-python) from 0.12.0 to 0.34.0.
- [Release notes](https://github.com/anthropics/anthropic-sdk-python/releases)
- [Changelog](https://github.com/anthropics/anthropic-sdk-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/anthropics/anthropic-sdk-python/compare/v0.12.0...v0.34.0)

---
updated-dependencies:
- dependency-name: anthropic
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-19 20:54:50 +00:00
dependabot[bot]
dec60a0fdd chore(deps-dev): bump postcss from 8.4.40 to 8.4.41 in /frontend
Bumps [postcss](https://github.com/postcss/postcss) from 8.4.40 to 8.4.41.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.4.40...8.4.41)

---
updated-dependencies:
- dependency-name: postcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-19 20:40:15 +00:00
dependabot[bot]
fa84d5c502 chore(deps): bump react-dom and @types/react-dom in /frontend
Bumps [react-dom](https://github.com/facebook/react/tree/HEAD/packages/react-dom) and [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom). These dependencies needed to be updated together.

Updates `react-dom` from 18.2.0 to 18.3.1
- [Release notes](https://github.com/facebook/react/releases)
- [Changelog](https://github.com/facebook/react/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/react/commits/v18.3.1/packages/react-dom)

Updates `@types/react-dom` from 18.0.10 to 18.3.0
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom)

---
updated-dependencies:
- dependency-name: react-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: "@types/react-dom"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-19 20:40:08 +00:00
dependabot[bot]
34c763caf5 chore(deps-dev): bump @vitejs/plugin-react in /frontend
Bumps [@vitejs/plugin-react](https://github.com/vitejs/vite-plugin-react/tree/HEAD/packages/plugin-react) from 4.2.1 to 4.3.1.
- [Release notes](https://github.com/vitejs/vite-plugin-react/releases)
- [Changelog](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite-plugin-react/commits/v4.3.1/packages/plugin-react)

---
updated-dependencies:
- dependency-name: "@vitejs/plugin-react"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-19 20:39:57 +00:00
dependabot[bot]
e5709dfabc chore(deps): bump i18next from 23.11.5 to 23.14.0 in /frontend
Bumps [i18next](https://github.com/i18next/i18next) from 23.11.5 to 23.14.0.
- [Release notes](https://github.com/i18next/i18next/releases)
- [Changelog](https://github.com/i18next/i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/i18next/compare/v23.11.5...v23.14.0)

---
updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-19 20:39:49 +00:00
Siddhant Rai
5f4f4a8ab9 fix: skeleton shown for 0 chunks and 'None' doc 2024-08-19 18:14:49 +05:30
Siddhant Rai
ca9e71087b fix: revert changes 2024-08-19 17:40:46 +05:30
Siddhant Rai
6da483b3ef fix: minor ui fix + added req changes 2024-08-19 17:33:15 +05:30
Alex
e300145263 Merge pull request #1070 from arc53/dependabot/npm_and_yarn/frontend/typescript-eslint/parser-5.62.0
chore(deps-dev): bump @typescript-eslint/parser from 5.51.0 to 5.62.0 in /frontend
2024-08-19 10:28:55 +01:00
dependabot[bot]
eb3f0035fe chore(deps): bump nltk from 3.8.1 to 3.9 in /scripts
Bumps [nltk](https://github.com/nltk/nltk) from 3.8.1 to 3.9.
- [Changelog](https://github.com/nltk/nltk/blob/develop/ChangeLog)
- [Commits](https://github.com/nltk/nltk/compare/3.8.1...3.9)

---
updated-dependencies:
- dependency-name: nltk
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-19 09:28:28 +00:00
Alex
6be3a2a142 Merge pull request #1072 from arc53/dependabot/npm_and_yarn/frontend/eslint-plugin-n-15.7.0
chore(deps-dev): bump eslint-plugin-n from 15.6.1 to 15.7.0 in /frontend
2024-08-19 10:28:26 +01:00
Alex
e23893a419 Merge pull request #1073 from arc53/dependabot/npm_and_yarn/frontend/react-i18next-15.0.1
chore(deps): bump react-i18next from 14.1.2 to 15.0.1 in /frontend
2024-08-19 10:28:08 +01:00
Alex
7b4c1dcde0 Merge pull request #1071 from arc53/dependabot/npm_and_yarn/frontend/remark-gfm-4.0.0
chore(deps): bump remark-gfm from 3.0.1 to 4.0.0 in /frontend
2024-08-19 10:27:34 +01:00
Siddhant Rai
4b7cb2a22a fix: minor fix 2024-08-17 15:26:22 +05:30
Siddhant Rai
344a8a3887 fix: added required changes 2024-08-17 14:42:48 +05:30
Siddhant Rai
0afda5dc27 feat: new sources section + sidebar component 2024-08-15 18:23:13 +05:30
ManishMadan2882
0891ef6d0a frontend: adapting to migration 2024-08-14 17:15:20 +05:30
dependabot[bot]
cdb64ecb19 chore(deps): bump react-i18next from 14.1.2 to 15.0.1 in /frontend
Bumps [react-i18next](https://github.com/i18next/react-i18next) from 14.1.2 to 15.0.1.
- [Changelog](https://github.com/i18next/react-i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/react-i18next/compare/v14.1.2...v15.0.1)

---
updated-dependencies:
- dependency-name: react-i18next
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-12 20:56:04 +00:00
dependabot[bot]
b05ac4f2a4 chore(deps-dev): bump eslint-plugin-n from 15.6.1 to 15.7.0 in /frontend
Bumps [eslint-plugin-n](https://github.com/eslint-community/eslint-plugin-n) from 15.6.1 to 15.7.0.
- [Release notes](https://github.com/eslint-community/eslint-plugin-n/releases)
- [Changelog](https://github.com/eslint-community/eslint-plugin-n/blob/master/CHANGELOG.md)
- [Commits](https://github.com/eslint-community/eslint-plugin-n/compare/15.6.1...15.7.0)

---
updated-dependencies:
- dependency-name: eslint-plugin-n
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-12 20:55:59 +00:00
dependabot[bot]
411189a076 chore(deps): bump remark-gfm from 3.0.1 to 4.0.0 in /frontend
Bumps [remark-gfm](https://github.com/remarkjs/remark-gfm) from 3.0.1 to 4.0.0.
- [Release notes](https://github.com/remarkjs/remark-gfm/releases)
- [Commits](https://github.com/remarkjs/remark-gfm/compare/3.0.1...4.0.0)

---
updated-dependencies:
- dependency-name: remark-gfm
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-12 20:55:53 +00:00
dependabot[bot]
63cf4d46c9 chore(deps-dev): bump @typescript-eslint/parser in /frontend
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 5.51.0 to 5.62.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.62.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-12 20:55:46 +00:00
Alex
3c683f2192 Merge pull request #1067 from arc53/dependabot/pip/application/elasticsearch-8.14.0
chore(deps): bump elasticsearch from 8.12.0 to 8.14.0 in /application
2024-08-12 21:49:51 +01:00
Alex
d46e59bcd4 Merge pull request #1065 from arc53/dependabot/pip/application/pydantic-settings-2.4.0
chore(deps): bump pydantic-settings from 2.1.0 to 2.4.0 in /application
2024-08-12 21:45:07 +01:00
Alex
c45e76ec31 Merge pull request #1068 from arc53/dependabot/pip/application/transformers-4.44.0
chore(deps): bump transformers from 4.43.4 to 4.44.0 in /application
2024-08-12 21:42:00 +01:00
Alex
44828707ea Merge pull request #1066 from arc53/dependabot/pip/application/gunicorn-23.0.0
chore(deps): bump gunicorn from 22.0.0 to 23.0.0 in /application
2024-08-12 21:39:05 +01:00
dependabot[bot]
74c047d249 chore(deps): bump transformers from 4.43.4 to 4.44.0 in /application
Bumps [transformers](https://github.com/huggingface/transformers) from 4.43.4 to 4.44.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.43.4...v4.44.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-12 20:22:59 +00:00
dependabot[bot]
50abfb98fe chore(deps): bump elasticsearch from 8.12.0 to 8.14.0 in /application
Bumps [elasticsearch](https://github.com/elastic/elasticsearch-py) from 8.12.0 to 8.14.0.
- [Release notes](https://github.com/elastic/elasticsearch-py/releases)
- [Commits](https://github.com/elastic/elasticsearch-py/compare/v8.12.0...v8.14.0)

---
updated-dependencies:
- dependency-name: elasticsearch
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-12 20:22:27 +00:00
dependabot[bot]
6104657970 chore(deps): bump gunicorn from 22.0.0 to 23.0.0 in /application
Bumps [gunicorn](https://github.com/benoitc/gunicorn) from 22.0.0 to 23.0.0.
- [Release notes](https://github.com/benoitc/gunicorn/releases)
- [Commits](https://github.com/benoitc/gunicorn/compare/22.0.0...23.0.0)

---
updated-dependencies:
- dependency-name: gunicorn
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-12 20:22:24 +00:00
dependabot[bot]
02116d4c05 chore(deps): bump pydantic-settings from 2.1.0 to 2.4.0 in /application
Bumps [pydantic-settings](https://github.com/pydantic/pydantic-settings) from 2.1.0 to 2.4.0.
- [Release notes](https://github.com/pydantic/pydantic-settings/releases)
- [Commits](https://github.com/pydantic/pydantic-settings/compare/v2.1.0...v2.4.0)

---
updated-dependencies:
- dependency-name: pydantic-settings
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-12 20:22:21 +00:00
Alex
23f993bb54 Merge pull request #1064 from arc53/dartpain-patch-1
Upgrade unstructured
2024-08-12 16:44:11 +01:00
Alex
16aedd61da fix: ruff lint 2024-08-12 16:37:03 +01:00
Alex
5a2f3ad616 feat: remove dep 2024-08-12 16:35:23 +01:00
Alex
54971104f8 Upgrade unstructured 2024-08-12 16:03:02 +01:00
ManishMadan2882
deeffbf77d fix(retriever):classic should not override 2024-08-12 15:52:36 +05:30
ManishMadan2882
7e8dd6bba8 fix: get api keys endpoint 2024-08-12 01:06:21 +05:30
ManishMadan2882
dc4078d744 migration(fixes): retriver/sharing endpoints 2024-08-11 21:26:30 +05:30
ManishMadan2882
1eb168be55 vector indexes to be named after mongo _id 2024-08-11 19:33:31 +05:30
ManishMadan2882
3c6fd365fb store only local docs as location 2024-08-09 18:27:54 +05:30
Siddhant Rai
53c9184057 Merge pull request #1061 from arc53/dependabot/pip/application/transformers-4.43.4
chore(deps): bump transformers from 4.36.2 to 4.43.4 in /application
2024-08-07 21:14:04 +05:30
Siddhant Rai
16c9872571 Merge pull request #1052 from arc53/dependabot/pip/application/dataclasses-json-0.6.7
chore(deps): bump dataclasses-json from 0.6.3 to 0.6.7 in /application
2024-08-07 20:28:11 +05:30
Siddhant Rai
17160bc467 Merge pull request #1051 from arc53/dependabot/pip/application/duckduckgo-search-6.2.6
chore(deps): bump duckduckgo-search from 5.3.0 to 6.2.6 in /application
2024-08-07 20:23:28 +05:30
Siddhant Rai
3b39e58cf3 Merge pull request #1058 from arc53/dependabot/npm_and_yarn/frontend/lint-staged-15.2.8
chore(deps-dev): bump lint-staged from 13.1.1 to 15.2.8 in /frontend
2024-08-07 15:17:28 +05:30
dependabot[bot]
7db055116c chore(deps-dev): bump lint-staged from 13.1.1 to 15.2.8 in /frontend
Bumps [lint-staged](https://github.com/lint-staged/lint-staged) from 13.1.1 to 15.2.8.
- [Release notes](https://github.com/lint-staged/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/lint-staged/lint-staged/compare/v13.1.1...v15.2.8)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-07 09:19:07 +00:00
Siddhant Rai
0262ff1aac Merge pull request #1054 from arc53/dependabot/npm_and_yarn/frontend/prettier-3.3.3
chore(deps-dev): bump prettier from 2.8.4 to 3.3.3 in /frontend
2024-08-07 14:47:52 +05:30
dependabot[bot]
14def09ce3 chore(deps-dev): bump prettier from 2.8.4 to 3.3.3 in /frontend
Bumps [prettier](https://github.com/prettier/prettier) from 2.8.4 to 3.3.3.
- [Release notes](https://github.com/prettier/prettier/releases)
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prettier/prettier/compare/2.8.4...3.3.3)

---
updated-dependencies:
- dependency-name: prettier
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-07 09:09:10 +00:00
Siddhant Rai
96e59da6bc Merge pull request #1057 from arc53/dependabot/npm_and_yarn/frontend/eslint-plugin-react-7.35.0
chore(deps-dev): bump eslint-plugin-react from 7.32.2 to 7.35.0 in /frontend
2024-08-07 14:37:56 +05:30
Siddhant Rai
8f75b0a0c0 Merge pull request #1056 from arc53/dependabot/npm_and_yarn/frontend/vite-5.3.5
chore(deps-dev): bump vite from 5.0.13 to 5.3.5 in /frontend
2024-08-07 14:35:26 +05:30
dependabot[bot]
70a40cfc45 chore(deps-dev): bump eslint-plugin-react in /frontend
Bumps [eslint-plugin-react](https://github.com/jsx-eslint/eslint-plugin-react) from 7.32.2 to 7.35.0.
- [Release notes](https://github.com/jsx-eslint/eslint-plugin-react/releases)
- [Changelog](https://github.com/jsx-eslint/eslint-plugin-react/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jsx-eslint/eslint-plugin-react/compare/v7.32.2...v7.35.0)

---
updated-dependencies:
- dependency-name: eslint-plugin-react
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-07 09:01:23 +00:00
Siddhant Rai
ef6b8b9ebc Merge pull request #1053 from arc53/dependabot/npm_and_yarn/frontend/eslint-plugin-promise-6.6.0
chore(deps-dev): bump eslint-plugin-promise from 6.1.1 to 6.6.0 in /frontend
2024-08-07 14:30:08 +05:30
ManishMadan2882
f9dbaa9407 migrate: link source to vector collection 2024-08-07 03:41:31 +05:30
dependabot[bot]
ed338668d1 chore(deps): bump transformers from 4.36.2 to 4.43.4 in /application
Bumps [transformers](https://github.com/huggingface/transformers) from 4.36.2 to 4.43.4.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.36.2...v4.43.4)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-05 20:32:46 +00:00
Alex
1e5d94a958 Merge pull request #1060 from siiddhantt/fix/conversation-textarea
fix: no safe-area for iOS browser, input not fixed
2024-08-05 18:12:38 +01:00
Siddhant Rai
51553c565f feat: replaced select source modal with hook to set default doc 2024-08-05 22:28:58 +05:30
Alex
ae36805aa1 Merge pull request #1050 from arc53/dependabot/pip/application/boto3-1.34.153
chore(deps): bump boto3 from 1.34.6 to 1.34.153 in /application
2024-08-05 16:54:18 +01:00
Siddhant Rai
22b7445ac4 fix: no safe-area for iOS browser, input not fixed 2024-08-05 19:05:34 +05:30
dependabot[bot]
9c278d7d0b chore(deps-dev): bump vite from 5.0.13 to 5.3.5 in /frontend
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.0.13 to 5.3.5.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.3.5/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-05 10:13:09 +00:00
dependabot[bot]
3a1592692e chore(deps-dev): bump eslint-plugin-promise in /frontend
Bumps [eslint-plugin-promise](https://github.com/eslint-community/eslint-plugin-promise) from 6.1.1 to 6.6.0.
- [Release notes](https://github.com/eslint-community/eslint-plugin-promise/releases)
- [Changelog](https://github.com/eslint-community/eslint-plugin-promise/blob/main/CHANGELOG.md)
- [Commits](https://github.com/eslint-community/eslint-plugin-promise/compare/v6.1.1...v6.6.0)

---
updated-dependencies:
- dependency-name: eslint-plugin-promise
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-05 10:12:55 +00:00
dependabot[bot]
bdec956708 chore(deps): bump dataclasses-json from 0.6.3 to 0.6.7 in /application
Bumps [dataclasses-json](https://github.com/lidatong/dataclasses-json) from 0.6.3 to 0.6.7.
- [Release notes](https://github.com/lidatong/dataclasses-json/releases)
- [Commits](https://github.com/lidatong/dataclasses-json/compare/v0.6.3...v0.6.7)

---
updated-dependencies:
- dependency-name: dataclasses-json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-05 10:12:52 +00:00
dependabot[bot]
d0d8a8a3af chore(deps): bump duckduckgo-search from 5.3.0 to 6.2.6 in /application
Bumps [duckduckgo-search](https://github.com/deedy5/duckduckgo_search) from 5.3.0 to 6.2.6.
- [Release notes](https://github.com/deedy5/duckduckgo_search/releases)
- [Commits](https://github.com/deedy5/duckduckgo_search/compare/v5.3.0...v6.2.6)

---
updated-dependencies:
- dependency-name: duckduckgo-search
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-05 10:12:50 +00:00
dependabot[bot]
f787962be8 chore(deps): bump boto3 from 1.34.6 to 1.34.153 in /application
Bumps [boto3](https://github.com/boto/boto3) from 1.34.6 to 1.34.153.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.6...1.34.153)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-08-05 10:12:48 +00:00
Alex
57b9b369b7 Create dependabot.yml 2024-08-05 11:12:00 +01:00
Alex
ccda5bdb7e Merge pull request #1049 from siiddhantt/fix/conversation-textarea
fix: changed query box from div to textarea
2024-08-05 10:06:57 +01:00
Siddhant Rai
bd5fa83fe0 fix: handleInput after submission of query 2024-08-05 14:25:21 +05:30
Siddhant Rai
59caf381f7 fix: changed query box from div to textarea for handling paste, autoscroll 2024-08-05 14:17:21 +05:30
Alex
181cb1b1bd Update README.md 2024-08-02 16:08:21 +01:00
Alex
ba2dd2d872 chore: run triage only on main 2024-08-01 17:40:28 +01:00
Alex
4dc5acd68e Update README.md 2024-08-01 17:32:23 +01:00
Alex
f57116afbe Merge pull request #1048 from PeterDaveHelloKitchen/Dockerfile
Optimize Dockerfile to reduce image size and improve build efficiency
2024-08-01 17:16:59 +01:00
Peter Dave Hello
99c41c7e34 Optimize Dockerfile to reduce image size and improve build efficiency
- Merge apt-related RUN commands in builder and final stages
- Remove redundant apt-get clean (automatically handled by base image)
- Consolidate cleanup operations within the same layer for effective
  size reduction

Reference:
- https://docs.docker.com/build/building/best-practices/#apt-get
2024-08-01 23:45:13 +08:00
Alex
d7b38d9513 Change api key name to chatbots 2024-07-31 10:40:35 +01:00
Alex
2c7aad1dcd Merge pull request #1043 from PeterDaveHelloKitchen/Dockerfile
Optimize Dockerfile by unifying download tools to `wget`
2024-07-30 16:42:07 +01:00
Alex
593dba72a8 Merge pull request #1047 from ManishMadan2882/main
UI Fixes
2024-07-30 16:24:11 +01:00
ManishMadan2882
09f2f2a9e7 (fix) dark theme,train modal 2024-07-30 18:01:49 +05:30
Jackson
6c583eedb9 Merge branch 'arc53:main' into Jackson 2024-07-30 17:45:06 +08:00
Jacksonxhx
1c4d7a6ad1 integrated milvus db 2024-07-30 17:44:27 +08:00
Alex
f75ed4bc66 Merge pull request #1039 from ManishMadan2882/main
Feature(Sharing): Introducing Promptable conversations
2024-07-29 21:25:20 +01:00
Alex
f487deb7b9 Merge pull request #1044 from arc53/fix/minor-bugs
fix: delete all method type
2024-07-29 21:25:12 +01:00
ManishMadan2882
ad30a8476c fix(shared): UI 2024-07-30 01:13:47 +05:30
Siddhant Rai
60db807443 Update Navigation.tsx 2024-07-29 12:17:05 +05:30
Siddhant Rai
d1dedff9ca fix: delete all method type 2024-07-29 12:15:18 +05:30
Peter Dave Hello
9a04506b0d Optimize Dockerfile by unifying download tools to wget
Replace curl with wget to streamline the build process. Remove
unnecessary curl installation to reduce dependency complexity and
installation time.
2024-07-29 00:48:16 +08:00
ManishMadan2882
a58369fbb1 share modal :translate option 2024-07-28 20:06:24 +05:30
ManishMadan2882
d63b5d71a1 prevent event bubble, open menu onhover 2024-07-28 19:48:48 +05:30
ManishMadan2882
360d790282 save further queries to localstorage 2024-07-28 00:58:55 +05:30
ManishMadan2882
a0dd8f8e0f integrated stream for shared(prompt) conv 2024-07-27 02:42:33 +05:30
ManishMadan2882
db7c001076 exclude the models only in root 2024-07-26 15:47:51 +05:30
Alex
c96f905d2b Merge pull request #1041 from arc53/dependabot/pip/scripts/langchain-community-0.2.9
chore(deps): bump langchain-community from 0.0.16 to 0.2.9 in /scripts
2024-07-26 11:10:21 +01:00
ManishMadan2882
4b3f04083b synced with upstream 2024-07-26 15:36:27 +05:30
ManishMadan2882
8276b6c9a9 (shared conv) centralised into redux state 2024-07-26 15:19:38 +05:30
Alex
43d6e788dc Merge pull request #1040 from siiddhantt/enhancement/reusable-api-client
enhancement: reusable api client in frontend
2024-07-25 11:48:31 +01:00
Siddhant Rai
0c062a8485 enhancement: implement api client in remaining places 2024-07-24 23:08:42 +05:30
dependabot[bot]
99b649f24e chore(deps): bump langchain-community from 0.0.16 to 0.2.9 in /scripts
Bumps [langchain-community](https://github.com/langchain-ai/langchain) from 0.0.16 to 0.2.9.
- [Release notes](https://github.com/langchain-ai/langchain/releases)
- [Commits](https://github.com/langchain-ai/langchain/commits/langchain-community==0.2.9)

---
updated-dependencies:
- dependency-name: langchain-community
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-24 17:36:19 +00:00
Siddhant Rai
7c6532f145 enhancement: reusable api client setup + replaced in settings, conversation 2024-07-24 21:33:36 +05:30
ManishMadan2882
052669a0b0 complete_stream: no storing for api key req, fe:update shared conv 2024-07-24 02:03:53 +05:30
ManishMadan2882
0cf86d3bbc share api key through response 2024-07-20 03:35:57 +05:30
Alex
56a16b862a Merge pull request #1038 from nayelimdejesus/ux-ehancement
Ux Enhancement
2024-07-19 12:28:23 +01:00
ManishMadan2882
b2fffb2e23 feat(share): enable route to share promptable conv 2024-07-19 03:24:35 +05:30
Nayeli De Jesus
3f7a27cdbb Updated Ux-enhancement
I worked on deleting the button since we want training modal to close automatically upon completion. I added the set variables into TrainingProgress().
2024-07-18 05:35:12 -07:00
Nayeli De Jesus
58e30b8c88 Created Branch 2024-07-18 05:28:30 -07:00
Alex
4025e55b95 Merge pull request #1028 from utin-francis-peter/fix/issue#1023
Fix: adjusted alignment of submit query icon within its container
2024-07-17 00:25:42 +01:00
utin-francis-peter
e1e63ebd64 Merge branch 'main' of https://github.com/utin-francis-peter/DocsGPT into fix/issue#1023 2024-07-16 22:05:12 +01:00
utin-francis-peter
8279df48bf removed shrink 2024-07-16 22:04:26 +01:00
Alex
d86a06fab0 Merge pull request #1027 from utin-francis-peter/feat/issue#1017
Feat Implementation for issue#1017
2024-07-16 14:35:18 +01:00
Siddhant Rai
90b24dd915 fix: removed unused TextArea component 2024-07-16 18:20:13 +05:30
Alex
bacd2a6893 Merge pull request #1034 from ManishMadan2882/main
Feat: sharing endpoints
2024-07-16 12:28:59 +01:00
Alex
0f059f247d fix: ruff lint 2024-07-16 12:28:43 +01:00
ManishMadan2882
e2b76d9c29 feat(share): share btn above conversations 2024-07-16 02:09:36 +05:30
ManishMadan2882
1107a2f2bc refactor App.tsx: better convention 2024-07-15 17:56:23 +05:30
ManishMadan2882
efd43013da minor fix 2024-07-15 05:13:28 +05:30
ManishMadan2882
7b8458b47d fix layout 2024-07-15 05:00:13 +05:30
ManishMadan2882
84eed09a17 feedback visible conditioned, update meta info in shared 2024-07-15 02:55:38 +05:30
ManishMadan2882
35b1a40d49 feat(share) translate 2024-07-14 04:13:25 +05:30
ManishMadan2882
81d7fe3fdb refactor App, add /shared/id page 2024-07-14 03:29:06 +05:30
ManishMadan2882
02187fed4e add timetamp in iso, remove sources 2024-07-14 03:27:53 +05:30
ManishMadan2882
019bf013ac add css class: no-scrollbar 2024-07-12 02:51:59 +05:30
ManishMadan2882
d6e59a6a0a conversation tile: add menu, add share modal 2024-07-11 21:45:47 +05:30
utin-francis-peter
46aa862943 Merge branch 'main' of https://github.com/utin-francis-peter/DocsGPT into feat/issue#1017 2024-07-09 13:34:49 +01:00
utin-francis-peter
0413cab0d9 chore: removed all TextArea related entities from branch as it's outiside scope of branch/issue 2024-07-09 13:32:46 +01:00
Manish Madan
3357ce8f33 Merge branch 'arc53:main' into main 2024-07-09 16:29:04 +05:30
Alex
1776f6e7fd Merge pull request #1024 from blackviking27/feat-bubble-width 2024-07-09 09:06:39 +04:00
ManishMadan2882
edfe5e1156 restrict redundant sharing, add user field 2024-07-08 15:59:19 +05:30
ManishMadan2882
0768992848 add route to share and fetch public conversations 2024-07-08 03:03:46 +05:30
FIRST_NAME LAST_NAME
1224f94879 moved the three icons to the bottom of conversation bubble 2024-07-07 21:52:20 +05:30
Alex
b58c5344b8 Merge pull request #1033 from arc53/dependabot/npm_and_yarn/extensions/web-widget/braces-3.0.3
chore(deps-dev): bump braces from 3.0.2 to 3.0.3 in /extensions/web-widget
2024-07-07 17:24:03 +04:00
dependabot[bot]
7175bc0595 chore(deps-dev): bump braces in /extensions/web-widget
Bumps [braces](https://github.com/micromatch/braces) from 3.0.2 to 3.0.3.
- [Changelog](https://github.com/micromatch/braces/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/braces/compare/3.0.2...3.0.3)

---
updated-dependencies:
- dependency-name: braces
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-07 13:20:00 +00:00
Alex
b7a6f5696d Merge pull request #1032 from utin-francis-peter/fix/issue#1016
FEAT: Auto Language Detection using User's Browser Default
2024-07-07 17:19:32 +04:00
utin-francis-peter
abf5b89c28 refactor: handling applied styles based on colorVariant in a neater manner 2024-07-07 08:33:02 +01:00
utin-francis-peter
d554444b0e chore: updated Input prop from hasSilverBorder to colorVariant 2024-07-06 21:22:41 +01:00
utin-francis-peter
16ae0725e6 chore: took off the option of looking-up docsgpt-locale lang key in localStorage on first load 2024-07-06 20:41:21 +01:00
utin-francis-peter
61feced541 Merge branch 'feat/issue#1017' of https://github.com/utin-francis-peter/DocsGPT into feat/issue#1017 2024-07-05 21:57:46 +01:00
utin-francis-peter
a1d4db2f1e Merge branch 'main' of https://github.com/utin-francis-peter/DocsGPT into feat/issue#1017 2024-07-05 12:15:38 +01:00
Utin Francis Peter
357e9af627 chore: typo elimination
Co-authored-by: Siddhant Rai <47355538+siiddhantt@users.noreply.github.com>
2024-07-05 12:07:33 +01:00
utin-francis-peter
a41519be63 fix: minor typo 2024-07-05 11:41:12 +01:00
FIRST_NAME LAST_NAME
870e6b07c8 Merge branch 'main' of https://github.com/blackviking27/DocsGPT into feat-bubble-width 2024-07-04 19:12:04 +05:30
utin-francis-peter
6f41759519 Merge branch 'main' of https://github.com/utin-francis-peter/DocsGPT into fix/issue#1016 2024-07-04 10:11:57 +01:00
utin-francis-peter
6727c42f18 feat: auto browser lang detection on first visit 2024-07-04 10:05:54 +01:00
utin-francis-peter
90c367842f chore: added browser lang detector package by i18next 2024-07-04 09:00:14 +01:00
Alex
a0bb6e370e Merge pull request #1018 from utin-francis-peter/fix/issue#1014 2024-07-04 00:35:29 +04:00
Alex
f2910ab9d1 Merge pull request #1029 from arc53/dependabot/npm_and_yarn/docs/braces-3.0.3
chore(deps): bump braces from 3.0.2 to 3.0.3 in /docs
2024-07-03 23:11:43 +04:00
utin-francis-peter
b4bfed2ccb style: query submission icon centering 2024-07-03 15:46:35 +01:00
dependabot[bot]
2fcde61b6d chore(deps): bump braces from 3.0.2 to 3.0.3 in /docs
Bumps [braces](https://github.com/micromatch/braces) from 3.0.2 to 3.0.3.
- [Changelog](https://github.com/micromatch/braces/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/braces/compare/3.0.2...3.0.3)

---
updated-dependencies:
- dependency-name: braces
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-03 13:10:18 +00:00
Alex
ffddf10de5 Merge pull request #1026 from ManishMadan2882/main 2024-07-03 17:09:46 +04:00
utin-francis-peter
6e3bd5e6f3 fix: adjusted alignment of submit query icon within its container 2024-07-03 13:29:34 +01:00
utin-francis-peter
b21230c4d6 chore: migrated to using custom Input component to address redundant twClasses 2024-07-03 12:34:13 +01:00
utin-francis-peter
0a533b64e1 chore: migrated prop type definition into a types declaration file for components. other components prop types will live here 2024-07-03 11:49:49 +01:00
utin-francis-peter
15b0e321bd chore: TextArea component to replace Div contentEditable for entering prompts 2024-07-03 11:24:29 +01:00
ManishMadan2882
4d749340a2 fix: lint error - semantic ambiguity 2024-07-03 13:25:47 +05:30
utin-francis-peter
0ef6ffa452 gap between y-borders and prompts input + border-radius reduction as prompts input grows 2024-07-02 19:48:19 +01:00
FIRST_NAME LAST_NAME
d7b1310ba3 conversation bubble width fix 2024-07-02 22:11:21 +05:30
utin-francis-peter
7408454a75 chore: prompts input now uses useState hook for state change and inbuilt autoFocus 2024-07-01 19:54:31 +01:00
utin-francis-peter
07b71468cc style: removed custom padding and used twClasses 2024-06-29 20:45:33 +01:00
utin-francis-peter
522e966194 refactor: custom input component is used. inputRef is also replaced with state value 2024-06-29 18:58:13 +01:00
utin-francis-peter
937c60c9cf style: updated custom css class to match textInput component's 2024-06-29 18:55:10 +01:00
utin-francis-peter
bbb1e22163 style: spacings... 2024-06-28 20:19:01 +01:00
utin-francis-peter
a16e83200a style fix: gap between conversations wrapper and prompts input wrapper 2024-06-28 15:16:55 +01:00
utin-francis-peter
d437521710 style fix: response bubble padding and radius 2024-06-28 14:45:14 +01:00
utin-francis-peter
5cbf4cf352 style fix: padding and radius of question bubble 2024-06-28 14:24:34 +01:00
Alex
2985e3b75b Merge pull request #1013 from arc53/fix/singleton-llama-cpp
fix: use singleton in llama_cpp
2024-06-25 18:25:01 +01:00
Alex
f34a75fc5b Merge pull request #1004 from utin-francis-peter/fix/traning-progress
Fix/training progress
2024-06-25 14:57:26 +01:00
Alex
5aa88714b8 refactor: Add thread lock 2024-06-25 14:41:04 +01:00
Alex
ce56a414e0 fix: use singleton 2024-06-25 14:37:00 +01:00
Alex
ba4a7dcd45 Merge pull request #1012 from siiddhantt/fix/input-box-cutting-content
fix: input box improvements
2024-06-25 13:38:08 +01:00
Siddhant Rai
85c648da6c fix: large spacing + padding issue in input box 2024-06-25 17:58:16 +05:30
Alex
483f8eb690 Merge pull request #1011 from arc53/dependabot/npm_and_yarn/braces-3.0.3
chore(deps-dev): bump braces from 3.0.2 to 3.0.3
2024-06-25 13:10:18 +01:00
dependabot[bot]
93c868d698 chore(deps-dev): bump braces from 3.0.2 to 3.0.3
Bumps [braces](https://github.com/micromatch/braces) from 3.0.2 to 3.0.3.
- [Changelog](https://github.com/micromatch/braces/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/braces/compare/3.0.2...3.0.3)

---
updated-dependencies:
- dependency-name: braces
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-06-25 12:05:49 +00:00
Alex
a14e70e3f4 Merge pull request #1006 from arc53/dependabot/npm_and_yarn/frontend/braces-3.0.3
chore(deps-dev): bump braces from 3.0.2 to 3.0.3 in /frontend
2024-06-25 13:04:35 +01:00
Alex
a6ff606cae Merge pull request #1008 from utin-francis-peter/fix/issue#998
Fix/issue#998
2024-06-24 22:14:24 +01:00
utin-francis-peter
651eb3374c chore: on language change when active tab is general, active tab is persisted as general 2024-06-23 23:33:27 +01:00
utin-francis-peter
68c71adc5a chore: i18n "General" tab title 2024-06-23 23:29:59 +01:00
utin-francis-peter
0c4ca9c94d refactor: selected language gets stored in local state, triggering an effect that updates lang value in local storage and change language 2024-06-23 23:27:43 +01:00
utin-francis-peter
8c04f5b3f1 chore: selected language isn't included in language options 2024-06-23 23:19:14 +01:00
Alex
35b29a0a1e Merge pull request #1005 from siiddhantt/fix/modals-and-sidebar
fix: modals close on clicking outside
2024-06-23 12:51:51 +01:00
dependabot[bot]
d289f432b1 chore(deps-dev): bump braces from 3.0.2 to 3.0.3 in /frontend
Bumps [braces](https://github.com/micromatch/braces) from 3.0.2 to 3.0.3.
- [Changelog](https://github.com/micromatch/braces/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/braces/compare/3.0.2...3.0.3)

---
updated-dependencies:
- dependency-name: braces
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-06-21 18:49:54 +00:00
Siddhant Rai
e16e269775 fix: dropdown closes on clicking outside 2024-06-21 23:35:03 +05:30
utin-francis-peter
4e5d0c2e84 Merge branch 'main' of https://github.com/utin-francis-peter/DocsGPT into fix/traning-progress 2024-06-21 18:06:55 +01:00
utin-francis-peter
c9a2034936 chore: adjusted delay time before training starts 2024-06-21 18:04:30 +01:00
Alex
b70fc1151d fix: print error to console 2024-06-21 14:54:32 +01:00
utin-francis-peter
c11034edcd chore: slight delay between uploading and learning progress transition 2024-06-20 23:35:39 +01:00
utin-francis-peter
804d9b42a5 Merge branch 'main' of https://github.com/utin-francis-peter/DocsGPT into fix/traning-progress 2024-06-20 22:33:44 +01:00
utin-francis-peter
b1bb4e6758 fix: uploading/training progress bar 2024-06-20 22:18:18 +01:00
Alex
76ed8f0ba2 Merge pull request #1002 from ManishMadan2882/main
Better Error handling on /stream endpoint
2024-06-20 20:00:55 +01:00
Alex
4dde7eaea1 feat: Improve error handling in /stream route 2024-06-20 19:51:35 +01:00
Alex
2e2149c110 fix: stream stuff 2024-06-20 19:40:29 +01:00
ManishMadan2882
70bb9477c5 update err msg, if req fails from client 2024-06-20 18:21:19 +05:30
Alex
ec5363e9c1 Merge pull request #1001 from utin-francis-peter/latest-srcdoc-as-active
Fix: Set Uploaded/Trained/Latest Source Doc as Selected/Active Source Doc
2024-06-20 13:31:10 +01:00
ManishMadan2882
dba3b1c559 sort local vectors in latest first order 2024-06-20 17:58:59 +05:30
utin-francis-peter
9606e3f80c chore: handleDeleteClick now accepts only doc as param 2024-06-20 06:00:32 +01:00
utin-francis-peter
7bc7b500f5 refactor/chore: migrated away from manually removing a deleted source doc from UI / latest docs are fetched after deletion to update UI 2024-06-20 05:58:39 +01:00
utin-francis-peter
c6e804fa10 Merge branch 'main' of https://github.com/utin-francis-peter/DocsGPT into latest-srcdoc-as-active 2024-06-20 00:19:09 +01:00
utin-francis-peter
1cbaf9bd9d chore: updates from upstream 2024-06-20 00:05:14 +01:00
utin-francis-peter
45145685d5 fix: upon successful training of newly uploaded src doc, the latest doc is auto set as selected doc 2024-06-19 23:41:38 +01:00
utin-francis-peter
2fbec6f21f chore: added cleanup fxn for TrainingProgress timeout fxn 2024-06-19 23:39:16 +01:00
ManishMadan2882
ad29d2765f fix: add reducers to raise error, handle complete_stream() 2024-06-20 00:10:29 +05:30
Alex
e47e751142 fix link 2024-06-19 12:35:30 +01:00
Alex
c63d4ccf3e Merge pull request #1000 from arc53/feat/upgrade-ubuntu-docker
upgrade docker to 24.04
2024-06-19 11:57:37 +01:00
Alex
e5c30cf841 upgrade docker to 24.04 2024-06-19 11:45:37 +01:00
Alex
c80678aac5 Merge pull request #994 from xucailiang/fix-celery-import-error
rename celery.py
2024-06-19 09:47:52 +01:00
xucai
1754570057 rename celery_init.py 2024-06-19 16:17:09 +08:00
xucailiang
d87b411193 Merge branch 'arc53:main' into fix-celery-import-error 2024-06-19 15:16:39 +08:00
utin-francis-peter
8fc6284317 chore: on deleting an uploaded doc, default doc gets set as selected source doc 2024-06-18 23:33:49 +01:00
Alex
eae49d2367 Merge pull request #996 from arc53/feat/memory-embedding-singleton
chore: Refactor embeddings instantiation to use a singleton pattern
2024-06-18 11:52:27 +01:00
ManishMadan2882
69287c5198 feat: err handling /stream 2024-06-18 16:12:18 +05:30
Alex
e6b3984f78 Merge pull request #988 from utin-francis-peter/fix/retry-btn
Fix/retry-btn
2024-06-15 11:36:46 +01:00
Alex
547fe888d4 Merge pull request #991 from vedantbhatter/vedant-branch
Adding in Mandarin translation into DocsGPT
2024-06-14 15:13:45 +01:00
Alex
3454309cbc chore: Refactor embeddings instantiation to use a singleton pattern 2024-06-14 12:58:35 +01:00
utin-francis-peter
544c46cd44 chore: retry btn is side-by-side with error mssg 2024-06-14 00:31:33 +01:00
utin-francis-peter
2c100825cc Merge branch 'main' of https://github.com/utin-francis-peter/DocsGPT into fix/retry-btn 2024-06-13 23:25:33 +01:00
Alex
558ecd84a6 Merge pull request #993 from siiddhantt/fix/input-bar-hidden-safari
fix: input field covered by url bar in safari
2024-06-13 14:18:26 +01:00
utin-francis-peter
df24cfff4f style: improve visibility of bottom-most message bubble 2024-06-12 22:52:43 +01:00
Siddhant Rai
bd5d93a964 fix: unfixed input bar + safe area inset for Safari (iOS) 2024-06-13 00:21:51 +05:30
xucai
ae2ded119f rename celery_init.py 2024-06-12 19:48:28 +08:00
Siddhant Rai
abdb80a6be fix: input field covered by url bar in safari 2024-06-12 15:55:55 +05:30
utin-francis-peter
2f9cbe2bf1 chore: if user types in a new prompt after failed generation (instead of hitting retry btn), the failed query is updated with the new prompt before response is fetched. Ensuring every query object remains useful & relevant 2024-06-11 20:30:12 +01:00
utin-francis-peter
2cca7d60d5 chore: modified "retry" generation flow to give users the option of retrying with prev failed response or entering a new prompt into the provided field 2024-06-11 18:19:35 +01:00
Alex
3df745d1d2 Merge pull request #990 from IlyasOsman/token-format
Denominations on tokens
2024-06-11 10:19:28 +01:00
Alex
9862083e0b Update README.md 2024-06-11 10:11:09 +01:00
Vedant Bhatter
7a4976c470 Addign in Mandarin translation into DocsGPT 2024-06-10 17:47:49 -07:00
ilyasosman
8834a19743 Denominations on tokens 2024-06-10 22:50:35 +03:00
Alex
6e15403f60 Merge pull request #989 from SDanielDev/working
Updated nextra docs with new html code block installation instruction
2024-06-10 10:57:45 +01:00
utin-francis-peter
7e1cf10cb2 style: reduced retry container padding 2024-06-09 13:49:26 +01:00
utin-francis-peter
ee762c3c68 chore: modified handleQuestion params for more clarity 2024-06-09 13:47:51 +01:00
utin-francis-peter
32c06414c5 style: added theme adaptable RetryIcon component to Retry btn 2024-06-08 03:29:18 +01:00
SamDanielDev
e97e1ba4bc Updated nextra docs with new html code block installation instruction 2024-06-07 18:16:50 +01:00
utin-francis-peter
2f580f7800 feat: japan locale config 2024-06-07 17:40:33 +01:00
utin-francis-peter
1ce1459455 Merge branch 'main' of https://github.com/utin-francis-peter/DocsGPT into fix/retry-btn 2024-06-07 17:38:03 +01:00
utin-francis-peter
c26573482e style: retry query generation btn 2024-06-07 17:28:13 +01:00
utin-francis-peter
414ec08dee refactor: modified prepResponseView to prioritize query.response and trigger re-render after a failed generation is retried 2024-06-07 17:26:19 +01:00
Alex
1cc78191eb Merge pull request #987 from charlesnilsson/main
my-japanese-translation
2024-06-07 16:14:25 +01:00
Alex
75c6c6081a feat: Add Japanese translation support fix 2024-06-07 16:08:36 +01:00
utin-francis-peter
8d2ebe9718 feat: "Retry" btn conditionally renders in place of query input when a generation fails. Uses prev query to fetch answer when clicked. 2024-06-07 15:59:56 +01:00
Charles Nilsson
eed974b883 my-japanese-translation 2024-06-07 16:44:16 +02:00
utin-francis-peter
ae846dac4d chore: received changes from upstream 2024-06-07 15:33:24 +01:00
utin-francis-peter
0b09c00b50 chore: modified handleQuestion to favor "Retry" action after a failed response generation 2024-06-07 14:47:29 +01:00
Alex
f7a1874cb3 Merge pull request #979 from arc53/dependabot/pip/application/qdrant-client-1.9.0
chore(deps): bump qdrant-client from 1.8.2 to 1.9.0 in /application
2024-06-04 19:13:55 +01:00
dependabot[bot]
28fb04eb7b chore(deps): bump qdrant-client from 1.8.2 to 1.9.0 in /application
Bumps [qdrant-client](https://github.com/qdrant/qdrant-client) from 1.8.2 to 1.9.0.
- [Release notes](https://github.com/qdrant/qdrant-client/releases)
- [Commits](https://github.com/qdrant/qdrant-client/compare/v1.8.2...v1.9.0)

---
updated-dependencies:
- dependency-name: qdrant-client
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-06-04 17:53:28 +00:00
Alex
34310cf420 Merge pull request #974 from siiddhantt/fix/pr-960
fix/pr-960
2024-06-03 22:44:36 +01:00
Alex
e1d61d7190 Merge pull request #961 from arc53/dependabot/pip/application/requests-2.32.0
build(deps): bump requests from 2.31.0 to 2.32.0 in /application
2024-06-03 22:43:11 +01:00
Alex
9c14ac84cb Merge pull request #977 from shelar1423/main
DocsGPT link update
2024-06-03 17:26:54 +01:00
Siddhant Rai
1d1ea7b6f2 fix: version update 2024-06-03 20:59:52 +05:30
digvijay shelar
92401f5b7c link fix 2024-06-03 20:59:30 +05:30
Siddhant Rai
38ac9218ec fix: unpkg link in readme 2024-06-03 20:43:43 +05:30
Siddhant Rai
48497c749a fix: dompurify import error 2024-06-03 20:36:52 +05:30
Siddhant Rai
72a1892058 fix: added targets for browser environment 2024-06-03 12:57:53 +05:30
Siddhant Rai
f2c328d212 fix: empty types.d.ts generated during build + updated README.md 2024-06-01 14:10:12 +05:30
Alex
e9eafc40a7 Merge pull request #971 from shelar1423/main
FIX: improved documentation
2024-05-30 15:32:48 +01:00
digvijay shelar
933ca1bf81 updated the llm instructions for OS version 2024-05-30 18:51:56 +05:30
digvijay shelar
b4fc9aa7eb new home demo 2024-05-30 18:27:40 +05:30
Digvijay Shelar
dcc475bbef Merge branch 'arc53:main' into main 2024-05-30 18:22:56 +05:30
Alex
1fe35ad0cd Merge pull request #970 from siiddhantt/feature/link-to-source
feat: remote sources have clickable links to original url
2024-05-30 12:06:05 +01:00
Siddhant Rai
f1ed1e0f14 fix: type error 2024-05-30 15:33:16 +05:30
Alex
fcc746fb98 Merge pull request #972 from ManishMadan2882/main
Fix: added translation for the conversation history dropdown
2024-05-29 18:37:43 +01:00
ManishMadan2882
95934a5b7a (i18n): updated for conv history 2024-05-29 22:54:46 +05:30
Digvijay Shelar
d38b101820 Merge branch 'arc53:main' into main 2024-05-29 19:45:35 +05:30
Siddhant Rai
91d730a7bc feat: remote sources have clickable links 2024-05-29 19:07:08 +05:30
Alex
0cfa77b628 chats word in translations 2024-05-29 11:29:00 +01:00
Alex
ca4881ad51 Merge pull request #969 from ManishMadan2882/main
Internationalisation with i18next
2024-05-29 11:23:45 +01:00
digvijay shelar
8c2c064fe2 updated emoji's 2024-05-29 15:25:23 +05:30
Digvijay Shelar
10646b9b86 Merge branch 'arc53:main' into main 2024-05-29 15:04:16 +05:30
Alex
967b195946 Merge pull request #967 from starkgate/empty-response-after-streaming
Fix empty response in the conversation
2024-05-28 23:06:46 +01:00
ManishMadan2882
1ae7771290 add spacing in general, minor change 2024-05-29 03:27:53 +05:30
ManishMadan2882
a585fe4d54 refactored locale json 2024-05-28 21:38:42 +05:30
ManishMadan2882
fa3a9fe70e fix: minor changes 2024-05-28 21:35:10 +05:30
ManishMadan2882
99952a393f feat(i18n): modals, Hero, Nav 2024-05-28 20:50:07 +05:30
digvijay shelar
920a41e3ca api section fixed 2024-05-28 20:47:22 +05:30
digvijay shelar
e5bec957a1 issue #962 2024-05-28 20:32:35 +05:30
Alex
41cb765255 Update README.md 2024-05-28 10:09:06 +01:00
Alex
2d12a3cd7a Merge pull request #965 from siiddhantt/feature/set-tokens-message-history
feat: dropdown to adjust conversational history limits
2024-05-28 09:43:21 +01:00
starkgate
df4fe0176c Fix empty response in the conversation 2024-05-28 10:40:55 +02:00
ManishMadan2882
4fcc80719e feat(i18n): settings static content 2024-05-28 01:39:37 +05:30
Alex
f6c66f6ee4 Merge pull request #964 from ManishMadan2882/main
Feature: Token count for vectors
2024-05-27 11:44:11 +01:00
Siddhant Rai
220d137e66 feat: dropdown to adjust conversational history limits 2024-05-26 23:13:01 +05:30
Alex
425803a1b6 chore: Refactor source assignment in api_answer route 2024-05-24 16:50:00 +01:00
Manish Madan
c794ea614a Merge branch 'arc53:main' into main 2024-05-24 21:12:07 +05:30
ManishMadan2882
9000838aab (feat:vectors): calc, add token in db 2024-05-24 21:10:50 +05:30
Alex
2790bda1e9 feat: Update Kubernetes deployment instructions for DocsGPT 2024-05-24 16:16:32 +01:00
Alex
e13d4daa9a chore: Remove unused VECTOR_STORE variable in docsgpt-secrets.yaml 2024-05-24 16:09:31 +01:00
Alex
2f504a4e03 Merge pull request #963 from arc53/feat/kubes-deployment
feat: k8s deployment
2024-05-24 14:48:22 +01:00
Alex
598a50a133 feat: Add Kubernetes deployment instructions for DocsGPT 2024-05-24 14:40:28 +01:00
Alex
1b06a5a3e0 feat: k8s deployment 2024-05-23 18:23:01 +01:00
Alex
9f1d3b0269 Update README.md 2024-05-22 16:34:04 +01:00
Alex
a09543d38b Update README.md 2024-05-22 16:33:48 +01:00
dependabot[bot]
2ab3539925 ---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-21 05:53:49 +00:00
Alex
23ddf53abe Update ci.yml 2024-05-20 12:09:11 +01:00
ilyasosman
d8720d0849 Add DocsGPTWidget embedding support for HTML 2024-05-19 22:08:18 +03:00
Alex
6753b55160 Merge pull request #955 from sossost/feat/add_copy_button_on_code_snippet
Feat : add copy button on code snippet
2024-05-18 13:08:25 +01:00
Alex
7f7f48ad56 Merge pull request #958 from arc53/feat-pre-loading-embeds
chore: Update Docker build platforms for application and frontend and…
2024-05-18 12:49:19 +01:00
jang_yoonsu
149ca01029 fix : Add group property to code block parent element and add copy button condition 2024-05-18 20:43:13 +09:00
Alex
5c8133a810 chore: Update Docker build platforms for application and frontend and optimised embedding import 2024-05-18 12:10:24 +01:00
Alex
2adccdd1b0 Merge pull request #957 from ManishMadan2882/main
Update Sidebar
2024-05-17 14:37:44 +01:00
ManishMadan2882
b91068d658 (navbar): shrink navbar 2024-05-17 18:07:06 +05:30
Alex
4534cafd3f Merge pull request #949 from ManishMadan2882/main
Updating Hero
2024-05-16 23:32:49 +01:00
Alex
405e79d729 removed space 2024-05-16 23:32:12 +01:00
ManishMadan2882
4df2349e9d (hero) minor update 2024-05-17 00:59:47 +05:30
jang_yoonsu
a9b61d3e13 design : add style invisible when lg and visible when hover 2024-05-16 23:29:33 +09:00
jang_yoonsu
3767d14e5c feat: add copy button in code snippet 2024-05-16 23:23:46 +09:00
jang_yoonsu
889a050f25 feat : add copy button component 2024-05-16 23:23:06 +09:00
ManishMadan2882
0701fac807 (hero): hover button outline 2024-05-16 18:42:19 +05:30
ManishMadan2882
9fba91069a lint fix 2024-05-16 18:27:36 +05:30
ManishMadan2882
4f9ce70ff8 (hero): demo queries on click 2024-05-16 18:23:45 +05:30
Alex
5e00d4ded7 Merge pull request #953 from shelar1423/main
FIX: Spinner
2024-05-16 10:51:40 +01:00
digvijay shelar
95cd9ee5bb spinner fixed 2024-05-16 15:15:48 +05:30
Alex
40f16f8ef1 Merge pull request #952 from ManishMadan2882/fix-api-key-parse
FIx: API Key Parsing
2024-05-15 16:27:43 +01:00
ManishMadan2882
3d9288f82f fix: override chunks,promps with api-key-data 2024-05-15 20:23:02 +05:30
ManishMadan2882
c51f12f88b (conversation)- taller input field 2024-05-15 16:31:41 +05:30
Alex
0618153390 fix: object id bug 2024-05-14 19:01:45 +01:00
Alex
a7c066291b Update README.md 2024-05-13 17:08:12 +01:00
Alex
a69ac372fa Merge pull request #946 from siiddhantt/refactor/ui-elements
refactor: several small ui refactor for generalisation
2024-05-13 11:47:20 +01:00
Alex
16b2a54981 Merge pull request #936 from Fagner-lourenco/patch-1
Update Dockerfile
2024-05-12 22:36:52 +01:00
Alex
3f68e0d66f chore: Update Dockerfile 2024-05-12 22:33:43 +01:00
Alex
12d483fde6 chore: update documentation links to use the new domain 2024-05-12 11:40:09 +01:00
Siddhant Rai
96034a9712 fix: minor change 2024-05-12 12:56:34 +05:30
Siddhant Rai
d2def4479b refactor: several small ui refactor for generalisation 2024-05-12 12:41:12 +05:30
ManishMadan2882
afbbb913e7 (hero): updating the UI 2024-05-10 16:21:42 +05:30
Alex
ad76f239a3 Merge pull request #943 from arc53/dependabot/npm_and_yarn/docs/next-14.1.1
build(deps): bump next from 14.0.4 to 14.1.1 in /docs
2024-05-10 11:29:37 +01:00
dependabot[bot]
e6b096c9e0 build(deps): bump next from 14.0.4 to 14.1.1 in /docs
Bumps [next](https://github.com/vercel/next.js) from 14.0.4 to 14.1.1.
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/compare/v14.0.4...v14.1.1)

---
updated-dependencies:
- dependency-name: next
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-10 04:27:40 +00:00
Alex
6e26b4e6c7 Merge pull request #942 from ManishMadan2882/main
Fix: Abnormal overflow on mobile screens and arbitrary word breaks.
2024-05-09 16:29:42 +01:00
ManishMadan2882
ea79494b6d fix(conversation): overflows in sources, removed tagline below input 2024-05-08 20:50:20 +05:30
ManishMadan2882
afb18a3e4d (conversation) makes overflow auto 2024-05-08 16:17:16 +05:30
ManishMadan2882
f9c9853102 fix(conversation) word breaks 2024-05-08 16:07:49 +05:30
ManishMadan2882
b3eb9fb6fa fix(conversation): mobile abnormal overflows 2024-05-08 15:56:52 +05:30
Alex
d3b97bf51a Merge pull request #941 from ManishMadan2882/main
fix(UI):conversation,settings
2024-05-08 09:50:30 +01:00
ManishMadan2882
7a2e491199 fix(UI):conversation,settings 2024-05-07 20:37:05 +05:30
Alex
25efaf08b7 Merge pull request #935 from arc53/dependabot/pip/application/tqdm-4.66.3
build(deps): bump tqdm from 4.66.1 to 4.66.3 in /application
2024-05-07 09:52:09 +01:00
Alex
f893ea6b98 Merge pull request #934 from arc53/dependabot/pip/scripts/tqdm-4.66.3
build(deps): bump tqdm from 4.66.1 to 4.66.3 in /scripts
2024-05-07 09:51:57 +01:00
dependabot[bot]
500745b62c build(deps): bump tqdm from 4.66.1 to 4.66.3 in /application
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.1 to 4.66.3.
- [Release notes](https://github.com/tqdm/tqdm/releases)
- [Commits](https://github.com/tqdm/tqdm/compare/v4.66.1...v4.66.3)

---
updated-dependencies:
- dependency-name: tqdm
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-07 08:24:51 +00:00
Alex
9ebe5bf1a7 Merge pull request #939 from arc53/dependabot/pip/application/werkzeug-3.0.3
build(deps): bump werkzeug from 3.0.1 to 3.0.3 in /application
2024-05-07 09:23:58 +01:00
dependabot[bot]
4aecb86daa build(deps): bump werkzeug from 3.0.1 to 3.0.3 in /application
Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3.
- [Release notes](https://github.com/pallets/werkzeug/releases)
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3)

---
updated-dependencies:
- dependency-name: werkzeug
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-06 19:28:27 +00:00
Fagner-lourenco
6924dd6df6 Update Dockerfile 2024-05-04 20:50:11 -03:00
Alex
431755144e fix: Update count_tokens function in utils.py 2024-05-04 10:39:23 +01:00
dependabot[bot]
d182f81754 build(deps): bump tqdm from 4.66.1 to 4.66.3 in /scripts
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.1 to 4.66.3.
- [Release notes](https://github.com/tqdm/tqdm/releases)
- [Commits](https://github.com/tqdm/tqdm/compare/v4.66.1...v4.66.3)

---
updated-dependencies:
- dependency-name: tqdm
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-03 21:48:38 +00:00
Alex
de0193fffc Merge pull request #933 from siiddhantt/fix/remote-upload-issue
fix: remote upload error
2024-05-03 14:54:12 +01:00
Siddhant Rai
53e86205ad fix: added more headers from default 2024-05-03 18:47:30 +05:30
Siddhant Rai
aa670efe3a fix: connection aborted in WebBaseLoader 2024-05-03 18:25:01 +05:30
Alex
e693fe49a7 fix: fixed Dockerfile python path bug 2024-05-03 11:55:51 +01:00
Alex
7eaa32d85f remove gunicorn from final 2024-05-02 14:43:09 +01:00
Alex
ab40d2c37a remove pip from final container 2024-05-01 14:11:16 +01:00
Alex
784206b39b chore: Update Dockerfile to use Ubuntu mantic as base image and upgrade gunicorn to version 22.0.0 2024-05-01 13:19:16 +01:00
Alex
7c8264e221 Merge pull request #929 from TomasMatarazzo/issue-button-to-clean-chat-history
Issue button to clean chat history
2024-05-01 10:54:34 +01:00
TomasMatarazzo
db7195aa30 Update Navigation.tsx 2024-04-29 17:02:22 -03:00
TomasMatarazzo
eb7bbc1612 TS2741 2024-04-27 11:04:28 -03:00
TomasMatarazzo
ee3792181d probando 2024-04-26 20:35:36 -03:00
TomasMatarazzo
9804965a20 style in button and user in back route delete all conv 2024-04-25 23:43:45 -03:00
TomasMatarazzo
b84842df3d Fixing types 2024-04-22 16:35:44 -03:00
TomasMatarazzo
fc170d3033 Update package.json 2024-04-22 16:19:00 -03:00
TomasMatarazzo
8fa4ec7ad8 delete console.log 2024-04-22 16:17:26 -03:00
TomasMatarazzo
480825ddd7 now is working in settings 2024-04-22 16:16:19 -03:00
TomasMatarazzo
260e328cc1 first change 2024-04-22 14:41:59 -03:00
Alex
8873428b4b Merge pull request #926 from siiddhantt/feature
Feature: Logging token usage info to MongoDB
2024-04-22 12:10:00 +01:00
Alex
ab43c20b8f delete test output 2024-04-22 12:08:11 +01:00
TomasMatarazzo
88d9d4f4a3 Update DeleteConvModal.tsx 2024-04-18 13:56:03 -03:00
TomasMatarazzo
d4840f85c0 change text in modal 2024-04-18 13:50:08 -03:00
TomasMatarazzo
6f9ddeaed0 Button to clean chat history 2024-04-17 19:51:29 -03:00
Siddhant Rai
af5e73c8cb fix: user_api_key capturing 2024-04-16 15:31:11 +05:30
Siddhant Rai
333b6e60e1 fix: anthropic llm positional arguments 2024-04-16 10:02:04 +05:30
Siddhant Rai
1b61337b75 fix: skip logging to db during tests 2024-04-16 01:08:39 +05:30
Siddhant Rai
77991896b4 fix: api_key capturing + pytest errors 2024-04-15 22:32:24 +05:30
Siddhant Rai
60a670ce29 fix: changes to llm classes according to base 2024-04-15 19:47:24 +05:30
Siddhant Rai
c1c69ed22b fix: pytest issues 2024-04-15 19:35:59 +05:30
Siddhant Rai
d71c74c6fb Merge branch 'feature' of https://github.com/siiddhantt/DocsGPT into feature 2024-04-15 18:57:46 +05:30
Siddhant Rai
590aa8b43f update: apply decorator to abstract classes 2024-04-15 18:57:28 +05:30
Siddhant Rai
607e0166f6 Merge branch 'arc53:main' into feature 2024-04-15 18:55:09 +05:30
Alex
130c83ee92 Merge pull request #911 from arc53/dependabot/pip/application/pymongo-4.6.3
Bump pymongo from 4.6.1 to 4.6.3 in /application
2024-04-15 12:57:22 +01:00
Alex
fd5e418abf Merge pull request #919 from arc53/dependabot/npm_and_yarn/docs/multi-4407677fd1
build(deps): bump tar and npm in /docs
2024-04-15 12:29:26 +01:00
Siddhant Rai
262d160314 Merge with branch main 2024-04-15 15:18:48 +05:30
Siddhant Rai
9146827590 fix: removed unused import 2024-04-15 15:14:17 +05:30
Siddhant Rai
062b108259 Merge branch 'arc53:main' into feature 2024-04-15 15:04:10 +05:30
Siddhant Rai
ba796b6be1 feat: logging token usage to database 2024-04-15 15:03:00 +05:30
Alex
3d763235e1 Merge pull request #925 from ManishMadan2882/main
Untraced types in react widget
2024-04-14 11:43:03 +01:00
Manish Madan
c30c6d9f10 Merge branch 'arc53:main' into main 2024-04-13 16:20:56 +05:30
ManishMadan2882
311716ed18 refactored fs, fix: untracked dir 2024-04-13 16:01:46 +05:30
Alex
19bb1b4aa4 Create SECURITY.md 2024-04-12 09:39:33 +01:00
Alex
b8749e36b9 Merge pull request #921 from siiddhantt/bugfix
fix for missing fields in API Keys section
2024-04-10 10:25:26 +01:00
Siddhant Rai
00b6639155 fix: minor ui changes 2024-04-10 12:37:29 +05:30
Siddhant Rai
71d7daaef3 fix: minor ui changes 2024-04-10 12:23:37 +05:30
Siddhant Rai
8654c5d471 Merge branch 'bugfix' of https://github.com/siiddhantt/DocsGPT into bugfix 2024-04-10 12:11:51 +05:30
Siddhant Rai
02124b3d38 fix: missing fields from API Keys section 2024-04-10 12:11:34 +05:30
dependabot[bot]
340dcfb70d build(deps): bump tar and npm in /docs
Removes [tar](https://github.com/isaacs/node-tar). It's no longer used after updating ancestor dependency [npm](https://github.com/npm/cli). These dependencies need to be updated together.


Removes `tar`

Updates `npm` from 10.5.0 to 10.5.1
- [Release notes](https://github.com/npm/cli/releases)
- [Changelog](https://github.com/npm/cli/blob/latest/CHANGELOG.md)
- [Commits](https://github.com/npm/cli/compare/v10.5.0...v10.5.1)

---
updated-dependencies:
- dependency-name: tar
  dependency-type: indirect
- dependency-name: npm
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-09 21:09:48 +00:00
Alex
a37b92223a Merge pull request #915 from arc53/feat/retrievers-class
Update application files and fix LLM models, create new retriever class
2024-04-09 22:09:11 +01:00
Alex
7d2b8cb4fc Merge pull request #917 from arc53/multiple-uploads
Multiple file upload
2024-04-09 18:13:52 +01:00
Alex
8d7a134cb4 lint: ruff 2024-04-09 17:25:08 +01:00
Alex
4b849d7201 Fix SagemakerAPILLM test 2024-04-09 17:20:26 +01:00
Alex
e03e185d30 Add Brave Search retriever and update application files 2024-04-09 17:11:09 +01:00
Pavel
7a02df5588 Multiple uploads 2024-04-09 19:56:07 +04:00
Alex
19494685ba Update application files, fix LLM models, and create new retriever class 2024-04-09 16:38:42 +01:00
Alex
1e26943c3e Update application files, fix LLM models, and create new retriever class 2024-04-09 15:45:24 +01:00
dependabot[bot]
83fa850142 Bump pymongo from 4.6.1 to 4.6.3 in /application
Bumps [pymongo](https://github.com/mongodb/mongo-python-driver) from 4.6.1 to 4.6.3.
- [Release notes](https://github.com/mongodb/mongo-python-driver/releases)
- [Changelog](https://github.com/mongodb/mongo-python-driver/blob/master/doc/changelog.rst)
- [Commits](https://github.com/mongodb/mongo-python-driver/compare/4.6.1...4.6.3)

---
updated-dependencies:
- dependency-name: pymongo
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-09 14:22:15 +00:00
Alex
968a116d14 Merge pull request #916 from siiddhantt/bugfix
fix: updated qdrant-client to v1.8.2
2024-04-09 15:20:46 +01:00
Siddhant Rai
fb55b494d7 Merge branch 'arc53:main' into bugfix 2024-04-09 19:09:44 +05:30
Siddhant Rai
59b6a83d7d fix: issue #884 2024-04-09 19:08:59 +05:30
Alex
aabc4f0d7b Merge pull request #907 from siiddhantt/main
refactor: clean up settings file for better structure
2024-04-09 14:17:56 +01:00
Alex
391f686173 Update application files and fix LLM models, create new retriever class 2024-04-09 14:02:33 +01:00
Siddhant Rai
8e6f6d46ec fix: issue during build 2024-04-09 16:34:51 +05:30
Siddhant Rai
2ba7a55439 Merge branch 'arc53:main' into main 2024-04-09 13:54:48 +05:30
Alex
e07df29ab9 Update FLASK_DEBUG_MODE setting to use value from settings module 2024-04-08 13:27:43 +01:00
Alex
abf24fe60f Update FLASK_DEBUG_MODE setting to use value from settings module 2024-04-08 13:15:58 +01:00
Siddhant Rai
fad5f5b81f fix: added requested changes 2024-04-08 17:45:56 +05:30
Siddhant Rai
6961f49a0c Merge branch 'arc53:main' into main 2024-04-08 17:43:21 +05:30
Alex
6911f8652a Fix vectorstore path in check_docs function 2024-04-08 13:06:05 +01:00
Alex
6658cec6a0 Merge pull request #897 from arc53/dependabot/npm_and_yarn/frontend/vite-5.0.13
Bump vite from 5.0.12 to 5.0.13 in /frontend
2024-04-08 13:03:20 +01:00
Alex
14011b9d84 Merge pull request #891 from arc53/dependabot/npm_and_yarn/mock-backend/express-4.19.2
Bump express from 4.18.2 to 4.19.2 in /mock-backend
2024-04-08 13:02:58 +01:00
Alex
bd2d0b6790 Merge pull request from GHSA-p5qc-vj2x-9rjp
advisory-fix
2024-04-08 12:58:36 +01:00
Alex
d36f58230a advisory-fix 2024-04-08 12:56:27 +01:00
Alex
018f950ca3 Merge pull request #908 from arc53/api-keys-documentation-guide
API keys guide
2024-04-08 10:36:35 +01:00
Alex
db8db9fae9 Add prompt_id and chunks fields in create_api_key function 2024-04-08 10:35:15 +01:00
Pavel
79ce8d6563 guide 2024-04-07 20:14:16 +04:00
Alex
13eaa9a35a Merge pull request #904 from arc53/fix/update-docs-widget
Update api key to new data
2024-04-06 11:39:32 +01:00
Siddhant Rai
39f0d76b4b refactor: clean up settings file for better structure 2024-04-05 23:38:59 +05:30
Siddhant Rai
0a5832ec75 refactor: clean up settings file for better structure 2024-04-05 23:33:27 +05:30
Alex
6e147b3ed2 Update api key to new data 2024-04-05 14:49:32 +01:00
Alex
c162f79daa Merge pull request #903 from arc53/feature/api-key-create
Feature/api key create
2024-04-05 13:18:11 +01:00
Alex
87585be687 Merge branch 'main' into feature/api-key-create 2024-04-05 13:01:42 +01:00
Alex
ea08d6413c Merge pull request #902 from ManishMadan2882/feature/api-key-create
Add Prompt, Chunks in Create Key
2024-04-04 12:45:33 +01:00
Alex
879905edf6 Refactor create_api_key function to include prompt_id and chunks in routes.py 2024-04-04 12:38:23 +01:00
Alex
6fd80a5582 Merge pull request #899 from siiddhantt/main
feat: added prompts section under general in settings
2024-04-04 10:25:08 +01:00
Siddhant Rai
0dc7333563 fix: added API Keys in tabs 2024-04-04 14:42:14 +05:30
Siddhant Rai
f61c3168d2 fix: issue with editing new prompts 2024-04-04 14:29:37 +05:30
Siddhant Rai
9cadd74a96 fix: minor ui changes 2024-04-04 13:42:32 +05:30
Siddhant Rai
729fa2352b feat: added prompts section under general in settings 2024-04-04 00:48:49 +05:30
dependabot[bot]
b673aaf9f0 Bump vite from 5.0.12 to 5.0.13 in /frontend
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.0.12 to 5.0.13.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v5.0.13/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.0.13/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-03 17:53:09 +00:00
Alex
3132cc6005 Merge pull request #895 from sarfarazsiddiquii/new_branch
added feature #887
2024-04-03 17:09:06 +01:00
ManishMadan2882
ac994d3077 add prompt,chunks in create key 2024-04-03 19:19:53 +05:30
sarfaraz siddiqui
02d4f7f2da functions can accept null 2024-04-03 18:08:46 +05:30
Alex
d99569f005 Merge pull request #896 from arc53/api-update
api update
2024-04-01 18:22:13 +01:00
Pavel
ec5166249a api update
A description of 3 more api methods in documentation.
2024-04-01 21:07:27 +04:00
Alex
dadd12adb3 Update API key in DocsGPTWidget.tsx 2024-04-01 11:25:59 +01:00
Alex
88b4fb8c2a Update API key in DocsGPTWidget.tsx 2024-04-01 11:25:31 +01:00
sarfaraz siddiqui
afecae3786 added feature #887 2024-03-31 03:50:11 +05:30
Alex
d18598bc33 Merge pull request #894 from arc53/feature/api-key-create
Feature/api key create
2024-03-29 20:04:26 +00:00
Alex
794fc05ada Merge branch 'main' into feature/api-key-create 2024-03-29 19:59:45 +00:00
Alex
5daeb7f876 Merge pull request #892 from ManishMadan2882/feature/api-key-create
Feature/api key create
2024-03-29 19:57:25 +00:00
ManishMadan2882
53e71c545e api key modal - enhancements 2024-03-29 19:11:40 +05:30
ManishMadan2882
959a55e36c adding dark mode - api key 2024-03-29 04:13:12 +05:30
ManishMadan2882
64572b0024 feat(settings): api key endpoints 2024-03-29 03:26:45 +05:30
Manish Madan
9a0c1caa43 Merge branch 'arc53:feature/api-key-create' into feature/api-key-create 2024-03-28 19:28:23 +05:30
ManishMadan2882
eed6723147 feat(settings): api keys tab 2024-03-28 19:25:35 +05:30
Alex
97fabf51b8 Refactor conversationSlice.ts and conversationApi.ts 2024-03-28 13:43:10 +00:00
Alex
5e5e2b8aee Merge pull request #877 from siiddhantt/main
Added reddit loader
2024-03-27 16:55:01 +00:00
Siddhant Rai
e01071426f feat: field to pass number of posts as a parameter 2024-03-27 19:20:55 +05:30
Siddhant Rai
eed1bfbe50 feat: fields to handle reddit loader + minor changes 2024-03-26 16:07:44 +05:30
Siddhant Rai
0c3970a266 Merge branch 'arc53:main' into main 2024-03-26 16:07:25 +05:30
dependabot[bot]
267cfb621e Bump express from 4.18.2 to 4.19.2 in /mock-backend
Bumps [express](https://github.com/expressjs/express) from 4.18.2 to 4.19.2.
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/master/History.md)
- [Commits](https://github.com/expressjs/express/compare/4.18.2...4.19.2)

---
updated-dependencies:
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-26 10:25:02 +00:00
Alex
0e90febab2 Merge pull request #890 from arc53/dependabot/npm_and_yarn/docs/katex-0.16.10
Bump katex from 0.16.9 to 0.16.10 in /docs
2024-03-26 10:24:19 +00:00
dependabot[bot]
31d947837f Bump katex from 0.16.9 to 0.16.10 in /docs
Bumps [katex](https://github.com/KaTeX/KaTeX) from 0.16.9 to 0.16.10.
- [Release notes](https://github.com/KaTeX/KaTeX/releases)
- [Changelog](https://github.com/KaTeX/KaTeX/blob/main/CHANGELOG.md)
- [Commits](https://github.com/KaTeX/KaTeX/compare/v0.16.9...v0.16.10)

---
updated-dependencies:
- dependency-name: katex
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-25 20:31:43 +00:00
Alex
017b11fbba Merge pull request #888 from arc53/fix/parsing-chunks-issue
Fix parsing issue with chunks in store.ts
2024-03-23 11:47:11 +00:00
Alex
3c492062a9 Fix parsing issue with chunks in store.ts 2024-03-23 11:42:50 +00:00
Alex
b26b49d0ca Merge pull request #883 from arc53/feat/chunks
Add support for setting the number of chunks processed per query
2024-03-22 15:34:09 +00:00
Alex
ed08123550 Add support for setting the number of chunks processed per query 2024-03-22 14:50:56 +00:00
Alex
add2db5b7a Merge pull request #881 from arc53/fix_model_selection_for_openai
Fix model selection at least for openAI LLM_NAME
2024-03-21 16:47:52 +00:00
Siddhant Rai
f272d7121a Merge branch 'arc53:main' into main 2024-03-21 19:38:44 +05:30
Anton Larin
577556678c Fix model selection at least for openAI LLM_NAME 2024-03-21 10:14:48 +01:00
Alex
e146922367 Merge pull request #880 from ManishMadan2882/main
Customised Scrollbar, fixed: Hero wasn't completely scrollable in Mobile
2024-03-20 10:08:22 +00:00
ManishMadan2882
6f1548b7f8 customised scrollbar 2024-03-19 21:40:00 +05:30
ManishMadan2882
9e6fe47b44 fix(hero): not fully scrollable in mobile 2024-03-19 21:39:16 +05:30
Siddhant Rai
60cfea1126 feat: added reddit loader 2024-03-16 20:22:05 +05:30
Alex
80a4a094af lint 2024-03-14 11:37:33 +00:00
Alex
70e1560cb3 fix check on model 2024-03-14 11:37:01 +00:00
Alex
725033659a Merge pull request #876 from ManishMadan2882/main
Pause Auto-scroll on user interrupt
2024-03-14 11:33:07 +00:00
ManishMadan2882
059111fb57 widget: version release 2024-03-14 16:58:57 +05:30
ManishMadan2882
d4a5eadf13 docs: updated version 2024-03-14 16:54:52 +05:30
ManishMadan2882
79cf487ac5 purge unused deps, comments 2024-03-14 04:03:17 +05:30
ManishMadan2882
52ecbab859 purge logs 2024-03-14 04:01:35 +05:30
ManishMadan2882
adfc79bf92 block autoScroll on user interrupt 2024-03-14 02:25:33 +05:30
ManishMadan2882
2447bab924 add listener for wheel, touch events 2024-03-14 01:51:55 +05:30
Alex
1057ca78a6 default remote 2024-03-13 17:01:23 +00:00
ManishMadan2882
7e7f98fd92 sanitize html - add dompurify 2024-03-13 00:21:54 +05:30
ManishMadan2882
64552ce2de add snarkdown: markdown support 2024-03-12 18:12:27 +05:30
ManishMadan2882
7506256f42 fix(lint): 2 errors 2024-03-11 19:42:22 +05:30
ManishMadan2882
db75230521 pause scroll on user action 2024-03-11 19:21:17 +05:30
Alex
f8955d5607 Update README.md 2024-03-11 12:05:35 +00:00
Alex
0bad217b93 Merge pull request #867 from siiddhantt/main
fix: issue #157
2024-03-08 16:06:51 +00:00
Alex
4da400a136 Merge pull request #873 from ManishMadan2882/main 2024-03-07 19:26:33 +00:00
ManishMadan2882
24740bd341 fix(UI) overflow in next 2024-03-08 00:48:18 +05:30
ManishMadan2882
3b6a15de84 version update 2024-03-08 00:46:53 +05:30
ManishMadan2882
ac1f525a6c fix: fine tuned the css 2024-03-07 19:20:03 +05:30
ManishMadan2882
e3999bdb0c updating version 2024-03-07 19:14:55 +05:30
ManishMadan2882
ad3d5a30ec docs update - widget v0.3.3 2024-03-07 15:58:31 +05:30
Alex
e4b5847725 Merge pull request #872 from ManishMadan2882/main
Widget UI fixes
2024-03-07 09:54:51 +00:00
ManishMadan2882
1a91a245a3 ui fixes 2024-03-07 02:50:30 +05:30
Alex
229f62d071 Merge pull request #861 from ManishMadan2882/main
DocsGPT Widget
2024-03-06 17:38:47 +00:00
Alex
b96fe16770 Update docsgpt version to 0.3.0 2024-03-06 17:36:47 +00:00
Alex
97750cb5e2 Update package.json with new version and add repository information 2024-03-06 17:24:23 +00:00
Siddhant Rai
e1a2bd11a9 fix: upload dropdown also combined 2024-03-06 16:01:53 +05:30
ManishMadan2882
229b408252 adding fallback avatar 2024-03-06 01:58:52 +05:30
ManishMadan2882
ae929438a5 shifted to parcel, styled-components 2024-03-05 21:15:58 +05:30
Siddhant Rai
5daaf84e05 fix: combined two dropdowns into a single component 2024-03-05 14:26:08 +05:30
Siddhant Rai
19b09515a1 Merge branch 'arc53:main' into main 2024-03-05 14:22:51 +05:30
Alex
9ce6078c8b Merge pull request #863 from Anush008/main
feat: Qdrant vectorstore support
2024-03-04 12:41:11 +00:00
Siddhant Rai
51f588f4b1 fix: issue #157 2024-03-04 15:45:34 +05:30
Alex
5ee6605703 Merge pull request #835 from arc53/feature/remote-loads
Feature/remote loads
2024-03-01 15:42:42 +00:00
Alex
7ef97cfd81 fix abort 2024-03-01 15:42:22 +00:00
Alex
f4288f0bd4 remove sitemap 2024-03-01 14:41:03 +00:00
Alex
4a701cb993 Merge branch 'main' into feature/remote-loads 2024-03-01 14:38:27 +00:00
Anush008
00dfb07b15 chore: revert to faiss default 2024-02-29 09:48:38 +05:30
ManishMadan2882
5fffa8e9db adding rollup-plugin-import-css 2024-02-29 04:11:47 +05:30
Pavel
54d187a0ad Fixing ingestion metadata grouping 2024-02-28 19:52:58 +03:00
ManishMadan2882
192ce468b7 inline responsive module styles 2024-02-28 19:31:36 +05:30
Anush008
75c0cadb50 feat: Qdrant vector store 2024-02-28 11:49:15 +05:30
ManishMadan2882
5d578d4b3b preparing for npm publish 2024-02-27 21:31:08 +05:30
Alex
325a8889ab update url 2024-02-27 11:52:51 +00:00
ManishMadan2882
9cdd78e68c purge out dist, update gitignore 2024-02-27 15:51:05 +05:30
ManishMadan2882
3a6770a1ae preparing build 2024-02-26 21:10:22 +05:30
ManishMadan2882
8073924056 padding improved at the edges 2024-02-26 20:46:46 +05:30
ManishMadan2882
7b53e1c54b UI enhancement, scroll fix 2024-02-26 20:10:00 +05:30
Alex
c4c0516820 add endpoint 2024-02-26 14:31:54 +00:00
Alex
8d36f8850e Merge pull request #860 from arc53/Fix-ingestion-grouping
Fixing ingestion metadata grouping
2024-02-26 10:16:37 +00:00
ManishMadan2882
abe5f43f3d adding responsive markdown response, error alert 2024-02-26 03:15:31 +05:30
Pavel
c8d8a8d0b5 Fixing ingestion metadata grouping 2024-02-25 16:03:18 +03:00
ManishMadan2882
f60e88573a refactored UI strategy, added prompt response in chat box 2024-02-24 21:02:28 +05:30
Alex
4216671ea2 Update README.md 2024-02-24 12:28:31 +00:00
Alex
ee3ea7a970 Add wget and unzip packages to Dockerfile 2024-02-23 21:19:04 +00:00
Alex
2b644dbb01 Add Rust toolchain and download mpnet-base-v2.zip model 2024-02-23 21:15:26 +00:00
ManishMadan2882
63878e7ffd inititated shadcn 2024-02-19 04:14:09 +05:30
Alex
007cd6cff1 Add conversations to db.json 2024-02-18 19:33:45 +00:00
Alex
4375215baa Update port number in Dockerfile and server.js 2024-02-18 19:12:58 +00:00
Alex
8cc5e9db13 Merge pull request #856 from ManishMadan2882/main
(mock) adding prompt routes
2024-02-16 11:22:40 +00:00
ManishMadan2882
5685f831a7 (mock) adding prompt routes 2024-02-15 05:35:34 +05:30
Alex
0cb3d12d94 Refactor loader classes to accept inputs directly 2024-02-14 15:17:56 +00:00
Alex
0e38c6751b Merge pull request #854 from ManishMadan2882/main
Message Streaming with the Mock Server
2024-02-14 13:50:15 +00:00
ManishMadan2882
70ad1fb3d8 Merge branch 'main' of https://github.com/manishMadan2882/docsgpt 2024-02-14 18:50:02 +05:30
ManishMadan2882
44f27d91a0 purge console logs 2024-02-14 18:48:43 +05:30
Manish Madan
1bb559c285 Merge branch 'arc53:main' into main 2024-02-14 18:40:24 +05:30
ManishMadan2882
7a005ef126 streamed the sample response /stream 2024-02-14 18:39:21 +05:30
Pavel
030c2a740f upload_remote class 2024-02-13 23:41:36 +03:00
Alex
5dcde67ae9 Merge pull request #852 from arc53/feat/premaillm
fix: docsgpt provider
2024-02-13 15:20:05 +00:00
Alex
ee06fa85f1 fix: docsgpt provider 2024-02-13 15:06:52 +00:00
Alex
5b9352a946 Merge pull request #851 from arc53/feat/premaillm
Add PremAI LLM implementation
2024-02-13 14:14:20 +00:00
Alex
b7927d8d75 Add PremAI LLM implementation 2024-02-13 14:08:55 +00:00
Alex
c144f30606 Merge pull request #850 from ManishMadan2882/feature/remote-loads
adding remote uploads tab
2024-02-12 23:46:30 +00:00
ManishMadan2882
d2dba3a0db adding remote uploads tab 2024-02-13 01:53:25 +05:30
Alex
2c991583ff Merge pull request #848 from ManishMadan2882/main
Makes input field absolute in Conversation, fixes delete icon in Settings/Documents
2024-02-09 14:20:02 +00:00
Alex
2e14dec12d Merge pull request #849 from arc53/main
Sync
2024-02-09 14:05:39 +00:00
ManishMadan2882
8826f0ff3c slight UI improvements in input box 2024-02-09 19:17:26 +05:30
ManishMadan2882
9129f7fb33 fix(Conversation): input box UI 2024-02-09 19:12:48 +05:30
ManishMadan2882
c0ed54406f fix(settings): delete button 2024-02-09 18:04:24 +05:30
Alex
18be257e10 Merge pull request #847 from ManishMadan2882/main
Fix : error on changing conversation while streaming answer
2024-02-07 18:00:12 +00:00
ManishMadan2882
615d549494 slight fixes, checking for null case 2024-02-07 05:09:12 +05:30
ManishMadan2882
0ce39e7f52 purge logs and !need code 2024-02-07 05:04:16 +05:30
ManishMadan2882
3c68cbc955 fix(stream err on changing conversation) 2024-02-07 04:42:39 +05:30
ManishMadan2882
300430e2d5 fixes weird bug- dark theme hook 2024-02-06 05:17:43 +05:30
Alex
166a07732a Merge pull request #820 from Quentium-Forks/main
Bump dependencies & support next 14 for docs
2024-02-05 15:13:40 +00:00
Alex
510b517270 Merge pull request #844 from ManishMadan2882/main
Fix: Sidebar Icons update on changing theme
2024-02-01 09:55:50 +00:00
ManishMadan2882
dea385384a fixes, update Nav images on theme toggle 2024-02-01 03:43:05 +05:30
ManishMadan2882
7a1c9101b2 add custom hook for dark theme 2024-02-01 03:42:09 +05:30
Alex
2be523cf77 Fix handling of embeddings_key in api_search() function 2024-01-30 17:22:33 +00:00
Alex
c01e334487 Merge pull request #843 from larinam/fix_application
Fix application + script requirements.txt
2024-01-29 20:59:36 +00:00
Alex
a2418d1373 Add sentence-transformers library to requirements.txt and comment out model_name in base.py 2024-01-29 20:51:28 +00:00
Alex
a697248b26 Merge pull request #841 from ManishMadan2882/main
Message bubble responsiveness
2024-01-29 13:55:29 +00:00
ManishMadan2882
6058939c00 change size in copy, like , dislike icons 2024-01-29 19:10:03 +05:30
Anton Larin
318de530e3 fix openapi-parser requirement 2024-01-27 16:52:33 +01:00
Anton Larin
9e04b7796a application folder related changes:
* optimize content of requirements.txt
* upgrade libs
* fix imports
2024-01-27 16:25:19 +01:00
Anton Larin
e8099c4db5 script folder related changes:
* optmize content of requirements.txt
* upgrade libs
* fix imports
2024-01-27 14:58:08 +01:00
Alex
bf808811cc Update README.md 2024-01-26 12:21:09 +00:00
ManishMadan2882
f0293de1b9 ui adjustments 2024-01-26 03:06:15 +05:30
ManishMadan2882
810dcb90ce refactored the divs, prevent overlap 2024-01-26 02:47:51 +05:30
ManishMadan2882
a2f2b8fabc make responsive msg bubble 2024-01-26 02:33:50 +05:30
Alex
cbc5f47786 Merge pull request #837 from ManishMadan2882/main
Adding Dark Mode
2024-01-23 14:59:22 +00:00
ManishMadan2882
3e3886ced7 slight UI changes 2024-01-23 19:38:22 +05:30
ManishMadan2882
9ce39fd2ba made borders in settings a bit darker 2024-01-22 18:22:46 +05:30
ManishMadan2882
5b08cdedf0 revert changes in docker yaml 2024-01-22 16:22:19 +05:30
ManishMadan2882
67e4d40c49 added dark mode, About page and bubble icons 2024-01-22 16:11:26 +05:30
ManishMadan2882
537a733157 add dark mode - conversation, bubble, UI fixes 2024-01-22 02:56:07 +05:30
ManishMadan2882
5136e7726d added dark mode, Hero component 2024-01-21 17:18:23 +05:30
Alex
6e236ba74d Merge pull request #827 from Juneezee/vite-5
Upgrade to Vite 5
2024-01-19 14:01:40 +00:00
Eng Zer Jun
374b665089 Upgrade to Vite 5
This commit upgrades vite to the latest version 5, and also updates the
vite plugins to the latest version.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2024-01-19 21:34:28 +08:00
ManishMadan2882
ffecc9a0c7 add dark mode, in Settings 2024-01-19 19:01:13 +05:30
ManishMadan2882
0b997418d3 add dark - sidebar 2024-01-19 01:47:23 +05:30
ManishMadan2882
eaad8a4cf5 initialising dark mode 2024-01-18 02:39:40 +05:30
Alex
396b4595f4 Merge pull request #832 from ArnabBCA/main
Fixed Empty Document Name Upload
2024-01-16 17:07:02 +00:00
Arnab Ghosh
0752aae9ef Fixed Empty Document Name Upload 2024-01-16 15:35:48 +05:30
Alex
ad2221a677 Merge pull request #830 from arc53/feature/search-endpoint
Search docs not inside the /stream in the stream request
2024-01-15 16:55:53 +00:00
Alex
1713d693b1 Merge pull request #831 from ManishMadan2882/feature/search-endpoint
integrate /api/search endpoint, get sources post stream
2024-01-15 16:34:48 +00:00
ManishMadan2882
f4f056449f integrate /api/search endpoint, get sources post stream 2024-01-15 20:23:18 +05:30
Alex
6a70e3e45b Commented out unused code in api_search function 2024-01-12 14:39:17 +00:00
Alex
a04cdee33f Refactor source log generation in complete_stream function 2024-01-12 14:38:15 +00:00
Alex
157769eeb4 Add API endpoint for searching documents 2024-01-12 14:35:23 +00:00
Alex
667b66b926 Merge pull request #825 from arc53/feat/mongodb
Public LLM
2024-01-09 14:31:40 +00:00
Alex
c0f7b344d9 Update environment variables and installation instructions 2024-01-09 12:35:18 +00:00
Alex
060c59e97d Update mpnet-base-v2.zip download URL 2024-01-09 11:41:25 +00:00
Alex
b3461b7134 Add MPNet model and update vector store for Hugging Face embeddings 2024-01-09 11:39:32 +00:00
Pavel
001c450abb choice text 2024-01-09 13:05:16 +03:00
Pavel
ceaa5763d4 choice fix 2024-01-09 12:57:19 +03:00
Alex
b45fd58944 Update EMBEDDINGS_NAME in settings.py and test_vector_store.py 2024-01-09 00:34:04 +00:00
Alex
b3149def82 Update EMBEDDINGS_NAME in settings.py 2024-01-09 00:29:02 +00:00
Alex
378d498402 Remove unused imports in docsgpt_provider.py 2024-01-09 00:19:49 +00:00
Alex
98f52b32a3 Update README and Quickstart guide 2024-01-09 00:18:04 +00:00
Alex
0ab32a6f84 Update setup.sh script with new options for language model usage 2024-01-09 00:07:37 +00:00
Alex
71cc22325d Add application files and update setup script 2024-01-09 00:05:44 +00:00
Alex
e1b2991aa6 Update LLM_NAME and EMBEDDINGS_NAME 2024-01-09 00:01:31 +00:00
Alex
033bcf80d0 docsgpt llm provider 2024-01-08 23:35:37 +00:00
Alex
103118d558 Merge pull request #823 from ManishMadan2882/main
fix distortion on different browsers
2024-01-08 11:30:30 +00:00
ManishMadan2882
f91b5fa004 fix distortion on different browsers 2024-01-08 16:03:41 +05:30
Alex
7179bf7b67 Merge pull request #822 from arc53/feat/mongodb
Mongodb integration as vectorstore
2024-01-06 18:30:25 +00:00
Alex
a3e6239e6e fix: remove import 2024-01-06 18:23:20 +00:00
Alex
1fa12e56c6 Remove unused test cases in test_openai.py 2024-01-06 18:04:50 +00:00
Alex
4ff834de76 Refactor MongoDBVectorStore and add delete_index method 2024-01-06 17:59:01 +00:00
QuentiumYT
6db38ad769 Bump dependencies & support next 14 for docs
- Renamed _app.js to mdx (for Next 14)
- Lint next config file & package.json
2024-01-05 18:50:49 +01:00
Alex
293b7b09a9 init tests 2024-01-05 17:16:16 +00:00
Alex
d5945f9ee7 Update README.md 2024-01-05 13:58:22 +00:00
Alex
d1f5a6fc31 Merge pull request #816 from ManishMadan2882/main
adding responsive sidebar
2024-01-04 20:51:30 +00:00
ManishMadan2882
e7b9f5e4c3 adding responsive sidebar 2024-01-05 01:50:52 +05:30
Alex
7870749077 fix openai 2024-01-03 12:09:05 +00:00
Alex
c5352f443a Merge pull request #813 from CBID2/making-alt-text-less-redundant
fix: Making alt text less redundant
2024-01-03 10:52:08 +00:00
Christine
fd8b7aa0f2 fix: change alt text for setting 2024-01-03 04:17:53 +00:00
Christine
458ea266ec fix: change name to alt text for Discord and GitHub 2024-01-03 04:14:06 +00:00
Alex
9748eaba25 Merge pull request #811 from Rutam21/patch-2
Added new Deployment Guide for Kamatera Performance Cloud
2023-12-31 15:37:32 +00:00
Alex
887a3740b2 Update holopin.yml 2023-12-31 15:34:55 +00:00
Rutam Prita Mishra
2e7cfe9cd7 Added new Deployment Guide
This PR adds a new deployment guide for Kamatera Performance Cloud.
2023-12-26 16:28:57 +05:30
Alex
6dbe156a02 Update README.md 2023-12-25 11:52:37 +00:00
Alex
2a9ef6d48e Merge pull request #792 from arc53/dependabot/npm_and_yarn/frontend/vite-4.5.1
Bump vite from 4.5.0 to 4.5.1 in /frontend
2023-12-22 15:51:11 +00:00
Alex
6717ddbd0b Merge pull request #804 from arc53/dependabot/npm_and_yarn/docs/next-13.5.1
Bump next from 13.4.19 to 13.5.1 in /docs
2023-12-22 15:50:49 +00:00
Alex
47c1aab064 Merge pull request #793 from arc53/dependabot/npm_and_yarn/extensions/react-widget/vite-4.4.12
Bump vite from 4.4.9 to 4.4.12 in /extensions/react-widget
2023-12-22 15:50:08 +00:00
Alex
eda41658b9 Merge pull request #806 from arc53/cve/py-removal
fix: Remove py==1.11.0 from requirements.txt
2023-12-22 15:47:45 +00:00
Alex
7f79363944 fix: Remove py==1.11.0 from requirements.txt 2023-12-22 15:44:39 +00:00
Alex
25967f2a09 Merge pull request #805 from arc53/fix/scripts-vulnerabilities
fix: vulns
2023-12-22 15:35:37 +00:00
Alex
4d3963ad67 fix: vulns 2023-12-22 15:27:23 +00:00
dependabot[bot]
f78c5257dc Bump next from 13.4.19 to 13.5.1 in /docs
Bumps [next](https://github.com/vercel/next.js) from 13.4.19 to 13.5.1.
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/compare/v13.4.19...v13.5.1)

---
updated-dependencies:
- dependency-name: next
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-12-22 14:14:27 +00:00
Alex
ccc6234ac8 Merge pull request #803 from arc53/fix/dependency-upgrades
fix: cve upgrades
2023-12-22 14:13:41 +00:00
Alex
c81b0200eb fix: Update pydantic_settings version to 2.1.0 2023-12-22 14:08:03 +00:00
Alex
f039d37c8a fix: pydantic 2023-12-22 14:03:43 +00:00
Alex
237975bfef fix: cve upgrades 2023-12-22 13:25:57 +00:00
Alex
015bc7c8c3 hotfix source doc data 2023-12-22 12:10:35 +00:00
Alex
3da2a00ee9 Merge pull request #801 from Victorivus/bug/fix-#800-mistral_models
Update requirements.txt HF Transformers
2023-12-14 15:29:28 +00:00
Victorivus
16eca5bebf Update requirements.txt HF Transformers
Fix 'mistral' models missing
2023-12-12 15:16:31 +01:00
dependabot[bot]
f8ac5e0af3 Bump vite from 4.4.9 to 4.4.12 in /extensions/react-widget
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 4.4.9 to 4.4.12.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v4.4.12/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v4.4.12/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-12-06 01:52:56 +00:00
dependabot[bot]
eb48a153d9 Bump vite from 4.5.0 to 4.5.1 in /frontend
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 4.5.0 to 4.5.1.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v4.5.1/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v4.5.1/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-12-05 23:44:40 +00:00
Pavel
381a2740ee change input 2023-10-13 21:52:56 +04:00
Alex
8b3b16bce4 inputs 2023-10-13 08:46:35 +01:00
Pavel
024674eef3 List check 2023-10-13 11:42:42 +04:00
Pavel
b7d88b4c0f fix wrong link 2023-10-12 19:45:36 +04:00
Pavel
719ca63ec1 fixes 2023-10-12 19:40:23 +04:00
Pavel
2cfb416fd0 Desc loader 2023-10-12 13:44:32 +04:00
Pavel
50f07f9ef5 limit crawler 2023-10-12 12:53:33 +04:00
Pavel
c517bdd2e1 Crawler + sitemap 2023-10-12 12:35:26 +04:00
Pavel
658867cb46 No crawler, no sitemap 2023-10-12 01:03:40 +04:00
Alex
8f2ad38503 tests 2023-10-11 10:13:51 +01:00
265 changed files with 31935 additions and 11186 deletions

View File

@@ -1,4 +1,5 @@
API_KEY=<LLM api key (for example, open ai key)>
LLM_NAME=docsgpt
VITE_API_STREAMING=true
#For Azure (you can delete it if you don't use Azure)

15
.github/dependabot.yml vendored Normal file
View File

@@ -0,0 +1,15 @@
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
version: 2
updates:
- package-ecosystem: "pip" # See documentation for possible values
directory: "/application" # Location of package manifests
schedule:
interval: "weekly"
- package-ecosystem: "npm" # See documentation for possible values
directory: "/frontend" # Location of package manifests
schedule:
interval: "weekly"

6
.github/holopin.yml vendored
View File

@@ -1,5 +1,5 @@
organization: arc53
defaultSticker: cln9dm7qz164460gk5ksrgr034
defaultSticker: clqmdf0ed34290glbvqh0kzxd
stickers:
- id: cln9dm7qz164460gk5ksrgr034
alias: hacktober
- id: clqmdf0ed34290glbvqh0kzxd
alias: festive

View File

@@ -13,7 +13,6 @@ jobs:
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v3
@@ -36,7 +35,6 @@ jobs:
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
# Runs a single command using the runners shell
- name: Build and push Docker images to docker.io and ghcr.io
uses: docker/build-push-action@v4
with:

View File

@@ -8,11 +8,11 @@ on:
jobs:
deploy:
if: github.repository == 'arc53/DocsGPT'
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v3
@@ -40,7 +40,7 @@ jobs:
uses: docker/build-push-action@v4
with:
file: './frontend/Dockerfile'
platforms: linux/amd64
platforms: linux/amd64, linux/arm64
context: ./frontend
push: true
tags: |

View File

@@ -4,6 +4,7 @@ on:
- pull_request_target
jobs:
triage:
if: github.repository == 'arc53/DocsGPT'
permissions:
contents: read
pull-requests: write

View File

@@ -6,7 +6,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.9", "3.10", "3.11"]
python-version: ["3.11"]
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
@@ -21,7 +21,7 @@ jobs:
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Test with pytest and generate coverage report
run: |
python -m pytest --cov=application --cov=scripts --cov=extensions --cov-report=xml
python -m pytest --cov=application --cov-report=xml
- name: Upload coverage reports to Codecov
if: github.event_name == 'pull_request' && matrix.python-version == '3.11'
uses: codecov/codecov-action@v3

4
.gitignore vendored
View File

@@ -75,6 +75,7 @@ target/
# Jupyter Notebook
.ipynb_checkpoints
**/*.ipynb
# IPython
profile_default/
@@ -171,4 +172,5 @@ application/vectors/
node_modules/
.vscode/settings.json
models/
/models/
model/

16
.vscode/launch.json vendored Normal file
View File

@@ -0,0 +1,16 @@
{
"version": "0.2.0",
"configurations": [
{
"name": "Docker Debug Frontend",
"request": "launch",
"type": "chrome",
"preLaunchTask": "docker-compose: debug:frontend",
"url": "http://127.0.0.1:5173",
"webRoot": "${workspaceFolder}/frontend",
"skipFiles": [
"<node_internals>/**"
]
}
]
}

21
.vscode/tasks.json vendored Normal file
View File

@@ -0,0 +1,21 @@
{
"version": "2.0.0",
"tasks": [
{
"type": "docker-compose",
"label": "docker-compose: debug:frontend",
"dockerCompose": {
"up": {
"detached": true,
"services": [
"frontend"
],
"build": true
},
"files": [
"${workspaceFolder}/docker-compose.yaml"
]
}
}
]
}

37
HACKTOBERFEST.md Normal file
View File

@@ -0,0 +1,37 @@
# **🎉 Join the Hacktoberfest with DocsGPT and win a Free T-shirt and other prizes! 🎉**
Welcome, contributors! We're excited to announce that DocsGPT is participating in Hacktoberfest. Get involved by submitting meaningful pull requests.
All contributors with accepted PRs will receive a cool Holopin! 🤩 (Watch out for a reply in your PR to collect it).
### 🏆 Top 50 contributors will recieve a special T-shirt
### 🏆 [LLM Document analysis by LexEU competition](https://github.com/arc53/DocsGPT/blob/main/lexeu-competition.md):
A separate competition is available for those sumbit best new retrieval / workflow method that will analyze a Document using EU laws.
With 200$, 100$, 50$ prize for 1st, 2nd and 3rd place respectively.
You can find more information [here](https://github.com/arc53/DocsGPT/blob/main/lexeu-competition.md)
## 📜 Here's How to Contribute:
```text
🛠️ Code: This is the golden ticket! Make meaningful contributions through PRs.
🧩 API extention: Build an app utilising DocsGPT API. We prefer submissions that showcase original ideas and turn the API into an AI agent.
Non-Code Contributions:
📚 Wiki: Improve our documentation, Create a guide or change existing documentation.
🖥️ Design: Improve the UI/UX or design a new feature.
📝 Blogging or Content Creation: Write articles or create videos to showcase DocsGPT or highlight your contributions!
```
### 📝 Guidelines for Pull Requests:
- Familiarize yourself with the current contributions and our [Roadmap](https://github.com/orgs/arc53/projects/2).
- Before contributing we highly advise that you check existing [issues](https://github.com/arc53/DocsGPT/issues) or [create](https://github.com/arc53/DocsGPT/issues/new/choose) an issue and wait to get assigned.
- Once you are finished with your contribution, please fill in this [form](https://airtable.com/appikMaJwdHhC1SDP/pagoblCJ9W29wf6Hf/form).
- Refer to the [Documentation](https://docs.docsgpt.cloud/).
- Feel free to join our [Discord](https://discord.gg/n5BX8dh8rU) server. We're here to help newcomers, so don't hesitate to jump in! Join us [here](https://discord.gg/n5BX8dh8rU).
Thank you very much for considering contributing to DocsGPT during Hacktoberfest! 🙏 Your contributions (not just simple typo) could earn you a stylish new t-shirt and other prizes as a token of our appreciation. 🎁 Join us, and let's code together! 🚀

View File

@@ -7,9 +7,9 @@
</p>
<p align="left">
<strong><a href="https://docsgpt.arc53.com/">DocsGPT</a></strong> is a cutting-edge open-source solution that streamlines the process of finding information in the project documentation. With its integration of the powerful <strong>GPT</strong> models, developers can easily ask questions about a project and receive accurate answers.
<strong><a href="https://www.docsgpt.cloud/">DocsGPT</a></strong> is a cutting-edge open-source solution that streamlines the process of finding information in the project documentation. With its integration of the powerful <strong>GPT</strong> models, developers can easily ask questions about a project and receive accurate answers.
Say goodbye to time-consuming manual searches, and let <strong><a href="https://docsgpt.arc53.com/">DocsGPT</a></strong> help you quickly find the information you need. Try it out and see how it revolutionizes your project documentation experience. Contribute to its development and be a part of the future of AI-powered assistance.
Say goodbye to time-consuming manual searches, and let <strong><a href="https://www.docsgpt.cloud/">DocsGPT</a></strong> help you quickly find the information you need. Try it out and see how it revolutionizes your project documentation experience. Contribute to its development and be a part of the future of AI-powered assistance.
</p>
<div align="center">
@@ -18,16 +18,20 @@ Say goodbye to time-consuming manual searches, and let <strong><a href="https://
<a href="https://github.com/arc53/DocsGPT">![link to main GitHub showing Forks number](https://img.shields.io/github/forks/arc53/docsgpt?style=social)</a>
<a href="https://github.com/arc53/DocsGPT/blob/main/LICENSE">![link to license file](https://img.shields.io/github/license/arc53/docsgpt)</a>
<a href="https://discord.gg/n5BX8dh8rU">![link to discord](https://img.shields.io/discord/1070046503302877216)</a>
<a href="https://twitter.com/ATushynski">![X (formerly Twitter) URL](https://img.shields.io/twitter/follow/ATushynski)</a>
<a href="https://twitter.com/docsgptai">![X (formerly Twitter) URL](https://img.shields.io/twitter/follow/docsgptai)</a>
</div>
### 🎃 [Hacktoberfest Prizes, Rules & Q&A](https://github.com/arc53/DocsGPT/blob/main/HACKTOBERFEST.md) 🎃
### Our [Livestream to Dive into Hacktoberfest! Prizes, Rules & Q&A 🎉](https://www.youtube.com/watch?v=5QQaFFu9BC8) on 3rd of October
### Production Support / Help for Companies:
We're eager to provide personalized assistance when deploying your DocsGPT to a live environment.
- [Book Demo :wave:](https://airtable.com/appdeaL0F1qV8Bl2C/shrrJF1Ll7btCJRbP)
- [Book Enterprise / teams Demo :wave:](https://cal.com/arc53/docsgpt-demo-b2b?date=2024-09-27&month=2024-09)
- [Send Email :email:](mailto:contact@arc53.com?subject=DocsGPT%20support%2Fsolutions)
![video-example-of-docs-gpt](https://d3dg1063dc54p9.cloudfront.net/videos/demov3.gif)
@@ -40,29 +44,29 @@ You can find our roadmap [here](https://github.com/orgs/arc53/projects/2). Pleas
| Name | Base Model | Requirements (or similar) |
| --------------------------------------------------------------------- | ----------- | ------------------------- |
| [Docsgpt-7b-falcon](https://huggingface.co/Arc53/docsgpt-7b-falcon) | Falcon-7b | 1xA10G gpu |
| [Docsgpt-7b-mistral](https://huggingface.co/Arc53/docsgpt-7b-mistral) | Mistral-7b | 1xA10G gpu |
| [Docsgpt-14b](https://huggingface.co/Arc53/docsgpt-14b) | llama-2-14b | 2xA10 gpu's |
| [Docsgpt-40b-falcon](https://huggingface.co/Arc53/docsgpt-40b-falcon) | falcon-40b | 8xA10G gpu's |
If you don't have enough resources to run it, you can use bitsnbytes to quantize.
## Features
## End to End AI Framework for Information Retrieval
![Main features of DocsGPT showcasing six main features](https://user-images.githubusercontent.com/17906039/220427472-2644cff4-7666-46a5-819f-fc4a521f63c7.png)
![Architecture chart](https://github.com/user-attachments/assets/fc6a7841-ddfc-45e6-b5a0-d05fe648cbe2)
## Useful Links
- :mag: :fire: [Live preview](https://docsgpt.arc53.com/)
- :mag: :fire: [Cloud Version](https://app.docsgpt.cloud/)
- :speech_balloon: :tada: [Join our Discord](https://discord.gg/n5BX8dh8rU)
- :books: :sunglasses: [Guides](https://docs.docsgpt.co.uk/)
- :books: :sunglasses: [Guides](https://docs.docsgpt.cloud/)
- :couple: [Interested in contributing?](https://github.com/arc53/DocsGPT/blob/main/CONTRIBUTING.md)
- :file_folder: :rocket: [How to use any other documentation](https://docs.docsgpt.co.uk/Guides/How-to-train-on-other-documentation)
- :file_folder: :rocket: [How to use any other documentation](https://docs.docsgpt.cloud/Guides/How-to-train-on-other-documentation)
- :house: :closed_lock_with_key: [How to host it locally (so all data will stay on-premises)](https://docs.docsgpt.co.uk/Guides/How-to-use-different-LLM)
- :house: :closed_lock_with_key: [How to host it locally (so all data will stay on-premises)](https://docs.docsgpt.cloud/Guides/How-to-use-different-LLM)
## Project Structure
@@ -83,17 +87,18 @@ On Mac OS or Linux, write:
`./setup.sh`
It will install all the dependencies and allow you to download the local model or use OpenAI.
It will install all the dependencies and allow you to download the local model, use OpenAI or use our LLM API.
Otherwise, refer to this Guide:
Otherwise, refer to this Guide for Windows:
1. Download and open this repository with `git clone https://github.com/arc53/DocsGPT.git`
2. Create a `.env` file in your root directory and set the env variable `API_KEY` with your [OpenAI API key](https://platform.openai.com/account/api-keys) and `VITE_API_STREAMING` to true or false, depending on whether you want streaming answers or not.
2. Create a `.env` file in your root directory and set the env variables and `VITE_API_STREAMING` to true or false, depending on whether you want streaming answers or not.
It should look like this inside:
```
API_KEY=Yourkey
LLM_NAME=[docsgpt or openai or others]
VITE_API_STREAMING=true
API_KEY=[if LLM_NAME is openai]
```
See optional environment variables in the [/.env-template](https://github.com/arc53/DocsGPT/blob/main/.env-template) and [/application/.env_sample](https://github.com/arc53/DocsGPT/blob/main/application/.env_sample) files.
@@ -122,8 +127,8 @@ docker compose -f docker-compose-dev.yaml up -d
> [!Note]
> Make sure you have Python 3.10 or 3.11 installed.
1. Export required environment variables or prepare a `.env` file in the `/application` folder:
- Copy [.env_sample](https://github.com/arc53/DocsGPT/blob/main/application/.env_sample) and create `.env` with your OpenAI API token for the `API_KEY` and `EMBEDDINGS_KEY` fields.
1. Export required environment variables or prepare a `.env` file in the project folder:
- Copy [.env_sample](https://github.com/arc53/DocsGPT/blob/main/application/.env_sample) and create `.env`.
(check out [`application/core/settings.py`](application/core/settings.py) if you want to see more config options.)
@@ -144,14 +149,23 @@ python -m venv venv
venv/Scripts/activate
```
3. Change to the `application/` subdir by the command `cd application/` and install dependencies for the backend:
3. Download embedding model and save it in the `model/` folder:
You can use the script below, or download it manually from [here](https://d3dg1063dc54p9.cloudfront.net/models/embeddings/mpnet-base-v2.zip), unzip it and save it in the `model/` folder.
```commandline
wget https://d3dg1063dc54p9.cloudfront.net/models/embeddings/mpnet-base-v2.zip
unzip mpnet-base-v2.zip -d model
rm mpnet-base-v2.zip
```
4. Install dependencies for the backend:
```commandline
pip install -r application/requirements.txt
```
4. Run the app using `flask --app application/app.py run --host=0.0.0.0 --port=7091`.
5. Start worker with `celery -A application.app.celery worker -l INFO`.
5. Run the app using `flask --app application/app.py run --host=0.0.0.0 --port=7091`.
6. Start worker with `celery -A application.app.celery worker -l INFO`.
### Start Frontend

14
SECURITY.md Normal file
View File

@@ -0,0 +1,14 @@
# Security Policy
## Supported Versions
Supported Versions:
Currently, we support security patches by committing changes and bumping the version published on Github.
## Reporting a Vulnerability
Found a vulnerability? Please email us:
security@arc53.com

View File

@@ -1,23 +1,88 @@
FROM python:3.10-slim-bullseye as builder
# Builder Stage
FROM ubuntu:24.04 as builder
# Tiktoken requires Rust toolchain, so build it in a separate stage
RUN apt-get update && apt-get install -y gcc curl
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y && apt-get install --reinstall libc6-dev -y
ENV PATH="/root/.cargo/bin:${PATH}"
RUN pip install --upgrade pip && pip install tiktoken==0.3.3
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get install -y software-properties-common && \
add-apt-repository ppa:deadsnakes/ppa && \
# Install necessary packages and Python
apt-get update && \
apt-get install -y --no-install-recommends gcc wget unzip libc6-dev python3.11 python3.11-distutils python3.11-venv && \
rm -rf /var/lib/apt/lists/*
# Verify Python installation and setup symlink
RUN if [ -f /usr/bin/python3.11 ]; then \
ln -s /usr/bin/python3.11 /usr/bin/python; \
else \
echo "Python 3.11 not found"; exit 1; \
fi
# Download and unzip the model
RUN wget https://d3dg1063dc54p9.cloudfront.net/models/embeddings/mpnet-base-v2.zip && \
unzip mpnet-base-v2.zip -d model && \
rm mpnet-base-v2.zip
# Install Rust
RUN wget -q -O - https://sh.rustup.rs | sh -s -- -y
# Clean up to reduce container size
RUN apt-get remove --purge -y wget unzip && apt-get autoremove -y && rm -rf /var/lib/apt/lists/*
# Copy requirements.txt
COPY requirements.txt .
RUN pip install -r requirements.txt
FROM python:3.10-slim-bullseye
# Setup Python virtual environment
RUN python3.11 -m venv /venv
# Copy pre-built packages and binaries from builder stage
COPY --from=builder /usr/local/ /usr/local/
# Activate virtual environment and install Python packages
ENV PATH="/venv/bin:$PATH"
# Install Python packages
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir tiktoken && \
pip install --no-cache-dir -r requirements.txt
# Final Stage
FROM ubuntu:24.04 as final
RUN apt-get update && \
apt-get install -y software-properties-common && \
add-apt-repository ppa:deadsnakes/ppa && \
# Install Python
apt-get update && apt-get install -y --no-install-recommends python3.11 && \
ln -s /usr/bin/python3.11 /usr/bin/python && \
rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /app
COPY . /app/application
ENV FLASK_APP=app.py
ENV FLASK_DEBUG=true
# Create a non-root user: `appuser` (Feel free to choose a name)
RUN groupadd -r appuser && \
useradd -r -g appuser -d /app -s /sbin/nologin -c "Docker image user" appuser
# Copy the virtual environment and model from the builder stage
COPY --from=builder /venv /venv
COPY --from=builder /model /app/model
# Copy your application code
COPY . /app/application
# Change the ownership of the /app directory to the appuser
RUN mkdir -p /app/application/inputs/local
RUN chown -R appuser:appuser /app
# Set environment variables
ENV FLASK_APP=app.py \
FLASK_DEBUG=true \
PATH="/venv/bin:$PATH"
# Expose the port the app runs on
EXPOSE 7091
CMD ["gunicorn", "-w", "2", "--timeout", "120", "--bind", "0.0.0.0:7091", "application.wsgi:app"]
# Switch to non-root user
USER appuser
# Start Gunicorn
CMD ["gunicorn", "-w", "2", "--timeout", "120", "--bind", "0.0.0.0:7091", "application.wsgi:app"]

View File

@@ -1,42 +1,53 @@
import asyncio
import os
from flask import Blueprint, request, Response
import json
import datetime
import json
import logging
import os
import sys
import traceback
from pymongo import MongoClient
from bson.dbref import DBRef
from bson.objectid import ObjectId
from transformers import GPT2TokenizerFast
from flask import Blueprint, current_app, make_response, request, Response
from flask_restx import fields, Namespace, Resource
from pymongo import MongoClient
from application.core.settings import settings
from application.vectorstore.vector_creator import VectorCreator
from application.llm.llm_creator import LLMCreator
from application.error import bad_request
from application.extensions import api
from application.llm.llm_creator import LLMCreator
from application.retriever.retriever_creator import RetrieverCreator
from application.utils import check_required_fields
logger = logging.getLogger(__name__)
mongo = MongoClient(settings.MONGO_URI)
db = mongo["docsgpt"]
conversations_collection = db["conversations"]
vectors_collection = db["vectors"]
sources_collection = db["sources"]
prompts_collection = db["prompts"]
answer = Blueprint('answer', __name__)
api_key_collection = db["api_keys"]
user_logs_collection = db["user_logs"]
if settings.LLM_NAME == "gpt4":
gpt_model = 'gpt-4'
answer = Blueprint("answer", __name__)
answer_ns = Namespace("answer", description="Answer related operations", path="/")
api.add_namespace(answer_ns)
gpt_model = ""
# to have some kind of default behaviour
if settings.LLM_NAME == "openai":
gpt_model = "gpt-3.5-turbo"
elif settings.LLM_NAME == "anthropic":
gpt_model = 'claude-2'
else:
gpt_model = 'gpt-3.5-turbo'
gpt_model = "claude-2"
if settings.MODEL_NAME: # in case there is particular model name configured
gpt_model = settings.MODEL_NAME
# load the prompts
current_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
current_dir = os.path.dirname(
os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
)
with open(os.path.join(current_dir, "prompts", "chat_combine_default.txt"), "r") as f:
chat_combine_template = f.read()
@@ -47,7 +58,7 @@ with open(os.path.join(current_dir, "prompts", "chat_combine_creative.txt"), "r"
chat_combine_creative = f.read()
with open(os.path.join(current_dir, "prompts", "chat_combine_strict.txt"), "r") as f:
chat_combine_strict = f.read()
chat_combine_strict = f.read()
api_key_set = settings.API_KEY is not None
embeddings_key_set = settings.EMBEDDINGS_KEY is not None
@@ -58,11 +69,6 @@ async def async_generate(chain, question, chat_history):
return result
def count_tokens(string):
tokenizer = GPT2TokenizerFast.from_pretrained('gpt2')
return len(tokenizer(string)['input_ids'])
def run_async_chain(chain, question, chat_history):
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
@@ -75,271 +81,538 @@ def run_async_chain(chain, question, chat_history):
return result
def get_vectorstore(data):
if "active_docs" in data:
if data["active_docs"].split("/")[0] == "default":
vectorstore = ""
elif data["active_docs"].split("/")[0] == "local":
vectorstore = "indexes/" + data["active_docs"]
else:
vectorstore = "vectors/" + data["active_docs"]
if data["active_docs"] == "default":
vectorstore = ""
def get_data_from_api_key(api_key):
data = api_key_collection.find_one({"key": api_key})
# # Raise custom exception if the API key is not found
if data is None:
raise Exception("Invalid API Key, please generate new key", 401)
if "retriever" not in data:
data["retriever"] = None
if "source" in data and isinstance(data["source"], DBRef):
source_doc = db.dereference(data["source"])
data["source"] = str(source_doc["_id"])
if "retriever" in source_doc:
data["retriever"] = source_doc["retriever"]
else:
vectorstore = ""
vectorstore = os.path.join("application", vectorstore)
return vectorstore
data["source"] = {}
return data
def get_retriever(source_id: str):
doc = sources_collection.find_one({"_id": ObjectId(source_id)})
if doc is None:
raise Exception("Source document does not exist", 404)
retriever_name = None if "retriever" not in doc else doc["retriever"]
return retriever_name
def is_azure_configured():
return settings.OPENAI_API_BASE and settings.OPENAI_API_VERSION and settings.AZURE_DEPLOYMENT_NAME
return (
settings.OPENAI_API_BASE
and settings.OPENAI_API_VERSION
and settings.AZURE_DEPLOYMENT_NAME
)
def complete_stream(question, docsearch, chat_history, api_key, prompt_id, conversation_id):
llm = LLMCreator.create_llm(settings.LLM_NAME, api_key=api_key)
if prompt_id == 'default':
prompt = chat_combine_template
elif prompt_id == 'creative':
prompt = chat_combine_creative
elif prompt_id == 'strict':
prompt = chat_combine_strict
else:
prompt = prompts_collection.find_one({"_id": ObjectId(prompt_id)})["content"]
docs = docsearch.search(question, k=2)
if settings.LLM_NAME == "llama.cpp":
docs = [docs[0]]
# join all page_content together with a newline
docs_together = "\n".join([doc.page_content for doc in docs])
p_chat_combine = prompt.replace("{summaries}", docs_together)
messages_combine = [{"role": "system", "content": p_chat_combine}]
source_log_docs = []
for doc in docs:
if doc.metadata:
data = json.dumps({"type": "source", "doc": doc.page_content, "metadata": doc.metadata})
source_log_docs.append({"title": doc.metadata['title'].split('/')[-1], "text": doc.page_content})
else:
data = json.dumps({"type": "source", "doc": doc.page_content})
source_log_docs.append({"title": doc.page_content, "text": doc.page_content})
yield f"data:{data}\n\n"
if len(chat_history) > 1:
tokens_current_history = 0
# count tokens in history
chat_history.reverse()
for i in chat_history:
if "prompt" in i and "response" in i:
tokens_batch = count_tokens(i["prompt"]) + count_tokens(i["response"])
if tokens_current_history + tokens_batch < settings.TOKENS_MAX_HISTORY:
tokens_current_history += tokens_batch
messages_combine.append({"role": "user", "content": i["prompt"]})
messages_combine.append({"role": "system", "content": i["response"]})
messages_combine.append({"role": "user", "content": question})
response_full = ""
completion = llm.gen_stream(model=gpt_model, engine=settings.AZURE_DEPLOYMENT_NAME,
messages=messages_combine)
for line in completion:
data = json.dumps({"answer": str(line)})
response_full += str(line)
yield f"data: {data}\n\n"
# save conversation to database
if conversation_id is not None:
def save_conversation(conversation_id, question, response, source_log_docs, llm):
if conversation_id is not None and conversation_id != "None":
conversations_collection.update_one(
{"_id": ObjectId(conversation_id)},
{"$push": {"queries": {"prompt": question, "response": response_full, "sources": source_log_docs}}},
{
"$push": {
"queries": {
"prompt": question,
"response": response,
"sources": source_log_docs,
}
}
},
)
else:
# create new conversation
# generate summary
messages_summary = [{"role": "assistant", "content": "Summarise following conversation in no more than 3 "
"words, respond ONLY with the summary, use the same "
"language as the system \n\nUser: " + question + "\n\n" +
"AI: " +
response_full},
{"role": "user", "content": "Summarise following conversation in no more than 3 words, "
"respond ONLY with the summary, use the same language as the "
"system"}]
messages_summary = [
{
"role": "assistant",
"content": "Summarise following conversation in no more than 3 "
"words, respond ONLY with the summary, use the same "
"language as the system \n\nUser: "
+ question
+ "\n\n"
+ "AI: "
+ response,
},
{
"role": "user",
"content": "Summarise following conversation in no more than 3 words, "
"respond ONLY with the summary, use the same language as the "
"system",
},
]
completion = llm.gen(model=gpt_model, engine=settings.AZURE_DEPLOYMENT_NAME,
messages=messages_summary, max_tokens=30)
completion = llm.gen(model=gpt_model, messages=messages_summary, max_tokens=30)
conversation_id = conversations_collection.insert_one(
{"user": "local",
"date": datetime.datetime.utcnow(),
"name": completion,
"queries": [{"prompt": question, "response": response_full, "sources": source_log_docs}]}
{
"user": "local",
"date": datetime.datetime.utcnow(),
"name": completion,
"queries": [
{
"prompt": question,
"response": response,
"sources": source_log_docs,
}
],
}
).inserted_id
# send data.type = "end" to indicate that the stream has ended as json
data = json.dumps({"type": "id", "id": str(conversation_id)})
yield f"data: {data}\n\n"
data = json.dumps({"type": "end"})
yield f"data: {data}\n\n"
return conversation_id
@answer.route("/stream", methods=["POST"])
def stream():
data = request.get_json()
# get parameter from url question
question = data["question"]
history = data["history"]
# history to json object from string
history = json.loads(history)
conversation_id = data["conversation_id"]
if 'prompt_id' in data:
prompt_id = data["prompt_id"]
else:
prompt_id = 'default'
# check if active_docs is set
if not api_key_set:
api_key = data["api_key"]
else:
api_key = settings.API_KEY
if not embeddings_key_set:
embeddings_key = data["embeddings_key"]
else:
embeddings_key = settings.EMBEDDINGS_KEY
if "active_docs" in data:
vectorstore = get_vectorstore({"active_docs": data["active_docs"]})
else:
vectorstore = ""
docsearch = VectorCreator.create_vectorstore(settings.VECTOR_STORE, vectorstore, embeddings_key)
return Response(
complete_stream(question, docsearch,
chat_history=history, api_key=api_key,
prompt_id=prompt_id,
conversation_id=conversation_id), mimetype="text/event-stream"
)
@answer.route("/api/answer", methods=["POST"])
def api_answer():
data = request.get_json()
question = data["question"]
history = data["history"]
if "conversation_id" not in data:
conversation_id = None
else:
conversation_id = data["conversation_id"]
print("-" * 5)
if not api_key_set:
api_key = data["api_key"]
else:
api_key = settings.API_KEY
if not embeddings_key_set:
embeddings_key = data["embeddings_key"]
else:
embeddings_key = settings.EMBEDDINGS_KEY
if 'prompt_id' in data:
prompt_id = data["prompt_id"]
else:
prompt_id = 'default'
if prompt_id == 'default':
def get_prompt(prompt_id):
if prompt_id == "default":
prompt = chat_combine_template
elif prompt_id == 'creative':
elif prompt_id == "creative":
prompt = chat_combine_creative
elif prompt_id == 'strict':
elif prompt_id == "strict":
prompt = chat_combine_strict
else:
prompt = prompts_collection.find_one({"_id": ObjectId(prompt_id)})["content"]
return prompt
def complete_stream(
question, retriever, conversation_id, user_api_key, isNoneDoc=False
):
# use try and except to check for exception
try:
# check if the vectorstore is set
vectorstore = get_vectorstore(data)
# loading the index and the store and the prompt template
# Note if you have used other embeddings than OpenAI, you need to change the embeddings
docsearch = VectorCreator.create_vectorstore(settings.VECTOR_STORE, vectorstore, embeddings_key)
llm = LLMCreator.create_llm(settings.LLM_NAME, api_key=api_key)
docs = docsearch.search(question, k=2)
# join all page_content together with a newline
docs_together = "\n".join([doc.page_content for doc in docs])
p_chat_combine = prompt.replace("{summaries}", docs_together)
messages_combine = [{"role": "system", "content": p_chat_combine}]
response_full = ""
source_log_docs = []
for doc in docs:
if doc.metadata:
source_log_docs.append({"title": doc.metadata['title'].split('/')[-1], "text": doc.page_content})
else:
source_log_docs.append({"title": doc.page_content, "text": doc.page_content})
# join all page_content together with a newline
answer = retriever.gen()
sources = retriever.search()
for source in sources:
if "text" in source:
source["text"] = source["text"][:100].strip() + "..."
if len(sources) > 0:
data = json.dumps({"type": "source", "source": sources})
yield f"data: {data}\n\n"
for line in answer:
if "answer" in line:
response_full += str(line["answer"])
data = json.dumps(line)
yield f"data: {data}\n\n"
elif "source" in line:
source_log_docs.append(line["source"])
if isNoneDoc:
for doc in source_log_docs:
doc["source"] = "None"
if len(history) > 1:
tokens_current_history = 0
# count tokens in history
history.reverse()
for i in history:
if "prompt" in i and "response" in i:
tokens_batch = count_tokens(i["prompt"]) + count_tokens(i["response"])
if tokens_current_history + tokens_batch < settings.TOKENS_MAX_HISTORY:
tokens_current_history += tokens_batch
messages_combine.append({"role": "user", "content": i["prompt"]})
messages_combine.append({"role": "system", "content": i["response"]})
messages_combine.append({"role": "user", "content": question})
completion = llm.gen(model=gpt_model, engine=settings.AZURE_DEPLOYMENT_NAME,
messages=messages_combine)
result = {"answer": completion, "sources": source_log_docs}
logger.debug(result)
# generate conversationId
if conversation_id is not None:
conversations_collection.update_one(
{"_id": ObjectId(conversation_id)},
{"$push": {"queries": {"prompt": question,
"response": result["answer"], "sources": result['sources']}}},
llm = LLMCreator.create_llm(
settings.LLM_NAME, api_key=settings.API_KEY, user_api_key=user_api_key
)
if user_api_key is None:
conversation_id = save_conversation(
conversation_id, question, response_full, source_log_docs, llm
)
# send data.type = "end" to indicate that the stream has ended as json
data = json.dumps({"type": "id", "id": str(conversation_id)})
yield f"data: {data}\n\n"
else:
# create new conversation
# generate summary
messages_summary = [
{"role": "assistant", "content": "Summarise following conversation in no more than 3 words, "
"respond ONLY with the summary, use the same language as the system \n\n"
"User: " + question + "\n\n" + "AI: " + result["answer"]},
{"role": "user", "content": "Summarise following conversation in no more than 3 words, "
"respond ONLY with the summary, use the same language as the system"}
]
completion = llm.gen(
model=gpt_model,
engine=settings.AZURE_DEPLOYMENT_NAME,
messages=messages_summary,
max_tokens=30
)
conversation_id = conversations_collection.insert_one(
{"user": "local",
"date": datetime.datetime.utcnow(),
"name": completion,
"queries": [{"prompt": question, "response": result["answer"], "sources": source_log_docs}]}
).inserted_id
result["conversation_id"] = str(conversation_id)
# mock result
# result = {
# "answer": "The answer is 42",
# "sources": ["https://en.wikipedia.org/wiki/42_(number)", "https://en.wikipedia.org/wiki/42_(number)"]
# }
return result
retriever_params = retriever.get_params()
user_logs_collection.insert_one(
{
"action": "stream_answer",
"level": "info",
"user": "local",
"api_key": user_api_key,
"question": question,
"response": response_full,
"sources": source_log_docs,
"retriever_params": retriever_params,
"timestamp": datetime.datetime.now(datetime.timezone.utc),
}
)
data = json.dumps({"type": "end"})
yield f"data: {data}\n\n"
except Exception as e:
# print whole traceback
traceback.print_exc()
print(str(e))
return bad_request(500, str(e))
print("\033[91merr", str(e), file=sys.stderr)
data = json.dumps(
{
"type": "error",
"error": "Please try again later. We apologize for any inconvenience.",
"error_exception": str(e),
}
)
yield f"data: {data}\n\n"
return
@answer_ns.route("/stream")
class Stream(Resource):
stream_model = api.model(
"StreamModel",
{
"question": fields.String(
required=True, description="Question to be asked"
),
"history": fields.List(
fields.String, required=False, description="Chat history"
),
"conversation_id": fields.String(
required=False, description="Conversation ID"
),
"prompt_id": fields.String(
required=False, default="default", description="Prompt ID"
),
"selectedDocs": fields.String(
required=False, description="Selected documents"
),
"chunks": fields.Integer(
required=False, default=2, description="Number of chunks"
),
"token_limit": fields.Integer(required=False, description="Token limit"),
"retriever": fields.String(required=False, description="Retriever type"),
"api_key": fields.String(required=False, description="API key"),
"active_docs": fields.String(
required=False, description="Active documents"
),
"isNoneDoc": fields.Boolean(
required=False, description="Flag indicating if no document is used"
),
},
)
@api.expect(stream_model)
@api.doc(description="Stream a response based on the question and retriever")
def post(self):
data = request.get_json()
required_fields = ["question"]
missing_fields = check_required_fields(data, required_fields)
if missing_fields:
return missing_fields
try:
question = data["question"]
history = data.get("history", [])
history = json.loads(history)
conversation_id = data.get("conversation_id")
prompt_id = data.get("prompt_id", "default")
if "selectedDocs" in data and data["selectedDocs"] is None:
chunks = 0
else:
chunks = int(data.get("chunks", 2))
token_limit = data.get("token_limit", settings.DEFAULT_MAX_HISTORY)
retriever_name = data.get("retriever", "classic")
if "api_key" in data:
data_key = get_data_from_api_key(data["api_key"])
chunks = int(data_key.get("chunks", 2))
prompt_id = data_key.get("prompt_id", "default")
source = {"active_docs": data_key.get("source")}
retriever_name = data_key.get("retriever", retriever_name)
user_api_key = data["api_key"]
elif "active_docs" in data:
source = {"active_docs": data["active_docs"]}
retriever_name = get_retriever(data["active_docs"]) or retriever_name
user_api_key = None
else:
source = {}
user_api_key = None
current_app.logger.info(
f"/stream - request_data: {data}, source: {source}",
extra={"data": json.dumps({"request_data": data, "source": source})},
)
prompt = get_prompt(prompt_id)
retriever = RetrieverCreator.create_retriever(
retriever_name,
question=question,
source=source,
chat_history=history,
prompt=prompt,
chunks=chunks,
token_limit=token_limit,
gpt_model=gpt_model,
user_api_key=user_api_key,
)
return Response(
complete_stream(
question=question,
retriever=retriever,
conversation_id=conversation_id,
user_api_key=user_api_key,
isNoneDoc=data.get("isNoneDoc"),
),
mimetype="text/event-stream",
)
except ValueError:
message = "Malformed request body"
print("\033[91merr", str(message), file=sys.stderr)
return Response(
error_stream_generate(message),
status=400,
mimetype="text/event-stream",
)
except Exception as e:
current_app.logger.error(
f"/stream - error: {str(e)} - traceback: {traceback.format_exc()}",
extra={"error": str(e), "traceback": traceback.format_exc()},
)
message = e.args[0]
status_code = 400
# Custom exceptions with two arguments, index 1 as status code
if len(e.args) >= 2:
status_code = e.args[1]
return Response(
error_stream_generate(message),
status=status_code,
mimetype="text/event-stream",
)
def error_stream_generate(err_response):
data = json.dumps({"type": "error", "error": err_response})
yield f"data: {data}\n\n"
@answer_ns.route("/api/answer")
class Answer(Resource):
answer_model = api.model(
"AnswerModel",
{
"question": fields.String(
required=True, description="The question to answer"
),
"history": fields.List(
fields.String, required=False, description="Conversation history"
),
"conversation_id": fields.String(
required=False, description="Conversation ID"
),
"prompt_id": fields.String(
required=False, default="default", description="Prompt ID"
),
"chunks": fields.Integer(
required=False, default=2, description="Number of chunks"
),
"token_limit": fields.Integer(required=False, description="Token limit"),
"retriever": fields.String(required=False, description="Retriever type"),
"api_key": fields.String(required=False, description="API key"),
"active_docs": fields.String(
required=False, description="Active documents"
),
"isNoneDoc": fields.Boolean(
required=False, description="Flag indicating if no document is used"
),
},
)
@api.expect(answer_model)
@api.doc(description="Provide an answer based on the question and retriever")
def post(self):
data = request.get_json()
required_fields = ["question"]
missing_fields = check_required_fields(data, required_fields)
if missing_fields:
return missing_fields
try:
question = data["question"]
history = data.get("history", [])
conversation_id = data.get("conversation_id")
prompt_id = data.get("prompt_id", "default")
chunks = int(data.get("chunks", 2))
token_limit = data.get("token_limit", settings.DEFAULT_MAX_HISTORY)
retriever_name = data.get("retriever", "classic")
if "api_key" in data:
data_key = get_data_from_api_key(data["api_key"])
chunks = int(data_key.get("chunks", 2))
prompt_id = data_key.get("prompt_id", "default")
source = {"active_docs": data_key.get("source")}
retriever_name = data_key.get("retriever", retriever_name)
user_api_key = data["api_key"]
elif "active_docs" in data:
source = {"active_docs": data["active_docs"]}
retriever_name = get_retriever(data["active_docs"]) or retriever_name
user_api_key = None
else:
source = {}
user_api_key = None
prompt = get_prompt(prompt_id)
current_app.logger.info(
f"/api/answer - request_data: {data}, source: {source}",
extra={"data": json.dumps({"request_data": data, "source": source})},
)
retriever = RetrieverCreator.create_retriever(
retriever_name,
question=question,
source=source,
chat_history=history,
prompt=prompt,
chunks=chunks,
token_limit=token_limit,
gpt_model=gpt_model,
user_api_key=user_api_key,
)
source_log_docs = []
response_full = ""
for line in retriever.gen():
if "source" in line:
source_log_docs.append(line["source"])
elif "answer" in line:
response_full += line["answer"]
if data.get("isNoneDoc"):
for doc in source_log_docs:
doc["source"] = "None"
llm = LLMCreator.create_llm(
settings.LLM_NAME, api_key=settings.API_KEY, user_api_key=user_api_key
)
result = {"answer": response_full, "sources": source_log_docs}
result["conversation_id"] = str(
save_conversation(
conversation_id, question, response_full, source_log_docs, llm
)
)
retriever_params = retriever.get_params()
user_logs_collection.insert_one(
{
"action": "api_answer",
"level": "info",
"user": "local",
"api_key": user_api_key,
"question": question,
"response": response_full,
"sources": source_log_docs,
"retriever_params": retriever_params,
"timestamp": datetime.datetime.now(datetime.timezone.utc),
}
)
except Exception as e:
current_app.logger.error(
f"/api/answer - error: {str(e)} - traceback: {traceback.format_exc()}",
extra={"error": str(e), "traceback": traceback.format_exc()},
)
return bad_request(500, str(e))
return make_response(result, 200)
@answer_ns.route("/api/search")
class Search(Resource):
search_model = api.model(
"SearchModel",
{
"question": fields.String(
required=True, description="The question to search"
),
"chunks": fields.Integer(
required=False, default=2, description="Number of chunks"
),
"api_key": fields.String(
required=False, description="API key for authentication"
),
"active_docs": fields.String(
required=False, description="Active documents for retrieval"
),
"retriever": fields.String(required=False, description="Retriever type"),
"token_limit": fields.Integer(
required=False, description="Limit for tokens"
),
"isNoneDoc": fields.Boolean(
required=False, description="Flag indicating if no document is used"
),
},
)
@api.expect(search_model)
@api.doc(
description="Search for relevant documents based on the question and retriever"
)
def post(self):
data = request.get_json()
required_fields = ["question"]
missing_fields = check_required_fields(data, required_fields)
if missing_fields:
return missing_fields
try:
question = data["question"]
chunks = int(data.get("chunks", 2))
token_limit = data.get("token_limit", settings.DEFAULT_MAX_HISTORY)
retriever_name = data.get("retriever", "classic")
if "api_key" in data:
data_key = get_data_from_api_key(data["api_key"])
chunks = int(data_key.get("chunks", 2))
source = {"active_docs": data_key.get("source")}
user_api_key = data["api_key"]
elif "active_docs" in data:
source = {"active_docs": data["active_docs"]}
user_api_key = None
else:
source = {}
user_api_key = None
current_app.logger.info(
f"/api/answer - request_data: {data}, source: {source}",
extra={"data": json.dumps({"request_data": data, "source": source})},
)
retriever = RetrieverCreator.create_retriever(
retriever_name,
question=question,
source=source,
chat_history=[],
prompt="default",
chunks=chunks,
token_limit=token_limit,
gpt_model=gpt_model,
user_api_key=user_api_key,
)
docs = retriever.search()
retriever_params = retriever.get_params()
user_logs_collection.insert_one(
{
"action": "api_search",
"level": "info",
"user": "local",
"api_key": user_api_key,
"question": question,
"sources": docs,
"retriever_params": retriever_params,
"timestamp": datetime.datetime.now(datetime.timezone.utc),
}
)
if data.get("isNoneDoc"):
for doc in docs:
doc["source"] = "None"
except Exception as e:
current_app.logger.error(
f"/api/search - error: {str(e)} - traceback: {traceback.format_exc()}",
extra={"error": str(e), "traceback": traceback.format_exc()},
)
return bad_request(500, str(e))
return make_response(docs, 200)

75
application/api/internal/routes.py Normal file → Executable file
View File

@@ -3,18 +3,23 @@ import datetime
from flask import Blueprint, request, send_from_directory
from pymongo import MongoClient
from werkzeug.utils import secure_filename
from bson.objectid import ObjectId
from application.core.settings import settings
mongo = MongoClient(settings.MONGO_URI)
db = mongo["docsgpt"]
conversations_collection = db["conversations"]
vectors_collection = db["vectors"]
sources_collection = db["sources"]
current_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
current_dir = os.path.dirname(
os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
)
internal = Blueprint("internal", __name__)
internal = Blueprint('internal', __name__)
@internal.route("/api/download", methods=["get"])
def download_file():
user = secure_filename(request.args.get("user"))
@@ -24,7 +29,6 @@ def download_file():
return send_from_directory(save_dir, filename, as_attachment=True)
@internal.route("/api/upload_index", methods=["POST"])
def upload_index_files():
"""Upload two files(index.faiss, index.pkl) to the user's folder."""
@@ -34,7 +38,14 @@ def upload_index_files():
if "name" not in request.form:
return {"status": "no name"}
job_name = secure_filename(request.form["name"])
save_dir = os.path.join(current_dir, "indexes", user, job_name)
tokens = secure_filename(request.form["tokens"])
retriever = secure_filename(request.form["retriever"])
id = secure_filename(request.form["id"])
type = secure_filename(request.form["type"])
remote_data = request.form["remote_data"] if "remote_data" in request.form else None
sync_frequency = secure_filename(request.form["sync_frequency"]) if "sync_frequency" in request.form else None
save_dir = os.path.join(current_dir, "indexes", str(id))
if settings.VECTOR_STORE == "faiss":
if "file_faiss" not in request.files:
print("No file part")
@@ -49,21 +60,45 @@ def upload_index_files():
if file_pkl.filename == "":
return {"status": "no file name"}
# saves index files
if not os.path.exists(save_dir):
os.makedirs(save_dir)
file_faiss.save(os.path.join(save_dir, "index.faiss"))
file_pkl.save(os.path.join(save_dir, "index.pkl"))
# create entry in vectors_collection
vectors_collection.insert_one(
{
"user": user,
"name": job_name,
"language": job_name,
"location": save_dir,
"date": datetime.datetime.now().strftime("%d/%m/%Y %H:%M:%S"),
"model": settings.EMBEDDINGS_NAME,
"type": "local",
}
)
return {"status": "ok"}
existing_entry = sources_collection.find_one({"_id": ObjectId(id)})
if existing_entry:
sources_collection.update_one(
{"_id": ObjectId(id)},
{
"$set": {
"user": user,
"name": job_name,
"language": job_name,
"date": datetime.datetime.now().strftime("%d/%m/%Y %H:%M:%S"),
"model": settings.EMBEDDINGS_NAME,
"type": type,
"tokens": tokens,
"retriever": retriever,
"remote_data": remote_data,
"sync_frequency": sync_frequency,
}
},
)
else:
sources_collection.insert_one(
{
"_id": ObjectId(id),
"user": user,
"name": job_name,
"language": job_name,
"date": datetime.datetime.now().strftime("%d/%m/%Y %H:%M:%S"),
"model": settings.EMBEDDINGS_NAME,
"type": type,
"tokens": tokens,
"retriever": retriever,
"remote_data": remote_data,
"sync_frequency": sync_frequency,
}
)
return {"status": "ok"}

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,38 @@
from application.worker import ingest_worker
from application.celery import celery
from datetime import timedelta
from application.celery_init import celery
from application.worker import ingest_worker, remote_worker, sync_worker
@celery.task(bind=True)
def ingest(self, directory, formats, name_job, filename, user):
resp = ingest_worker(self, directory, formats, name_job, filename, user)
return resp
@celery.task(bind=True)
def ingest_remote(self, source_data, job_name, user, loader):
resp = remote_worker(self, source_data, job_name, user, loader)
return resp
@celery.task(bind=True)
def schedule_syncs(self, frequency):
resp = sync_worker(self, frequency)
return resp
@celery.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
sender.add_periodic_task(
timedelta(days=1),
schedule_syncs.s("daily"),
)
sender.add_periodic_task(
timedelta(weeks=1),
schedule_syncs.s("weekly"),
)
sender.add_periodic_task(
timedelta(days=30),
schedule_syncs.s("monthly"),
)

View File

@@ -1,17 +1,23 @@
import platform
import dotenv
from application.celery import celery
from flask import Flask, request, redirect
from application.core.settings import settings
from application.api.user.routes import user
from flask import Flask, redirect, request
from application.api.answer.routes import answer
from application.api.internal.routes import internal
from application.api.user.routes import user
from application.celery_init import celery
from application.core.logging_config import setup_logging
from application.core.settings import settings
from application.extensions import api
if platform.system() == "Windows":
import pathlib
pathlib.PosixPath = pathlib.WindowsPath
dotenv.load_dotenv()
setup_logging()
app = Flask(__name__)
app.register_blueprint(user)
@@ -21,16 +27,19 @@ app.config.update(
UPLOAD_FOLDER="inputs",
CELERY_BROKER_URL=settings.CELERY_BROKER_URL,
CELERY_RESULT_BACKEND=settings.CELERY_RESULT_BACKEND,
MONGO_URI=settings.MONGO_URI
MONGO_URI=settings.MONGO_URI,
)
celery.config_from_object("application.celeryconfig")
api.init_app(app)
@app.route("/")
def home():
if request.remote_addr in ('0.0.0.0', '127.0.0.1', 'localhost', '172.18.0.1'):
return redirect('http://localhost:5173')
if request.remote_addr in ("0.0.0.0", "127.0.0.1", "localhost", "172.18.0.1"):
return redirect("http://localhost:5173")
else:
return 'Welcome to DocsGPT Backend!'
return "Welcome to DocsGPT Backend!"
@app.after_request
def after_request(response):
@@ -39,6 +48,6 @@ def after_request(response):
response.headers.add("Access-Control-Allow-Methods", "GET,PUT,POST,DELETE,OPTIONS")
return response
if __name__ == "__main__":
app.run(debug=True, port=7091)
if __name__ == "__main__":
app.run(debug=settings.FLASK_DEBUG_MODE, port=7091)

View File

@@ -1,9 +1,15 @@
from celery import Celery
from application.core.settings import settings
from celery.signals import setup_logging
def make_celery(app_name=__name__):
celery = Celery(app_name, broker=settings.CELERY_BROKER_URL, backend=settings.CELERY_RESULT_BACKEND)
celery.conf.update(settings)
return celery
@setup_logging.connect
def config_loggers(*args, **kwargs):
from application.core.logging_config import setup_logging
setup_logging()
celery = make_celery()

View File

@@ -0,0 +1,22 @@
from logging.config import dictConfig
def setup_logging():
dictConfig({
'version': 1,
'formatters': {
'default': {
'format': '[%(asctime)s] %(levelname)s in %(module)s: %(message)s',
}
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"stream": "ext://sys.stdout",
"formatter": "default",
}
},
'root': {
'level': 'INFO',
'handlers': ['console'],
},
})

View File

@@ -1,42 +1,75 @@
from pathlib import Path
from typing import Optional
import os
from pydantic import BaseSettings
from pydantic_settings import BaseSettings
current_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
class Settings(BaseSettings):
LLM_NAME: str = "openai"
EMBEDDINGS_NAME: str = "openai_text-embedding-ada-002"
LLM_NAME: str = "docsgpt"
MODEL_NAME: Optional[str] = None # if LLM_NAME is openai, MODEL_NAME can be gpt-4 or gpt-3.5-turbo
EMBEDDINGS_NAME: str = "huggingface_sentence-transformers/all-mpnet-base-v2"
CELERY_BROKER_URL: str = "redis://localhost:6379/0"
CELERY_RESULT_BACKEND: str = "redis://localhost:6379/1"
MONGO_URI: str = "mongodb://localhost:27017/docsgpt"
MODEL_PATH: str = os.path.join(current_dir, "models/docsgpt-7b-f16.gguf")
TOKENS_MAX_HISTORY: int = 150
DEFAULT_MAX_HISTORY: int = 150
MODEL_TOKEN_LIMITS: dict = {"gpt-3.5-turbo": 4096, "claude-2": 1e5}
UPLOAD_FOLDER: str = "inputs"
VECTOR_STORE: str = "faiss" # "faiss" or "elasticsearch"
VECTOR_STORE: str = "faiss" # "faiss" or "elasticsearch" or "qdrant" or "milvus"
RETRIEVERS_ENABLED: list = ["classic_rag", "duckduck_search"] # also brave_search
API_URL: str = "http://localhost:7091" # backend url for celery worker
API_KEY: str = None # LLM api key
EMBEDDINGS_KEY: str = None # api key for embeddings (if using openai, just copy API_KEY)
OPENAI_API_BASE: str = None # azure openai api base url
OPENAI_API_VERSION: str = None # azure openai api version
AZURE_DEPLOYMENT_NAME: str = None # azure deployment name for answering
AZURE_EMBEDDINGS_DEPLOYMENT_NAME: str = None # azure deployment name for embeddings
API_KEY: Optional[str] = None # LLM api key
EMBEDDINGS_KEY: Optional[str] = None # api key for embeddings (if using openai, just copy API_KEY)
OPENAI_API_BASE: Optional[str] = None # azure openai api base url
OPENAI_API_VERSION: Optional[str] = None # azure openai api version
AZURE_DEPLOYMENT_NAME: Optional[str] = None # azure deployment name for answering
AZURE_EMBEDDINGS_DEPLOYMENT_NAME: Optional[str] = None # azure deployment name for embeddings
OPENAI_BASE_URL: Optional[str] = None # openai base url for open ai compatable models
# elasticsearch
ELASTIC_CLOUD_ID: str = None # cloud id for elasticsearch
ELASTIC_USERNAME: str = None # username for elasticsearch
ELASTIC_PASSWORD: str = None # password for elasticsearch
ELASTIC_URL: str = None # url for elasticsearch
ELASTIC_INDEX: str = "docsgpt" # index name for elasticsearch
ELASTIC_CLOUD_ID: Optional[str] = None # cloud id for elasticsearch
ELASTIC_USERNAME: Optional[str] = None # username for elasticsearch
ELASTIC_PASSWORD: Optional[str] = None # password for elasticsearch
ELASTIC_URL: Optional[str] = None # url for elasticsearch
ELASTIC_INDEX: Optional[str] = "docsgpt" # index name for elasticsearch
# SageMaker config
SAGEMAKER_ENDPOINT: str = None # SageMaker endpoint name
SAGEMAKER_REGION: str = None # SageMaker region name
SAGEMAKER_ACCESS_KEY: str = None # SageMaker access key
SAGEMAKER_SECRET_KEY: str = None # SageMaker secret key
SAGEMAKER_ENDPOINT: Optional[str] = None # SageMaker endpoint name
SAGEMAKER_REGION: Optional[str] = None # SageMaker region name
SAGEMAKER_ACCESS_KEY: Optional[str] = None # SageMaker access key
SAGEMAKER_SECRET_KEY: Optional[str] = None # SageMaker secret key
# prem ai project id
PREMAI_PROJECT_ID: Optional[str] = None
# Qdrant vectorstore config
QDRANT_COLLECTION_NAME: Optional[str] = "docsgpt"
QDRANT_LOCATION: Optional[str] = None
QDRANT_URL: Optional[str] = None
QDRANT_PORT: Optional[int] = 6333
QDRANT_GRPC_PORT: int = 6334
QDRANT_PREFER_GRPC: bool = False
QDRANT_HTTPS: Optional[bool] = None
QDRANT_API_KEY: Optional[str] = None
QDRANT_PREFIX: Optional[str] = None
QDRANT_TIMEOUT: Optional[float] = None
QDRANT_HOST: Optional[str] = None
QDRANT_PATH: Optional[str] = None
QDRANT_DISTANCE_FUNC: str = "Cosine"
# Milvus vectorstore config
MILVUS_COLLECTION_NAME: Optional[str] = "docsgpt"
MILVUS_URI: Optional[str] = "./milvus_local.db" # milvus lite version as default
MILVUS_TOKEN: Optional[str] = ""
BRAVE_SEARCH_API_KEY: Optional[str] = None
FLASK_DEBUG_MODE: bool = False
path = Path(__file__).parent.parent.absolute()

View File

@@ -0,0 +1,7 @@
from flask_restx import Api
api = Api(
version="1.0",
title="DocsGPT API",
description="API for DocsGPT",
)

Binary file not shown.

Binary file not shown.

View File

@@ -1,21 +1,29 @@
from application.llm.base import BaseLLM
from application.core.settings import settings
class AnthropicLLM(BaseLLM):
def __init__(self, api_key=None):
def __init__(self, api_key=None, user_api_key=None, *args, **kwargs):
from anthropic import Anthropic, HUMAN_PROMPT, AI_PROMPT
self.api_key = api_key or settings.ANTHROPIC_API_KEY # If not provided, use a default from settings
super().__init__(*args, **kwargs)
self.api_key = (
api_key or settings.ANTHROPIC_API_KEY
) # If not provided, use a default from settings
self.user_api_key = user_api_key
self.anthropic = Anthropic(api_key=self.api_key)
self.HUMAN_PROMPT = HUMAN_PROMPT
self.AI_PROMPT = AI_PROMPT
def gen(self, model, messages, engine=None, max_tokens=300, stream=False, **kwargs):
context = messages[0]['content']
user_question = messages[-1]['content']
def _raw_gen(
self, baseself, model, messages, stream=False, max_tokens=300, **kwargs
):
context = messages[0]["content"]
user_question = messages[-1]["content"]
prompt = f"### Context \n {context} \n ### Question \n {user_question}"
if stream:
return self.gen_stream(model, prompt, max_tokens, **kwargs)
return self.gen_stream(model, prompt, stream, max_tokens, **kwargs)
completion = self.anthropic.completions.create(
model=model,
@@ -25,9 +33,11 @@ class AnthropicLLM(BaseLLM):
)
return completion.completion
def gen_stream(self, model, messages, engine=None, max_tokens=300, **kwargs):
context = messages[0]['content']
user_question = messages[-1]['content']
def _raw_gen_stream(
self, baseself, model, messages, stream=True, max_tokens=300, **kwargs
):
context = messages[0]["content"]
user_question = messages[-1]["content"]
prompt = f"### Context \n {context} \n ### Question \n {user_question}"
stream_response = self.anthropic.completions.create(
model=model,
@@ -37,4 +47,4 @@ class AnthropicLLM(BaseLLM):
)
for completion in stream_response:
yield completion.completion
yield completion.completion

View File

@@ -1,14 +1,28 @@
from abc import ABC, abstractmethod
from application.usage import gen_token_usage, stream_token_usage
class BaseLLM(ABC):
def __init__(self):
pass
self.token_usage = {"prompt_tokens": 0, "generated_tokens": 0}
def _apply_decorator(self, method, decorator, *args, **kwargs):
return decorator(method, *args, **kwargs)
@abstractmethod
def gen(self, *args, **kwargs):
def _raw_gen(self, model, messages, stream, *args, **kwargs):
pass
def gen(self, model, messages, stream=False, *args, **kwargs):
return self._apply_decorator(self._raw_gen, gen_token_usage)(
self, model=model, messages=messages, stream=stream, *args, **kwargs
)
@abstractmethod
def gen_stream(self, *args, **kwargs):
def _raw_gen_stream(self, model, messages, stream, *args, **kwargs):
pass
def gen_stream(self, model, messages, stream=True, *args, **kwargs):
return self._apply_decorator(self._raw_gen_stream, stream_token_usage)(
self, model=model, messages=messages, stream=stream, *args, **kwargs
)

View File

@@ -0,0 +1,44 @@
from application.llm.base import BaseLLM
import json
import requests
class DocsGPTAPILLM(BaseLLM):
def __init__(self, api_key=None, user_api_key=None, *args, **kwargs):
super().__init__(*args, **kwargs)
self.api_key = api_key
self.user_api_key = user_api_key
self.endpoint = "https://llm.docsgpt.co.uk"
def _raw_gen(self, baseself, model, messages, stream=False, *args, **kwargs):
context = messages[0]["content"]
user_question = messages[-1]["content"]
prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"
response = requests.post(
f"{self.endpoint}/answer", json={"prompt": prompt, "max_new_tokens": 30}
)
response_clean = response.json()["a"].replace("###", "")
return response_clean
def _raw_gen_stream(self, baseself, model, messages, stream=True, *args, **kwargs):
context = messages[0]["content"]
user_question = messages[-1]["content"]
prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"
# send prompt to endpoint /stream
response = requests.post(
f"{self.endpoint}/stream",
json={"prompt": prompt, "max_new_tokens": 256},
stream=True,
)
for line in response.iter_lines():
if line:
# data = json.loads(line)
data_str = line.decode("utf-8")
if data_str.startswith("data: "):
data = json.loads(data_str[6:])
yield data["a"]

View File

@@ -1,44 +1,68 @@
from application.llm.base import BaseLLM
class HuggingFaceLLM(BaseLLM):
def __init__(self, api_key, llm_name='Arc53/DocsGPT-7B',q=False):
def __init__(
self,
api_key=None,
user_api_key=None,
llm_name="Arc53/DocsGPT-7B",
q=False,
*args,
**kwargs,
):
global hf
from langchain.llms import HuggingFacePipeline
if q:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
pipeline,
BitsAndBytesConfig,
)
tokenizer = AutoTokenizer.from_pretrained(llm_name)
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(llm_name,quantization_config=bnb_config)
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
llm_name, quantization_config=bnb_config
)
else:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
tokenizer = AutoTokenizer.from_pretrained(llm_name)
model = AutoModelForCausalLM.from_pretrained(llm_name)
super().__init__(*args, **kwargs)
self.api_key = api_key
self.user_api_key = user_api_key
pipe = pipeline(
"text-generation", model=model,
tokenizer=tokenizer, max_new_tokens=2000,
device_map="auto", eos_token_id=tokenizer.eos_token_id
"text-generation",
model=model,
tokenizer=tokenizer,
max_new_tokens=2000,
device_map="auto",
eos_token_id=tokenizer.eos_token_id,
)
hf = HuggingFacePipeline(pipeline=pipe)
def gen(self, model, engine, messages, stream=False, **kwargs):
context = messages[0]['content']
user_question = messages[-1]['content']
def _raw_gen(self, baseself, model, messages, stream=False, **kwargs):
context = messages[0]["content"]
user_question = messages[-1]["content"]
prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"
result = hf(prompt)
return result.content
def gen_stream(self, model, engine, messages, stream=True, **kwargs):
def _raw_gen_stream(self, baseself, model, messages, stream=True, **kwargs):
raise NotImplementedError("HuggingFaceLLM Streaming is not implemented yet.")

View File

@@ -1,39 +1,55 @@
from application.llm.base import BaseLLM
from application.core.settings import settings
import threading
class LlamaSingleton:
_instances = {}
_lock = threading.Lock() # Add a lock for thread synchronization
@classmethod
def get_instance(cls, llm_name):
if llm_name not in cls._instances:
try:
from llama_cpp import Llama
except ImportError:
raise ImportError(
"Please install llama_cpp using pip install llama-cpp-python"
)
cls._instances[llm_name] = Llama(model_path=llm_name, n_ctx=2048)
return cls._instances[llm_name]
@classmethod
def query_model(cls, llm, prompt, **kwargs):
with cls._lock:
return llm(prompt, **kwargs)
class LlamaCpp(BaseLLM):
def __init__(
self,
api_key=None,
user_api_key=None,
llm_name=settings.MODEL_PATH,
*args,
**kwargs,
):
super().__init__(*args, **kwargs)
self.api_key = api_key
self.user_api_key = user_api_key
self.llama = LlamaSingleton.get_instance(llm_name)
def __init__(self, api_key, llm_name=settings.MODEL_PATH, **kwargs):
global llama
try:
from llama_cpp import Llama
except ImportError:
raise ImportError("Please install llama_cpp using pip install llama-cpp-python")
llama = Llama(model_path=llm_name, n_ctx=2048)
def gen(self, model, engine, messages, stream=False, **kwargs):
context = messages[0]['content']
user_question = messages[-1]['content']
def _raw_gen(self, baseself, model, messages, stream=False, **kwargs):
context = messages[0]["content"]
user_question = messages[-1]["content"]
prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"
result = LlamaSingleton.query_model(self.llama, prompt, max_tokens=150, echo=False)
return result["choices"][0]["text"].split("### Answer \n")[-1]
result = llama(prompt, max_tokens=150, echo=False)
# import sys
# print(result['choices'][0]['text'].split('### Answer \n')[-1], file=sys.stderr)
return result['choices'][0]['text'].split('### Answer \n')[-1]
def gen_stream(self, model, engine, messages, stream=True, **kwargs):
context = messages[0]['content']
user_question = messages[-1]['content']
def _raw_gen_stream(self, baseself, model, messages, stream=True, **kwargs):
context = messages[0]["content"]
user_question = messages[-1]["content"]
prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"
result = llama(prompt, max_tokens=150, echo=False, stream=stream)
# import sys
# print(list(result), file=sys.stderr)
result = LlamaSingleton.query_model(self.llama, prompt, max_tokens=150, echo=False, stream=stream)
for item in result:
for choice in item['choices']:
yield choice['text']
for choice in item["choices"]:
yield choice["text"]

View File

@@ -3,22 +3,25 @@ from application.llm.sagemaker import SagemakerAPILLM
from application.llm.huggingface import HuggingFaceLLM
from application.llm.llama_cpp import LlamaCpp
from application.llm.anthropic import AnthropicLLM
from application.llm.docsgpt_provider import DocsGPTAPILLM
from application.llm.premai import PremAILLM
class LLMCreator:
llms = {
'openai': OpenAILLM,
'azure_openai': AzureOpenAILLM,
'sagemaker': SagemakerAPILLM,
'huggingface': HuggingFaceLLM,
'llama.cpp': LlamaCpp,
'anthropic': AnthropicLLM
"openai": OpenAILLM,
"azure_openai": AzureOpenAILLM,
"sagemaker": SagemakerAPILLM,
"huggingface": HuggingFaceLLM,
"llama.cpp": LlamaCpp,
"anthropic": AnthropicLLM,
"docsgpt": DocsGPTAPILLM,
"premai": PremAILLM,
}
@classmethod
def create_llm(cls, type, *args, **kwargs):
def create_llm(cls, type, api_key, user_api_key, *args, **kwargs):
llm_class = cls.llms.get(type.lower())
if not llm_class:
raise ValueError(f"No LLM class found for type {type}")
return llm_class(*args, **kwargs)
return llm_class(api_key, user_api_key, *args, **kwargs)

View File

@@ -1,57 +1,73 @@
from application.llm.base import BaseLLM
from application.core.settings import settings
class OpenAILLM(BaseLLM):
def __init__(self, api_key):
global openai
import openai
openai.api_key = api_key
self.api_key = api_key # Save the API key to be used later
def __init__(self, api_key=None, user_api_key=None, *args, **kwargs):
from openai import OpenAI
def _get_openai(self):
# Import openai when needed
import openai
# Set the API key every time you import openai
openai.api_key = self.api_key
return openai
super().__init__(*args, **kwargs)
if settings.OPENAI_BASE_URL:
self.client = OpenAI(
api_key=api_key,
base_url=settings.OPENAI_BASE_URL
)
else:
self.client = OpenAI(api_key=api_key)
self.api_key = api_key
self.user_api_key = user_api_key
def gen(self, model, engine, messages, stream=False, **kwargs):
response = openai.ChatCompletion.create(
model=model,
engine=engine,
messages=messages,
stream=stream,
**kwargs
def _raw_gen(
self,
baseself,
model,
messages,
stream=False,
engine=settings.AZURE_DEPLOYMENT_NAME,
**kwargs
):
response = self.client.chat.completions.create(
model=model, messages=messages, stream=stream, **kwargs
)
return response["choices"][0]["message"]["content"]
return response.choices[0].message.content
def gen_stream(self, model, engine, messages, stream=True, **kwargs):
response = openai.ChatCompletion.create(
model=model,
engine=engine,
messages=messages,
stream=stream,
**kwargs
def _raw_gen_stream(
self,
baseself,
model,
messages,
stream=True,
engine=settings.AZURE_DEPLOYMENT_NAME,
**kwargs
):
response = self.client.chat.completions.create(
model=model, messages=messages, stream=stream, **kwargs
)
for line in response:
if "content" in line["choices"][0]["delta"]:
yield line["choices"][0]["delta"]["content"]
# import sys
# print(line.choices[0].delta.content, file=sys.stderr)
if line.choices[0].delta.content is not None:
yield line.choices[0].delta.content
class AzureOpenAILLM(OpenAILLM):
def __init__(self, openai_api_key, openai_api_base, openai_api_version, deployment_name):
def __init__(
self, openai_api_key, openai_api_base, openai_api_version, deployment_name
):
super().__init__(openai_api_key)
self.api_base = settings.OPENAI_API_BASE,
self.api_version = settings.OPENAI_API_VERSION,
self.deployment_name = settings.AZURE_DEPLOYMENT_NAME,
self.api_base = (settings.OPENAI_API_BASE,)
self.api_version = (settings.OPENAI_API_VERSION,)
self.deployment_name = (settings.AZURE_DEPLOYMENT_NAME,)
from openai import AzureOpenAI
def _get_openai(self):
openai = super()._get_openai()
openai.api_base = self.api_base
openai.api_version = self.api_version
openai.api_type = "azure"
return openai
self.client = AzureOpenAI(
api_key=openai_api_key,
api_version=settings.OPENAI_API_VERSION,
api_base=settings.OPENAI_API_BASE,
deployment_name=settings.AZURE_DEPLOYMENT_NAME,
)

38
application/llm/premai.py Normal file
View File

@@ -0,0 +1,38 @@
from application.llm.base import BaseLLM
from application.core.settings import settings
class PremAILLM(BaseLLM):
def __init__(self, api_key=None, user_api_key=None, *args, **kwargs):
from premai import Prem
super().__init__(*args, **kwargs)
self.client = Prem(api_key=api_key)
self.api_key = api_key
self.user_api_key = user_api_key
self.project_id = settings.PREMAI_PROJECT_ID
def _raw_gen(self, baseself, model, messages, stream=False, **kwargs):
response = self.client.chat.completions.create(
model=model,
project_id=self.project_id,
messages=messages,
stream=stream,
**kwargs
)
return response.choices[0].message["content"]
def _raw_gen_stream(self, baseself, model, messages, stream=True, **kwargs):
response = self.client.chat.completions.create(
model=model,
project_id=self.project_id,
messages=messages,
stream=stream,
**kwargs
)
for line in response:
if line.choices[0].delta["content"] is not None:
yield line.choices[0].delta["content"]

View File

@@ -4,11 +4,10 @@ import json
import io
class LineIterator:
"""
A helper class for parsing the byte stream input.
A helper class for parsing the byte stream input.
The output of the model will be in the following format:
```
b'{"outputs": [" a"]}\n'
@@ -16,21 +15,21 @@ class LineIterator:
b'{"outputs": [" problem"]}\n'
...
```
While usually each PayloadPart event from the event stream will contain a byte array
While usually each PayloadPart event from the event stream will contain a byte array
with a full json, this is not guaranteed and some of the json objects may be split across
PayloadPart events. For example:
```
{'PayloadPart': {'Bytes': b'{"outputs": '}}
{'PayloadPart': {'Bytes': b'[" problem"]}\n'}}
```
This class accounts for this by concatenating bytes written via the 'write' function
and then exposing a method which will return lines (ending with a '\n' character) within
the buffer via the 'scan_lines' function. It maintains the position of the last read
position to ensure that previous bytes are not exposed again.
the buffer via the 'scan_lines' function. It maintains the position of the last read
position to ensure that previous bytes are not exposed again.
"""
def __init__(self, stream):
self.byte_iterator = iter(stream)
self.buffer = io.BytesIO()
@@ -43,7 +42,7 @@ class LineIterator:
while True:
self.buffer.seek(self.read_pos)
line = self.buffer.readline()
if line and line[-1] == ord('\n'):
if line and line[-1] == ord("\n"):
self.read_pos += len(line)
return line[:-1]
try:
@@ -52,33 +51,35 @@ class LineIterator:
if self.read_pos < self.buffer.getbuffer().nbytes:
continue
raise
if 'PayloadPart' not in chunk:
print('Unknown event type:' + chunk)
if "PayloadPart" not in chunk:
print("Unknown event type:" + chunk)
continue
self.buffer.seek(0, io.SEEK_END)
self.buffer.write(chunk['PayloadPart']['Bytes'])
self.buffer.write(chunk["PayloadPart"]["Bytes"])
class SagemakerAPILLM(BaseLLM):
def __init__(self, *args, **kwargs):
def __init__(self, api_key=None, user_api_key=None, *args, **kwargs):
import boto3
runtime = boto3.client(
'runtime.sagemaker',
aws_access_key_id='xxx',
aws_secret_access_key='xxx',
region_name='us-west-2'
"runtime.sagemaker",
aws_access_key_id="xxx",
aws_secret_access_key="xxx",
region_name="us-west-2",
)
self.endpoint = settings.SAGEMAKER_ENDPOINT
super().__init__(*args, **kwargs)
self.api_key = api_key
self.user_api_key = user_api_key
self.endpoint = settings.SAGEMAKER_ENDPOINT
self.runtime = runtime
def gen(self, model, engine, messages, stream=False, **kwargs):
context = messages[0]['content']
user_question = messages[-1]['content']
def _raw_gen(self, baseself, model, messages, stream=False, **kwargs):
context = messages[0]["content"]
user_question = messages[-1]["content"]
prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"
# Construct payload for endpoint
payload = {
@@ -89,25 +90,25 @@ class SagemakerAPILLM(BaseLLM):
"temperature": 0.1,
"max_new_tokens": 30,
"repetition_penalty": 1.03,
"stop": ["</s>", "###"]
}
"stop": ["</s>", "###"],
},
}
body_bytes = json.dumps(payload).encode('utf-8')
body_bytes = json.dumps(payload).encode("utf-8")
# Invoke the endpoint
response = self.runtime.invoke_endpoint(EndpointName=self.endpoint,
ContentType='application/json',
Body=body_bytes)
result = json.loads(response['Body'].read().decode())
response = self.runtime.invoke_endpoint(
EndpointName=self.endpoint, ContentType="application/json", Body=body_bytes
)
result = json.loads(response["Body"].read().decode())
import sys
print(result[0]['generated_text'], file=sys.stderr)
return result[0]['generated_text'][len(prompt):]
def gen_stream(self, model, engine, messages, stream=True, **kwargs):
context = messages[0]['content']
user_question = messages[-1]['content']
print(result[0]["generated_text"], file=sys.stderr)
return result[0]["generated_text"][len(prompt) :]
def _raw_gen_stream(self, baseself, model, messages, stream=True, **kwargs):
context = messages[0]["content"]
user_question = messages[-1]["content"]
prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"
# Construct payload for endpoint
payload = {
@@ -118,22 +119,22 @@ class SagemakerAPILLM(BaseLLM):
"temperature": 0.1,
"max_new_tokens": 512,
"repetition_penalty": 1.03,
"stop": ["</s>", "###"]
}
"stop": ["</s>", "###"],
},
}
body_bytes = json.dumps(payload).encode('utf-8')
body_bytes = json.dumps(payload).encode("utf-8")
# Invoke the endpoint
response = self.runtime.invoke_endpoint_with_response_stream(EndpointName=self.endpoint,
ContentType='application/json',
Body=body_bytes)
#result = json.loads(response['Body'].read().decode())
event_stream = response['Body']
start_json = b'{'
response = self.runtime.invoke_endpoint_with_response_stream(
EndpointName=self.endpoint, ContentType="application/json", Body=body_bytes
)
# result = json.loads(response['Body'].read().decode())
event_stream = response["Body"]
start_json = b"{"
for line in LineIterator(event_stream):
if line != b'' and start_json in line:
#print(line)
data = json.loads(line[line.find(start_json):].decode('utf-8'))
if data['token']['text'] not in ["</s>", "###"]:
print(data['token']['text'],end='')
yield data['token']['text']
if line != b"" and start_json in line:
# print(line)
data = json.loads(line[line.find(start_json) :].decode("utf-8"))
if data["token"]["text"] not in ["</s>", "###"]:
print(data["token"]["text"], end="")
yield data["token"]["text"]

View File

@@ -62,7 +62,6 @@ class SimpleDirectoryReader(BaseReader):
file_extractor: Optional[Dict[str, BaseParser]] = None,
num_files_limit: Optional[int] = None,
file_metadata: Optional[Callable[[str], Dict]] = None,
chunk_size_max: int = 2048,
) -> None:
"""Initialize with parameters."""
super().__init__()
@@ -148,12 +147,24 @@ class SimpleDirectoryReader(BaseReader):
# do standard read
with open(input_file, "r", errors=self.errors) as f:
data = f.read()
if isinstance(data, List):
data_list.extend(data)
else:
data_list.append(str(data))
# Prepare metadata for this file
if self.file_metadata is not None:
metadata_list.append(self.file_metadata(str(input_file)))
file_metadata = self.file_metadata(str(input_file))
else:
# Provide a default empty metadata
file_metadata = {'title': '', 'store': ''}
# TODO: Find a case with no metadata and check if breaks anything
if isinstance(data, List):
# Extend data_list with each item in the data list
data_list.extend([str(d) for d in data])
# For each item in the data list, add the file's metadata to metadata_list
metadata_list.extend([file_metadata for _ in data])
else:
# Add the single piece of data to data_list
data_list.append(str(data))
# Add the file's metadata to metadata_list
metadata_list.append(file_metadata)
if concatenate:
return [Document("\n".join(data_list))]

View File

@@ -3,7 +3,6 @@
Contains parser for html files.
"""
import re
from pathlib import Path
from typing import Dict, Union
@@ -18,66 +17,8 @@ class HTMLParser(BaseParser):
return {}
def parse_file(self, file: Path, errors: str = "ignore") -> Union[str, list[str]]:
"""Parse file.
from langchain_community.document_loaders import BSHTMLLoader
Returns:
Union[str, List[str]]: a string or a List of strings.
"""
try:
from unstructured.partition.html import partition_html
from unstructured.staging.base import convert_to_isd
from unstructured.cleaners.core import clean
except ImportError:
raise ValueError("unstructured package is required to parse HTML files.")
# Using the unstructured library to convert the html to isd format
# isd sample : isd = [
# {"text": "My Title", "type": "Title"},
# {"text": "My Narrative", "type": "NarrativeText"}
# ]
with open(file, "r", encoding="utf-8") as fp:
elements = partition_html(file=fp)
isd = convert_to_isd(elements)
# Removing non ascii charactwers from isd_el['text']
for isd_el in isd:
isd_el['text'] = isd_el['text'].encode("ascii", "ignore").decode()
# Removing all the \n characters from isd_el['text'] using regex and replace with single space
# Removing all the extra spaces from isd_el['text'] using regex and replace with single space
for isd_el in isd:
isd_el['text'] = re.sub(r'\n', ' ', isd_el['text'], flags=re.MULTILINE | re.DOTALL)
isd_el['text'] = re.sub(r"\s{2,}", " ", isd_el['text'], flags=re.MULTILINE | re.DOTALL)
# more cleaning: extra_whitespaces, dashes, bullets, trailing_punctuation
for isd_el in isd:
clean(isd_el['text'], extra_whitespace=True, dashes=True, bullets=True, trailing_punctuation=True)
# Creating a list of all the indexes of isd_el['type'] = 'Title'
title_indexes = [i for i, isd_el in enumerate(isd) if isd_el['type'] == 'Title']
# Creating 'Chunks' - List of lists of strings
# each list starting with isd_el['type'] = 'Title' and all the data till the next 'Title'
# Each Chunk can be thought of as an individual set of data, which can be sent to the model
# Where Each Title is grouped together with the data under it
Chunks = [[]]
final_chunks = list(list())
for i, isd_el in enumerate(isd):
if i in title_indexes:
Chunks.append([])
Chunks[-1].append(isd_el['text'])
# Removing all the chunks with sum of length of all the strings in the chunk < 25
# TODO: This value can be an user defined variable
for chunk in Chunks:
# sum of length of all the strings in the chunk
sum = 0
sum += len(str(chunk))
if sum < 25:
Chunks.remove(chunk)
else:
# appending all the approved chunks to final_chunks as a single string
final_chunks.append(" ".join([str(item) for item in chunk]))
return final_chunks
loader = BSHTMLLoader(file)
data = loader.load()
return data

75
application/parser/open_ai_func.py Normal file → Executable file
View File

@@ -1,38 +1,33 @@
import os
import tiktoken
from application.vectorstore.vector_creator import VectorCreator
from application.core.settings import settings
from retry import retry
from application.core.settings import settings
# from langchain.embeddings import HuggingFaceEmbeddings
# from langchain.embeddings import HuggingFaceInstructEmbeddings
# from langchain.embeddings import CohereEmbeddings
from application.vectorstore.vector_creator import VectorCreator
def num_tokens_from_string(string: str, encoding_name: str) -> int:
# Function to convert string to tokens and estimate user cost.
encoding = tiktoken.get_encoding(encoding_name)
num_tokens = len(encoding.encode(string))
total_price = ((num_tokens / 1000) * 0.0004)
return num_tokens, total_price
# from langchain_community.embeddings import HuggingFaceEmbeddings
# from langchain_community.embeddings import HuggingFaceInstructEmbeddings
# from langchain_community.embeddings import CohereEmbeddings
@retry(tries=10, delay=60)
def store_add_texts_with_retry(store, i):
def store_add_texts_with_retry(store, i, id):
# add source_id to the metadata
i.metadata["source_id"] = str(id)
store.add_texts([i.page_content], metadatas=[i.metadata])
# store_pine.add_texts([i.page_content], metadatas=[i.metadata])
def call_openai_api(docs, folder_name, task_status):
# Function to create a vector store from the documents and save it to disk.
def call_openai_api(docs, folder_name, id, task_status):
# Function to create a vector store from the documents and save it to disk
# create output folder if it doesn't exist
if not os.path.exists(f"{folder_name}"):
os.makedirs(f"{folder_name}")
from tqdm import tqdm
c1 = 0
if settings.VECTOR_STORE == "faiss":
docs_init = [docs[0]]
@@ -40,26 +35,34 @@ def call_openai_api(docs, folder_name, task_status):
store = VectorCreator.create_vectorstore(
settings.VECTOR_STORE,
docs_init = docs_init,
path=f"{folder_name}",
embeddings_key=os.getenv("EMBEDDINGS_KEY")
docs_init=docs_init,
source_id=f"{folder_name}",
embeddings_key=os.getenv("EMBEDDINGS_KEY"),
)
else:
store = VectorCreator.create_vectorstore(
settings.VECTOR_STORE,
path=f"{folder_name}",
embeddings_key=os.getenv("EMBEDDINGS_KEY")
source_id=str(id),
embeddings_key=os.getenv("EMBEDDINGS_KEY"),
)
store.delete_index()
# Uncomment for MPNet embeddings
# model_name = "sentence-transformers/all-mpnet-base-v2"
# hf = HuggingFaceEmbeddings(model_name=model_name)
# store = FAISS.from_documents(docs_test, hf)
s1 = len(docs)
for i in tqdm(docs, desc="Embedding 🦖", unit="docs", total=len(docs),
bar_format='{l_bar}{bar}| Time Left: {remaining}'):
for i in tqdm(
docs,
desc="Embedding 🦖",
unit="docs",
total=len(docs),
bar_format="{l_bar}{bar}| Time Left: {remaining}",
):
try:
task_status.update_state(state='PROGRESS', meta={'current': int((c1 / s1) * 100)})
store_add_texts_with_retry(store, i)
task_status.update_state(
state="PROGRESS", meta={"current": int((c1 / s1) * 100)}
)
store_add_texts_with_retry(store, i, id)
except Exception as e:
print(e)
print("Error on ", i)
@@ -70,25 +73,3 @@ def call_openai_api(docs, folder_name, task_status):
c1 += 1
if settings.VECTOR_STORE == "faiss":
store.save_local(f"{folder_name}")
def get_user_permission(docs, folder_name):
# Function to ask user permission to call the OpenAI api and spend their OpenAI funds.
# Here we convert the docs list to a string and calculate the number of OpenAI tokens the string represents.
# docs_content = (" ".join(docs))
docs_content = ""
for doc in docs:
docs_content += doc.page_content
tokens, total_price = num_tokens_from_string(string=docs_content, encoding_name="cl100k_base")
# Here we print the number of tokens and the approx user cost with some visually appealing formatting.
print(f"Number of Tokens = {format(tokens, ',d')}")
print(f"Approx Cost = ${format(total_price, ',.2f')}")
# Here we check for user permission before calling the API.
user_input = input("Price Okay? (Y/N) \n").lower()
if user_input == "y":
call_openai_api(docs, folder_name)
elif user_input == "":
call_openai_api(docs, folder_name)
else:
print("The API was not called. No money was spent.")

View File

@@ -0,0 +1,19 @@
"""Base reader class."""
from abc import abstractmethod
from typing import Any, List
from langchain.docstore.document import Document as LCDocument
from application.parser.schema.base import Document
class BaseRemote:
"""Utilities for loading data from a directory."""
@abstractmethod
def load_data(self, *args: Any, **load_kwargs: Any) -> List[Document]:
"""Load data from the input directory."""
def load_langchain_documents(self, **load_kwargs: Any) -> List[LCDocument]:
"""Load data in LangChain document format."""
docs = self.load_data(**load_kwargs)
return [d.to_langchain_format() for d in docs]

View File

@@ -0,0 +1,59 @@
import requests
from urllib.parse import urlparse, urljoin
from bs4 import BeautifulSoup
from application.parser.remote.base import BaseRemote
class CrawlerLoader(BaseRemote):
def __init__(self, limit=10):
from langchain_community.document_loaders import WebBaseLoader
self.loader = WebBaseLoader # Initialize the document loader
self.limit = limit # Set the limit for the number of pages to scrape
def load_data(self, inputs):
url = inputs
# Check if the input is a list and if it is, use the first element
if isinstance(url, list) and url:
url = url[0]
# Check if the URL scheme is provided, if not, assume http
if not urlparse(url).scheme:
url = "http://" + url
visited_urls = set() # Keep track of URLs that have been visited
base_url = urlparse(url).scheme + "://" + urlparse(url).hostname # Extract the base URL
urls_to_visit = [url] # List of URLs to be visited, starting with the initial URL
loaded_content = [] # Store the loaded content from each URL
# Continue crawling until there are no more URLs to visit
while urls_to_visit:
current_url = urls_to_visit.pop(0) # Get the next URL to visit
visited_urls.add(current_url) # Mark the URL as visited
# Try to load and process the content from the current URL
try:
response = requests.get(current_url) # Fetch the content of the current URL
response.raise_for_status() # Raise an exception for HTTP errors
loader = self.loader([current_url]) # Initialize the document loader for the current URL
loaded_content.extend(loader.load()) # Load the content and add it to the loaded_content list
except Exception as e:
# Print an error message if loading or processing fails and continue with the next URL
print(f"Error processing URL {current_url}: {e}")
continue
# Parse the HTML content to extract all links
soup = BeautifulSoup(response.text, 'html.parser')
all_links = [
urljoin(current_url, a['href'])
for a in soup.find_all('a', href=True)
if base_url in urljoin(current_url, a['href']) # Ensure links are from the same domain
]
# Add new links to the list of URLs to visit if they haven't been visited yet
urls_to_visit.extend([link for link in all_links if link not in visited_urls])
urls_to_visit = list(set(urls_to_visit)) # Remove duplicate URLs
# Stop crawling if the limit of pages to scrape is reached
if self.limit is not None and len(visited_urls) >= self.limit:
break
return loaded_content # Return the loaded content from all visited URLs

View File

@@ -0,0 +1,26 @@
from application.parser.remote.base import BaseRemote
from langchain_community.document_loaders import RedditPostsLoader
class RedditPostsLoaderRemote(BaseRemote):
def load_data(self, inputs):
data = eval(inputs)
client_id = data.get("client_id")
client_secret = data.get("client_secret")
user_agent = data.get("user_agent")
categories = data.get("categories", ["new", "hot"])
mode = data.get("mode", "subreddit")
search_queries = data.get("search_queries")
number_posts = data.get("number_posts", 10)
self.loader = RedditPostsLoader(
client_id=client_id,
client_secret=client_secret,
user_agent=user_agent,
categories=categories,
mode=mode,
search_queries=search_queries,
number_posts=number_posts,
)
documents = self.loader.load()
print(f"Loaded {len(documents)} documents from Reddit")
return documents

View File

@@ -0,0 +1,20 @@
from application.parser.remote.sitemap_loader import SitemapLoader
from application.parser.remote.crawler_loader import CrawlerLoader
from application.parser.remote.web_loader import WebLoader
from application.parser.remote.reddit_loader import RedditPostsLoaderRemote
class RemoteCreator:
loaders = {
"url": WebLoader,
"sitemap": SitemapLoader,
"crawler": CrawlerLoader,
"reddit": RedditPostsLoaderRemote,
}
@classmethod
def create_loader(cls, type, *args, **kwargs):
loader_class = cls.loaders.get(type.lower())
if not loader_class:
raise ValueError(f"No LLM class found for type {type}")
return loader_class(*args, **kwargs)

View File

@@ -0,0 +1,81 @@
import requests
import re # Import regular expression library
import xml.etree.ElementTree as ET
from application.parser.remote.base import BaseRemote
class SitemapLoader(BaseRemote):
def __init__(self, limit=20):
from langchain_community.document_loaders import WebBaseLoader
self.loader = WebBaseLoader
self.limit = limit # Adding limit to control the number of URLs to process
def load_data(self, inputs):
sitemap_url= inputs
# Check if the input is a list and if it is, use the first element
if isinstance(sitemap_url, list) and sitemap_url:
url = sitemap_url[0]
urls = self._extract_urls(sitemap_url)
if not urls:
print(f"No URLs found in the sitemap: {sitemap_url}")
return []
# Load content of extracted URLs
documents = []
processed_urls = 0 # Counter for processed URLs
for url in urls:
if self.limit is not None and processed_urls >= self.limit:
break # Stop processing if the limit is reached
try:
loader = self.loader([url])
documents.extend(loader.load())
processed_urls += 1 # Increment the counter after processing each URL
except Exception as e:
print(f"Error processing URL {url}: {e}")
continue
return documents
def _extract_urls(self, sitemap_url):
try:
response = requests.get(sitemap_url)
response.raise_for_status() # Raise an exception for HTTP errors
except (requests.exceptions.HTTPError, requests.exceptions.ConnectionError) as e:
print(f"Failed to fetch sitemap: {sitemap_url}. Error: {e}")
return []
# Determine if this is a sitemap or a URL
if self._is_sitemap(response):
# It's a sitemap, so parse it and extract URLs
return self._parse_sitemap(response.content)
else:
# It's not a sitemap, return the URL itself
return [sitemap_url]
def _is_sitemap(self, response):
content_type = response.headers.get('Content-Type', '')
if 'xml' in content_type or response.url.endswith('.xml'):
return True
if '<sitemapindex' in response.text or '<urlset' in response.text:
return True
return False
def _parse_sitemap(self, sitemap_content):
# Remove namespaces
sitemap_content = re.sub(' xmlns="[^"]+"', '', sitemap_content.decode('utf-8'), count=1)
root = ET.fromstring(sitemap_content)
urls = []
for loc in root.findall('.//url/loc'):
urls.append(loc.text)
# Check for nested sitemaps
for sitemap in root.findall('.//sitemap/loc'):
nested_sitemap_url = sitemap.text
urls.extend(self._extract_urls(nested_sitemap_url))
return urls

View File

@@ -0,0 +1,11 @@
from langchain.document_loader import TelegramChatApiLoader
from application.parser.remote.base import BaseRemote
class TelegramChatApiRemote(BaseRemote):
def _init_parser(self, *args, **load_kwargs):
self.loader = TelegramChatApiLoader(**load_kwargs)
return {}
def parse_file(self, *args, **load_kwargs):
return

View File

@@ -0,0 +1,32 @@
from application.parser.remote.base import BaseRemote
from langchain_community.document_loaders import WebBaseLoader
headers = {
"User-Agent": "Mozilla/5.0",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*"
";q=0.8",
"Accept-Language": "en-US,en;q=0.5",
"Referer": "https://www.google.com/",
"DNT": "1",
"Connection": "keep-alive",
"Upgrade-Insecure-Requests": "1",
}
class WebLoader(BaseRemote):
def __init__(self):
self.loader = WebBaseLoader
def load_data(self, inputs):
urls = inputs
if isinstance(urls, str):
urls = [urls]
documents = []
for url in urls:
try:
loader = self.loader([url], header_template=headers)
documents.extend(loader.load())
except Exception as e:
print(f"Error processing URL {url}: {e}")
continue
return documents

View File

@@ -21,16 +21,18 @@ def group_documents(documents: List[Document], min_tokens: int, max_tokens: int)
for doc in documents:
doc_len = len(tiktoken.get_encoding("cl100k_base").encode(doc.text))
if current_group is None:
current_group = Document(text=doc.text, doc_id=doc.doc_id, embedding=doc.embedding,
extra_info=doc.extra_info)
elif len(tiktoken.get_encoding("cl100k_base").encode(
current_group.text)) + doc_len < max_tokens and doc_len < min_tokens:
current_group.text += " " + doc.text
# Check if current group is empty or if the document can be added based on token count and matching metadata
if (current_group is None or
(len(tiktoken.get_encoding("cl100k_base").encode(current_group.text)) + doc_len < max_tokens and
doc_len < min_tokens and
current_group.extra_info == doc.extra_info)):
if current_group is None:
current_group = doc # Use the document directly to retain its metadata
else:
current_group.text += " " + doc.text # Append text to the current group
else:
docs.append(current_group)
current_group = Document(text=doc.text, doc_id=doc.doc_id, embedding=doc.embedding,
extra_info=doc.extra_info)
current_group = doc # Start a new group with the current document
if current_group is not None:
docs.append(current_group)

View File

@@ -1,110 +1,86 @@
aiodns==3.0.0
aiohttp==3.8.6
aiohttp-retry==2.8.3
aiosignal==1.3.1
aleph-alpha-client==2.16.1
amqp==5.1.1
anthropic==0.5.0
async-timeout==4.0.2
attrs==22.2.0
billiard==3.6.4.0
blobfile==2.0.1
boto3==1.28.20
celery==5.2.7
cffi==1.15.1
charset-normalizer==3.1.0
click==8.1.3
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.2.0
cryptography==41.0.4
dataclasses-json==0.5.7
decorator==5.1.1
anthropic==0.34.2
boto3==1.34.153
beautifulsoup4==4.12.3
celery==5.3.6
dataclasses-json==0.6.7
docx2txt==0.8
dill==0.3.6
dnspython==2.3.0
ecdsa==0.18.0
elasticsearch==8.9.0
entrypoints==0.4
faiss-cpu==1.7.3
filelock==3.9.0
Flask==2.2.5
Flask-Cors==3.0.10
frozenlist==1.3.3
geojson==2.5.0
gunicorn==20.1.0
greenlet==2.0.2
gpt4all==0.1.7
huggingface-hub==0.15.1
humbug==0.3.2
idna==3.4
itsdangerous==2.1.2
Jinja2==3.1.2
duckduckgo-search==6.2.6
ebooklib==0.18
elastic-transport==8.15.0
elasticsearch==8.15.1
escodegen==1.0.11
esprima==4.0.1
esutils==1.0.1
Flask==3.0.3
faiss-cpu==1.8.0.post1
flask-restx==1.3.0
gunicorn==23.0.0
html2text==2024.2.26
javalang==0.13.0
jinja2==3.1.4
jiter==0.5.0
jmespath==1.0.1
joblib==1.2.0
kombu==5.2.4
langchain==0.0.312
loguru==0.6.0
lxml==4.9.2
MarkupSafe==2.1.2
marshmallow==3.19.0
marshmallow-enum==1.5.1
joblib==1.4.2
jsonpatch==1.33
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-spec==0.2.4
jsonschema-specifications==2023.7.1
kombu==5.4.2
langchain==0.3.0
langchain-community==0.3.0
langchain-core==0.3.2
langchain-openai==0.2.0
langchain-text-splitters==0.3.0
langsmith==0.1.125
lazy-object-proxy==1.10.0
lxml==5.3.0
markupsafe==2.1.5
marshmallow==3.22.0
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.14
multidict==6.1.0
mypy-extensions==1.0.0
networkx==3.0
npx
nltk==3.8.1
numcodecs==0.11.0
numpy==1.24.2
openai==0.27.8
openapi3-parser==1.1.14
packaging==23.0
pathos==0.3.0
Pillow==10.0.1
pox==0.3.2
ppft==1.7.6.6
prompt-toolkit==3.0.38
networkx==3.3
numpy==1.26.4
openai==1.46.1
openapi-schema-validator==0.6.2
openapi-spec-validator==0.6.0
openapi3-parser==1.1.18
orjson==3.10.7
packaging==24.1
pandas==2.2.3
pathable==0.4.3
pillow==10.4.0
portalocker==2.10.1
prance==23.6.21.0
primp==0.6.2
prompt-toolkit==3.0.47
protobuf==5.28.2
py==1.11.0
pyasn1==0.4.8
pycares==4.3.0
pycparser==2.21
pycryptodomex==3.17
pycryptodome==3.19.0
pydantic==1.10.5
PyJWT==2.6.0
pymongo==4.3.3
pyowm==3.3.0
PyPDF2==3.0.1
PySocks==1.7.1
pytest
python-dateutil==2.8.2
python-dotenv==1.0.0
python-jose==3.3.0
pytz==2022.7.1
PyYAML==6.0
redis==4.5.4
regex==2022.10.31
requests==2.31.0
pydantic==2.9.2
pydantic-core==2.23.4
pydantic-settings==2.4.0
pymongo==4.8.0
pypdf2==3.0.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
qdrant-client==1.11.0
redis==5.0.1
referencing==0.30.2
regex==2024.9.11
requests==2.32.3
retry==0.9.2
rsa==4.9
scikit-learn==1.2.2
scipy==1.10.1
sentencepiece
six==1.16.0
SQLAlchemy==1.4.46
sympy==1.11.1
tenacity==8.2.2
threadpoolctl==3.1.0
tiktoken
tqdm==4.65.0
transformers==4.30.0
typer==0.7.0
typing-inspect==0.8.0
typing_extensions==4.5.0
urllib3==1.26.18
vine==5.0.0
wcwidth==0.2.6
yarl==1.8.2
sentence-transformers==2.2.2
sentence-transformers==3.0.1
tiktoken==0.7.0
tokenizers==0.19.1
torch==2.4.1
tqdm==4.66.5
transformers==4.44.2
typing-extensions==4.12.2
typing-inspect==0.9.0
tzdata==2024.2
urllib3==2.2.3
vine==5.1.0
wcwidth==0.2.13
werkzeug==3.0.4
yarl==1.11.1

View File

View File

@@ -0,0 +1,18 @@
from abc import ABC, abstractmethod
class BaseRetriever(ABC):
def __init__(self):
pass
@abstractmethod
def gen(self, *args, **kwargs):
pass
@abstractmethod
def search(self, *args, **kwargs):
pass
@abstractmethod
def get_params(self):
pass

View File

@@ -0,0 +1,115 @@
import json
from application.retriever.base import BaseRetriever
from application.core.settings import settings
from application.llm.llm_creator import LLMCreator
from application.utils import num_tokens_from_string
from langchain_community.tools import BraveSearch
class BraveRetSearch(BaseRetriever):
def __init__(
self,
question,
source,
chat_history,
prompt,
chunks=2,
token_limit=150,
gpt_model="docsgpt",
user_api_key=None,
):
self.question = question
self.source = source
self.chat_history = chat_history
self.prompt = prompt
self.chunks = chunks
self.gpt_model = gpt_model
self.token_limit = (
token_limit
if token_limit
< settings.MODEL_TOKEN_LIMITS.get(
self.gpt_model, settings.DEFAULT_MAX_HISTORY
)
else settings.MODEL_TOKEN_LIMITS.get(
self.gpt_model, settings.DEFAULT_MAX_HISTORY
)
)
self.user_api_key = user_api_key
def _get_data(self):
if self.chunks == 0:
docs = []
else:
search = BraveSearch.from_api_key(
api_key=settings.BRAVE_SEARCH_API_KEY,
search_kwargs={"count": int(self.chunks)},
)
results = search.run(self.question)
results = json.loads(results)
docs = []
for i in results:
try:
title = i["title"]
link = i["link"]
snippet = i["snippet"]
docs.append({"text": snippet, "title": title, "link": link})
except IndexError:
pass
if settings.LLM_NAME == "llama.cpp":
docs = [docs[0]]
return docs
def gen(self):
docs = self._get_data()
# join all page_content together with a newline
docs_together = "\n".join([doc["text"] for doc in docs])
p_chat_combine = self.prompt.replace("{summaries}", docs_together)
messages_combine = [{"role": "system", "content": p_chat_combine}]
for doc in docs:
yield {"source": doc}
if len(self.chat_history) > 1:
tokens_current_history = 0
# count tokens in history
self.chat_history.reverse()
for i in self.chat_history:
if "prompt" in i and "response" in i:
tokens_batch = num_tokens_from_string(i["prompt"]) + num_tokens_from_string(
i["response"]
)
if tokens_current_history + tokens_batch < self.token_limit:
tokens_current_history += tokens_batch
messages_combine.append(
{"role": "user", "content": i["prompt"]}
)
messages_combine.append(
{"role": "system", "content": i["response"]}
)
messages_combine.append({"role": "user", "content": self.question})
llm = LLMCreator.create_llm(
settings.LLM_NAME, api_key=settings.API_KEY, user_api_key=self.user_api_key
)
completion = llm.gen_stream(model=self.gpt_model, messages=messages_combine)
for line in completion:
yield {"answer": str(line)}
def search(self):
return self._get_data()
def get_params(self):
return {
"question": self.question,
"source": self.source,
"chat_history": self.chat_history,
"prompt": self.prompt,
"chunks": self.chunks,
"token_limit": self.token_limit,
"gpt_model": self.gpt_model,
"user_api_key": self.user_api_key
}

View File

@@ -0,0 +1,118 @@
from application.retriever.base import BaseRetriever
from application.core.settings import settings
from application.vectorstore.vector_creator import VectorCreator
from application.llm.llm_creator import LLMCreator
from application.utils import num_tokens_from_string
class ClassicRAG(BaseRetriever):
def __init__(
self,
question,
source,
chat_history,
prompt,
chunks=2,
token_limit=150,
gpt_model="docsgpt",
user_api_key=None,
):
self.question = question
self.vectorstore = source['active_docs'] if 'active_docs' in source else None
self.chat_history = chat_history
self.prompt = prompt
self.chunks = chunks
self.gpt_model = gpt_model
self.token_limit = (
token_limit
if token_limit
< settings.MODEL_TOKEN_LIMITS.get(
self.gpt_model, settings.DEFAULT_MAX_HISTORY
)
else settings.MODEL_TOKEN_LIMITS.get(
self.gpt_model, settings.DEFAULT_MAX_HISTORY
)
)
self.user_api_key = user_api_key
def _get_data(self):
if self.chunks == 0:
docs = []
else:
docsearch = VectorCreator.create_vectorstore(
settings.VECTOR_STORE, self.vectorstore, settings.EMBEDDINGS_KEY
)
docs_temp = docsearch.search(self.question, k=self.chunks)
print(docs_temp)
docs = [
{
"title": i.metadata.get(
"title", i.metadata.get("post_title", i.page_content)
).split("/")[-1],
"text": i.page_content,
"source": (
i.metadata.get("source")
if i.metadata.get("source")
else "local"
),
}
for i in docs_temp
]
if settings.LLM_NAME == "llama.cpp":
docs = [docs[0]]
return docs
def gen(self):
docs = self._get_data()
# join all page_content together with a newline
docs_together = "\n".join([doc["text"] for doc in docs])
p_chat_combine = self.prompt.replace("{summaries}", docs_together)
messages_combine = [{"role": "system", "content": p_chat_combine}]
for doc in docs:
yield {"source": doc}
if len(self.chat_history) > 1:
tokens_current_history = 0
# count tokens in history
self.chat_history.reverse()
for i in self.chat_history:
if "prompt" in i and "response" in i:
tokens_batch = num_tokens_from_string(i["prompt"]) + num_tokens_from_string(
i["response"]
)
if tokens_current_history + tokens_batch < self.token_limit:
tokens_current_history += tokens_batch
messages_combine.append(
{"role": "user", "content": i["prompt"]}
)
messages_combine.append(
{"role": "system", "content": i["response"]}
)
messages_combine.append({"role": "user", "content": self.question})
llm = LLMCreator.create_llm(
settings.LLM_NAME, api_key=settings.API_KEY, user_api_key=self.user_api_key
)
completion = llm.gen_stream(model=self.gpt_model, messages=messages_combine)
for line in completion:
yield {"answer": str(line)}
def search(self):
return self._get_data()
def get_params(self):
return {
"question": self.question,
"source": self.vectorstore,
"chat_history": self.chat_history,
"prompt": self.prompt,
"chunks": self.chunks,
"token_limit": self.token_limit,
"gpt_model": self.gpt_model,
"user_api_key": self.user_api_key
}

View File

@@ -0,0 +1,132 @@
from application.retriever.base import BaseRetriever
from application.core.settings import settings
from application.llm.llm_creator import LLMCreator
from application.utils import num_tokens_from_string
from langchain_community.tools import DuckDuckGoSearchResults
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper
class DuckDuckSearch(BaseRetriever):
def __init__(
self,
question,
source,
chat_history,
prompt,
chunks=2,
token_limit=150,
gpt_model="docsgpt",
user_api_key=None,
):
self.question = question
self.source = source
self.chat_history = chat_history
self.prompt = prompt
self.chunks = chunks
self.gpt_model = gpt_model
self.token_limit = (
token_limit
if token_limit
< settings.MODEL_TOKEN_LIMITS.get(
self.gpt_model, settings.DEFAULT_MAX_HISTORY
)
else settings.MODEL_TOKEN_LIMITS.get(
self.gpt_model, settings.DEFAULT_MAX_HISTORY
)
)
self.user_api_key = user_api_key
def _parse_lang_string(self, input_string):
result = []
current_item = ""
inside_brackets = False
for char in input_string:
if char == "[":
inside_brackets = True
elif char == "]":
inside_brackets = False
result.append(current_item)
current_item = ""
elif inside_brackets:
current_item += char
if inside_brackets:
result.append(current_item)
return result
def _get_data(self):
if self.chunks == 0:
docs = []
else:
wrapper = DuckDuckGoSearchAPIWrapper(max_results=self.chunks)
search = DuckDuckGoSearchResults(api_wrapper=wrapper)
results = search.run(self.question)
results = self._parse_lang_string(results)
docs = []
for i in results:
try:
text = i.split("title:")[0]
title = i.split("title:")[1].split("link:")[0]
link = i.split("link:")[1]
docs.append({"text": text, "title": title, "link": link})
except IndexError:
pass
if settings.LLM_NAME == "llama.cpp":
docs = [docs[0]]
return docs
def gen(self):
docs = self._get_data()
# join all page_content together with a newline
docs_together = "\n".join([doc["text"] for doc in docs])
p_chat_combine = self.prompt.replace("{summaries}", docs_together)
messages_combine = [{"role": "system", "content": p_chat_combine}]
for doc in docs:
yield {"source": doc}
if len(self.chat_history) > 1:
tokens_current_history = 0
# count tokens in history
self.chat_history.reverse()
for i in self.chat_history:
if "prompt" in i and "response" in i:
tokens_batch = num_tokens_from_string(i["prompt"]) + num_tokens_from_string(
i["response"]
)
if tokens_current_history + tokens_batch < self.token_limit:
tokens_current_history += tokens_batch
messages_combine.append(
{"role": "user", "content": i["prompt"]}
)
messages_combine.append(
{"role": "system", "content": i["response"]}
)
messages_combine.append({"role": "user", "content": self.question})
llm = LLMCreator.create_llm(
settings.LLM_NAME, api_key=settings.API_KEY, user_api_key=self.user_api_key
)
completion = llm.gen_stream(model=self.gpt_model, messages=messages_combine)
for line in completion:
yield {"answer": str(line)}
def search(self):
return self._get_data()
def get_params(self):
return {
"question": self.question,
"source": self.source,
"chat_history": self.chat_history,
"prompt": self.prompt,
"chunks": self.chunks,
"token_limit": self.token_limit,
"gpt_model": self.gpt_model,
"user_api_key": self.user_api_key
}

View File

@@ -0,0 +1,20 @@
from application.retriever.classic_rag import ClassicRAG
from application.retriever.duckduck_search import DuckDuckSearch
from application.retriever.brave_search import BraveRetSearch
class RetrieverCreator:
retrievers = {
'classic': ClassicRAG,
'duckduck_search': DuckDuckSearch,
'brave_search': BraveRetSearch,
'default': ClassicRAG
}
@classmethod
def create_retriever(cls, type, *args, **kwargs):
retiever_class = cls.retrievers.get(type.lower())
if not retiever_class:
raise ValueError(f"No retievers class found for type {type}")
return retiever_class(*args, **kwargs)

49
application/usage.py Normal file
View File

@@ -0,0 +1,49 @@
import sys
from pymongo import MongoClient
from datetime import datetime
from application.core.settings import settings
from application.utils import num_tokens_from_string
mongo = MongoClient(settings.MONGO_URI)
db = mongo["docsgpt"]
usage_collection = db["token_usage"]
def update_token_usage(user_api_key, token_usage):
if "pytest" in sys.modules:
return
usage_data = {
"api_key": user_api_key,
"prompt_tokens": token_usage["prompt_tokens"],
"generated_tokens": token_usage["generated_tokens"],
"timestamp": datetime.now(),
}
usage_collection.insert_one(usage_data)
def gen_token_usage(func):
def wrapper(self, model, messages, stream, **kwargs):
for message in messages:
self.token_usage["prompt_tokens"] += num_tokens_from_string(message["content"])
result = func(self, model, messages, stream, **kwargs)
self.token_usage["generated_tokens"] += num_tokens_from_string(result)
update_token_usage(self.user_api_key, self.token_usage)
return result
return wrapper
def stream_token_usage(func):
def wrapper(self, model, messages, stream, **kwargs):
for message in messages:
self.token_usage["prompt_tokens"] += num_tokens_from_string(message["content"])
batch = []
result = func(self, model, messages, stream, **kwargs)
for r in result:
batch.append(r)
yield r
for line in batch:
self.token_usage["generated_tokens"] += num_tokens_from_string(line)
update_token_usage(self.user_api_key, self.token_usage)
return wrapper

41
application/utils.py Normal file
View File

@@ -0,0 +1,41 @@
import tiktoken
from flask import jsonify, make_response
_encoding = None
def get_encoding():
global _encoding
if _encoding is None:
_encoding = tiktoken.get_encoding("cl100k_base")
return _encoding
def num_tokens_from_string(string: str) -> int:
encoding = get_encoding()
num_tokens = len(encoding.encode(string))
return num_tokens
def count_tokens_docs(docs):
docs_content = ""
for doc in docs:
docs_content += doc.page_content
tokens = num_tokens_from_string(docs_content)
return tokens
def check_required_fields(data, required_fields):
missing_fields = [field for field in required_fields if field not in data]
if missing_fields:
return make_response(
jsonify(
{
"success": False,
"message": f"Missing fields: {', '.join(missing_fields)}",
}
),
400,
)
return None

View File

@@ -1,13 +1,55 @@
from abc import ABC, abstractmethod
import os
from langchain.embeddings import (
OpenAIEmbeddings,
HuggingFaceEmbeddings,
CohereEmbeddings,
HuggingFaceInstructEmbeddings,
)
from sentence_transformers import SentenceTransformer
from langchain_openai import OpenAIEmbeddings
from application.core.settings import settings
class EmbeddingsWrapper:
def __init__(self, model_name, *args, **kwargs):
self.model = SentenceTransformer(model_name, config_kwargs={'allow_dangerous_deserialization': True}, *args, **kwargs)
self.dimension = self.model.get_sentence_embedding_dimension()
def embed_query(self, query: str):
return self.model.encode(query).tolist()
def embed_documents(self, documents: list):
return self.model.encode(documents).tolist()
def __call__(self, text):
if isinstance(text, str):
return self.embed_query(text)
elif isinstance(text, list):
return self.embed_documents(text)
else:
raise ValueError("Input must be a string or a list of strings")
class EmbeddingsSingleton:
_instances = {}
@staticmethod
def get_instance(embeddings_name, *args, **kwargs):
if embeddings_name not in EmbeddingsSingleton._instances:
EmbeddingsSingleton._instances[embeddings_name] = EmbeddingsSingleton._create_instance(
embeddings_name, *args, **kwargs
)
return EmbeddingsSingleton._instances[embeddings_name]
@staticmethod
def _create_instance(embeddings_name, *args, **kwargs):
embeddings_factory = {
"openai_text-embedding-ada-002": OpenAIEmbeddings,
"huggingface_sentence-transformers/all-mpnet-base-v2": lambda: EmbeddingsWrapper("sentence-transformers/all-mpnet-base-v2"),
"huggingface_sentence-transformers-all-mpnet-base-v2": lambda: EmbeddingsWrapper("sentence-transformers/all-mpnet-base-v2"),
"huggingface_hkunlp/instructor-large": lambda: EmbeddingsWrapper("hkunlp/instructor-large"),
}
if embeddings_name in embeddings_factory:
return embeddings_factory[embeddings_name](*args, **kwargs)
else:
return EmbeddingsWrapper(embeddings_name, *args, **kwargs)
class BaseVectorStore(ABC):
def __init__(self):
pass
@@ -20,32 +62,28 @@ class BaseVectorStore(ABC):
return settings.OPENAI_API_BASE and settings.OPENAI_API_VERSION and settings.AZURE_DEPLOYMENT_NAME
def _get_embeddings(self, embeddings_name, embeddings_key=None):
embeddings_factory = {
"openai_text-embedding-ada-002": OpenAIEmbeddings,
"huggingface_sentence-transformers/all-mpnet-base-v2": HuggingFaceEmbeddings,
"huggingface_hkunlp/instructor-large": HuggingFaceInstructEmbeddings,
"cohere_medium": CohereEmbeddings
}
if embeddings_name not in embeddings_factory:
raise ValueError(f"Invalid embeddings_name: {embeddings_name}")
if embeddings_name == "openai_text-embedding-ada-002":
if self.is_azure_configured():
os.environ["OPENAI_API_TYPE"] = "azure"
embedding_instance = embeddings_factory[embeddings_name](
embedding_instance = EmbeddingsSingleton.get_instance(
embeddings_name,
model=settings.AZURE_EMBEDDINGS_DEPLOYMENT_NAME
)
else:
embedding_instance = embeddings_factory[embeddings_name](
embedding_instance = EmbeddingsSingleton.get_instance(
embeddings_name,
openai_api_key=embeddings_key
)
elif embeddings_name == "cohere_medium":
embedding_instance = embeddings_factory[embeddings_name](
cohere_api_key=embeddings_key
)
elif embeddings_name == "huggingface_sentence-transformers/all-mpnet-base-v2":
if os.path.exists("./model/all-mpnet-base-v2"):
embedding_instance = EmbeddingsSingleton.get_instance(
embeddings_name="./model/all-mpnet-base-v2",
)
else:
embedding_instance = EmbeddingsSingleton.get_instance(
embeddings_name,
)
else:
embedding_instance = embeddings_factory[embeddings_name]()
return embedding_instance
embedding_instance = EmbeddingsSingleton.get_instance(embeddings_name)
return embedding_instance

View File

@@ -0,0 +1,8 @@
class Document(str):
"""Class for storing a piece of text and associated metadata."""
def __new__(cls, page_content: str, metadata: dict):
instance = super().__new__(cls, page_content)
instance.page_content = page_content
instance.metadata = metadata
return instance

View File

@@ -1,25 +1,17 @@
from application.vectorstore.base import BaseVectorStore
from application.core.settings import settings
from application.vectorstore.document_class import Document
import elasticsearch
class Document(str):
"""Class for storing a piece of text and associated metadata."""
def __new__(cls, page_content: str, metadata: dict):
instance = super().__new__(cls, page_content)
instance.page_content = page_content
instance.metadata = metadata
return instance
class ElasticsearchStore(BaseVectorStore):
_es_connection = None # Class attribute to hold the Elasticsearch connection
def __init__(self, path, embeddings_key, index_name=settings.ELASTIC_INDEX):
def __init__(self, source_id, embeddings_key, index_name=settings.ELASTIC_INDEX):
super().__init__()
self.path = path.replace("application/indexes/", "").rstrip("/")
self.source_id = source_id.replace("application/indexes/", "").rstrip("/")
self.embeddings_key = embeddings_key
self.index_name = index_name
@@ -89,7 +81,7 @@ class ElasticsearchStore(BaseVectorStore):
embeddings = self._get_embeddings(settings.EMBEDDINGS_NAME, self.embeddings_key)
vector = embeddings.embed_query(question)
knn = {
"filter": [{"match": {"metadata.store.keyword": self.path}}],
"filter": [{"match": {"metadata.source_id.keyword": self.source_id}}],
"field": "vector",
"k": k,
"num_candidates": 100,
@@ -108,7 +100,7 @@ class ElasticsearchStore(BaseVectorStore):
}
}
],
"filter": [{"match": {"metadata.store.keyword": self.path}}],
"filter": [{"match": {"metadata.source_id.keyword": self.source_id}}],
}
},
"rank": {"rrf": {}},
@@ -217,5 +209,4 @@ class ElasticsearchStore(BaseVectorStore):
def delete_index(self):
self._es_connection.delete_by_query(index=self.index_name, query={"match": {
"metadata.store.keyword": self.path}},)
"metadata.source_id.keyword": self.source_id}},)

View File

@@ -1,12 +1,22 @@
from langchain.vectorstores import FAISS
from langchain_community.vectorstores import FAISS
from application.vectorstore.base import BaseVectorStore
from application.core.settings import settings
import os
def get_vectorstore(path):
if path:
vectorstore = "indexes/"+path
vectorstore = os.path.join("application", vectorstore)
else:
vectorstore = os.path.join("application")
return vectorstore
class FaissStore(BaseVectorStore):
def __init__(self, path, embeddings_key, docs_init=None):
def __init__(self, source_id, embeddings_key, docs_init=None):
super().__init__()
self.path = path
self.path = get_vectorstore(source_id)
embeddings = self._get_embeddings(settings.EMBEDDINGS_NAME, embeddings_key)
if docs_init:
self.docsearch = FAISS.from_documents(
@@ -14,7 +24,8 @@ class FaissStore(BaseVectorStore):
)
else:
self.docsearch = FAISS.load_local(
self.path, embeddings
self.path, embeddings,
allow_dangerous_deserialization=True
)
self.assert_embedding_dimensions(embeddings)
@@ -37,10 +48,10 @@ class FaissStore(BaseVectorStore):
"""
if settings.EMBEDDINGS_NAME == "huggingface_sentence-transformers/all-mpnet-base-v2":
try:
word_embedding_dimension = embeddings.client[1].word_embedding_dimension
word_embedding_dimension = embeddings.dimension
except AttributeError as e:
raise AttributeError("word_embedding_dimension not found in embeddings.client[1]") from e
raise AttributeError("'dimension' attribute not found in embeddings instance. Make sure the embeddings object is properly initialized.") from e
docsearch_index_dimension = self.docsearch.index.d
if word_embedding_dimension != docsearch_index_dimension:
raise ValueError(f"word_embedding_dimension ({word_embedding_dimension}) " +
f"!= docsearch_index_word_embedding_dimension ({docsearch_index_dimension})")
raise ValueError(f"Embedding dimension mismatch: embeddings.dimension ({word_embedding_dimension}) " +
f"!= docsearch index dimension ({docsearch_index_dimension})")

View File

@@ -0,0 +1,37 @@
from typing import List, Optional
from uuid import uuid4
from application.core.settings import settings
from application.vectorstore.base import BaseVectorStore
class MilvusStore(BaseVectorStore):
def __init__(self, path: str = "", embeddings_key: str = "embeddings"):
super().__init__()
from langchain_milvus import Milvus
connection_args = {
"uri": settings.MILVUS_URI,
"token": settings.MILVUS_TOKEN,
}
self._docsearch = Milvus(
embedding_function=self._get_embeddings(settings.EMBEDDINGS_NAME, embeddings_key),
collection_name=settings.MILVUS_COLLECTION_NAME,
connection_args=connection_args,
)
self._path = path
def search(self, question, k=2, *args, **kwargs):
return self._docsearch.similarity_search(query=question, k=k, filter={"path": self._path} *args, **kwargs)
def add_texts(self, texts: List[str], metadatas: Optional[List[dict]], *args, **kwargs):
ids = [str(uuid4()) for _ in range(len(texts))]
return self._docsearch.add_texts(texts=texts, metadatas=metadatas, ids=ids, *args, **kwargs)
def save_local(self, *args, **kwargs):
pass
def delete_index(self, *args, **kwargs):
pass

View File

@@ -0,0 +1,126 @@
from application.core.settings import settings
from application.vectorstore.base import BaseVectorStore
from application.vectorstore.document_class import Document
class MongoDBVectorStore(BaseVectorStore):
def __init__(
self,
source_id: str = "",
embeddings_key: str = "embeddings",
collection: str = "documents",
index_name: str = "vector_search_index",
text_key: str = "text",
embedding_key: str = "embedding",
database: str = "docsgpt",
):
self._index_name = index_name
self._text_key = text_key
self._embedding_key = embedding_key
self._embeddings_key = embeddings_key
self._mongo_uri = settings.MONGO_URI
self._source_id = source_id.replace("application/indexes/", "").rstrip("/")
self._embedding = self._get_embeddings(settings.EMBEDDINGS_NAME, embeddings_key)
try:
import pymongo
except ImportError:
raise ImportError(
"Could not import pymongo python package. "
"Please install it with `pip install pymongo`."
)
self._client = pymongo.MongoClient(self._mongo_uri)
self._database = self._client[database]
self._collection = self._database[collection]
def search(self, question, k=2, *args, **kwargs):
query_vector = self._embedding.embed_query(question)
pipeline = [
{
"$vectorSearch": {
"queryVector": query_vector,
"path": self._embedding_key,
"limit": k,
"numCandidates": k * 10,
"index": self._index_name,
"filter": {"source_id": {"$eq": self._source_id}},
}
}
]
cursor = self._collection.aggregate(pipeline)
results = []
for doc in cursor:
text = doc[self._text_key]
doc.pop("_id")
doc.pop(self._text_key)
doc.pop(self._embedding_key)
metadata = doc
results.append(Document(text, metadata))
return results
def _insert_texts(self, texts, metadatas):
if not texts:
return []
embeddings = self._embedding.embed_documents(texts)
to_insert = [
{self._text_key: t, self._embedding_key: embedding, **m}
for t, m, embedding in zip(texts, metadatas, embeddings)
]
insert_result = self._collection.insert_many(to_insert)
return insert_result.inserted_ids
def add_texts(
self,
texts,
metadatas=None,
ids=None,
refresh_indices=True,
create_index_if_not_exists=True,
bulk_kwargs=None,
**kwargs,
):
# dims = self._embedding.client[1].word_embedding_dimension
# # check if index exists
# if create_index_if_not_exists:
# # check if index exists
# info = self._collection.index_information()
# if self._index_name not in info:
# index_mongo = {
# "fields": [{
# "type": "vector",
# "path": self._embedding_key,
# "numDimensions": dims,
# "similarity": "cosine",
# },
# {
# "type": "filter",
# "path": "store"
# }]
# }
# self._collection.create_index(self._index_name, index_mongo)
batch_size = 100
_metadatas = metadatas or ({} for _ in texts)
texts_batch = []
metadatas_batch = []
result_ids = []
for i, (text, metadata) in enumerate(zip(texts, _metadatas)):
texts_batch.append(text)
metadatas_batch.append(metadata)
if (i + 1) % batch_size == 0:
result_ids.extend(self._insert_texts(texts_batch, metadatas_batch))
texts_batch = []
metadatas_batch = []
if texts_batch:
result_ids.extend(self._insert_texts(texts_batch, metadatas_batch))
return result_ids
def delete_index(self, *args, **kwargs):
self._collection.delete_many({"source_id": self._source_id})

View File

@@ -0,0 +1,47 @@
from langchain_community.vectorstores.qdrant import Qdrant
from application.vectorstore.base import BaseVectorStore
from application.core.settings import settings
from qdrant_client import models
class QdrantStore(BaseVectorStore):
def __init__(self, source_id: str = "", embeddings_key: str = "embeddings"):
self._filter = models.Filter(
must=[
models.FieldCondition(
key="metadata.source_id",
match=models.MatchValue(value=source_id.replace("application/indexes/", "").rstrip("/")),
)
]
)
self._docsearch = Qdrant.construct_instance(
["TEXT_TO_OBTAIN_EMBEDDINGS_DIMENSION"],
embedding=self._get_embeddings(settings.EMBEDDINGS_NAME, embeddings_key),
collection_name=settings.QDRANT_COLLECTION_NAME,
location=settings.QDRANT_LOCATION,
url=settings.QDRANT_URL,
port=settings.QDRANT_PORT,
grpc_port=settings.QDRANT_GRPC_PORT,
https=settings.QDRANT_HTTPS,
prefer_grpc=settings.QDRANT_PREFER_GRPC,
api_key=settings.QDRANT_API_KEY,
prefix=settings.QDRANT_PREFIX,
timeout=settings.QDRANT_TIMEOUT,
path=settings.QDRANT_PATH,
distance_func=settings.QDRANT_DISTANCE_FUNC,
)
def search(self, *args, **kwargs):
return self._docsearch.similarity_search(filter=self._filter, *args, **kwargs)
def add_texts(self, *args, **kwargs):
return self._docsearch.add_texts(*args, **kwargs)
def save_local(self, *args, **kwargs):
pass
def delete_index(self, *args, **kwargs):
return self._docsearch.client.delete(
collection_name=settings.QDRANT_COLLECTION_NAME, points_selector=self._filter
)

View File

@@ -1,11 +1,17 @@
from application.vectorstore.faiss import FaissStore
from application.vectorstore.elasticsearch import ElasticsearchStore
from application.vectorstore.milvus import MilvusStore
from application.vectorstore.mongodb import MongoDBVectorStore
from application.vectorstore.qdrant import QdrantStore
class VectorCreator:
vectorstores = {
'faiss': FaissStore,
'elasticsearch':ElasticsearchStore
"faiss": FaissStore,
"elasticsearch": ElasticsearchStore,
"mongodb": MongoDBVectorStore,
"qdrant": QdrantStore,
"milvus": MilvusStore,
}
@classmethod
@@ -13,4 +19,4 @@ class VectorCreator:
vectorstore_class = cls.vectorstores.get(type.lower())
if not vectorstore_class:
raise ValueError(f"No vectorstore class found for type {type}")
return vectorstore_class(*args, **kwargs)
return vectorstore_class(*args, **kwargs)

277
application/worker.py Normal file → Executable file
View File

@@ -1,39 +1,74 @@
import logging
import os
import shutil
import string
import zipfile
from collections import Counter
from urllib.parse import urljoin
import nltk
import requests
from bson.objectid import ObjectId
from pymongo import MongoClient
from application.core.settings import settings
from application.parser.file.bulk import SimpleDirectoryReader
from application.parser.open_ai_func import call_openai_api
from application.parser.remote.remote_creator import RemoteCreator
from application.parser.schema.base import Document
from application.parser.token_func import group_split
from application.utils import count_tokens_docs
try:
nltk.download('punkt', quiet=True)
nltk.download('averaged_perceptron_tagger', quiet=True)
except FileExistsError:
pass
mongo = MongoClient(settings.MONGO_URI)
db = mongo["docsgpt"]
sources_collection = db["sources"]
# Define a function to extract metadata from a given filename.
def metadata_from_filename(title):
store = '/'.join(title.split('/')[1:3])
return {'title': title, 'store': store}
return {"title": title}
# Define a function to generate a random string of a given length.
def generate_random_string(length):
return ''.join([string.ascii_letters[i % 52] for i in range(length)])
return "".join([string.ascii_letters[i % 52] for i in range(length)])
current_dir = os.path.dirname(
os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
)
def extract_zip_recursive(zip_path, extract_to, current_depth=0, max_depth=5):
"""
Recursively extract zip files with a limit on recursion depth.
Args:
zip_path (str): Path to the zip file to be extracted.
extract_to (str): Destination path for extracted files.
current_depth (int): Current depth of recursion.
max_depth (int): Maximum allowed depth of recursion to prevent infinite loops.
"""
if current_depth > max_depth:
logging.warning(f"Reached maximum recursion depth of {max_depth}")
return
with zipfile.ZipFile(zip_path, "r") as zip_ref:
zip_ref.extractall(extract_to)
os.remove(zip_path) # Remove the zip file after extracting
# Check for nested zip files and extract them
for root, dirs, files in os.walk(extract_to):
for file in files:
if file.endswith(".zip"):
# If a nested zip file is found, extract it recursively
file_path = os.path.join(root, file)
extract_zip_recursive(file_path, root, current_depth + 1, max_depth)
current_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
# Define the main function for ingesting and processing documents.
def ingest_worker(self, directory, formats, name_job, filename, user):
def ingest_worker(
self, directory, formats, name_job, filename, user, retriever="classic"
):
"""
Ingest and process documents.
@@ -44,6 +79,7 @@ def ingest_worker(self, directory, formats, name_job, filename, user):
name_job (str): Name of the job for this ingestion task.
filename (str): Name of the file to be ingested.
user (str): Identifier for the user initiating the ingestion.
retriever (str): Type of retriever to use for processing the documents.
Returns:
dict: Information about the completed ingestion task, including input parameters and a "limited" flag.
@@ -61,63 +97,216 @@ def ingest_worker(self, directory, formats, name_job, filename, user):
token_check = True
min_tokens = 150
max_tokens = 1250
full_path = directory + '/' + user + '/' + name_job
import sys
print(full_path, file=sys.stderr)
recursion_depth = 2
full_path = os.path.join(directory, user, name_job)
logging.info(f"Ingest file: {full_path}", extra={"user": user, "job": name_job})
# check if API_URL env variable is set
file_data = {'name': name_job, 'file': filename, 'user': user}
response = requests.get(urljoin(settings.API_URL, "/api/download"), params=file_data)
# check if file is in the response
print(response, file=sys.stderr)
file_data = {"name": name_job, "file": filename, "user": user}
response = requests.get(
urljoin(settings.API_URL, "/api/download"), params=file_data
)
file = response.content
if not os.path.exists(full_path):
os.makedirs(full_path)
with open(full_path + '/' + filename, 'wb') as f:
with open(os.path.join(full_path, filename), "wb") as f:
f.write(file)
# check if file is .zip and extract it
if filename.endswith('.zip'):
with zipfile.ZipFile(full_path + '/' + filename, 'r') as zip_ref:
zip_ref.extractall(full_path)
os.remove(full_path + '/' + filename)
if filename.endswith(".zip"):
extract_zip_recursive(
os.path.join(full_path, filename), full_path, 0, recursion_depth
)
self.update_state(state='PROGRESS', meta={'current': 1})
self.update_state(state="PROGRESS", meta={"current": 1})
raw_docs = SimpleDirectoryReader(input_dir=full_path, input_files=input_files, recursive=recursive,
required_exts=formats, num_files_limit=limit,
exclude_hidden=exclude, file_metadata=metadata_from_filename).load_data()
raw_docs = group_split(documents=raw_docs, min_tokens=min_tokens, max_tokens=max_tokens, token_check=token_check)
raw_docs = SimpleDirectoryReader(
input_dir=full_path,
input_files=input_files,
recursive=recursive,
required_exts=formats,
num_files_limit=limit,
exclude_hidden=exclude,
file_metadata=metadata_from_filename,
).load_data()
raw_docs = group_split(
documents=raw_docs,
min_tokens=min_tokens,
max_tokens=max_tokens,
token_check=token_check,
)
docs = [Document.to_langchain_format(raw_doc) for raw_doc in raw_docs]
id = ObjectId()
call_openai_api(docs, full_path, self)
self.update_state(state='PROGRESS', meta={'current': 100})
call_openai_api(docs, full_path, id, self)
tokens = count_tokens_docs(docs)
self.update_state(state="PROGRESS", meta={"current": 100})
if sample:
for i in range(min(5, len(raw_docs))):
print(raw_docs[i].text)
logging.info(f"Sample document {i}: {raw_docs[i]}")
# get files from outputs/inputs/index.faiss and outputs/inputs/index.pkl
# and send them to the server (provide user and name in form)
file_data = {'name': name_job, 'user': user}
file_data = {
"name": name_job,
"user": user,
"tokens": tokens,
"retriever": retriever,
"id": str(id),
"type": "local",
}
if settings.VECTOR_STORE == "faiss":
files = {'file_faiss': open(full_path + '/index.faiss', 'rb'),
'file_pkl': open(full_path + '/index.pkl', 'rb')}
response = requests.post(urljoin(settings.API_URL, "/api/upload_index"), files=files, data=file_data)
response = requests.get(urljoin(settings.API_URL, "/api/delete_old?path=" + full_path))
files = {
"file_faiss": open(full_path + "/index.faiss", "rb"),
"file_pkl": open(full_path + "/index.pkl", "rb"),
}
response = requests.post(
urljoin(settings.API_URL, "/api/upload_index"), files=files, data=file_data
)
else:
response = requests.post(urljoin(settings.API_URL, "/api/upload_index"), data=file_data)
response = requests.post(
urljoin(settings.API_URL, "/api/upload_index"), data=file_data
)
# delete local
shutil.rmtree(full_path)
return {
'directory': directory,
'formats': formats,
'name_job': name_job,
'filename': filename,
'user': user,
'limited': False
"directory": directory,
"formats": formats,
"name_job": name_job,
"filename": filename,
"user": user,
"limited": False,
}
def remote_worker(
self,
source_data,
name_job,
user,
loader,
directory="temp",
retriever="classic",
sync_frequency="never",
operation_mode="upload",
doc_id=None,
):
token_check = True
min_tokens = 150
max_tokens = 1250
full_path = directory + "/" + user + "/" + name_job
if not os.path.exists(full_path):
os.makedirs(full_path)
self.update_state(state="PROGRESS", meta={"current": 1})
logging.info(
f"Remote job: {full_path}",
extra={"user": user, "job": name_job, source_data: source_data},
)
remote_loader = RemoteCreator.create_loader(loader)
raw_docs = remote_loader.load_data(source_data)
docs = group_split(
documents=raw_docs,
min_tokens=min_tokens,
max_tokens=max_tokens,
token_check=token_check,
)
# docs = [Document.to_langchain_format(raw_doc) for raw_doc in raw_docs]
tokens = count_tokens_docs(docs)
if operation_mode == "upload":
id = ObjectId()
call_openai_api(docs, full_path, id, self)
elif operation_mode == "sync":
if not doc_id or not ObjectId.is_valid(doc_id):
raise ValueError("doc_id must be provided for sync operation.")
id = ObjectId(doc_id)
call_openai_api(docs, full_path, id, self)
self.update_state(state="PROGRESS", meta={"current": 100})
# Proceed with uploading and cleaning as in the original function
file_data = {
"name": name_job,
"user": user,
"tokens": tokens,
"retriever": retriever,
"id": str(id),
"type": loader,
"remote_data": source_data,
"sync_frequency": sync_frequency,
}
if settings.VECTOR_STORE == "faiss":
files = {
"file_faiss": open(full_path + "/index.faiss", "rb"),
"file_pkl": open(full_path + "/index.pkl", "rb"),
}
requests.post(
urljoin(settings.API_URL, "/api/upload_index"), files=files, data=file_data
)
else:
requests.post(urljoin(settings.API_URL, "/api/upload_index"), data=file_data)
shutil.rmtree(full_path)
return {"urls": source_data, "name_job": name_job, "user": user, "limited": False}
def sync(
self,
source_data,
name_job,
user,
loader,
sync_frequency,
retriever,
doc_id=None,
directory="temp",
):
try:
remote_worker(
self,
source_data,
name_job,
user,
loader,
directory,
retriever,
sync_frequency,
"sync",
doc_id,
)
except Exception as e:
return {"status": "error", "error": str(e)}
return {"status": "success"}
def sync_worker(self, frequency):
sync_counts = Counter()
sources = sources_collection.find()
for doc in sources:
if doc.get("sync_frequency") == frequency:
name = doc.get("name")
user = doc.get("user")
source_type = doc.get("type")
source_data = doc.get("remote_data")
retriever = doc.get("retriever")
doc_id = str(doc.get("_id"))
resp = sync(
self, source_data, name, user, source_type, frequency, retriever, doc_id
)
sync_counts["total_sync_count"] += 1
sync_counts[
"sync_success" if resp["status"] == "success" else "sync_failure"
] += 1
return {
key: sync_counts[key]
for key in ["total_sync_count", "sync_success", "sync_failure"]
}

View File

@@ -1,4 +1,5 @@
from application.app import app
from application.core.settings import settings
if __name__ == "__main__":
app.run(debug=True, port=7091)
app.run(debug=settings.FLASK_DEBUG_MODE, port=7091)

View File

@@ -1,5 +1,3 @@
version: "3.9"
services:
frontend:
build: ./frontend

View File

@@ -1,5 +1,3 @@
version: "3.9"
services:
redis:

View File

@@ -1,8 +1,8 @@
version: "3.9"
services:
frontend:
build: ./frontend
volumes:
- ./frontend/src:/app/src
environment:
- VITE_API_HOST=http://localhost:7091
- VITE_API_STREAMING=$VITE_API_STREAMING

View File

@@ -1,5 +1,3 @@
version: "3.9"
services:
frontend:
build: ./frontend

View File

@@ -1,8 +1,8 @@
version: "3.9"
services:
frontend:
build: ./frontend
volumes:
- ./frontend/src:/app/src
environment:
- VITE_API_HOST=http://localhost:7091
- VITE_API_STREAMING=$VITE_API_STREAMING
@@ -32,7 +32,7 @@ services:
worker:
build: ./application
command: celery -A application.app.celery worker -l INFO
command: celery -A application.app.celery worker -l INFO -B
environment:
- API_KEY=$API_KEY
- EMBEDDINGS_KEY=$API_KEY

View File

@@ -1,9 +1,9 @@
const withNextra = require('nextra')({
theme: 'nextra-theme-docs',
themeConfig: './theme.config.jsx'
})
theme: 'nextra-theme-docs',
themeConfig: './theme.config.jsx'
})
module.exports = withNextra()
module.exports = withNextra()
// If you have other Next.js configurations, you can pass them as the parameter:
// module.exports = withNextra({ /* other next.js config */ })
// If you have other Next.js configurations, you can pass them as the parameter:
// module.exports = withNextra({ /* other next.js config */ })

7595
docs/package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,17 +1,17 @@
{
"scripts":{
"dev": "next dev",
"scripts": {
"dev": "next dev",
"build": "next build",
"start": "next start"
},
"license": "MIT",
"license": "MIT",
"dependencies": {
"@vercel/analytics": "^1.0.2",
"docsgpt": "^0.2.4",
"next": "^13.4.19",
"nextra": "^2.12.3",
"nextra-theme-docs": "^2.12.3",
"@vercel/analytics": "^1.1.1",
"docsgpt": "^0.4.1",
"next": "^14.2.12",
"nextra": "^2.13.2",
"nextra-theme-docs": "^2.13.2",
"react": "^18.2.0",
"react-dom": "^18.2.0"
}
}
}

View File

@@ -227,3 +227,124 @@ JSON response indicating the status of the operation:
```json
{ "status": "ok" }
```
### 7. /api/get_api_keys
**Description:**
The endpoint retrieves a list of API keys for the user.
**Request:**
**Method**: `GET`
**Sample JavaScript Fetch Request:**
```js
// get_api_keys (GET http://127.0.0.1:5000/api/get_api_keys)
fetch("http://localhost:5001/api/get_api_keys", {
"method": "GET",
"headers": {
"Content-Type": "application/json; charset=utf-8"
},
})
.then((res) => res.text())
.then(console.log.bind(console))
```
**Response:**
JSON response with a list of created API keys:
```json
[
{
"id": "string",
"name": "string",
"key": "string",
"source": "string"
},
...
]
```
### 8. /api/create_api_key
**Description:**
Create a new API key for the user.
**Request:**
**Method**: `POST`
**Headers**: Content-Type should be set to `application/json; charset=utf-8`
**Request Body**: JSON object with the following fields:
* `name` — A name for the API key.
* `source` — The source documents that will be used.
* `prompt_id` — The prompt ID.
* `chunks` — The number of chunks used to process an answer.
Here is a JavaScript Fetch Request example:
```js
// create_api_key (POST http://127.0.0.1:5000/api/create_api_key)
fetch("http://127.0.0.1:5000/api/create_api_key", {
"method": "POST",
"headers": {
"Content-Type": "application/json; charset=utf-8"
},
"body": JSON.stringify({"name":"Example Key Name",
"source":"Example Source",
"prompt_id":"creative",
"chunks":"2"})
})
.then((res) => res.json())
.then(console.log.bind(console))
```
**Response**
In response, you will get a JSON document containing the `id` and `key`:
```json
{
"id": "string",
"key": "string"
}
```
### 9. /api/delete_api_key
**Description:**
Delete an API key for the user.
**Request:**
**Method**: `POST`
**Headers**: Content-Type should be set to `application/json; charset=utf-8`
**Request Body**: JSON object with the field:
* `id` — The unique identifier of the API key to be deleted.
Here is a JavaScript Fetch Request example:
```js
// delete_api_key (POST http://127.0.0.1:5000/api/delete_api_key)
fetch("http://127.0.0.1:5000/api/delete_api_key", {
"method": "POST",
"headers": {
"Content-Type": "application/json; charset=utf-8"
},
"body": JSON.stringify({"id":"API_KEY_ID"})
})
.then((res) => res.json())
.then(console.log.bind(console))
```
**Response:**
In response, you will get a JSON document indicating the status of the operation:
```json
{
"status": "ok"
}
```

10
docs/pages/API/_meta.json Normal file
View File

@@ -0,0 +1,10 @@
{
"API-docs": {
"title": "🗂️️ API-docs",
"href": "/API/API-docs"
},
"api-key-guide": {
"title": "🔐 API Keys guide",
"href": "/API/api-key-guide"
}
}

View File

@@ -0,0 +1,30 @@
## Guide to DocsGPT API Keys
DocsGPT API keys are essential for developers and users who wish to integrate the DocsGPT models into external applications, such as the our widget. This guide will walk you through the steps of obtaining an API key, starting from uploading your document to understanding the key variables associated with API keys.
### Uploading Your Document
Before creating your first API key, you must upload the document that will be linked to this key. You can upload your document through two methods:
- **GUI Web App Upload:** A user-friendly graphical interface that allows for easy upload and management of documents.
- **Using `/api/upload` Method:** For users comfortable with API calls, this method provides a direct way to upload documents.
### Obtaining Your API Key
After uploading your document, you can obtain an API key either through the graphical user interface or via an API call:
- **Graphical User Interface:** Navigate to the Settings section of the DocsGPT web app, find the API Keys option, and press 'Create New' to generate your key.
- **API Call:** Alternatively, you can use the `/api/create_api_key` endpoint to create a new API key. For detailed instructions, visit [DocsGPT API Documentation](https://docs.docsgpt.cloud/API/API-docs#8-apicreate_api_key).
### Understanding Key Variables
Upon creating your API key, you will encounter several key variables. Each serves a specific purpose:
- **Name:** Assign a name to your API key for easy identification.
- **Source:** Indicates the source document(s) linked to your API key, which DocsGPT will use to generate responses.
- **ID:** A unique identifier for your API key. You can view this by making a call to `/api/get_api_keys`.
- **Key:** The API key itself, which will be used in your application to authenticate API requests.
With your API key ready, you can now integrate DocsGPT into your application, such as the DocsGPT Widget or any other software, via `/api/answer` or `/stream` endpoints. The source document is preset with the API key, allowing you to bypass fields like `selectDocs` and `active_docs` during implementation.
Congratulations on taking the first step towards enhancing your applications with DocsGPT! With this guide, you're now equipped to navigate the process of obtaining and understanding DocsGPT API keys.

View File

@@ -107,3 +107,4 @@ Your instance is now available at your Public IP Address on port 5173. Enjoy usi
- [Deploy DocsGPT on Civo Compute Cloud](https://dev.to/rutamhere/deploying-docsgpt-on-civo-compute-c)
- [Deploy DocsGPT on DigitalOcean Droplet](https://dev.to/rutamhere/deploying-docsgpt-on-digitalocean-droplet-50ea)
- [Deploy DocsGPT on Kamatera Performance Cloud](https://dev.to/rutamhere/deploying-docsgpt-on-kamatera-performance-cloud-1bj)

View File

@@ -0,0 +1,100 @@
# Self-hosting DocsGPT on Kubernetes
This guide will walk you through deploying DocsGPT on Kubernetes.
## Prerequisites
Ensure you have the following installed before proceeding:
- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
- Access to a Kubernetes cluster
## Folder Structure
The `k8s` folder contains the necessary deployment and service configuration files:
- `deployments/`
- `services/`
- `docsgpt-secrets.yaml`
## Deployment Instructions
1. **Clone the Repository**
```sh
git clone https://github.com/arc53/DocsGPT.git
cd docsgpt/k8s
```
2. **Configure Secrets (optional)**
Ensure that you have all the necessary secrets in `docsgpt-secrets.yaml`. Update it with your secrets before applying if you want. By default we will use qdrant as a vectorstore and public docsgpt llm as llm for inference.
3. **Apply Kubernetes Deployments**
Deploy your DocsGPT resources using the following commands:
```sh
kubectl apply -f deployments/
```
4. **Apply Kubernetes Services**
Set up your services using the following commands:
```sh
kubectl apply -f services/
```
5. **Apply Secrets**
Apply the secret configurations:
```sh
kubectl apply -f docsgpt-secrets.yaml
```
6. **Substitute API URL**
After deploying the services, you need to update the environment variable `VITE_API_HOST` in your deployment file `deployments/docsgpt-deploy.yaml` with the actual endpoint URL created by your `docsgpt-api-service`.
```sh
kubectl get services/docsgpt-api-service -o jsonpath='{.status.loadBalancer.ingress[0].ip}' | xargs -I {} sed -i "s|<your-api-endpoint>|{}|g" deployments/docsgpt-deploy.yaml
```
7. **Rerun Deployment**
After making the changes, reapply the deployment configuration to update the environment variables:
```sh
kubectl apply -f deployments/
```
## Verifying the Deployment
To verify if everything is set up correctly, you can run the following:
```sh
kubectl get pods
kubectl get services
```
Ensure that the pods are running and the services are available.
## Accessing DocsGPT
To access DocsGPT, you need to find the external IP address of the frontend service. You can do this by running:
```sh
kubectl get services/docsgpt-frontend-service | awk 'NR>1 {print "http://" $4}'
```
## Troubleshooting
If you encounter any issues, you can check the logs of the pods for more details:
```sh
kubectl logs <pod-name>
```
Replace `<pod-name>` with the actual name of your DocsGPT pod.

View File

@@ -8,7 +8,7 @@ Just run the following command:
./setup.sh
```
This command will install all the necessary dependencies and provide you with an option to download the local model or use OpenAI.
This command will install all the necessary dependencies and provide you with an option to use our LLM API, download the local model or use OpenAI.
If you prefer to follow manual steps, refer to this guide:
@@ -16,7 +16,7 @@ If you prefer to follow manual steps, refer to this guide:
```bash
git clone https://github.com/arc53/DocsGPT.git
```
2. Create a `.env` file in your root directory and set your `API_KEY` with your [OpenAI API key](https://platform.openai.com/account/api-keys).
2. Create a `.env` file in your root directory and set your `API_KEY` with your [OpenAI API key](https://platform.openai.com/account/api-keys). (optional in case you want to use OpenAI)
3. Run the following commands:
```bash
docker-compose build && docker-compose up
@@ -67,62 +67,3 @@ To run the setup on Windows, you have two options: using the Windows Subsystem f
These steps should help you set up and run the project on Windows using either WSL or Git Bash/Command Prompt.
**Important:** Ensure that Docker is installed and properly configured on your Windows system for these steps to work.
For WINDOWS:
To run the given setup on Windows, you can use the Windows Subsystem for Linux (WSL) or a Git Bash terminal to execute similar commands. Here are the steps adapted for Windows:
Option 1: Using Windows Subsystem for Linux (WSL):
1. Install WSL if you haven't already. You can follow the official Microsoft documentation for installation: (https://learn.microsoft.com/en-us/windows/wsl/install).
2. After setting up WSL, open the WSL terminal.
3. Clone the repository and create the `.env` file:
```bash
git clone https://github.com/arc53/DocsGPT.git
cd DocsGPT
echo "API_KEY=Yourkey" > .env
echo "VITE_API_STREAMING=true" >> .env
```
4. Run the following command to start the setup with Docker Compose:
```bash
./run-with-docker-compose.sh
```
5. Open your web browser and navigate to http://localhost:5173/.
6. To stop the setup, just press **Ctrl + C** in the WSL terminal.
Option 2: Using Git Bash or Command Prompt (CMD):
1. Install Git for Windows if you haven't already. You can download it from the official website: (https://gitforwindows.org/).
2. Open Git Bash or Command Prompt.
3. Clone the repository and create the `.env` file:
```bash
git clone https://github.com/arc53/DocsGPT.git
cd DocsGPT
echo "API_KEY=Yourkey" > .env
echo "VITE_API_STREAMING=true" >> .env
```
4. Run the following command to start the setup with Docker Compose:
```bash
./run-with-docker-compose.sh
```
5. Open your web browser and navigate to http://localhost:5173/.
6. To stop the setup, just press **Ctrl + C** in the Git Bash or Command Prompt terminal.
These steps should help you set up and run the project on Windows using either WSL or Git Bash/Command Prompt. Make sure you have Docker installed and properly configured on your Windows system for this to work.
### Chrome Extension
#### Installing the Chrome extension:
To enhance your DocsGPT experience, you can install the DocsGPT Chrome extension. Here's how:
1. In the DocsGPT GitHub repository, click on the **Code** button and select **Download ZIP**.
2. Unzip the downloaded file to a location you can easily access.
3. Open the Google Chrome browser and click on the three dots menu (upper right corner).
4. Select **More Tools** and then **Extensions**.
5. Turn on the **Developer mode** switch in the top right corner of the **Extensions page**.
6. Click on the **Load unpacked** button.
7. Select the **Chrome** folder where the DocsGPT files have been unzipped (docsgpt-main > extensions > chrome).
8. The extension should now be added to Google Chrome and can be managed on the Extensions page.
9. To disable or remove the extension, simply turn off the toggle switch on the extension card or click the **Remove** button.

View File

@@ -10,5 +10,9 @@
"Railway-Deploying": {
"title": "🚂Deploying on Railway",
"href": "/Deploying/Railway-Deploying"
},
"Kubernetes-Deploying": {
"title": "☸Deploying on Kubernetes",
"href": "/Deploying/Kubernetes-Deploying"
}
}
}

View File

@@ -1,6 +0,0 @@
{
"API-docs": {
"title": "🗂️️ API-docs",
"href": "/Developing/API-docs"
}
}

View File

@@ -0,0 +1,34 @@
import {Steps} from 'nextra/components'
import { Callout } from 'nextra/components'
## Chrome Extension Setup Guide
To enhance your DocsGPT experience, you can install the DocsGPT Chrome extension. Here's how:
<Steps >
### Step 1
In the DocsGPT GitHub repository, click on the **Code** button and select **Download ZIP**.
### Step 2
Unzip the downloaded file to a location you can easily access.
### Step 3
Open the Google Chrome browser and click on the three dots menu (upper right corner).
### Step 4
Select **More Tools** and then **Extensions**.
### Step 5
Turn on the **Developer mode** switch in the top right corner of the **Extensions page**.
### Step 6
Click on the **Load unpacked** button.
### Step 7
7. Select the **Chrome** folder where the DocsGPT files have been unzipped (docsgpt-main > extensions > chrome).
### Step 8
The extension should now be added to Google Chrome and can be managed on the Extensions page.
### Step 9
To disable or remove the extension, simply turn off the toggle switch on the extension card or click the **Remove** button.
</Steps>

View File

@@ -4,7 +4,11 @@
"href": "/Extensions/Chatwoot-extension"
},
"react-widget": {
"title": "🏗️ Widget setup",
"href": "/Extensions/react-widget"
}
"title": "🏗️ Widget setup",
"href": "/Extensions/react-widget"
},
"Chrome-extension": {
"title": "🌐 Chrome Extension",
"href": "/Extensions/Chrome-extension"
}
}

View File

@@ -10,7 +10,6 @@ First, make sure you have Node.js and npm installed in your project. Then go to
In the file where you want to use the widget, import it and include the CSS file:
```js
import { DocsGPTWidget } from "docsgpt";
import "docsgpt/dist/style.css";
```
@@ -18,20 +17,36 @@ Now, you can use the widget in your component like this :
```jsx
<DocsGPTWidget
apiHost="https://your-docsgpt-api.com"
selectDocs="local/docs.zip"
apiKey=""
avatar = "https://d3dg1063dc54p9.cloudfront.net/cute-docsgpt.png"
title = "Get AI assistance"
description = "DocsGPT's AI Chatbot is here to help"
heroTitle = "Welcome to DocsGPT !"
heroDescription="This chatbot is built with DocsGPT and utilises GenAI,
please review important information using sources."
theme = "dark"
buttonIcon = "https://your-icon"
buttonBg = "#222327"
/>
```
DocsGPTWidget takes 3 **props**:
To tailor the widget to your needs, you can configure the following props in your component:
1. `apiHost` — The URL of your DocsGPT API.
2. `selectDocs` — The documentation source that you want to use for your widget (e.g. `default` or `local/docs1.zip`).
2. `theme` — Allows to select your specific theme (dark or light).
3. `apiKey` — Usually, it's empty.
4. `avatar`: Specifies the URL of the avatar or image representing the chatbot.
5. `title`: Sets the title text displayed in the chatbot interface.
6. `description`: Provides a brief description of the chatbot's purpose or functionality.
7. `heroTitle`: Displays a welcome title when users interact with the chatbot.
8. `heroDescription`: Provide additional introductory text or information about the chatbot's capabilities.
9. `buttonIcon`: Specifies the url of the icon image for the widget.
10. `buttonBg`: Allows to specify the Background color of the widget.
11. `size`: Sets the size of the widget ( small, medium).
### How to use DocsGPTWidget with [Nextra](https://nextra.site/) (Next.js + MDX)
Install your widget as described above and then go to your `pages/` folder and create a new file `_app.js` with the following content:
```js
import { DocsGPTWidget } from "docsgpt";
import "docsgpt/dist/style.css";
export default function MyApp({ Component, pageProps }) {
return (
@@ -42,6 +57,69 @@ export default function MyApp({ Component, pageProps }) {
)
}
```
### How to use DocsGPTWidget with HTML
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta http-equiv="X-UA-Compatible" content="ie=edge" />
<title>HTML + CSS</title>
<link rel="stylesheet" href="styles.css" />
</head>
<body>
<h1>This is a simple HTML + CSS template!</h1>
<div id="app"></div>
<!-- Include the widget script from dist/modern or dist/legacy -->
<script
src="https://unpkg.com/docsgpt/dist/modern/main.js"
type="module"
></script>
<script type="module">
window.onload = function () {
renderDocsGPTWidget("app", {
apiKey: "",
size: "medium",
});
};
</script>
</body>
</html>
```
To link the widget to your api and your documents you can pass parameters to the renderDocsGPTWidget('div id', { parameters }).
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>DocsGPT Widget</title>
<script src="https://unpkg.com/docsgpt/dist/modern/main.js" type="module"></script>
</head>
<body>
<div id="app"></div>
<!-- Include the widget script from dist/modern or dist/legacy -->
<script type="module">
window.onload = function() {
renderDocsGPTWidget('app', {
apiHost: 'http://localhost:7001',
apiKey:"",
avatar: 'https://d3dg1063dc54p9.cloudfront.net/cute-docsgpt.png',
title: 'Get AI assistance',
description: "DocsGPT's AI Chatbot is here to help",
heroTitle: 'Welcome to DocsGPT!',
heroDescription: 'This chatbot is built with DocsGPT and utilises GenAI, please review important information using sources.',
theme:"dark",
buttonIcon:"https://your-icon",
buttonBg:"#222327"
});
}
</script>
</body>
</html>
```
For more information about React, refer to this [link here](https://react.dev/learn)

View File

@@ -1,10 +1,25 @@
import Image from 'next/image'
# Customizing the Main Prompt
Customizing the main prompt for DocsGPT gives you the ability to tailor the AI's responses to your specific requirements. By modifying the prompt text, you can achieve more accurate and relevant answers. Here's how you can do it:
1. Navigate to `/application/prompts/combine_prompt.txt`.
1. Navigate to `SideBar -> Settings`.
2.In Settings select the `Active Prompt` now you will be able to see various prompts style.x
3.Click on the `edit icon` on the prompt of your choice and you will be able to see the current prompt for it,you can now customise the prompt as per your choice.
### Video Demo
<Image src="/prompts.gif" alt="prompts" width={800} height={500} />
2. Open the `combine_prompt.txt` file and modify the prompt text to suit your needs. You can experiment with different phrasings and structures to observe how the model responds. The main prompt serves as guidance to the AI model on how to generate responses.
## Example Prompt Modification

View File

@@ -1,63 +0,0 @@
## How to train on other documentation
This AI can utilize any documentation, but it requires preparation for similarity search. Follow these steps to get your documentation ready:
**Step 1: Prepare Your Documentation**
![video-example-of-how-to-do-it](https://d3dg1063dc54p9.cloudfront.net/videos/how-to-vectorise.gif)
Start by going to `/scripts/` folder.
If you open this file, you will see that it uses RST files from the folder to create a `index.faiss` and `index.pkl`.
It currently uses OPENAI to create the vector store, so make sure your documentation is not too large. Using Pandas cost me around $3-$4.
You can typically find documentation on GitHub in the `docs/` folder for most open-source projects.
### 1. Find documentation in .rst/.md format and create a folder with it in your scripts directory.
- Name it `inputs/`.
- Put all your .rst/.md files in there.
- The search is recursive, so you don't need to flatten them.
If there are no .rst/.md files, convert whatever you find to a .txt file and feed it. (Don't forget to change the extension in the script).
### Step 2: Configure Your OpenAI API Key
1. Create a .env file in the scripts/ folder.
- Add your OpenAI API key inside: OPENAI_API_KEY=<your-api-key>.
### Step 3: Run the Ingestion Script
`python ingest.py ingest`
It will provide you with the estimated cost.
### Step 4: Move `index.faiss` and `index.pkl` generated in `scripts/output` to `application/` folder.
### Step 5: Run the Web App
Once you run it, it will use new context relevant to your documentation.Make sure you select default in the dropdown in the UI.
## Customization
You can learn more about options while running ingest.py by running:
- Make sure you select 'default' from the dropdown in the UI.
## Customization
You can learn more about options while running ingest.py by executing:
`python ingest.py --help`
| Options | |
|:--------------------------------:|:------------------------------------------------------------------------------------------------------------------------------:|
| **ingest** | Runs 'ingest' function, converting documentation to Faiss plus Index format |
| --dir TEXT | List of paths to directory for index creation. E.g. --dir inputs --dir inputs2 [default: inputs] |
| --file TEXT | File paths to use (Optional; overrides directory) E.g. --files inputs/1.md --files inputs/2.md |
| --recursive / --no-recursive | Whether to recursively search in subdirectories [default: recursive] |
| --limit INTEGER | Maximum number of files to read |
| --formats TEXT | List of required extensions (list with .) Currently supported: .rst, .md, .pdf, .docx, .csv, .epub, .html [default: .rst, .md] |
| --exclude / --no-exclude | Whether to exclude hidden files (dotfiles) [default: exclude] |
| -y, --yes | Whether to skip price confirmation |
| --sample / --no-sample | Whether to output sample of the first 5 split documents. [default: no-sample] |
| --token-check / --no-token-check | Whether to group small documents and split large. Improves semantics. [default: token-check] |
| --min_tokens INTEGER | Minimum number of tokens to not group. [default: 150] |
| --max_tokens INTEGER | Maximum number of tokens to not split. [default: 2000] |
| | |
| **convert** | Creates documentation in .md format from source code |
| --dir TEXT | Path to a directory with source code. E.g. --dir inputs [default: inputs] |
| --formats TEXT | Source code language from which to create documentation. Supports py, js and java. E.g. --formats py [default: py] |

View File

@@ -0,0 +1,44 @@
import { Callout } from 'nextra/components'
import Image from 'next/image'
import { Steps } from 'nextra/components'
## How to train on other documentation
Training on other documentation sources can greatly enhance the versatility and depth of DocsGPT's knowledge. By incorporating diverse materials, you can broaden the AI's understanding and improve its ability to generate insightful responses across a range of topics. Here's a step-by-step guide on how to effectively train DocsGPT on additional documentation sources:
**Get your document ready**:
Make sure you have the document on which you want to train on ready with you on the device which you are using .You can also use links to the documentation to train on.
<Callout type="warning" emoji="⚠️">
Note: The document should be either of the given file formats .pdf, .txt, .rst, .docx, .md, .zip and limited to 25mb.You can also train using the link of the documentation.
</Callout>
### Video Demo
<Image src="/docs.gif" alt="prompts" width={800} height={500} />
<Steps>
### Step1
Navigate to the sidebar where you will find `Source Docs` option,here you will find 3 options built in which are default,Web Search and None.
### Step 2
Click on the `Upload icon` just beside the source docs options,now borwse and upload the document which you want to train on or select the `remote` option if you have to insert the link of the documentation.
### Step 3
Now you will be able to see the name of the file uploaded under the Uploaded Files ,now click on `Train`,once you click on train it might take some time to train on the document. You will be able to see the `Training progress` and once the training is completed you can click the `finish` button and there you go your docuemnt is uploaded.
### Step 4
Go to `New chat` and from the side bar select the document you uploaded under the `Source Docs` and go ahead with your chat, now you can ask qestions regarding the document you uploaded and you will get the effective answer based on it.
</Steps>

View File

@@ -1,48 +0,0 @@
# Setting Up Local Language Models for Your App
Your app relies on two essential models: Embeddings and Text Generation. While OpenAI's default models work seamlessly, you have the flexibility to switch providers or even run the models locally.
## Step 1: Configure Environment Variables
Navigate to the `.env` file or set the following environment variables:
```env
LLM_NAME=<your Text Generation model>
API_KEY=<API key for Text Generation>
EMBEDDINGS_NAME=<LLM for Embeddings>
EMBEDDINGS_KEY=<API key for Embeddings>
VITE_API_STREAMING=<true or false>
```
You can omit the keys if users provide their own. Ensure you set `LLM_NAME` and `EMBEDDINGS_NAME`.
## Step 2: Choose Your Models
**Options for `LLM_NAME`:**
- openai ([More details](https://platform.openai.com/docs/models))
- anthropic ([More details](https://docs.anthropic.com/claude/reference/selecting-a-model))
- manifest ([More details](https://python.langchain.com/docs/integrations/llms/manifest))
- cohere ([More details](https://docs.cohere.com/docs/llmu))
- llama.cpp ([More details](https://python.langchain.com/docs/integrations/llms/llamacpp))
- huggingface (Arc53/DocsGPT-7B by default)
- sagemaker ([Mode details](https://aws.amazon.com/sagemaker/))
Note: for huggingface you can choose any model inside application/llm/huggingface.py or pass llm_name on init, loads
**Options for `EMBEDDINGS_NAME`:**
- openai_text-embedding-ada-002
- huggingface_sentence-transformers/all-mpnet-base-v2
- huggingface_hkunlp/instructor-large
- cohere_medium
If you want to be completely local, set `EMBEDDINGS_NAME` to `huggingface_sentence-transformers/all-mpnet-base-v2`.
For llama.cpp Download the required model and place it in the `models/` folder.
Alternatively, for local Llama setup, run `setup.sh` and choose option 1. The script handles the DocsGPT model addition.
## Step 3: Local Hosting for Privacy
If working with sensitive data, host everything locally by setting `LLM_NAME`, llama.cpp or huggingface, use any model available on Hugging Face, for llama.cpp you need to convert it into gguf format.
That's it! Your app is now configured for local and private hosting, ensuring optimal security for critical data.

View File

@@ -0,0 +1,49 @@
import { Callout } from 'nextra/components'
import Image from 'next/image'
import { Steps } from 'nextra/components'
# Setting Up Local Language Models for Your App
Setting up local language models for your app can significantly enhance its capabilities, enabling it to understand and generate text in multiple languages without relying on external APIs. By integrating local language models, you can improve privacy, reduce latency, and ensure continuous functionality even in offline environments. Here's a comprehensive guide on how to set up local language models for your application:
## Steps:
### For cloud version LLM change:
<Steps >
### Step 1
Visit the chat screen and you will be to see the default LLM selected.
### Step 2
Click on it and you will get a drop down of various LLM's available to choose.
### Step 3
Choose the LLM of your choice.
</Steps>
### Video Demo
<Image src="/llms.gif" alt="prompts" width={800} height={500} />
### For Open source llm change:
<Steps >
### Step 1
For open source you have to edit .env file with LLM_NAME with their desired LLM name.
### Step 2
All the supported LLM providers are here application/llm and you can check what env variable are needed for each
List of latest supported LLMs are https://github.com/arc53/DocsGPT/blob/main/application/llm/llm_creator.py
### Step 3
Visit application/llm and select the file of your selected llm and there you will find the speicifc requirements needed to be filled in order to use it,i.e API key of that llm.
</Steps>
### For OpenAI-Compatible Endpoints:
DocsGPT supports the use of OpenAI-compatible endpoints through base URL substitution. This feature allows you to use alternative AI models or services that implement the OpenAI API interface.
Set the OPENAI_BASE_URL in your environment. You can change .env file with OPENAI_BASE_URL with the desired base URL or docker-compose.yml file and add the environment variable to the backend container.
> Make sure you have the right API_KEY and correct LLM_NAME.

View File

@@ -1,6 +1,6 @@
{
"Customising-prompts": {
"title": "🏗️ Customising Prompts",
"title": "💻 Customising Prompts",
"href": "/Guides/Customising-prompts"
},
"How-to-train-on-other-documentation": {
@@ -8,7 +8,7 @@
"href": "/Guides/How-to-train-on-other-documentation"
},
"How-to-use-different-LLM": {
"title": "⚙️ How to use different LLM's",
"title": "🤖 How to use different LLM's",
"href": "/Guides/How-to-use-different-LLM"
},
"My-AI-answers-questions-using-external-knowledge": {

View File

@@ -1,11 +1,10 @@
import { DocsGPTWidget } from "docsgpt";
import "docsgpt/dist/style.css";
export default function MyApp({ Component, pageProps }) {
return (
<>
<Component {...pageProps} />
<DocsGPTWidget selectDocs="local/docsgpt-sep.zip/"/>
<DocsGPTWidget apiKey="d61a020c-ac8f-4f23-bb98-458e4da3c240" theme="dark" />
</>
)
}

View File

@@ -2,14 +2,16 @@
title: 'Home'
---
import { Cards, Card } from 'nextra/components'
import Image from 'next/image'
import deployingGuides from './Deploying/_meta.json';
import developingGuides from './Developing/_meta.json';
import developingGuides from './API/_meta.json';
import extensionGuides from './Extensions/_meta.json';
import mainGuides from './Guides/_meta.json';
export const allGuides = {
...deployingGuides,
...developingGuides,
@@ -21,9 +23,12 @@ export const allGuides = {
DocsGPT 🦖 is an innovative open-source tool designed to simplify the retrieval of information from project documentation using advanced GPT models 🤖. Eliminate lengthy manual searches 🔍 and enhance your documentation experience with DocsGPT, and consider contributing to its AI-powered future 🚀.
![video-example-of-docs-gpt](https://d3dg1063dc54p9.cloudfront.net/videos/demov3.gif)
Try it yourself: [https://docsgpt.arc53.com/](https://docsgpt.arc53.com/)
<Image src="/homevideo.gif" alt="homedemo" width={800} height={500}/>
Try it yourself: [https://www.docsgpt.cloud/](https://www.docsgpt.cloud/)
<Cards
num={3}

BIN
docs/public/docs.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 839 KiB

BIN
docs/public/homevideo.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 MiB

BIN
docs/public/llms.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 500 KiB

BIN
docs/public/prompts.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 974 KiB

View File

@@ -107,12 +107,12 @@
}
},
"node_modules/braces": {
"version": "3.0.2",
"resolved": "https://registry.npmjs.org/braces/-/braces-3.0.2.tgz",
"integrity": "sha512-b8um+L1RzM3WDSzvhm6gIz1yfTbBt6YTlcEKAvsmqCZZFw46z626lVj9j1yEPW33H5H+lBQpZMP1k8l+78Ha0A==",
"version": "3.0.3",
"resolved": "https://registry.npmjs.org/braces/-/braces-3.0.3.tgz",
"integrity": "sha512-yQbXgO/OSZVD2IsiLlro+7Hf6Q18EJrKSEsdoMzKePKXct3gvD8oLcOQdIzGupr5Fj+EDe8gO/lxc1BzfMpxvA==",
"dev": true,
"dependencies": {
"fill-range": "^7.0.1"
"fill-range": "^7.1.1"
},
"engines": {
"node": ">=8"
@@ -260,9 +260,9 @@
}
},
"node_modules/fill-range": {
"version": "7.0.1",
"resolved": "https://registry.npmjs.org/fill-range/-/fill-range-7.0.1.tgz",
"integrity": "sha512-qOo9F+dMUmC2Lcb4BbVvnKJxTPjCm+RRpe4gDuGrzkL7mEVl/djYSu2OdQ2Pa302N4oqkSg9ir6jaLWJ2USVpQ==",
"version": "7.1.1",
"resolved": "https://registry.npmjs.org/fill-range/-/fill-range-7.1.1.tgz",
"integrity": "sha512-YsGpe3WHLK8ZYi4tWDg2Jy3ebRz2rXowDxnld4bkQB00cc/1Zw9AWnC0i9ztDJitivtQvaI9KaLyKrc+hBW0yg==",
"dev": true,
"dependencies": {
"to-regex-range": "^5.0.1"
@@ -884,12 +884,12 @@
"dev": true
},
"braces": {
"version": "3.0.2",
"resolved": "https://registry.npmjs.org/braces/-/braces-3.0.2.tgz",
"integrity": "sha512-b8um+L1RzM3WDSzvhm6gIz1yfTbBt6YTlcEKAvsmqCZZFw46z626lVj9j1yEPW33H5H+lBQpZMP1k8l+78Ha0A==",
"version": "3.0.3",
"resolved": "https://registry.npmjs.org/braces/-/braces-3.0.3.tgz",
"integrity": "sha512-yQbXgO/OSZVD2IsiLlro+7Hf6Q18EJrKSEsdoMzKePKXct3gvD8oLcOQdIzGupr5Fj+EDe8gO/lxc1BzfMpxvA==",
"dev": true,
"requires": {
"fill-range": "^7.0.1"
"fill-range": "^7.1.1"
}
},
"camelcase-css": {
@@ -1000,9 +1000,9 @@
}
},
"fill-range": {
"version": "7.0.1",
"resolved": "https://registry.npmjs.org/fill-range/-/fill-range-7.0.1.tgz",
"integrity": "sha512-qOo9F+dMUmC2Lcb4BbVvnKJxTPjCm+RRpe4gDuGrzkL7mEVl/djYSu2OdQ2Pa302N4oqkSg9ir6jaLWJ2USVpQ==",
"version": "7.1.1",
"resolved": "https://registry.npmjs.org/fill-range/-/fill-range-7.1.1.tgz",
"integrity": "sha512-YsGpe3WHLK8ZYi4tWDg2Jy3ebRz2rXowDxnld4bkQB00cc/1Zw9AWnC0i9ztDJitivtQvaI9KaLyKrc+hBW0yg==",
"dev": true,
"requires": {
"to-regex-range": "^5.0.1"

3
extensions/react-widget/.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
node_modules
dist
.parcel-cache

View File

@@ -0,0 +1,10 @@
{
"extends": "@parcel/config-default",
"resolvers": ["@parcel/resolver-glob","..."],
"transformers": {
"*.svg": ["...", "@parcel/transformer-svg-react", "@parcel/transformer-typescript-tsc"]
},
"validators": {
"*.{ts,tsx}": ["@parcel/validator-typescript"]
}
}

View File

@@ -1,6 +1,5 @@
# DocsGPT react widget
This widget will allow you to embed a DocsGPT assistant in your React app.
## Installation
@@ -11,9 +10,10 @@ npm install docsgpt
## Usage
### React
```javascript
import { DocsGPTWidget } from "docsgpt";
import "docsgpt/dist/style.css";
const App = () => {
return <DocsGPTWidget />;
@@ -24,17 +24,85 @@ To link the widget to your api and your documents you can pass parameters to the
```javascript
import { DocsGPTWidget } from "docsgpt";
import "docsgpt/dist/style.css";
const App = () => {
return <DocsGPTWidget apiHost="http://localhost:7001" selectDocs='default' apiKey=''/>;
return <DocsGPTWidget
apiHost="https://your-docsgpt-api.com"
apiKey=""
avatar = "https://d3dg1063dc54p9.cloudfront.net/cute-docsgpt.png"
title = "Get AI assistance"
description = "DocsGPT's AI Chatbot is here to help"
heroTitle = "Welcome to DocsGPT !"
heroDescription="This chatbot is built with DocsGPT and utilises GenAI,
please review important information using sources."
theme = "dark"
buttonIcon = "https://your-icon"
buttonBg = "#222327"
/>;
};
```
### Html
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>DocsGPT Widget</title>
</head>
<body>
<div id="app"></div>
<!-- Include the widget script from dist/modern or dist/legacy -->
<script src="https://unpkg.com/docsgpt/dist/modern/main.js" type="module"></script>
<script type="module">
window.onload = function() {
renderDocsGPTWidget('app');
}
</script>
</body>
</html>
```
To link the widget to your api and your documents you can pass parameters to the **renderDocsGPTWidget('div id', { parameters })**.
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>DocsGPT Widget</title>
</head>
<body>
<div id="app"></div>
<!-- Include the widget script from dist/modern or dist/legacy -->
<script src="https://unpkg.com/docsgpt/dist/modern/main.js" type="module"></script>
<script type="module">
window.onload = function() {
renderDocsGPTWidget('app', {
apiHost: 'http://localhost:7001',
apiKey:"",
avatar: 'https://d3dg1063dc54p9.cloudfront.net/cute-docsgpt.png',
title: 'Get AI assistance',
description: "DocsGPT's AI Chatbot is here to help",
heroTitle: 'Welcome to DocsGPT!',
heroDescription: 'This chatbot is built with DocsGPT and utilises GenAI, please review important information using sources.',
theme:"dark",
buttonIcon:"https://your-icon.svg",
buttonBg:"#222327"
});
}
</script>
</body>
</html>
```
## Our github
[DocsGPT](https://github.com/arc53/DocsGPT)
You can find the source code in the extensions/react-widget folder.

Some files were not shown because too many files have changed in this diff Show More