Pavel
5a891647bf
parser functions change
...
token_func proposed change to chunking. open_ai_func proposed change to embedding_pipeline. Late chunking first implementation requires further testing.
2024-11-20 21:40:57 +04:00
Alex
63b547ea13
fix: delete old files
2024-11-17 12:59:34 +00:00
Alex
2245f4690e
fix: reddit loader validation
2024-11-15 11:02:27 +00:00
JeevaRamanathan M
5c756348a5
feat: Presentation parser implementation
...
Signed-off-by: JeevaRamanathan M <jeevaramanathan.m@infosys.com >
2024-10-31 11:47:12 +00:00
Alex
1c791f240a
Merge pull request #1377 from JeevaRamanathan/feature/file-json
...
feat: JSON Parser Implementation
2024-10-26 17:28:57 +01:00
JeevaRamanathan M
c77d415893
feat: JSON parser implementation
...
Signed-off-by: JeevaRamanathan M <jeevaramanathan.m@infosys.com >
2024-10-24 20:36:47 +00:00
devendra.parihar
d3238de8ab
fix: lint error
2024-10-18 12:23:17 +05:30
devendra.parihar
09a2705311
fix:GitHubLoader to Handle Binary Files
2024-10-18 12:08:08 +05:30
devendra.parihar
a4c0861cf4
fix:GitHubLoader to Handle Binary Files
2024-10-18 12:07:44 +05:30
Alex
c9e95a9146
Merge pull request #1184 from Devparihar5/ExcelParser
...
new: added ExcelParser(tested) to read .xlsx files
2024-10-06 23:19:37 +01:00
Alex
6932c7e3e9
feat: add filename to the top
2024-10-05 21:56:47 +01:00
Alex
c04687fdd1
fix: github loader metadata clickable
2024-10-05 21:53:30 +01:00
Alex
7717242112
fix(lint): ruff var
2024-10-05 21:37:55 +01:00
Alex
1ad82c22d9
fix: headers
2024-10-05 21:36:04 +01:00
Alex
8fa88175c1
fix: translation + auth
2024-10-05 21:33:58 +01:00
Alex
2611550ffd
2024-10-02 23:44:29 +01:00
devendra.parihar
7794129929
new: added ExcelParser(tested) to read .xlsx files
2024-10-01 22:03:10 +05:30
Siddhant Rai
3d292aa485
feat: sync remote sources through celery periodic tasks
2024-09-25 15:20:11 +05:30
Alex
44d225e6ca
Merge branch 'main' into 1059-migrating-database-to-new-model
2024-09-09 23:55:25 +01:00
Alex
2f9c72c1cf
feat: migrate store to source_id
2024-09-09 15:46:18 +01:00
Alex
1bb81614a5
fix: metadata things
2024-09-09 13:37:11 +01:00
Alex
8166642ff9
fix: write id instead of old path on remote db's
2024-09-09 12:00:59 +01:00
Alex
c49b7613e0
fix: langchain warning
2024-08-31 12:53:37 +01:00
Alex
16aedd61da
fix: ruff lint
2024-08-12 16:37:03 +01:00
Alex
5a2f3ad616
feat: remove dep
2024-08-12 16:35:23 +01:00
ManishMadan2882
9000838aab
(feat:vectors): calc, add token in db
2024-05-24 21:10:50 +05:30
Siddhant Rai
53e86205ad
fix: added more headers from default
2024-05-03 18:47:30 +05:30
Siddhant Rai
aa670efe3a
fix: connection aborted in WebBaseLoader
2024-05-03 18:25:01 +05:30
Alex
8d7a134cb4
lint: ruff
2024-04-09 17:25:08 +01:00
Siddhant Rai
e01071426f
feat: field to pass number of posts as a parameter
2024-03-27 19:20:55 +05:30
Siddhant Rai
eed1bfbe50
feat: fields to handle reddit loader + minor changes
2024-03-26 16:07:44 +05:30
Siddhant Rai
60cfea1126
feat: added reddit loader
2024-03-16 20:22:05 +05:30
Alex
4a701cb993
Merge branch 'main' into feature/remote-loads
2024-03-01 14:38:27 +00:00
Pavel
54d187a0ad
Fixing ingestion metadata grouping
2024-02-28 19:52:58 +03:00
Pavel
c8d8a8d0b5
Fixing ingestion metadata grouping
2024-02-25 16:03:18 +03:00
Alex
0cb3d12d94
Refactor loader classes to accept inputs directly
2024-02-14 15:17:56 +00:00
Alex
2e14dec12d
Merge pull request #849 from arc53/main
...
Sync
2024-02-09 14:05:39 +00:00
Anton Larin
9e04b7796a
application folder related changes:
...
* optimize content of requirements.txt
* upgrade libs
* fix imports
2024-01-27 16:25:19 +01:00
Anton Larin
e8099c4db5
script folder related changes:
...
* optmize content of requirements.txt
* upgrade libs
* fix imports
2024-01-27 14:58:08 +01:00
Exterminator11
f3540aac0f
Changed import
2023-10-25 17:07:47 +05:30
Exterminator11
889ce984a9
Made changes
2023-10-25 16:50:01 +05:30
Pavel
381a2740ee
change input
2023-10-13 21:52:56 +04:00
Pavel
024674eef3
List check
2023-10-13 11:42:42 +04:00
Pavel
b7d88b4c0f
fix wrong link
2023-10-12 19:45:36 +04:00
Pavel
719ca63ec1
fixes
2023-10-12 19:40:23 +04:00
Pavel
2cfb416fd0
Desc loader
2023-10-12 13:44:32 +04:00
Pavel
50f07f9ef5
limit crawler
2023-10-12 12:53:33 +04:00
Pavel
c517bdd2e1
Crawler + sitemap
2023-10-12 12:35:26 +04:00
Pavel
658867cb46
No crawler, no sitemap
2023-10-12 01:03:40 +04:00
Alex
8f2ad38503
tests
2023-10-11 10:13:51 +01:00