You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat: add support for reading XLSX files
* feat: add pandas excel dependencies
* fix: prevent lambda function from capturing loop variable in EmbeddingIndexer
* Use Executor.submit() args
* feat: renamed XlsxReader with ExcelReader for broader Excel file support
* refactor: renaming XlsxSplitter and fixing mypy errors
* refactor: rename config classes for consistency
* Update release version to dev-testing
* Cast TextNodes directly
* Simplify model_source if-else
* Remove implicit port conversion in config_to_env to stringify of None
* Improve Qdrant configuration and performance with environment variables and gRPC support
* Remove unused environment variables and hardcode embedding concurrency and boto3 max pool connections
* minor fixes for configuration
* Reduce EmbeddingIndexer batch size and add botocore config to BedrockModelProvider
* Adjust batch size in EmbeddingIndexer based on reader type to prevent Qdrant timeouts
* Add support for CSVReader in EmbeddingIndexer and adjust batch size accordingly
* Refactor batch sizes and sampling for EmbeddingIndexer and SummaryIndexer to improve performance with tabular documents
* Enhance ExcelReader to handle empty workbooks and ensure JSON serialization compatibility
* Enhance ExcelReader to handle null dataframes and improve JSON serialization
* fix: Use non-depracated `map` function over `applymap` in ExcelReader
* Refactor batch sizes in EmbeddingIndexer and SummaryIndexer to use Qdrant-safe batches
* Adjust batch sizes in LlamaIndexQdrantVectorStore
* fix: mypy errors
* Update release version to dev-testing
* Refactor Qdrant configuration and ExcelReader for improved performance and compatibility
* fix: more mypy issues
* Update release version to dev-testing
* Enable Git LFS for prebuilt artifacts
* merge origin/main
* Update prebuilt artifacts with new versions
* Update batch sizes for Qdrant vector store and indexing
* fix: Increase memory for application to allow excel use cases
* Update llm-service/app/ai/indexing/readers/base_reader.py
Co-authored-by: mliu-cloudera <[email protected]>
* Update llm-service/app/config.py
Co-authored-by: mliu-cloudera <[email protected]>
* Update llm-service/app/ai/vector_stores/qdrant.py
Co-authored-by: mliu-cloudera <[email protected]>
* Update llm-service/app/ai/indexing/embedding_indexer.py
Co-authored-by: mliu-cloudera <[email protected]>
* fix: minor fixes and adjustments for consistency
* refactor: simplify batch size logic in embedding and summary indexers
* Update llm-service/app/ai/indexing/readers/base_reader.py
Co-authored-by: mliu-cloudera <[email protected]>
* Update llm-service/app/ai/indexing/readers/base_reader.py
Co-authored-by: mliu-cloudera <[email protected]>
* Update llm-service/app/ai/indexing/summary_indexer.py
Co-authored-by: mliu-cloudera <[email protected]>
* refactor: reverting variable name batch_size to max_samples
* Update .DS_Store file in llm-service directory
---------
Co-authored-by: Michael Liu <[email protected]>
Co-authored-by: actions-user <[email protected]>
Copy file name to clipboardExpand all lines: .project-metadata.yaml
+5-5Lines changed: 5 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -1,16 +1,16 @@
1
1
name: RAG Studio
2
-
description: |
2
+
description: |
3
3
"Build a RAG application to ask questions about your documents. Configuration for access to models will be available inside the application itself once it has been deployed."
4
4
author: "Cloudera"
5
5
date: "2024-09-10"
6
6
specification_version: 1.0
7
7
prototype_version: 1.0
8
8
9
9
environment_variables:
10
-
UV_HTTP_TIMEOUT:
11
-
description: "Timeout for UV processing in seconds."
12
-
default: "60000"
13
-
required: false
10
+
UV_HTTP_TIMEOUT:
11
+
description: "Timeout for UV processing in seconds."
0 commit comments