Skip to content

Commit c35a20d

Browse files
Set indexing error when embedding model returns incorrect number of embeddings (#321)
* Raise error when embeddings and text chunks don't match * Elaborate on error message
1 parent 464d8d7 commit c35a20d

File tree

1 file changed

+8
-2
lines changed

1 file changed

+8
-2
lines changed

llm-service/app/ai/indexing/embedding_indexer.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,12 @@ def _compute_embeddings(
130130
logger.debug(f"Waiting for {len(futures)} futures")
131131
for future in as_completed(futures):
132132
i, batch_embeddings = future.result()
133-
for chunk, embedding in zip(batched_chunks[i], batch_embeddings):
133+
batch_chunks = batched_chunks[i]
134+
if len(batch_chunks) != len(batch_embeddings):
135+
raise ValueError(
136+
f"Expected {len(batch_chunks)} embedding vectors for this batch of chunks,"
137+
+ f" but got {len(batch_embeddings)} from {self.embedding_model.model_name}"
138+
)
139+
for chunk, embedding in zip(batch_chunks, batch_embeddings):
134140
chunk.embedding = embedding
135-
yield batched_chunks[i]
141+
yield batch_chunks

0 commit comments

Comments
 (0)