Skip to content

fix: wrap group_ids OR clause in parentheses for BM25 fulltext query#1311

Open
giulio-leone wants to merge 1 commit intogetzep:mainfrom
giulio-leone:fix/bm25-query-parenthesis
Open

fix: wrap group_ids OR clause in parentheses for BM25 fulltext query#1311
giulio-leone wants to merge 1 commit intogetzep:mainfrom
giulio-leone:fix/bm25-query-parenthesis

Conversation

@giulio-leone
Copy link

Summary

Fixes #1249 — BM25 fulltext search with multiple group_ids only filtered by the last group due to missing parentheses around the OR clause.

Problem

Without parentheses, Lucene operator precedence causes AND to bind tighter than OR:

group_id:"1" OR group_id:"2" OR group_id:"3" AND (query)

is parsed as:

group_id:"1" OR group_id:"2" OR (group_id:"3" AND query)

Fix

Wrap the OR clause in parentheses in fulltext_query() (search_utils.py line 101):

# Before
group_ids_filter += ' AND ' if group_ids_filter else ''

# After
group_ids_filter = f'({group_ids_filter}) AND ' if group_ids_filter else ''

Now produces the correct query:

(group_id:"1" OR group_id:"2" OR group_id:"3") AND (query)

Tests

Added tests/utils/search/test_fulltext_query.py with 4 test cases covering single, multiple, no, and empty group_ids.

Without parentheses around the OR-joined group_id filters, Lucene's
operator precedence causes AND to bind tighter than OR. This means
only the last group_id is effectively combined with the search query.

Before: group_id:"1" OR group_id:"2" OR group_id:"3" AND (query)
After:  (group_id:"1" OR group_id:"2" OR group_id:"3") AND (query)

Closes getzep#1249
@danielchalef
Copy link
Member


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


giulio-leone seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@giulio-leone
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

@giulio-leone
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

@giulio-leone
Copy link
Author

I have read the Zep CLA and I sign and agree to its terms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Missing parenthesis for BM25 search method query only searching with the last group_id setted

2 participants