Skip to content

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Aug 17, 2025

I ran experiments locally that suggest that some of the performance decrease from type pollution (mikemccand/luceneutil#436) can be attributed to calls to SimScorer#score no longer being inlinable since they are polymorphic. This change helps BM25Scorer remain inlinable using similar tricks that we are applying for Bits#get and ImpactsEnum#nextDoc/ImpactsEnum#advance.

Hopefully changes such as #15039 will help improve performance with other similarities as well in the future.

I ran experiments locally that suggest that some of the performance decrease
from type pollution (mikemccand/luceneutil#436)
can be attributed to calls to `SimScorer#score` no longer being inlinable since
they are polymorphic. This change helps `BM25Scorer` remain inlinable using
similar tricks that we are applying for `Bits#get` and
`ImpactsEnum#nextDoc`/`ImpactsEnum#advance`.

Hopefully changes such as apache#15039 will help improve performance with other
similarities as well in the future.
Copy link
Contributor

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

Copy link
Contributor

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

Copy link
Contributor

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@rmuir
Copy link
Member

rmuir commented Aug 18, 2025

Why not just commit #15039 instead of this, which will also solve this problem and make it bimorphic?

@jpountz
Copy link
Contributor Author

jpountz commented Aug 18, 2025

I was hesitating which one to commit first. #15039 helps less in my benchmarks for now, likely because it only helps the first clause of conjunctive queries, because it doesn't help with computing impact scores, and because it only helps when scores are computed via the BulkScorer API instead of Scorer API (so e.g. AndHighOrMedMed doesn't see a speedup). I can start with #15039 plus follow-ups and see how much this PR still helps when I get stuck.

@rmuir
Copy link
Member

rmuir commented Aug 18, 2025

But this PR is also incomplete and ooesn't hook in the logic for things like SynonymQuery, PhraseQuery, which led to my confusion :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants