Disallow dot_product and max_inner_product for int8_hnsw GPU type #136881

mayya-sharipova · 2025-10-21T14:27:00Z

For int8_hnsw, during merges we get quantized vectors from Lucene files but dropping for each quantized vector its correction factor. For cosine and euclidean metrics this correction factor is not important, but for dot_product and max_inner_product metrics, they are important. It means that that currently for dot_product and max_inner_product metrics, GPU graph building doesn't work well, and may produce bad recall. This PR does the following:

disallows max_inner_product for int8
substitutes internally dot_product with cosine

Alternatives: for most datasets (really majority), we can substitute dot_product with cosine. But there are some datasets that require max_inner_product, and for this our "int8_hsnw" will not work, and "hnsw" should be used instead

For int8_hnsw, during merges we get quantized vectors from Lucene files but dropping for each quantized vector its correction factor. For cosine and euclidean metrics this correction factor is not important, but for dot_product and max_inner_product metrics, they are important. It means that that currently for dot_product and max_inner_product metrics, GPU graph building doesn't work well, and may product bad recall. Thus this PR diallows these metrids. Alternatives: for most datasets (really majority), we can substitute dot_product with cosine. But there are some datasets that require max_inner_product, and for this our "int8_hsnw" will not work, and "hnsw" should be used instead

elasticsearchmachine · 2025-10-21T14:27:40Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

elasticsearchmachine · 2025-10-21T14:27:40Z

Hi @mayya-sharipova, I've created a changelog YAML for you.

mayya-sharipova · 2025-10-21T14:29:21Z

KnnIndexTester
opeanai: 2.6M docs; 1536 dims; 8 indexing threads

Notice low recall for dot_product on gpu, while recall recovers for cosine.

int8 dot_product

index_type	index_time( ms)	force_merge_time (ms)	QPS multiple segs	recall multiple segs	QPS force_merge 5 segs	recall 5 segs
cpu	697877	152161	95	0.84	148	0.84
gpu	235097	29671	117	0.71	185	0.48

int8 cosine

index_type	index_time( ms)	force_merge_time (ms)	QPS multiple segs	recall multiple segs	QPS force_merge 5 segs	recall 5 segs
cpu	739290	151696	96	0.85	151	0.83
gpu	286267	45025	99	0.99	136	0.99

But for other datasets, there is no difference:

hotpotqa-arctic: 5.2M docs; 768 dims; 8 indexing threads
int8 dot_product

index_type	index_time( ms)	force_merge_time (ms)	QPS multiple segs	recall multiple segs	QPS force_merge 5 segs	recall 5 segs
cpu	565207	177059	147	0.67	224	0.69
gpu	357982	58152	124	0.88	167	0.89

index_type	index_time( ms)	force_merge_time (ms)	QPS multiple segs	recall multiple segs	QPS force_merge 5 segs	recall 5 segs
cpu	600691	175769	154	0.68	217	0.68
gpu	303617	54235	117	0.87	160	0.86

benwtrent · 2025-10-21T17:11:15Z

x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/GPUPlugin.java

            }
            return new ES92GpuHnswVectorsFormat(hnswIndexOptions.m(), efConstruction);
        } else if (indexOptions.getType() == DenseVectorFieldMapper.VectorIndexType.INT8_HNSW) {
+            if (similarity == DenseVectorFieldMapper.VectorSimilarity.DOT_PRODUCT


I think this should just silently change to cosine when using the GPU integration to build the index, then using DOT_PRODUCT on search would work just fine.

this way we only disallow max-inner-product.

Good suggestion, I will follow up on this.

Addressed in 60e1dc9

benwtrent

I think this is good. I have one concern about element_type: byte vs. quantized element_type: float which both would use CuVSMatrix.DataType.BYTE right?

x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/codec/ES92GpuHnswVectorsWriter.java

…dec/ES92GpuHnswVectorsWriter.java Co-authored-by: Benjamin Trent <[email protected]>

elasticsearchmachine · 2025-10-22T21:20:04Z

💔 Backport failed

Status	Branch	Result
❌	9.2	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 136881

mayya-sharipova added >bug :Search Relevance/Search Catch all for Search Relevance v9.2.1 v9.3.0 test-gpu Run tests using a GPU auto-backport Automatically create backport pull requests when merged labels Oct 21, 2025

elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 21, 2025

Update docs/changelog/136881.yaml

79d8e02

mayya-sharipova added 3 commits October 21, 2025 12:03

Merge branch 'main' into gpu-avoid-dot-product-int8

95882d0

Fix errors

34ad4cf

Merge branch 'main' into gpu-avoid-dot-product-int8

ef30f28

benwtrent reviewed Oct 21, 2025

View reviewed changes

mayya-sharipova added 4 commits October 21, 2025 14:15

Substitute dot_product with cosine for int8

60e1dc9

Merge branch 'main' into gpu-avoid-dot-product-int8

3cb0849

Merge branch 'main' into gpu-avoid-dot-product-int8

0a3a354

Merge branch 'main' into gpu-avoid-dot-product-int8

02b9a7b

mayya-sharipova removed the test-gpu Run tests using a GPU label Oct 22, 2025

Merge branch 'main' into gpu-avoid-dot-product-int8

3af91ae

benwtrent approved these changes Oct 22, 2025

View reviewed changes

x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/codec/ES92GpuHnswVectorsWriter.java Outdated Show resolved Hide resolved

mayya-sharipova and others added 4 commits October 22, 2025 12:19

Update x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/co…

0fac65c

…dec/ES92GpuHnswVectorsWriter.java Co-authored-by: Benjamin Trent <[email protected]>

Merge branch 'main' into gpu-avoid-dot-product-int8

fe79abd

Add assertion

33909ff

Merge branch 'main' into gpu-avoid-dot-product-int8

6f43a34

mayya-sharipova added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Oct 22, 2025

mayya-sharipova added 2 commits October 22, 2025 13:32

Merge branch 'main' into gpu-avoid-dot-product-int8

5150339

Merge branch 'main' into gpu-avoid-dot-product-int8

774bf13

mayya-sharipova added 3 commits October 22, 2025 14:31

Merge branch 'main' into gpu-avoid-dot-product-int8

7bb554b

Merge branch 'main' into gpu-avoid-dot-product-int8

bbc8465

Merge branch 'main' into gpu-avoid-dot-product-int8

345af15

elasticsearchmachine merged commit 9634fd6 into elastic:main Oct 22, 2025
34 checks passed

mayya-sharipova deleted the gpu-avoid-dot-product-int8 branch October 22, 2025 21:19

elasticsearchmachine added the backport pending label Oct 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Disallow dot_product and max_inner_product for int8_hnsw GPU type #136881

Disallow dot_product and max_inner_product for int8_hnsw GPU type #136881

mayya-sharipova commented Oct 21, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Oct 21, 2025

Uh oh!

elasticsearchmachine commented Oct 21, 2025

Uh oh!

mayya-sharipova commented Oct 21, 2025 •

edited

Loading

Uh oh!

benwtrent Oct 21, 2025

Uh oh!

mayya-sharipova Oct 21, 2025

Uh oh!

mayya-sharipova Oct 21, 2025

Uh oh!

benwtrent left a comment

Uh oh!

Uh oh!

Uh oh!

elasticsearchmachine commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Disallow dot_product and max_inner_product for int8_hnsw GPU type #136881

Disallow dot_product and max_inner_product for int8_hnsw GPU type #136881

Conversation

mayya-sharipova commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Oct 21, 2025

Uh oh!

elasticsearchmachine commented Oct 21, 2025

Uh oh!

mayya-sharipova commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benwtrent Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

mayya-sharipova Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

mayya-sharipova Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

elasticsearchmachine commented Oct 22, 2025

💔 Backport failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mayya-sharipova commented Oct 21, 2025 •

edited

Loading

mayya-sharipova commented Oct 21, 2025 •

edited

Loading