JSON_EXTRACT: zero-copy byte slicing for object, array, and number extraction by quackaplop · Pull Request #143702 · elastic/elasticsearch

quackaplop · 2026-03-05T17:18:07Z

Summary

Optimizes JSON_EXTRACT to use zero-copy byte slicing instead of copyCurrentStructure()
re-serialization when extracting objects, arrays, and numbers from JSON input. This builds on
the byte offset API exposed in #143501.

What changed

Object/array extraction — Previously, extracting a nested object or array walked every token
in the subtree and rebuilt JSON from scratch via XContentBuilder.copyCurrentStructure(). Now it
slices bytes directly from the input buffer using getTokenLocation().byteOffset() →
skipChildren() → getCurrentLocation().byteOffset(). Zero allocation, zero re-parsing.

Number extraction — Previously called parser.text() which makes Jackson convert the number
to a Java String, then wraps in BytesRef. Now byte-slices the number literal directly from
the input array, avoiding the String allocation entirely.

Boolean extraction — Reuses static TRUE_BYTES / FALSE_BYTES constants instead of
allocating a new BytesRef("true") / BytesRef("false") per call.

Navigation refactoring — Replaced recursive descent (extractValue → navigateObject →
extractValue → ...) with an iterative loop. Navigation methods are now pure parser-positioning
helpers that don't need the byte-slicing context, keeping raw byte access confined to the
extraction point.

Non-JSON _source formats (SMILE/CBOR/YAML) fall back to copyCurrentStructure().

Benchmarks

Also adds json_extract and json_extract_object scenarios to EvalBenchmark, and a
dedicated JsonExtractBenchmark with 10 scenarios through the full eval pipeline (EvalMapper →
Layout → Page → Evaluator).

Environment: Apple M3 Max, JDK 25.0.1, JMH 1.37, warmup 3×2s, measurement 5×2s.

Scenario	Before (ns/op)	After (ns/op)	Change
small_object (30B)	222.0 ± 2.8	115.9 ± 3.1	-47.8%
medium_object (500B)	1,275.9 ± 27.0	662.2 ± 15.7	-48.1%
large_object (4KB)	24,531.3 ± 1,641	15,938.0 ± 721	-35.0%
large_nested_extract (10KB doc)	12,323.1 ± 458	6,664.0 ± 180	-45.9%
array_of_objects ([25] of 50)	4,253.0 ± 76	3,853.5 ± 68	-9.4%
nested_scalar (5 levels)	206.2 ± 4.9	178.9 ± 3.1	-13.2%
deep_nesting (10 levels)	478.7 ± 54	324.9 ± 10.7	-32.1%
number	160.0 ± 2.4	133.1 ± 3.2	-16.8%
boolean	106.0 ± 2.1	100.6 ± 2.5	-5.1%
string	107.6 ± 2.8	103.1 ± 2.0	-4.2%

Largest wins on object/array extraction (35–48%) where copyCurrentStructure was the hot path.

Relates #142873

…traction Replace copyCurrentStructure() re-serialization with zero-copy byte slicing for JSON input. When the extracted value is an object, array, or number, slice bytes directly from the input buffer using XContentLocation.byteOffset() offsets (exposed in elastic#143501). Also refactors navigation from recursive descent to iterative loop, confining raw byte access to the extraction point. Adds JMH benchmarks for JSON_EXTRACT through the full eval pipeline.

elasticsearchmachine · 2026-03-05T17:18:32Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

Navigation methods now only position the parser — they no longer carry builder, segments, depth, rawBytes, or rawOffset.

coderabbitai · 2026-03-05T23:54:44Z

Important

Review skipped

Auto reviews are limited based on label configuration.

🏷️ Required labels (at least one) (2)

Team:Delivery
Team:Search - Inference

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 8e077be9-aa96-49f2-8bcc-ed85366ef820

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai · 2026-03-06T00:15:31Z

Note

Unit test generation is a public access feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Generating unit tests... This may take up to 20 minutes.

coderabbitai · 2026-03-06T00:19:59Z

❌ Failed to create PR with unit tests: AGENT_CHAT: Failed to open pull request

quackaplop added >enhancement :Analytics/ES|QL AKA ESQL labels Mar 5, 2026

elasticsearchmachine added v9.4.0 Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Mar 5, 2026

quackaplop and others added 4 commits March 5, 2026 17:19

Add changelog for elastic#143702

8de1155

[CI] Auto commit changes from spotless

56f0451

Clean up navigation helpers to avoid threading unused parameters

8d49e7c

Navigation methods now only position the parser — they no longer carry builder, segments, depth, rawBytes, or rawOffset.

Merge branch 'main' into feature/json-extract-byte-slicing

20101a2

quackaplop requested a review from nik9000 March 5, 2026 23:53

Use full variable names instead of abbreviations

cdf2e27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON_EXTRACT: zero-copy byte slicing for object, array, and number extraction#143702

JSON_EXTRACT: zero-copy byte slicing for object, array, and number extraction#143702
quackaplop wants to merge 6 commits intoelastic:mainfrom
quackaplop:feature/json-extract-byte-slicing

quackaplop commented Mar 5, 2026

Uh oh!

elasticsearchmachine commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai bot commented Mar 6, 2026

Uh oh!

coderabbitai bot commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

quackaplop commented Mar 5, 2026

Summary

What changed

Benchmarks

Uh oh!

elasticsearchmachine commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

coderabbitai bot commented Mar 6, 2026

Uh oh!

coderabbitai bot commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Mar 5, 2026 •

edited

Loading