Skip to content

Allow compressed batch sizes up to 32767#9230

Open
akuzm wants to merge 9 commits intotimescale:mainfrom
akuzm:bigger-batch
Open

Allow compressed batch sizes up to 32767#9230
akuzm wants to merge 9 commits intotimescale:mainfrom
akuzm:bigger-batch

Conversation

@akuzm
Copy link
Member

@akuzm akuzm commented Feb 4, 2026

It should be allowed everywhere in the code already. We have a GUC that can reduce it below the current default of 1000, and this PR also allows the GUC to go up.

The GUC is timescaledb.compression_batch_size_limit;

I'm not going to advertise this in the changelog for now, until we get a better understanding of what the implications are. But this GUC will be useful to experiment with various batch sizes.

Disable-check: force-changelog-file

It should be allowed everywhere in the code already. We have a GUC that
can reduce it below the current default of 1000, and this PR also allows
the GUC to go up.

The GUC is timescaledb.compression_batch_size_limit;
@akuzm akuzm requested a review from a team February 4, 2026 13:55
@github-actions github-actions bot requested review from dbeck and melihmutlu February 4, 2026 13:55
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

@melihmutlu, @dbeck: please review this pull request.

Powered by pull-review

@codecov
Copy link

codecov bot commented Feb 4, 2026

Codecov Report

❌ Patch coverage is 90.47619% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
tsl/src/compression/compression.c 87.50% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@svenklemm
Copy link
Member

Did you test compressing something with 32k and then querying it with 1k?

@dbeck
Copy link
Member

dbeck commented Feb 4, 2026

I think we should do a comprehensive testing across all algos and a good number of data types. And the algorithm tests should include fallbacks, like dictionary to array and, uuid to dictionary.

@akuzm
Copy link
Member Author

akuzm commented Feb 5, 2026

Benchmarked with 32767 default batch size just for fun: https://grafana.dev-us-east-1.ops.dev.timescale.com/d/fasYic_4z/compare-benchmark-runs?orgId=1&var-run1=5309&var-run2=5310&var-postgres=16&var-branch=All&var-threshold=0.02&var-use_historical_thresholds=true&var-threshold_expression=2.0%20%2A%20percentile_cont%280.90%29&var-exact_suite_version=true

Lots of regressions, mostly due to dml slowdowns and worse selectivity of compressed metadata filters. Some queries with aggregation up to 40% faster. Interestingly, also some big imporvements in some last-point queries.

@akuzm
Copy link
Member Author

akuzm commented Feb 5, 2026

@dbeck @svenklemm I added more tests, had to fix a couple of places where we actually didn't support the bigger batches.

Copy link
Member

@dbeck dbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants