Skip to content

Conversation

colega
Copy link
Contributor

@colega colega commented Mar 26, 2025

This adds an experimental BatchSeriesRefs method in the appender that should be used to create all series from a scrape/write request in a single call, reducing the mutex contention on MemPostings.Add for customers with a lot of churn and for Mimir ingesters that have recently started (or went out of read-only mode) and are receiving avalanches of new series.

@colega colega force-pushed the batch-appender branch 2 times, most recently from e0efb52 to bf5711f Compare March 27, 2025 14:27
@colega colega changed the title storage.BatchAppender with BatchSeriesRef to create all series in a single call storage.BatchAppender with BatchSeriesRefs to create all series in a single call Mar 31, 2025
@colega colega changed the title storage.BatchAppender with BatchSeriesRefs to create all series in a single call storage.BatchSeriesReferencer to create all series in a single call Mar 31, 2025
colega added 7 commits March 31, 2025 17:07
`storage.BatchAppender` adds the BatchSeriesRefs method to the Appender().
This method can be used to make sure that all the series from the
incoming scrape or write request already exist (not wired here, as I'm
implementing it for Mimir, but can be used for Prometheus as well).

The reason to add this is to take advantage of a single operation to
reduce locking time on MemPostings among others: right now during Mimir
startup we have to call `Add()` millions of times within a few seconds,
which causes lots of lock contention, but most of the time is actually
spent locking & unlocking rather than adding series.

This method does not implement the optimization yet: it only focuses on
the exposed interface.

I couldn't make Head.Appender() return a BatchAppender because it returns
the initAppender which initializes the head on the first append call:
if I wanted initAppender to implement BatchAppender we would need to add
a timestamp to BatchSeriesRefs, which doesn't really belong here. So we are
exposing HeadInitialized() and InitializeHead() methods and move the
initialization responisibility to the caller, so it can be sure that
BatchAppender() method has always an initialized appender.

Signed-off-by: Oleg Zaytsev <[email protected]>
This goes one step further, and does batch series creation in
Appender.BatchSeriesRef as well as creates a new implementation of
getOrCreateBatch().

One of the advantages of the getOrCreateBatch() is that it's now adding
new seires to MemPostings in a batch, probably in a more sorted fashion
than previous implementation.

MemPostings.AddBatch was added but it trivially uses Add() so far.

Signed-off-by: Oleg Zaytsev <[email protected]>
This implements AddBatch() in a simple way: take the mutex, create
series using the existing mechanisms, and make pauses every 512 series
to allow reads to proceed: we don't want them to wait forever.

This optimizes the locking/unlocking time: effectively taking it 512x
times less.

I think we could still do better: if batch is large enough (1000
elements?) we could send it through some workers optimizing the locked
time.

We can just ensure from the beginning that all labels exist, and then
shard the labels across GOMAXPROCS workers, and send each label to it's
specific worker, without having to coordinate them and still holding a
single mutex.

Signed-off-by: Oleg Zaytsev <[email protected]>
This modifies the method to copy the provided labels before creating
them in the index and Head, and return the existing stored labels if
possible.

This is quite specific to our Mimir optimizations, but it's really
needed if we want to use this method with all the current optimizations.

Signed-off-by: Oleg Zaytsev <[email protected]>
Adding series to a.series will make sure they're tracked in the WAL.
There's an increased risk of losing these series if gc() happens in
between, that should be fixed once prometheus#16333 is merged.

Signed-off-by: Oleg Zaytsev <[email protected]>
colega added 4 commits March 31, 2025 17:31
Signed-off-by: Oleg Zaytsev <[email protected]>
Signed-off-by: Oleg Zaytsev <[email protected]>
Signed-off-by: Oleg Zaytsev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant