Skip to content

Conversation

@tonynajjar
Copy link

Description

Summary

This PR implements preservation of topics with TRANSIENT_LOCAL QoS durability in snapshot mode recording. Previously, transient local topics (like /tf_static, parameter events, or latched topics) could be lost when the circular buffer rolled over, even though they contain critical state information. This feature ensures these messages are always included in snapshots.

Problem

In snapshot mode, rosbag2 uses a circular buffer to maintain a rolling window of messages. When a snapshot is triggered, the buffer contents are dumped to disk. However, topics with TRANSIENT_LOCAL durability semantics—which should persist their latest message for late-joining subscribers—were not treated specially and could be overwritten before snapshot capture.

Solution

The implementation adds a separate storage mechanism for transient local messages:

1. Cache Layer (CircularMessageCache)

  • Added transient_local_messages_ map to store latest message per transient local topic
  • Added push_transient_local() method for separate message routing
  • Modified swap_buffers() to merge transient local messages into snapshots with adjusted timestamps

2. Writer Layer (SequentialWriter, Writer)

  • Added mark_topic_as_transient_local() API to register topics
  • Modified write() to route transient local messages to protected storage
  • Added transient_local_topics_ set to track registered topics

3. Recorder Integration

  • Automatic detection of TransientLocal QoS durability during subscription
  • Automatic registration of detected topics with writer

4. Interface Updates

  • Added virtual push_transient_local() to MessageCacheInterface with default fallback
  • Added mark_topic_as_transient_local() to BaseWriterInterface

Fixes #1886

Is this user-facing behavior change?

No

Did you use Generative AI?

Clause Sonnet 4.5

Additional Information

@tonynajjar tonynajjar changed the title add transient local persistence for snapshot mode Persist Transient Local topics in snapshot mode Dec 22, 2025
message_lost = !message_cache_->push(converted_msg);
// Check if this topic uses transient_local durability
if (transient_local_topics_.find(message->topic_name) != transient_local_topics_.end()) {
message_cache_->push_transient_local(converted_msg);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: handling of lost messages?

std::lock_guard<std::mutex> cache_lock(transient_local_buffer_mutex_);
// Store/update the latest message for this topic
// This ensures we always have the most recent state for transient local topics
transient_local_messages_[msg->topic_name] = std::move(msg);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: store more than just the last msg?

Comment on lines +120 to +126
// Use the front message timestamp as the snapshot start time for transient local messages.
// This may not be the absolute earliest timestamp in the buffer (messages can arrive
// out of order), but it's close enough - typically within 1-2ms of the true minimum.
// This avoids the overhead of searching through all messages for the exact minimum.
if (!consumer_data.empty()) {
snapshot_start_time = consumer_data.front()->recv_timestamp;
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: is this acceptable?

Comment on lines +134 to +135
updated_msg->recv_timestamp = snapshot_start_time;
updated_msg->send_timestamp = snapshot_start_time;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: not sure if they should have the same timestamp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Snapshot mode throws away transient local topics when the buffer is full (in other words, tf_static is lost)

1 participant