Skip to content

Conversation

kewang1024
Copy link
Collaborator

@kewang1024 kewang1024 commented Oct 17, 2025

Exchange has two phase protocol (#21926)

  1. First request is a get-data-size request, it uses long-poll protocol and wait until
    exchange source has data or timeout
  2. Second request is get-data request, it will be be non-blocking call to transmit data

Differentiating those two would be helpful to differentiate the time waiting on server
and time for transmitting data

== NO RELEASE NOTE ==

@kewang1024 kewang1024 requested review from a team as code owners October 17, 2025 09:34
@prestodb-ci prestodb-ci added the from:Meta PR from Meta label Oct 17, 2025
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Oct 17, 2025

Reviewer's Guide

This PR separates exchange protocol metrics by introducing distinct histograms for the initial get-data-size phase vs the data retrieval phase, refines existing histogram parameters, and updates PrestoExchangeSource to tag and record metrics based on request type.

Sequence diagram for differentiated metric recording in exchange protocol

sequenceDiagram
    participant Client
    participant PrestoExchangeSource
    participant Metrics
    Client->>PrestoExchangeSource: get-data-size request (long-poll)
    PrestoExchangeSource->>Metrics: Record kCounterExchangeGetDataSizeDuration
    PrestoExchangeSource->>Metrics: Record kCounterExchangeGetDataSizeNumTries
    Client->>PrestoExchangeSource: get-data request (non-blocking)
    PrestoExchangeSource->>Metrics: Record kCounterExchangeRequestDuration
    PrestoExchangeSource->>Metrics: Record kCounterExchangeRequestNumTries
    PrestoExchangeSource->>Metrics: Record kCounterExchangeRequestPageSize
Loading

Class diagram for updated PrestoExchangeSource and Counters

classDiagram
    class PrestoExchangeSource {
        +void processDataResponse(std::unique_ptr<http::HttpResponse> response, bool getDataSize)
    }
    class Counters {
        +kCounterExchangeRequestDuration
        +kCounterExchangeRequestNumTries
        +kCounterExchangeRequestPageSize
        +kCounterExchangeGetDataSizeDuration
        +kCounterExchangeGetDataSizeNumTries
    }
Loading

File-Level Changes

Change Details Files
Refined existing exchange request metrics and added page size histogram
  • Updated exchange request duration histogram to use 20ms buckets over 0–10s
  • Introduced page size histogram with 10KB buckets up to 20MB
presto-native-execution/presto_cpp/main/common/Counters.cpp
presto-native-execution/presto_cpp/main/common/Counters.h
Added dedicated histograms for get-data-size requests
  • Defined exchange.get-data-size.duration histogram
  • Defined exchange.get-data-size.num_tries histogram
presto-native-execution/presto_cpp/main/common/Counters.cpp
presto-native-execution/presto_cpp/main/common/Counters.h
Differentiated request phases in PrestoExchangeSource
  • Passed a getDataSize flag (maxBytes==0) into handleDataResponse
  • Extended processDataResponse signature to accept the phase flag
  • Conditionally record size vs data histograms and page size based on getDataSize
presto-native-execution/presto_cpp/main/PrestoExchangeSource.cpp
presto-native-execution/presto_cpp/main/PrestoExchangeSource.h

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@kewang1024 kewang1024 changed the title feat(native): Separate exchange get-data-size vs get-data counters fo… feat(native): Separate exchange get-data-size vs get-data counters Oct 17, 2025
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Using a boolean flag in processDataResponse to differentiate request types can be error-prone; consider splitting into two explicit handler methods or using an enum for better clarity.
  • Inferring get-data-size requests via maxBytes == 0 is implicit and may lead to confusion; introducing an explicit parameter or separate request path would make the intent clearer.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Using a boolean flag in processDataResponse to differentiate request types can be error-prone; consider splitting into two explicit handler methods or using an enum for better clarity.
- Inferring get-data-size requests via maxBytes == 0 is implicit and may lead to confusion; introducing an explicit parameter or separate request path would make the intent clearer.

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/PrestoExchangeSource.cpp:249` </location>
<code_context>
         processDataError(httpRequestPath, maxBytes, maxWait, response->error());
       } else {
-        processDataResponse(std::move(response));
+        processDataResponse(std::move(response), maxBytes == 0);
       }
     } catch (const std::exception& e) {
</code_context>

<issue_to_address>
**suggestion:** The use of 'maxBytes == 0' to signal get-data-size requests could be error-prone.

Using 'maxBytes == 0' as a signal may cause confusion or bugs if the parameter's meaning changes or if zero becomes a valid value. An explicit flag or method would improve clarity and future maintainability.

Suggested implementation:

```cpp
      } else {
        const bool isGetDataSizeRequest = (maxBytes == 0);
        processDataResponse(std::move(response), isGetDataSizeRequest);
      }

```

```cpp
void PrestoExchangeSource::processDataResponse(
    std::unique_ptr<http::HttpResponse> response, bool isGetDataSizeRequest) {

```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@kewang1024 kewang1024 force-pushed the improve-exchange-stats branch 3 times, most recently from be73073 to 7bee4cc Compare October 17, 2025 09:57
…r better tracking

Exchange has two phase protocol
1. First request is a get-data-size request, it uses long-poll protocol and wait until
exchange source has data or timeout
2. Second request is get-data request, it will be be non-blocking call to transmit data

Differentiating those two would be helpful to differentiate the time waiting on server
and time for transmitting data
if (isGetDataSizeRequest) {
RECORD_HISTOGRAM_METRIC_VALUE(
kCounterExchangeGetDataSizeDuration,
dataRequestRetryState_.durationMs());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how useful is the overall duration. Should we add a metric excluding the duration of a long pool? (waiting for pages).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also debating with myself whether this is needed, but when I look at this counter, one useful insight it gives is a ballpark about how much time we have to wait for data

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time thinking how we can exclude the duration but didn't find an easy way

  • we need to include in the http response the "get-data-size server ready time"
  • but since get-data-size is quite frequent call, so it would introduce non-trivial amount of overhead

Any ideas?

dataRequestRetryState_.durationMs());
RECORD_HISTOGRAM_METRIC_VALUE(
kCounterExchangeGetDataSizeNumTries,
dataRequestRetryState_.numTries());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the duration include all retries? Do we have (want to have?) metrics for a single request?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it includes all retries. So far, in my observation, for get-data, we don't have any retry at all

@kewang1024 kewang1024 requested a review from arhimondr October 20, 2025 16:32
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. We'll enable this in our clusters as well and share observations with you next native worker group meeting.

@aditi-pandit aditi-pandit merged commit 0085f7f into prestodb:master Oct 20, 2025
97 of 103 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:Meta PR from Meta

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants