perf: Change shuffle read API to return a row batch instead if io buffer #26322

xiaoxmeng · 2025-10-15T19:03:54Z

Summary: Extend shuffle read API to return a row batch instead of a iobuf so that we can avoid redundant parsing

== NO RELEASE NOTE ==

sourcery-ai · 2025-10-15T19:04:01Z

Reviewer's Guide

This PR refactors the shuffle read API to return parsed row batches (ReadBatch) instead of raw IO buffers, introducing a ReadBatch type, updating ShuffleReader.next signatures and implementations in both LocalPersistentShuffleReader and CompactRowExchangeSource, renaming UnsafeRowExchangeSource to CompactRowExchangeSource (with a CompactRowBatch wrapper), and updating tests, ExchangeSource registration, and CMake build files accordingly.

Class diagram for updated ShuffleReader and ReadBatch

classDiagram
    class ShuffleReader {
        <<interface>>
        +next() : SemiFuture<ReadBatch>
        +noMoreData(success: bool)
    }
    class ReadBatch {
        +rows: vector<string_view>
        +data: BufferPtr
        +ReadBatch(rows, data)
    }
    ShuffleReader --> ReadBatch

Class diagram for CompactRowExchangeSource and CompactRowBatch

classDiagram
    class CompactRowBatch {
        +CompactRowBatch(rowBatch: ReadBatch)
        +rows() : vector<string_view>
        -rowBatch_: unique_ptr<ReadBatch>
    }
    class CompactRowExchangeSource {
        +CompactRowExchangeSource(taskId, destination, queue, shuffleReader, pool)
        +request(maxBytes, maxWait) : SemiFuture<Response>
        +requestDataSizes(maxWait) : SemiFuture<Response>
        +stats() : F14FastMap<string, int64_t>
        +createExchangeSource(url, destination, queue, pool) : shared_ptr<ExchangeSource>
        -shuffleReader_: shared_ptr<ShuffleReader>
    }
    CompactRowExchangeSource --> CompactRowBatch

Class diagram for LocalPersistentShuffleReader changes

classDiagram
    class LocalPersistentShuffleReader {
        +LocalPersistentShuffleReader(rootPath, queryId, partitionIds, pool)
        +next() : SemiFuture<ReadBatch>
        +noMoreData(success: bool)
        -pool_: MemoryPool*
        -readPartitionFileIndex_: size_t
        -readPartitionFiles_: vector<string>
    }
    LocalPersistentShuffleReader --> ReadBatch

File-Level Changes

Change	Details	Files
Introduce ReadBatch and change ShuffleReader.next to return row batches	add ReadBatch struct in ShuffleInterface.h update ShuffleReader interface and next() signatures to return unique_ptr modify LocalPersistentShuffleReader.next() to parse raw buffer into ReadBatch(rows, data)	`presto_cpp/main/operators/ShuffleInterface.h` `presto_cpp/main/operators/LocalPersistentShuffle.cpp` `presto_cpp/main/operators/LocalPersistentShuffle.h`
Replace UnsafeRowExchangeSource with CompactRowExchangeSource using row batches	rename UnsafeRowExchangeSource class to CompactRowExchangeSource and update request/requestDataSizes to use ReadBatch introduce CompactRowBatch wrapper around ReadBatch for ExchangeSource update ExchangeSource factory registration in PrestoServer	`presto_cpp/main/operators/CompactRowExchangeSource.cpp` `presto_cpp/main/operators/CompactRowExchangeSource.h` `presto_cpp/main/PrestoServer.cpp`
Update tests and CMakeLists for CompactRow shuffle	rename UnsafeRowShuffleTest to CompactRowShuffleTest and update includes/references update operators and tests CMakeLists.txt to use CompactRowExchangeSource and new test file	`presto_cpp/main/operators/tests/CompactRowShuffleTest.cpp` `presto_cpp/main/operators/CMakeLists.txt` `presto_cpp/main/operators/tests/CMakeLists.txt`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/operators/LocalPersistentShuffle.cpp:185-203` </location>
<code_context>
+  size_t offset = 0;
+  const size_t totalSize = buffer->size();
+
+  while (offset + sizeof(TRowSize) <= totalSize) {
+    // Read row size (stored in big endian).
+    TRowSize rowSize = folly::Endian::big(*(TRowSize*)(data + offset));
+    offset += sizeof(TRowSize);
+
+    if (offset + rowSize > totalSize) {
+      VELOX_FAIL(
+          "Invalid row data: row size {} exceeds remaining buffer size {}",
+          rowSize,
+          totalSize - offset);
+    }
+
+    // Create a Row with empty key and the row data as value.
+    rows.emplace_back(std::string_view{data + offset, rowSize});
+    offset += rowSize;
+  }
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Consider validating rowSize for zero or excessive values.

The code should also check for rowSize == 0 and enforce a reasonable maximum to prevent processing invalid or corrupted data.

```suggestion
  size_t offset = 0;
  const size_t totalSize = buffer->size();
  constexpr size_t kMaxRowSize = 16 * 1024 * 1024; // 16MB max row size

  while (offset + sizeof(TRowSize) <= totalSize) {
    // Read row size (stored in big endian).
    TRowSize rowSize = folly::Endian::big(*(TRowSize*)(data + offset));
    offset += sizeof(TRowSize);

    if (rowSize == 0) {
      VELOX_FAIL("Invalid row data: row size is zero at offset {}", offset - sizeof(TRowSize));
    }

    if (rowSize > kMaxRowSize) {
      VELOX_FAIL(
          "Invalid row data: row size {} exceeds maximum allowed row size {}",
          rowSize,
          kMaxRowSize);
    }

    if (offset + rowSize > totalSize) {
      VELOX_FAIL(
          "Invalid row data: row size {} exceeds remaining buffer size {}",
          rowSize,
          totalSize - offset);
    }

    // Create a Row with empty key and the row data as value.
    rows.emplace_back(std::string_view{data + offset, rowSize});
    offset += rowSize;
  }
```
</issue_to_address>

### Comment 2
<location> `presto-native-execution/presto_cpp/main/operators/LocalPersistentShuffle.cpp:190` </location>
<code_context>
+
+  while (offset + sizeof(TRowSize) <= totalSize) {
+    // Read row size (stored in big endian).
+    TRowSize rowSize = folly::Endian::big(*(TRowSize*)(data + offset));
+    offset += sizeof(TRowSize);
+
</code_context>

<issue_to_address>
**issue (bug_risk):** Direct pointer casting for rowSize extraction may be unsafe on platforms with strict alignment requirements.

Using memcpy to extract the row size avoids potential undefined behavior from misaligned pointer casts.
</issue_to_address>

### Comment 3
<location> `presto-native-execution/presto_cpp/main/operators/tests/UnsafeRowShuffleTest.cpp:306` </location>
<code_context>
 }
 } // namespace

-class UnsafeRowShuffleTest : public exec::test::OperatorTestBase {
+class CompactRowShuffleTest : public exec::test::OperatorTestBase {
  public:
   std::string testShuffleInfo(
</code_context>

<issue_to_address>
**suggestion (testing):** Test class and test names updated, but no new tests for row batch semantics.

Please add or update tests to cover the new row batch behavior, including edge cases like empty batches, single-row batches, and invalid row sizes.

Suggested implementation:

```cpp
class CompactRowShuffleTest : public exec::test::OperatorTestBase {
 public:
  std::string testShuffleInfo(
      uint32_t numPartitions,
  }

  // Test: Empty batch should produce no output.
  TEST_F(CompactRowShuffleTest, emptyBatch) {
    auto emptyData = makeRowVector({
      makeFlatVector<int32_t>({}),
      makeFlatVector<int64_t>({})
    });
    TestShuffleWriter::reset();
    // Assuming runShuffle is the method to execute shuffle
    auto result = runShuffle(emptyData, /*numPartitions=*/4);
    ASSERT_TRUE(result->size() == 0) << "Empty batch should produce no output";
  }

  // Test: Single-row batch should produce correct output.
  TEST_F(CompactRowShuffleTest, singleRowBatch) {
    auto singleData = makeRowVector({
      makeFlatVector<int32_t>({42}),
      makeFlatVector<int64_t>({420})
    });
    TestShuffleWriter::reset();
    auto result = runShuffle(singleData, /*numPartitions=*/4);
    ASSERT_EQ(result->size(), 1) << "Single-row batch should produce one output row";
    ASSERT_EQ(result->childAt(0)->as<FlatVector<int32_t>>()->getValue(0), 42);
    ASSERT_EQ(result->childAt(1)->as<FlatVector<int64_t>>()->getValue(0), 420);
  }

  // Test: Invalid row size (zero columns).
  TEST_F(CompactRowShuffleTest, invalidRowSizeZeroColumns) {
    auto invalidData = makeRowVector({});
    TestShuffleWriter::reset();
    EXPECT_THROW(runShuffle(invalidData, /*numPartitions=*/4), VeloxException);
  }

  // Test: Invalid row size (exceeding max allowed).
  TEST_F(CompactRowShuffleTest, invalidRowSizeTooLarge) {
    // Assuming max allowed columns is 100
    std::vector<VectorPtr> columns;
    for (int i = 0; i < 101; ++i) {
      columns.push_back(makeFlatVector<int32_t>({i}));
    }
    auto invalidData = makeRowVector(columns);
    TestShuffleWriter::reset();
    EXPECT_THROW(runShuffle(invalidData, /*numPartitions=*/4), VeloxException);
  }
};

```

- You may need to adjust the `runShuffle` function name and signature to match your actual shuffle execution method.
- If your test framework uses a different exception type than `VeloxException`, replace it accordingly.
- If your row vector creation utility differs, adapt the test data creation to your codebase.
- Ensure the test class is registered with your test runner (e.g., GoogleTest).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-10-15T19:05:08Z

presto-native-execution/presto_cpp/main/operators/LocalPersistentShuffle.cpp

+  size_t offset = 0;
+  const size_t totalSize = buffer->size();
+
+  while (offset + sizeof(TRowSize) <= totalSize) {
+    // Read row size (stored in big endian).
+    TRowSize rowSize = folly::Endian::big(*(TRowSize*)(data + offset));
+    offset += sizeof(TRowSize);
+
+    if (offset + rowSize > totalSize) {
+      VELOX_FAIL(
+          "Invalid row data: row size {} exceeds remaining buffer size {}",
+          rowSize,
+          totalSize - offset);
+    }
+
+    // Create a Row with empty key and the row data as value.
+    rows.emplace_back(std::string_view{data + offset, rowSize});
+    offset += rowSize;
+  }


suggestion (bug_risk): Consider validating rowSize for zero or excessive values.

The code should also check for rowSize == 0 and enforce a reasonable maximum to prevent processing invalid or corrupted data.

Suggested change

size_t offset = 0;

const size_t totalSize = buffer->size();

while (offset + sizeof(TRowSize) <= totalSize) {

// Read row size (stored in big endian).

TRowSize rowSize = folly::Endian::big(*(TRowSize*)(data + offset));

offset += sizeof(TRowSize);

if (offset + rowSize > totalSize) {

VELOX_FAIL(

"Invalid row data: row size {} exceeds remaining buffer size {}",

rowSize,

totalSize - offset);

}

// Create a Row with empty key and the row data as value.

rows.emplace_back(std::string_view{data + offset, rowSize});

offset += rowSize;

}

size_t offset = 0;

const size_t totalSize = buffer->size();

constexpr size_t kMaxRowSize = 16 * 1024 * 1024; // 16MB max row size

while (offset + sizeof(TRowSize) <= totalSize) {

// Read row size (stored in big endian).

TRowSize rowSize = folly::Endian::big(*(TRowSize*)(data + offset));

offset += sizeof(TRowSize);

if (rowSize == 0) {

VELOX_FAIL("Invalid row data: row size is zero at offset {}", offset - sizeof(TRowSize));

}

if (rowSize > kMaxRowSize) {

VELOX_FAIL(

"Invalid row data: row size {} exceeds maximum allowed row size {}",

rowSize,

kMaxRowSize);

}

if (offset + rowSize > totalSize) {

VELOX_FAIL(

"Invalid row data: row size {} exceeds remaining buffer size {}",

rowSize,

totalSize - offset);

}

// Create a Row with empty key and the row data as value.

rows.emplace_back(std::string_view{data + offset, rowSize});

offset += rowSize;

}

sourcery-ai · 2025-10-15T19:05:08Z

presto-native-execution/presto_cpp/main/operators/LocalPersistentShuffle.cpp

+
+  while (offset + sizeof(TRowSize) <= totalSize) {
+    // Read row size (stored in big endian).
+    TRowSize rowSize = folly::Endian::big(*(TRowSize*)(data + offset));


issue (bug_risk): Direct pointer casting for rowSize extraction may be unsafe on platforms with strict alignment requirements.

Using memcpy to extract the row size avoids potential undefined behavior from misaligned pointer casts.

…if io buffer (prestodb#26322) Summary: Extend shuffle read API to return a row batch instead of a iobuf so that we can avoid redundant parsing Differential Revision: D84737440

xiaoxmeng requested review from a team as code owners October 15, 2025 19:03

prestodb-ci added the from:Meta PR from Meta label Oct 15, 2025

sourcery-ai bot reviewed Oct 15, 2025

View reviewed changes

xiaoxmeng force-pushed the export-D84737440 branch from c34745e to 0c74da0 Compare October 15, 2025 20:37

xiaoxmeng force-pushed the export-D84737440 branch from 0c74da0 to 95e4451 Compare October 15, 2025 22:20

tanjialiang previously approved these changes Oct 15, 2025

View reviewed changes

xiaoxmeng force-pushed the export-D84737440 branch from 95e4451 to 120bd7a Compare October 16, 2025 02:35

xiaoxmeng dismissed tanjialiang’s stale review via b2b0eed October 16, 2025 04:13

xiaoxmeng force-pushed the export-D84737440 branch from 120bd7a to b2b0eed Compare October 16, 2025 04:13

xiaoxmeng force-pushed the export-D84737440 branch from b2b0eed to 1825011 Compare October 16, 2025 04:13

xiaoxmeng force-pushed the export-D84737440 branch from 1825011 to 7c2b7b3 Compare October 16, 2025 04:19

xiaoxmeng force-pushed the export-D84737440 branch from 7c2b7b3 to e4979df Compare October 16, 2025 05:54

xiaoxmeng changed the title ~~[sv-cosco]opt: Change shuffle read API to return a row batch instead if io buffer~~ perf: Change shuffle read API to return a row batch instead if io buffer Oct 16, 2025

[sv-cosco]opt: Change shuffle read API to return a row batch instead …

6b31d83

…if io buffer (prestodb#26322) Summary: Extend shuffle read API to return a row batch instead of a iobuf so that we can avoid redundant parsing Differential Revision: D84737440

xiaoxmeng force-pushed the export-D84737440 branch from e4979df to 6b31d83 Compare October 16, 2025 13:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: Change shuffle read API to return a row batch instead if io buffer #26322

perf: Change shuffle read API to return a row batch instead if io buffer #26322

Uh oh!

xiaoxmeng commented Oct 15, 2025 •

edited

Loading

Uh oh!

sourcery-ai bot commented Oct 15, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Oct 15, 2025

Uh oh!

sourcery-ai bot Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

perf: Change shuffle read API to return a row batch instead if io buffer #26322

Are you sure you want to change the base?

perf: Change shuffle read API to return a row batch instead if io buffer #26322

Uh oh!

Conversation

xiaoxmeng commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sourcery-ai bot commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Class diagram for updated ShuffleReader and ReadBatch

Class diagram for CompactRowExchangeSource and CompactRowBatch

Class diagram for LocalPersistentShuffleReader changes

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xiaoxmeng commented Oct 15, 2025 •

edited

Loading

sourcery-ai bot commented Oct 15, 2025 •

edited

Loading