Skip to content

⚡ Bolt: [performance improvement] Optimize D1 SQL generation#215

Open
bashandbone wants to merge 1 commit intomainfrom
bolt/optimize-d1-sql-generation-13575450059674256912
Open

⚡ Bolt: [performance improvement] Optimize D1 SQL generation#215
bashandbone wants to merge 1 commit intomainfrom
bolt/optimize-d1-sql-generation-13575450059674256912

Conversation

@bashandbone
Copy link
Copy Markdown
Contributor

@bashandbone bashandbone commented May 8, 2026

💡 What: Refactored D1 SQL query generation in crates/flow/src/targets/d1.rs to utilize String::with_capacity and the write! macro instead of generating intermediate Vec collections and relying on .join().

🎯 Why: Generating SQL queries using format! and mapping over slices to .collect::<Vec<_>>().join(", ") creates excessive intermediate memory allocations and string copies, which can impact performance when processing large numbers of columns or setup changes.

📊 Impact: Reduces heap allocations and memory churn when computing SQL queries for table and index creation in the D1 target, leading to slightly faster initialization times for targets.

🔬 Measurement: Verify by executing cargo test -p thread-flow --test d1_target_tests and cargo test -p thread-flow --test d1_minimal_tests, which confirms exact parity in SQL format logic for D1 exports.


PR created automatically by Jules for task 13575450059674256912 started by @bashandbone

Summary by Sourcery

Optimize SQL string generation for D1 table and index creation to reduce allocations and improve performance.

Enhancements:

  • Refactor D1 table creation SQL builder to construct the final string incrementally without intermediate collections.
  • Refactor D1 index creation SQL generation to avoid Vec-based joins and minimize string allocations.

Refactored D1 SQL query generation in crates/flow/src/targets/d1.rs to utilize `String::with_capacity` and the `write!` macro instead of generating intermediate Vec collections and relying on `.join()`. This eliminates unnecessary heap allocations and string copies.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 8, 2026 18:33
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented May 8, 2026

Reviewer's Guide

Refactors D1 SQL generation for table and index creation to build SQL strings incrementally using pre-allocated String buffers and write!/push APIs instead of intermediate Vec collections and join, reducing allocations while preserving existing SQL semantics.

Flow diagram for optimized create_table_sql generation

flowchart TD
  A[Start create_table_sql] --> B[Init sql with capacity 256]
  B --> C[Append CREATE TABLE IF NOT EXISTS and table_name and opening parenthesis]
  C --> D[Iterate key_columns and value_columns]
  D --> E{First column?}
  E -- Yes --> F[Set first to false]
  E -- No --> G[Append comma and space]
  G --> H[Write column name and sql_type]
  F --> H[Write column name and sql_type]
  H --> I{Column nullable?}
  I -- No --> J[Append NOT NULL]
  I -- Yes --> K[Do nothing]
  J --> L{More columns?}
  K --> L{More columns?}
  L -- Yes --> D
  L -- No --> M{Any key_columns?}
  M -- No --> T[Append closing parenthesis]
  M -- Yes --> N[Append comma and PRIMARY KEY and opening parenthesis]
  N --> O[Iterate key_columns]
  O --> P{First primary key column?}
  P -- Yes --> Q[Set first_pk to false]
  P -- No --> R[Append comma and space]
  R --> S[Append key column name]
  Q --> S[Append key column name]
  S --> U{More key_columns?}
  U -- Yes --> O
  U -- No --> V[Append closing parenthesis for PRIMARY KEY]
  V --> T[Append closing parenthesis]
  T --> W[Return sql]
Loading

File-Level Changes

Change Details Files
Optimize create_table_sql to build SQL incrementally without intermediate column definition vectors or joins.
  • Introduce local import of std::fmt::Write to use the write! macro on String buffers.
  • Pre-allocate the CREATE TABLE SQL string with String::with_capacity and write the fixed prefix once.
  • Iterate over key and value columns, appending column definitions directly to the SQL string with manual comma management instead of pushing formatted column strings into a Vec.
  • Append NOT NULL directly for non-nullable columns without constructing separate strings.
  • Generate the PRIMARY KEY clause by iterating key_columns and appending column names with manual comma handling, then closing the CREATE TABLE statement with a final parenthesis.
crates/flow/src/targets/d1.rs
Optimize create_indexes_sql to construct each index creation SQL string with a pre-allocated buffer and manual column list assembly.
  • Import std::fmt::Write in create_indexes_sql to support write! on String.
  • Replace format! with a pre-allocated String::with_capacity and a write! call for the fixed portion of the CREATE INDEX statement, including unique flag, index name, and table name.
  • Append index column names by iterating idx.columns and manually inserting commas, instead of using join on the column slice.
  • Return the collected Vec of incrementally built SQL index statements, preserving previous behavior but with fewer allocations.
crates/flow/src/targets/d1.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The write! calls currently discard their Result via let _ = ...; since writing into a String is infallible, consider using .unwrap() or a small helper to make this explicit and avoid silently ignoring potential errors (and clippy warnings).
  • The fixed with_capacity(256) / with_capacity(128) values are somewhat arbitrary; you might want to derive these from the number of columns/index columns (e.g., base + per_column * len) to avoid under- or over-allocating when tables vary significantly in size.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `write!` calls currently discard their `Result` via `let _ = ...`; since writing into a `String` is infallible, consider using `.unwrap()` or a small helper to make this explicit and avoid silently ignoring potential errors (and clippy warnings).
- The fixed `with_capacity(256)` / `with_capacity(128)` values are somewhat arbitrary; you might want to derive these from the number of columns/index columns (e.g., `base + per_column * len`) to avoid under- or over-allocating when tables vary significantly in size.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors D1 SQL generation in the Flow D1 target to reduce intermediate allocations while building CREATE TABLE and CREATE INDEX statements.

Changes:

  • Rewrote D1SetupState::create_table_sql() to build SQL incrementally using String::with_capacity, write!, and push_str instead of Vec + .join().
  • Rewrote D1SetupState::create_indexes_sql() similarly to avoid allocating joined column lists.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +541 to 553
// ⚡ Bolt: Use String::with_capacity and write! to avoid intermediate allocations
let mut sql = String::with_capacity(256);
let _ = write!(sql, "CREATE TABLE IF NOT EXISTS {} (", self.table_id.table_name);

let mut first = true;
for col in self.key_columns.iter().chain(self.value_columns.iter()) {
let mut col_def = format!("{} {}", col.name, col.sql_type);
if !first {
sql.push_str(", ");
}
first = false;

let _ = write!(sql, "{} {}", col.name, col.sql_type);
if !col.nullable {
Comment on lines +581 to +588
// ⚡ Bolt: Use String::with_capacity and write! for index SQL generation
let mut sql = String::with_capacity(128);
let unique = if idx.unique { "UNIQUE " } else { "" };
format!(
"CREATE {}INDEX IF NOT EXISTS {} ON {} ({})",
unique,
idx.name,
self.table_id.table_name,
idx.columns.join(", ")
)
let _ = write!(
sql,
"CREATE {}INDEX IF NOT EXISTS {} ON {} (",
unique, idx.name, self.table_id.table_name
);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants