Skip to content

Conversation

@ahkcs
Copy link
Contributor

@ahkcs ahkcs commented Oct 7, 2025

Description

Implement the replace command in PPL to replace text patterns in specified fields. This PR includes the grammar implementation and basic replacement functionality. This implementation reuses the existing replace core logic. Missing / new features for regex support will be added in a separate PR.

Original PR: #4248


Syntax

<source> | replace '<pattern>' WITH '<replacement>' IN field_list
  • pattern: The text pattern to search for (case-sensitive)
  • replacement: The text to replace matches with
  • field_list: Comma-separated list of fields to perform replacement in
    The replace command modifies the specified fields in-place with the replaced text.

Semantics

Expected Behavior

  • Action: Modifies specified fields in-place with replaced text values
  • Scope: Operates only on the specified fields
  • Data Modification: Specified fields are updated with replacement text
  • Case Sensitivity: Text literal matching is case-sensitive
  • Pattern Type: Only supports literal string patterns (wildcards or regex support added later)

Implementation Approach

  • Modifies specified fields in-place using string replacement
  • Performs literal string replacement in the specified fields
  • Other fields remain unchanged
  • Validates the existence of fields and correctness of pattern / replacement parameters

Example Queries

-- Replace in a single field  
source=logs | replace 'error' WITH 'ERROR' IN message  

-- Replace in multiple fields  
source=logs | replace 'USA' WITH 'United States' IN country, state  

-- Combine with other commands  
source=logs  
  | where level = 'error'  
  | replace 'error' WITH 'ERROR' IN message  
  | sort @timestamp  

-- Replace and select specific fields  
source=logs | replace 'error' WITH 'ERROR' IN message | fields message, level  

Output Schema

For each field specified in the IN clause, the field is modified in-place:

  • Specified fields: modified with replacement text
  • Other fields: remain unchanged

Example
Input:

message = "error occurred", level = "error"

After replace 'error' WITH 'ERROR' IN message, level

message = "ERROR occurred"  
level = "ERROR"

Resolves

#3975


Check List

  • New functionality includes tests
  • New functionality has documentation
  • Javadoc added for new components
  • User manual / user documentation updated
  • PPL command checklist confirmed
  • Companion API changes PR created (if applicable)
  • Commits are signed per DCO (--signoff or -s)
  • Public documentation / issue or PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@ToString
@EqualsAndHashCode(callSuper = false)
public class Replace extends UnresolvedPlan {
private final UnresolvedExpression pattern;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: We should annotate these with @NotNull.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to @Nullable, received similar comment before: #3878 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw, I also prefer using optional in fields, despite the warning. I'm not sure why that warning exists. 🤷

But why mark it nullable if we validate that they can't be null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We marked child as @nullable, which we didn't include in the validate logic

UnresolvedExpression pattern = internalVisitExpression(ctx.pattern);
UnresolvedExpression replacement = internalVisitExpression(ctx.replacement);

List<Field> fieldList =
Copy link
Collaborator

@Swiddis Swiddis Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Should we have a global method for turning this into a LinkedHashSet?

This is currently checked in validate(), but I don't think this is the only command to rely on a fieldList. If we make Replace take a Set instead of a List, we simplify the validation logic there, and we can push the responsibility of deduplicating fieldLists to shared parsing code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to accept Set instead of List, removing duplicate validation logic

verifyPPLToSparkSQL(root, expectedSparkSql);
}

@Test(expected = Exception.class)
Copy link
Collaborator

@Swiddis Swiddis Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Catch a more specific Exception

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to be more specific

@Swiddis Swiddis added the enhancement New feature or request label Oct 7, 2025
RyanL1997
RyanL1997 previously approved these changes Oct 9, 2025
ahkcs added 12 commits October 15, 2025 13:00
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Comment on lines 84 to 87
@Deprecated
public UnresolvedExpression getReplacement() {
if (replacePairs.isEmpty()) {
throw new IllegalStateException("No replacement pairs available");
}
return replacePairs.get(0).getReplacement();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete instead of deprecate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Comment on lines 70 to 73
@Deprecated
public UnresolvedExpression getPattern() {
if (replacePairs.isEmpty()) {
throw new IllegalStateException("No replacement pairs available");
}
return replacePairs.get(0).getPattern();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete instead of deprecate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Comment on lines 51 to 50
public Replace(
UnresolvedExpression pattern, UnresolvedExpression replacement, Set<Field> fieldList) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to remove unused parameter

dai-chen
dai-chen previously approved these changes Oct 15, 2025
Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

Signed-off-by: Kai Huang <[email protected]>
Comment on lines +15 to +18
@Getter
@AllArgsConstructor
@EqualsAndHashCode
@ToString
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: consider using @Value

@dai-chen dai-chen added PPL Piped processing language backport 2.19-dev labels Oct 16, 2025
@dai-chen dai-chen merged commit 5677765 into opensearch-project:main Oct 16, 2025
37 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4451-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5677765e6d2f0203fc99014e5ebc4aa27424b57d
# Push it to GitHub
git push --set-upstream origin backport/backport-4451-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4451-to-2.19-dev.

ahkcs added a commit to ahkcs/sql that referenced this pull request Oct 20, 2025
* Add replace command with Calcite

Signed-off-by: Kai Huang <[email protected]>

---------

Signed-off-by: Kai Huang <[email protected]>
Co-authored-by: Manasvini B S <[email protected]>
(cherry picked from commit 5677765)
Signed-off-by: Kai Huang <[email protected]>
penghuo pushed a commit that referenced this pull request Oct 22, 2025
* Add replace command with Calcite (#4451)

* Add replace command with Calcite

Signed-off-by: Kai Huang <[email protected]>

---------

Signed-off-by: Kai Huang <[email protected]>
Co-authored-by: Manasvini B S <[email protected]>
(cherry picked from commit 5677765)
Signed-off-by: Kai Huang <[email protected]>

* fix compile

Signed-off-by: Kai Huang <[email protected]>

* backporting backslash handling from main and fix tests

Signed-off-by: Kai Huang <[email protected]>

* Fix tests

Signed-off-by: Kai Huang <[email protected]>

* fix tests

Signed-off-by: Kai Huang <[email protected]>

* compatability accross java versions

Signed-off-by: Kai Huang <[email protected]>

---------

Signed-off-by: Kai Huang <[email protected]>
Co-authored-by: Manasvini B S <[email protected]>
@LantaoJin LantaoJin added the backport-manually Filed a PR to backport manually. label Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. enhancement New feature or request PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants