Skip to content

Conversation

ebyhr
Copy link
Member

@ebyhr ebyhr commented Oct 4, 2025

Description

This is a proposal to add support for chained function calls, similar to BigQuery, DuckDB.
It improves the readability of deeply nested expressions. Also, it eliminates the need to move the cursor backward when writing functions.

SELECT
  REPLACE(
    REPLACE(
      REPLACE(
        REPLACE(
          REPLACE('one two three four five', 'one', '1'),
          'two', '2'),
        'three', '3'),
      'four', '4'),
    'five', '5');

SELECT
  ('one two three four five')
  .REPLACE('one', '1')
  .REPLACE('two', '2')
  .REPLACE('three', '3')
  .REPLACE('four', '4')
  .REPLACE('five', '5');

Release notes

## General
* Add support for chained function calls. ({issue}`26841`)

@ebyhr ebyhr requested a review from martint October 4, 2025 00:13
@ebyhr ebyhr added syntax-needs-review needs-docs This pull request requires changes to the documentation labels Oct 4, 2025
@cla-bot cla-bot bot added the cla-signed label Oct 4, 2025
@ebyhr ebyhr requested review from Praveen2112 and kasiafi October 4, 2025 00:31
@ebyhr ebyhr force-pushed the ebi/chained-function-call branch from 76a957d to 9ebb391 Compare October 4, 2025 00:33
@ebyhr ebyhr requested a review from Copilot October 4, 2025 23:55
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for chained function calls to Trino SQL parser, allowing expressions like ('hello').upper().concat(' world!') instead of nested function calls. This improves readability for deeply nested function expressions similar to BigQuery and DuckDB implementations.

  • Adds grammar support for dot notation chained function calls
  • Implements AST building logic to transform chained calls into nested FunctionCall objects
  • Updates parser error messages to reflect new grammar tokens

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
SqlBase.g4 Adds grammar rules for chained function calls and refactors function call parsing
AstBuilder.java Implements visitChainedFunctionCalls method to build nested FunctionCall AST nodes
TestSqlParser.java Adds unit tests for chained function call parsing
TestChainedFunctionCalls.java Adds comprehensive integration tests for chained function call execution
TestSqlParserErrorHandling.java Updates expected error message to include new grammar tokens

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@kasiafi
Copy link
Member

kasiafi commented Oct 6, 2025

For the expression foo.bar(baz) how do we know if it's:

  • function foo.bar applied to argument baz
  • function bar applied to arguments foo, baz?

For an expression like a.b.c.d(), how do we know where the argument ends and the function name starts?

@findepi
Copy link
Member

findepi commented Oct 6, 2025

This is a proposal to add support for chained function calls, similar to BigQuery, DuckDB.

Are these the only ones which support this syntax sugar?

It improves the readability of deeply nested expressions.

So do "lateral column aliases".
Does the proposed syntax solve a problem that is hard to solve without it?

Coming from object-oriented languages, I very much like the idea of SQL values having methods, but I am not convinced val.f(...) method would always be desired to mean f(val, ...).
The replace case looks nice (but is also likely a logic error -- one usually wants to do a multi-replace in one shot so that the replacement values from the first pattern as not considered when looking for the second pattern).
Other case where one could want to chain calls is date_add and/or other date/time functions, but there the value being operated on doesn't come as the first one.

@ebyhr
Copy link
Member Author

ebyhr commented Oct 7, 2025

For the expression foo.bar(baz)

A function require (), so the expression is always foo.bar function with baz parameter in my opinion.
"function bar applied to arguments foo, baz" should be foo().bar(baz). Am I missing something?

Are these the only ones which support this syntax sugar?

Yes, as far as I confirmed.

So do "lateral column aliases".
Does the proposed syntax solve a problem that is hard to solve without it?

I don't think LCA and chained function calls actually solve any "problems". Those syntaxes exists to improve user experience.
For non-first values, we could consider introducing a special symbol or keyword if truely needed.

@kasiafi
Copy link
Member

kasiafi commented Oct 7, 2025

For the expression foo.bar(baz)

Function names are QualifiedNames. Column names are also QualifiedNames. With the proposed chained calling syntax, we can chain the column name (if the function operates on a column) with the function name, and create a multi-part qualified name, which is ambiguous to the parser.

foo.bar(baz) can be function bar(baz) applied to column foo. But it can also be function foo.bar(baz) in the traditional calling convention. In the latter case, foo is the schema name.

In a.b.c.d() there are even more possibilities.
Column a, function b.c.d()
Column a.b, function c.d()
Column a.b.c, function d()

In fact, field names can contain even more than 3 parts.

@wendigo
Copy link
Contributor

wendigo commented Oct 7, 2025

But if we use other syntax (ie :: instead of a dot) it shouldn't be ambiguous anymore, right?

@kasiafi
Copy link
Member

kasiafi commented Oct 7, 2025

But if we use other syntax (ie :: instead of a dot) it shouldn't be ambiguous anymore, right?

Yes.
Regarding :: however, I recall that it's reserved for function calls on types #23795 (comment), so this would be probably also ambiguous.

@ebyhr
Copy link
Member Author

ebyhr commented Oct 7, 2025

Can we resolve the ambiguity by requiring () when we want to use a column name at the beginning of the chain?

That's the BigQuery's syntax:

SELECT (name).lower() FROM tpch.region;

The below query throws "Function not found: name.lower":

SELECT name.lower() FROM tpch.region;

@kasiafi
Copy link
Member

kasiafi commented Oct 7, 2025

Can we resolve the ambiguity by requiring () when we want to use a column name at the beginning of the chain?

We could require (). I think that after the proposed grammar change it's already required. Not sure how intuitive using parenthesis in this context is to the users. Column names / qualified names are the only primary expressions that require parenthesis for disambiguation. There's a chance of getting unexpected output when a user misses parenthesis and the column name overlaps with function prefix. But this does not seem a likely scenario.

@martint
Copy link
Member

martint commented Oct 7, 2025

There's a whole section in the SQL spec for instance method invocations. We don't need to reinvent any wheels.

However, as @findepi pointed out above, we can't blindly interpret every function as a method invocation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed needs-docs This pull request requires changes to the documentation syntax-needs-review

Development

Successfully merging this pull request may close these issues.

5 participants