Skip to content

feat(postgresql): add support for table creation DDL that contains a primary key alongside the INCLUDE keyword #5425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

amosbiras
Copy link
Contributor

Hi there! 👋

First things first, sqlglot is such an amazing library - my team and I have been using it so far for SQL code transpiling into Python code, on which we rely on the most for our database infrastructure management, and it is been so easy, developer friendly and useful that I wanted to say how much it is appreciated as a library out there.

I ran into an issue the other day with the current version of sqlglot that led me to open this PR.
The issue is as follows:
When I tried to run the following code:

 ```
import sqlglot
from sqlglot.dialects.postgres import Postgres
sql = "CREATE TABLE t (i INT, a TEXT PRIMARY KEY (i) INCLUDE (a))"
result = sqlglot.parse(sql, dialect=Postgres) 
``` 

I received this error:
sqlglot.errors.ParseError: Expecting ). Line 1, Col: 53. CREATE TABLE t (i INT, a TEXT PRIMARY KEY (i) INCLUDE (a))

I went ahead and tried to run a debugger with the code, specifically I reached a point that I realized that the tokenizing process in the code leveraged by Rust might have been the issue as more tokens were created for this SQL as opposed to the same SQL query except for the INCLUDE keyword - but then I found out that it was essentially just a matter of the PrimaryKey Expression class definition.

Would appreciate this PR.
BTW - you might be wondering why we attempt to use a Primary Key on PostgreSQL alongside with the INCLUDE keyword. The reason for that is that we basically create along with the PK the same unique index that is merely different by the fact that it has this INCLUDE keyword, nothing more than that. So we realized that we might be able to save the burden on the database by leveraging the primary key with other columns as part of the B+ data-structure that composes the index, but are not actually indexed.

Copy link
Collaborator

@VaggelisD VaggelisD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @amosbiras, thank you for your kind words and for the contribution!

Leaving a few comments to further improve this:

Comment on lines 6300 to 6302
include = None
if self._match_text_seq("INCLUDE"):
include = self._parse_wrapped_id_vars()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should reuse parser::_parse_index_params() here, it contains the INCLUDE parsing & generation so you'll get them for free

Comment on lines +3000 to +3001
include = self.expressions(expression, key="include", flat=True)
include = f" INCLUDE ({include})" if include else ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, if we have an exp.IndexParameters node here, all we'll have to do is:

include = self.sql(expression, "include")
include = f" {include}" if include else ""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately here when I run this suggested code as is, I get the following error:
ValueError: Expected an Expression. Received <class 'list'>: [Identifier(this=a, quoted=False)]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your version is the correct one– you need to use self.expressions if the child is a list.

@@ -987,6 +987,7 @@ def test_ddl(self):
self.validate_identity(
"CREATE TABLE t (vid INT NOT NULL, CONSTRAINT ht_vid_nid_fid_idx EXCLUDE (INT4RANGE(vid, nid) WITH &&, INT4RANGE(fid, fid, '[]') WITH &&))"
)
self.validate_identity("CREATE TABLE t (i INT, a TEXT PRIMARY KEY (i) INCLUDE (a))")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried running this in PSQL and I'm getting a parse error there as well, this needs a comma between the column def list and the primary key afaict:

... a TEXT, PRIMARY KEY (i) ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, I must've missed it, thanks for thoroughly checking it.
I at first cloned the repository, checked out to a branch of mine and committed the changes but when I tried to push it, I received a 403 - so I went to see other PRs and I noticed that some of them came from forked repos, so I literally did it and copied my changes.
Maybe it's worth to add a description in the README for developers in the library for all the required steps, tho developing here was pretty much straightforward.

@@ -987,6 +987,7 @@ def test_ddl(self):
self.validate_identity(
"CREATE TABLE t (vid INT NOT NULL, CONSTRAINT ht_vid_nid_fid_idx EXCLUDE (INT4RANGE(vid, nid) WITH &&, INT4RANGE(fid, fid, '[]') WITH &&))"
)
self.validate_identity("CREATE TABLE t (i INT, a TEXT PRIMARY KEY (i) INCLUDE (a))")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, I must've missed it, thanks for thoroughly checking it.
I at first cloned the repository, checked out to a branch of mine and committed the changes but when I tried to push it, I received a 403 - so I went to see other PRs and I noticed that some of them came from forked repos, so I literally did it and copied my changes.
Maybe it's worth to add a description in the README for developers in the library for all the required steps, tho developing here was pretty much straightforward.

Comment on lines +3000 to +3001
include = self.expressions(expression, key="include", flat=True)
include = f" INCLUDE ({include})" if include else ""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately here when I run this suggested code as is, I get the following error:
ValueError: Expected an Expression. Received <class 'list'>: [Identifier(this=a, quoted=False)]

@amosbiras
Copy link
Contributor Author

Hey @amosbiras, thank you for your kind words and for the contribution!

Leaving a few comments to further improve this:

Thanks for the CR!
I'd also like to know when the library is expected to have the next version release :)

@amosbiras amosbiras requested a review from VaggelisD July 20, 2025 18:07
@VaggelisD
Copy link
Collaborator

VaggelisD commented Jul 21, 2025

Hey @amosbiras, thank you once again! I think we can further clean this up a bit, I'll merge this and follow up on a different commit.

As for the releases we perform them often, I'd say once every few days is typical.

@VaggelisD VaggelisD merged commit 732548f into tobymao:main Jul 21, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants