A comprehensive SQL parser for Amazon Redshift built with ANTLR 4, optimized for Redshift-specific syntax.
This project is a Go-based SQL parser specifically designed for Amazon Redshift. It originated as a fork of the PostgreSQL parser but has been restructured to focus exclusively on Redshift's unique syntax requirements.
- Complete SQL Support: Parses 200+ SQL statement types including DDL, DML, and advanced constructs
- Redshift-Specific Syntax: Full support for Redshift extensions like
IDENTITY
columns,DISTKEY
,SORTKEY
, and more - Redshift-Optimized: Parser optimized exclusively for Redshift syntax and features
- Comprehensive Testing: 200+ test cases covering real-world SQL scenarios
- High Performance: Optimized for production use with parser reuse and efficient error handling
go get github.com/bytebase/redshift-parser
package main
import (
"fmt"
"github.com/antlr4-go/antlr/v4"
"github.com/bytebase/redshift-parser"
)
func main() {
// Parse a Redshift-specific CREATE TABLE statement
sql := `CREATE TABLE users (
id INT IDENTITY(1,1),
name VARCHAR(100),
email VARCHAR(255) UNIQUE
) DISTKEY(id) SORTKEY(name);`
// Create lexer and parser
input := antlr.NewInputStream(sql)
lexer := parser.NewRedshiftLexer(input)
stream := antlr.NewCommonTokenStream(lexer, 0)
p := parser.NewRedshiftParser(stream)
// Parse the SQL
tree := p.Root()
fmt.Println("Successfully parsed Redshift SQL!")
}
CREATE TABLE
with Redshift-specific options (DISTKEY, SORTKEY, IDENTITY)ALTER TABLE
with column modifications and constraintsCREATE INDEX
with various index typesCREATE VIEW
and materialized viewsCREATE FUNCTION
and stored procedures
SELECT
with complex joins, subqueries, and window functionsINSERT
with conflict resolution (ON CONFLICT
)UPDATE
with joins and CTEsDELETE
with complex conditionsMERGE
statements
- Common Table Expressions (CTEs)
- Window functions and analytics
- JSON operations and path expressions
- Array operations
- Regular expressions
- Full-text search
The parser is optimized for Redshift's unique SQL extensions:
- IDENTITY columns:
CREATE TABLE t (id INT IDENTITY(1,1))
- Distribution keys:
DISTKEY(column_name)
- Sort keys:
SORTKEY(column_name)
- Redshift built-in functions: Comprehensive support for Redshift-specific functions
- Data types: All Redshift-supported data types including extensions
- Go 1.21+
- ANTLR 4.13.2+
- Clone the repository:
git clone https://github.com/bytebase/redshift-parser.git
cd redshift-parser
- Generate parser code:
./build.sh
- Run tests:
go test -v
redshift-parser/
├── RedshiftLexer.g4 # ANTLR lexer grammar
├── RedshiftParser.g4 # ANTLR parser grammar
├── build.sh # Code generation script
├── redshift_lexer_base.go # Base lexer implementation
├── redshift_parser_base.go # Base parser implementation
├── keywords.go # 600+ SQL keywords
├── builtin_function.go # Built-in function definitions
├── examples/ # 200+ SQL test files
├── parser_test.go # Main test suite
└── CLAUDE.md # Development guide
The project includes comprehensive test coverage:
# Run all tests
go test -v
# Run specific test
go test -run TestRedshiftParser -v
# Run benchmarks
go test -bench=. -v
Test files are located in the examples/
directory and cover:
- Basic SQL operations
- Complex queries with joins and subqueries
- Redshift-specific syntax
- Error handling scenarios
- Performance benchmarks
The parser is built using ANTLR 4 grammars:
- RedshiftLexer.g4: Tokenization rules for SQL keywords, operators, and literals
- RedshiftParser.g4: Grammar rules for SQL statement parsing
After modifying grammar files, regenerate the Go code:
./build.sh
- Fork the repository
- Create a feature branch
- Add tests for your changes
- Ensure all tests pass
- Update documentation as needed
- Submit a pull request
- Always run
./build.sh
before testing after grammar changes - Add test cases for new SQL syntax support
- Follow existing code patterns and conventions
- Use AWS Redshift documentation for syntax reference
- Test against both PostgreSQL and Redshift engines
This project is licensed under the MIT License. See the grammar files for additional license information from the original PostgreSQL grammar contributors.
- Based on the PostgreSQL grammar from Tunnel Vision Labs
- Forked from Bytebase PostgreSQL Parser
- Built with ANTLR 4
- Bytebase - Database DevOps platform
- PostgreSQL Parser - Original PostgreSQL parser
- ANTLR 4 - Parser generator toolkit
- GitHub Issues - Bug reports and feature requests
- AWS Redshift Documentation - SQL syntax reference
- ANTLR Documentation - Grammar development guide