A type-safe lexer and parser library for Scala 3, featuring compile-time validation and a pattern-matching DSL.
- Type-safe lexer and parser — catch errors at compile time with Scala 3's type system
- Pattern-matching DSL — define lexers and parsers using intuitive
casesyntax - Compile-time validation — regex patterns and grammar rules are checked during compilation
- Macro-based code generation — Scala 3 macros generate efficient tokenizers and parse tables
- Context-aware — lexical and parsing contexts with type-safe state management
- LR(1) parsing — automatic parse table generation with conflict detection
Add Alpaca as a dependency in your build.mill:
//| mill-version: 1.1.3
//| mill-jvm-version: 21
import mill._
import mill.scalalib._
object myproject extends ScalaModule {
def scalaVersion = "3.8.3-RC1"
def scalacOptions = Seq("-Yretain-trees")
def mvnDeps = Seq(
mvn"io.github.halotukozak::alpaca:0.1.0"
)
}
Add Alpaca to your build.sbt:
libraryDependencies += "io.github.halotukozak" %% "alpaca" % "0.1.0"Make sure you're using Scala 3.8.3-RC1 or later and enable the required compiler flag:
scalaVersion := "3.8.3-RC1"
scalacOptions += "-Yretain-trees"Use Alpaca directly in your Scala CLI scripts:
//> using scala "3.8.3-RC1"
//> using dep "io.github.halotukozak::alpaca:0.1.0"
//> using option "-Yretain-trees"
import alpaca.*
// Your code hereDefine a lexer using pattern matching with regex patterns:
import alpaca.*
val MyLexer = lexer:
case num @ "[0-9]+" => Token["NUM"](num.toDouble)
case "\\+" => Token["PLUS"]
case "-" => Token["MINUS"]
case "\\*" => Token["STAR"]
case "/" => Token["SLASH"]
case "\\(" => Token["LP"]
case "\\)" => Token["RP"]
case "\\s+" => Token.IgnoredDefine a parser by extending the Parser class and defining grammar rules:
import alpaca.*
object MyParser extends Parser:
val root: Rule[Double] = rule { case Expr(e) => e }
val Expr: Rule[Double] = rule(
{ case (Expr(l), MyLexer.PLUS(_), Term(r)) => l + r },
{ case (Expr(l), MyLexer.MINUS(_), Term(r)) => l - r },
{ case Term(t) => t }
)
val Term: Rule[Double] = rule(
{ case (Term(l), MyLexer.STAR(_), Factor(r)) => l * r },
{ case (Term(l), MyLexer.SLASH(_), Factor(r)) => l / r },
{ case Factor(f) => f }
)
val Factor: Rule[Double] = rule(
{ case MyLexer.NUM(n) => n.value },
{ case (MyLexer.LP(_), Expr(e), MyLexer.RP(_)) => e }
)import alpaca.*
val input = "2 + 3 * 4"
val (_, lexemes) = MyLexer.tokenize(input)
val (_, result) = MyParser.parse(lexemes)
println(result) // 14.0- Getting Started — build a BrainFuck interpreter step by step
- Lexer — the full lexer DSL reference
- Parser — grammar rules, EBNF operators, conflict resolution
- Theory — formal foundations: finite automata, LR parsing, parse tables
Runtime benchmarks are not run automatically in CI on push or pull requests. They can be triggered manually:
-
GitHub Actions — go to Actions > Runtime Benchmark > Run workflow and select the branch.
-
Locally — run all benchmarks (JMH + Python) from the repository root:
./mill benchmarks.runAll
Or run individual JMH suites directly:
./mill benchmarks.alpaca.runJmh ./mill benchmarks.fastparse.runJmh
Results are written to
benchmarks/outputs/.
- JDK 21 or later
- Mill 1.1.3 or later
# Compile the project
./mill compile
# Run tests
./mill test
# Generate documentation
./mill docJar
# Run test coverage
./mill test.scoverage.htmlReportThis project was developed as a Bachelor's Thesis. The full text is available in
the thesis.pdf file. The LaTeX source files are on
the thesis branch. The thesis is written in Polish
and does not represent the current state of the project.
Contributions are welcome. Please feel free to submit a Pull Request.
Created by halotukozak and Corvette653
Made with ❤️ and coffee