Skip to content

egraphs-good/egglog

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,999 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

egglog: The Next-Generation Equality Saturation Engine

Web Demo Main Branch Documentation CodSpeed Badge Zulip Chat

This is the repo for the core of the egglog engine, which combines the power of equality saturation and Datalog.

For getting started, try out the egglog tutorial!

You can also run egglog in your web browser or check out the documentation.

For a "battery-included" experience, we recommend egglog-experimental. It provides more features through additional egglog plugins.

If you want to cite egglog, please use this citation.


The following instructions are for using/developing the core directly.

Prerequisites & compilation

Install Cargo.

git clone git@github.com:egraphs-good/egglog.git
cargo install --path=egglog

Usage

The core can be used in REPL mode with:

cargo run --release

The standard mode processes an input file:

cargo run --release [-f fact-directory] [--to-dot] [--to-svg] [-j --threads <THREADS>] <files.egg>
  • The --to-dot command will save a graphviz dot file at the end of the program, replacing the .egg extension with .dot.
  • The --to-svg, which requires Graphviz to be installed, will save a graphviz svg file at the end of the program, replacing the .egg extension with .svg.
  • Set RUST_LOG=INFO to see more logging messages, as we use env-logger defaulting to warn.
  • The -j option specifies the number of threads to use for parallel execution. The default value is 1, which runs everything in a single thread. Passing 0 will use the maximum inferred parallelism available on the current system.

One can also use egglog as a Rust library by adding the following to your Cargo.toml:

[dependencies]
egglog = "2.0.0"

See also the Python binding for using egglog in Python.

Egglog can also be compiled to WebAssembly, see ./wasm-example for more information.

Development

To view documentation in a browser, run cargo doc --open.

Run cargo test to run the core egglog tests.

Community extensions

The community has maintained egglog extensions for IDEs. However, they are outdated at the time of writing.

Parallelism

egglog has support to run programs in parallel via the -j flag. This support is relatively new and most users just run egglog single-threaded; the codspeed benchmarks only evaluate single-threaded performance. However, please take care not to pessimize parallel performance where possible (e.g. by adding coarse-grained locks).

We use rayon's global thread pool for parallelism, and the number of threads used is set to 1 by default when egglog's CLI is run. If you use egglog as a library, you can control the level of parallelism by setting rayon's num_threads.

Benchmarks

All PRs use codspeed to evaluate the performance of a change against a suite of micro-benchmarks. You should see a "performance report" from codspeed a few minutes after posting a PR for review. Generally speaking, PRs should only improve performance or leave it unchanged, though exceptions are possible when warranted.

To debug performance issues, we recommend looking at the codspeed profiles, or running your own using samply, flamegraph-rs, cargo-instruments (on MacOS) or perf (on Linux). All codspeed benchmarks correspond to named .egg files, usually in the tests/ directory. For example, to debug an issue with extract-vec-bench, you can run the following commands:

# install samply
cargo install --locked samply
# build a profile build which includes debug symbols
cargo build --profile profiling
# run the egglog file and profile
samply record ./target/profiling/egglog tests/extract-vec-bench.egg
# [optional] run the egglog file without logging or printing messages, which can help reduce the stdout
# when you are profiling extracting a large expression
env RUST_LOG=error samply record ./target/profiling/egglog --no-messages tests/extract-vec-bench.egg

CodSpeed

We run all of our "examples" as benchmarks in CodSpeed. These are in CI for every commit in main and for all PRs. It will run the examples with extra instrumentation added so that it can capture a single trace of the CPU interactions (src):

CodSpeed instruments your benchmarks to measure the performance of your code. A benchmark will be run only once and the CPU behavior will be simulated. This ensures that the measurement is as accurate as possible, taking into account not only the instructions executed but also the cache and memory access patterns. The simulation gives us an equivalent of the CPU cycles that includes cache and memory access.

Since many of the shorter running benchmarks have unstable timings due to non deterministic performance (like in the memory allocator), we "ignore" them in CodSpeed. That way, we still capture their performance, but their timings don't show up in our reports by default.

We use 50 ms as our cutoff currently, any benchmarks shorter than that are ignored. This number was selected to try to ignore any benchmarks with have changes > 1% when they haven't been modified. Note that all the ignoring is done manually, so if you add another example that's short, an admin on the CodSpeed project will need to manually ignore it.

Code Coverage

To generate code coverage reports locally, first install cargo-llvm-cov:

cargo install cargo-llvm-cov

Then run:

make coverage

This will generate a coverage report using nextest and output it to lcov.info. The coverage report is automatically generated and uploaded to Codecov in CI for all pull requests and commits to main. To visualize coverage in VS Code, we recommend using the Coverage Gutters extension. After running make coverage, click "Watch" in the status bar to see coverage highlighted in your editor.

Contributing

We are open to new contributors helping with egglog!

A group of core egglog developers are responsible for final decisions on what code will be accepted.

We organize our issues into three stages:

  1. status:needs discussion: More work refining this should happen on or offline, before its ready to be considered.
  2. status: needs decision: There is a concrete proposal here which needs to be considered by the core developers.
  3. status: ready for work: This is ready to be implemented and a PR to address it would be supported.

So if you are looking to find an issue to solve, looking for one that is status: ready for work will be more likely to result in a PR that could be accepted. We also try to maintain a set of good first issues which may be easier to approach.

The core developers will regularly review all open un-triaged issues to categorize them.

So if you are unsure if a feature is desired, feel free to open an issue on it first to get feedback before spending time implementing it.

About

egraphs + datalog!

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages