Add BDD-based rules engine trait #2703

mtdowling · 2025-07-15T21:47:48Z

This commit updates the smithy-rules-engine package to support binary decision diagrams (BDD) to more efficiently resolve endpoints.

We create the BDD by converting the decision tree into a control flow graph (CFG), then compile the CFG to a BDD. The CFG canonicalizes conditions for better sharing (e.g., sorts commutative functions, expands simple string templates, etc), and strips all conditions from results and hash-conses them as well. Later, we'll migrate to emitting the BDD directly in order to shave off many conditions and results that can be simplified.

Our decision-tree based rules engine requires deep branching logic to find results. When evaluating the path to a result based on given input, decision trees require descending into a branch, and if at any point a condition in the branch fails, you bail out and go back up to the next branch. This can cause pathological searches of a tree (e.g., 60+ repeated checks on things like isset and booleanEquals to resolve S3 endpoints). In fact, there are currently ~73,000 unique paths through the current decision tree for S3 rules.

Using a BDD (a fully reduced one at least) guarantees that we only evaluate any given condition at most once, and only when that condition actually discriminates the result. This is achieved by recursively converting the CFG into BDD nodes using ITE (if-then-else) operations, choosing a variable ordering that honors dependencies between conditions and variable bindings. The BDD builder applies Shannon expansion during ITE operations and uses hash-consing to share common subgraphs.

The "bdd" trait has most of the same information as the endpointRuleset trait, but doesn't include "rules". Instead it contains a base64 encoded "nodes" value that contains the zig-zag variable-length encoded node triples, one after the other (this is much more compact and efficient to decode than 1000+ JSON array nodes).

The BDD implementation uses CUDD-style complement edges where negative node references represent logical NOT, further reducing BDD size.

BDD output examples

AWS Connect BDD output

Bdd{
  conditions (8):
     C0: isSet(Endpoint)
     C1: isSet(Region)
     C2: PartitionResult = aws.partition(Region)
     C3: booleanEquals(UseFIPS, true)
     C4: booleanEquals(UseDualStack, true)
     C5: booleanEquals(PartitionResult#supportsDualStack, true)
     C6: booleanEquals(PartitionResult#supportsFIPS, true)
     C7: stringEquals("aws-us-gov", PartitionResult#name)
  results (13):
     R0: NoMatchRule
     R1: Error: "Invalid Configuration: FIPS and custom endpoint are not supported"
     R2: Error: "Invalid Configuration: Dualstack and custom endpoint are not supported"
     R3: Endpoint: Endpoint
     R4: Endpoint: "https://connect-fips.{Region}.{PartitionResult#dualStackDnsSuffix}"
     R5: Error: "FIPS and DualStack are enabled, but this partition does not support one or both"
     R6: Endpoint: "https://connect.{Region}.amazonaws.com"
     R7: Endpoint: "https://connect-fips.{Region}.{PartitionResult#dnsSuffix}"
     R8: Error: "FIPS is enabled but this partition does not support FIPS"
     R9: Endpoint: "https://connect.{Region}.{PartitionResult#dualStackDnsSuffix}"
    R10: Error: "DualStack is enabled but this partition does not support DualStack"
    R11: Endpoint: "https://connect.{Region}.{PartitionResult#dnsSuffix}"
    R12: Error: "Invalid Configuration: Missing Region"
  root: 1
  nodes (14):
     0: terminal
     1: [ C0,     12,      2]
     2: [ C1,      3,    R12]
     3: [ C2,      4,    R12]
     4: [ C3,      7,      5]
     5: [ C4,      6,    R11]
     6: [ C5,     R9,    R10]
     7: [ C4,     10,      8]
     8: [ C6,      9,     R8]
     9: [ C7,     R6,     R7]
    10: [ C5,     11,     R5]
    11: [ C6,     R4,     R5]
    12: [ C3,     R1,     13]
    13: [ C4,     R2,     R3]
}

bdd trait

{
    "version": "1.3",
    "parameters": {
        "Region": {
            "builtIn": "AWS::Region",
            "required": false,
            "documentation": "The AWS region used to dispatch the request.",
            "type": "String"
        },
        "UseDualStack": {
            "builtIn": "AWS::UseDualStack",
            "required": true,
            "default": false,
            "documentation": "When true, use the dual-stack endpoint. If the configured endpoint does not support dual-stack, dispatching the request MAY return an error.",
            "type": "Boolean"
        },
        "UseFIPS": {
            "builtIn": "AWS::UseFIPS",
            "required": true,
            "default": false,
            "documentation": "When true, send this request to the FIPS-compliant regional endpoint. If the configured endpoint does not have a FIPS compliant endpoint, dispatching the request will return an error.",
            "type": "Boolean"
        },
        "Endpoint": {
            "builtIn": "SDK::Endpoint",
            "required": false,
            "documentation": "Override the endpoint used to send this request",
            "type": "String"
        }
    },
    "conditions": [
        {
            "fn": "isSet",
            "argv": [
                {
                    "ref": "Endpoint"
                }
            ]
        },
        {
            "fn": "isSet",
            "argv": [
                {
                    "ref": "Region"
                }
            ]
        },
        {
            "fn": "aws.partition",
            "argv": [
                {
                    "ref": "Region"
                }
            ],
            "assign": "PartitionResult"
        },
        {
            "fn": "booleanEquals",
            "argv": [
                {
                    "ref": "UseFIPS"
                },
                true
            ]
        },
        {
            "fn": "booleanEquals",
            "argv": [
                {
                    "ref": "UseDualStack"
                },
                true
            ]
        },
        {
            "fn": "booleanEquals",
            "argv": [
                {
                    "fn": "getAttr",
                    "argv": [
                        {
                            "ref": "PartitionResult"
                        },
                        "supportsDualStack"
                    ]
                },
                true
            ]
        },
        {
            "fn": "booleanEquals",
            "argv": [
                {
                    "fn": "getAttr",
                    "argv": [
                        {
                            "ref": "PartitionResult"
                        },
                        "supportsFIPS"
                    ]
                },
                true
            ]
        },
        {
            "fn": "stringEquals",
            "argv": [
                "aws-us-gov",
                {
                    "fn": "getAttr",
                    "argv": [
                        {
                            "ref": "PartitionResult"
                        },
                        "name"
                    ]
                }
            ]
        }
    ],
    "results": [
        {},
        {
            "error": "Invalid Configuration: FIPS and custom endpoint are not supported",
            "type": "error"
        },
        {
            "error": "Invalid Configuration: Dualstack and custom endpoint are not supported",
            "type": "error"
        },
        {
            "endpoint": {
                "url": {
                    "ref": "Endpoint"
                },
                "properties": {},
                "headers": {}
            },
            "type": "endpoint"
        },
        {
            "endpoint": {
                "url": "https://connect-fips.{Region}.{PartitionResult#dualStackDnsSuffix}",
                "properties": {},
                "headers": {}
            },
            "type": "endpoint"
        },
        {
            "error": "FIPS and DualStack are enabled, but this partition does not support one or both",
            "type": "error"
        },
        {
            "endpoint": {
                "url": "https://connect.{Region}.amazonaws.com",
                "properties": {},
                "headers": {}
            },
            "type": "endpoint"
        },
        {
            "endpoint": {
                "url": "https://connect-fips.{Region}.{PartitionResult#dnsSuffix}",
                "properties": {},
                "headers": {}
            },
            "type": "endpoint"
        },
        {
            "error": "FIPS is enabled but this partition does not support FIPS",
            "type": "error"
        },
        {
            "endpoint": {
                "url": "https://connect.{Region}.{PartitionResult#dualStackDnsSuffix}",
                "properties": {},
                "headers": {}
            },
            "type": "endpoint"
        },
        {
            "error": "DualStack is enabled but this partition does not support DualStack",
            "type": "error"
        },
        {
            "endpoint": {
                "url": "https://connect.{Region}.{PartitionResult#dnsSuffix}",
                "properties": {},
                "headers": {}
            },
            "type": "endpoint"
        },
        {
            "error": "Invalid Configuration: Missing Region",
            "type": "error"
        }
    ],
    "root": 2,
    "nodes": "AQIBACwGAggKBAwKKAIBBhgOCBIQJgIBChYUJAIBIgIBCCQaDB4cIAIBDiIgHgIBHAIBCiYoDCooGgIBGAIBBjQuCDIwFgIBFAIBEgIB"
}

Endpoint rules: BDD vs Decision tree size comparison

Regional service

BDD: Pretty=4.4 KB; Minified=2.8 KB
Decision tree: Pretty=9.7 KB; Minified=3.7 KB

S3

BDD: Pretty=67 KB; Minified=42 KB
Decision tree: Pretty=427 KB; Minified=96 KB

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

JordonPhillips

I'm not done reviewing, but I thought I should at least post what I have. I still need to look at the bdd sifting and tests.

overall looks great

...ware/amazon/smithy/rulesengine/language/syntax/expressions/functions/FunctionDefinition.java

smithy-rules-engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/Bdd.java

smithy-rules-engine/src/main/java/software/amazon/smithy/rulesengine/logic/cfg/Cfg.java

smithy-rules-engine/src/main/resources/META-INF/smithy/smithy.rules.smithy

smithy-rules-engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/BddBuilder.java

...gine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/DefaultOrderingStrategy.java

...engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/BddEquivalenceChecker.java

...-rules-engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/BddNodeHelpers.java

JordonPhillips

When are we going to be running the BDD optimizations? I think it would make sense to do either prior to code generation, or better as a sort of pre-compile/formatting step. The latter would make sure it's only done once, but maybe a generator wouldn't want to trust that

...ules-engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/OrderConstraints.java

mtdowling · 2025-07-18T17:21:07Z

When are we going to be running the BDD optimizations

I don't think anyone do code generation from a Bdd trait will want to optimize at all. We'll only ship already optimized BDDs.

In the future, I want us to eventually ship just the BDD trait and not the current decision tree trait. We'd do the optimizations at the end of the build process that computes the BDD (sifting, reversal, etc).

When building BDDs manually because you just have the decision tree and no BDD trait, you can choose to either optimize or not based on your "budget".

smithy-rules-engine/src/main/resources/META-INF/smithy/smithy.rules.smithy

kstich

Partial review, will continue tomorrow but posting comments so they can be discussed/addressed.

...e/src/main/java/software/amazon/smithy/rulesengine/language/syntax/parameters/Parameter.java

...rules-engine/src/main/java/software/amazon/smithy/rulesengine/language/syntax/rule/Rule.java

smithy-rules-engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/BddBuilder.java

smithy-rules-engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/BddCompiler.java

...engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/BddEquivalenceChecker.java

smithy-rules-engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/BddTrait.java

This commit updates the smithy-rules-engine package to support binary decision diagrams (BDD) to more efficiently resolve endpoints. We create the BDD by converting the decision tree into a control flow graph (CFG), then compile the CFG to a BDD. The CFG canonicalizes conditions for better sharing (e.g., sorts commutative functions, expands simple string templates, etc), and strips all conditions from results and hash-conses them as well. Later, we'll migrate to emitting the BDD directly in order to shave off many conditions and results that can be simplified. Our decision-tree based rules engine requires deep branching logic to find results. When evaluating the path to a result based on given input, decision trees require descending into a branch, and if at any point a condition in the branch fails, you bail out and go back up to the next branch. This can cause pathological searches of a tree (e.g., 60+ repeated checks on things like isset and booleanEquals to resolve S3 endpoints). In fact, there are currently ~73,000 unique paths through the current decision tree for S3 rules. Using a BDD (a fully reduced one at least) guarantees that we only evaluate any given condition at most once, and only when that condition actually discriminates the result. This is achieved by recursively converting the CFG into BDD nodes using ITE (if-then-else) operations, choosing a variable ordering that honors dependencies between conditions and variable bindings. The BDD builder applies Shannon expansion during ITE operations and uses hash-consing to share common subgraphs. The "bdd" trait has most of the same information as the endpointRuleset trait, but doesn't include "rules". Instead it contains a base64 encoded "nodes" value that contains the zig-zag variable-length encoded node triples, one after the other (this is much more compact and efficient to decode than 1000+ JSON array nodes). The BDD implementation uses CUDD-style complement edges where negative node references represent logical NOT, further reducing BDD size.

Rather than have the Bdd class contain Condition, Results, Parameters, etc, it now just deals with nodes. It also now hides the implementation detail of how the BDD nodes are laid out internally. BDD evaluation is internalized to the BDD as well rather than a separate BddEvaluator. This change provides faster evaluation, makes it possible to change the internal node data layout if necessary, and cleans up all the interacts we had with BddTrait (no need to always reach into Bdd).

We were using the wrong condition ordering in BddTrait after compiling a Bdd from the CFG, leading to a totally broken BDD. Also adds some tests, fixes, and generalizes BddTrait transforms

This also revealed a bug in the BDD compilation process that was causing negated nodes to get added twice.

The varint encoding does help compact the binary node array, but adds maybe a bit to much decoding complexity for only a 20-30% size reduction, and most of the size comes from conditions and results.

Our previous initial ordering could result in pathalogical orderings if it decided to moving something very early from the CFG to very late. This is in fact what happened when I added a coalesce method: it moved an early discriminating condition to very late, which blew up the BDD from ~40K nodes to 5.1M. This taught me that we really shouldn't throw away the ordering found in the CFG, and instead should leverage it when determining the initial ordering since it inherently gates logic and keeps related conditions together. So now the initial ordering is based on the CFG ordering and also on cone analysis (basically how many downstream nodes a node affects). We now get an initial ordering ~3K nodes, and with the coalesce method, we can now sift S3 down to ~800 nodes instead of ~1000. The coalesce function is added here so that we can fold bind-then-test conditions into a single condition. The current endpoints type system has strict nullability requirements. So you can't do a substring test and pass that directly into something that expects a non-null value. You have to first do the nullable function, then assign that to a value, then the next condition is inherently guarded and only called if the assigned value is non-null (the assignment acts as an implicit guard). The coalsce function allows us to identify these patterns and inline the test into a single condition by defaulting null to the zero value of the return type (integer=0, string="", array=[]). We only coalesce when the comparison is not to literally the zero value. When coalesce was added, it uncovered the original brittle ordering, leading to the much improved ordering in this PR.

Rename CfgGuidedOrdering to InitialOrdering

When we detect that a result or error are the same except for the variables used in template placeholders at the same position, we now automatically insert phi nodes using coalesce functions to reduce the amount of duplicate results. For example, this removed the result duplication from S3Express SSA variable versioning entirely, down from 158 results to 121.

We can now track what version an endpointBdd trait uses and attach minimum version requirements to all syntax elements of the rules engine, include functions. Now the coalesce function is available since version 1.1. Next, I'll add validation to ensure the version requirements of every syntax element of an endpointRuleSet or endpointBdd meet the version of the trait.

This now ensures that syntax elements, functions, etc all have an availableSince version that does not exceed the version decalred on a rules engine trait.

This commit adds support for variadic functions in the rules engine and makes coalesce variadic. This makes things like phi functions a shallow list of expressions rather than a massive list of nested binary expressions. This also makes it easier to optimize phi nodes without requiring peephole checks in compilers for optimization.

Trying to coalesce SSA nodes with a result phi node is causing BDD resolution issues. Removing for now.

docs/source-2.0/additional-specs/rules-engine/specification.rst

alextwoods · 2025-08-24T18:36:54Z

docs/source-2.0/additional-specs/rules-engine/standard-library.rst

+=====================
+
+Summary
+    Evaluates arguments in order and returns the first non-empty result, otherwise returns the result of the last


We should make sure that we have test cases for implementors that cover coalesce with boolean types to ensure that handling of False values is consistent.

Where would such a test be added?

I think we've generally added them under the test rule files, eg: parse-url.smithy as well as in internal documentation.

docs/source-2.0/additional-specs/rules-engine/standard-library.rst

mtdowling requested a review from a team as a code owner July 15, 2025 21:47

mtdowling requested a review from JordonPhillips July 15, 2025 21:47

mtdowling force-pushed the mtbdd branch 4 times, most recently from 25c0e7f to 16503fe Compare July 16, 2025 20:37

JordonPhillips requested changes Jul 17, 2025

View reviewed changes

JordonPhillips requested changes Jul 18, 2025

View reviewed changes

...ules-engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/OrderConstraints.java Outdated Show resolved Hide resolved

...ules-engine/src/main/java/software/amazon/smithy/rulesengine/logic/bdd/OrderConstraints.java Outdated Show resolved Hide resolved

mtdowling force-pushed the mtbdd branch from 90beebf to e9a7616 Compare July 18, 2025 17:41

JordonPhillips requested changes Jul 21, 2025

View reviewed changes

smithy-rules-engine/src/main/resources/META-INF/smithy/smithy.rules.smithy Show resolved Hide resolved

mtdowling requested a review from JordonPhillips July 21, 2025 19:56

JordonPhillips approved these changes Jul 22, 2025

View reviewed changes

JordonPhillips approved these changes Jul 23, 2025

View reviewed changes

mtdowling force-pushed the mtbdd branch 2 times, most recently from 8ed7d56 to f453664 Compare July 28, 2025 15:29

kstich requested changes Jul 29, 2025

View reviewed changes

...e/src/main/java/software/amazon/smithy/rulesengine/language/syntax/parameters/Parameter.java Show resolved Hide resolved

...rules-engine/src/main/java/software/amazon/smithy/rulesengine/language/syntax/rule/Rule.java Outdated Show resolved Hide resolved

JordonPhillips approved these changes Jul 30, 2025

View reviewed changes

mtdowling force-pushed the mtbdd branch 3 times, most recently from c648fa5 to 219e5e1 Compare July 31, 2025 21:30

mtdowling requested a review from kstich August 1, 2025 17:41

kstich reviewed Aug 1, 2025

View reviewed changes

mtdowling requested a review from kstich August 2, 2025 00:40

mtdowling force-pushed the mtbdd branch from 7eb40ae to 32426ea Compare August 4, 2025 18:02

kstich approved these changes Aug 4, 2025

View reviewed changes

mtdowling force-pushed the mtbdd branch 2 times, most recently from 415fcc5 to 238d75c Compare August 14, 2025 02:53

mtdowling added 2 commits August 20, 2025 11:44

Add separate BddFormatter

448d89f

mtdowling added 19 commits August 20, 2025 11:44

Add BDD validation, same as ruleset trait

af9cd9a

Always serialize conditions... even when empty

73b44f2

Fix BddTrait logic issue, using wrong conditions

0d8abb9

We were using the wrong condition ordering in BddTrait after compiling a Bdd from the CFG, leading to a totally broken BDD. Also adds some tests, fixes, and generalizes BddTrait transforms

Create UniqueTable cache and cleanup BddBuilder

3428992

Simplify condition handling in CFG and BDD

493349f

This also revealed a bug in the BDD compilation process that was causing negated nodes to get added twice.

Remove unused methods from ConditionReference

777203d

Remove varint encoding and address PR feedback

8afec2d

The varint encoding does help compact the binary node array, but adds maybe a bit to much decoding complexity for only a 20-30% size reduction, and most of the size comes from conditions and results.

Add coalesce function

79b1cb3

Improve sifting

0dfebc8

Improve transforms

0df404a

Rename CfgGuidedOrdering to InitialOrdering

Rename bdd trait to endpointBdd

67a02ec

Simplify and document coalesce

5bbd38b

Add version validation to rules engine

7c5ee43

This now ensures that syntax elements, functions, etc all have an availableSince version that does not exceed the version decalred on a rules engine trait.

Use NO_MATCH rule for false nodes instead

4a8aa35

mtdowling force-pushed the mtbdd branch from fca7c6f to 9986fc8 Compare August 20, 2025 18:11

mtdowling added 2 commits August 21, 2025 17:11

Remove result coalescing

96d73de

Trying to coalesce SSA nodes with a result phi node is causing BDD resolution issues. Removing for now.

Add BDD trait docs

bed1c9a

alextwoods reviewed Aug 24, 2025

View reviewed changes

mtdowling added 3 commits August 25, 2025 13:45

Add documentation clarifications

bec5965

Add BDD coverage checker

e73a1e4

Add test cases that cover boolean in coalesce

2224ba8

Add BDD-based rules engine trait #2703

Are you sure you want to change the base?

Add BDD-based rules engine trait #2703

Uh oh!

Conversation

mtdowling commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

BDD output examples

Endpoint rules: BDD vs Decision tree size comparison

Uh oh!

JordonPhillips left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JordonPhillips left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mtdowling commented Jul 18, 2025

Uh oh!

Uh oh!

kstich left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alextwoods Aug 24, 2025

Choose a reason for hiding this comment

Uh oh!

mtdowling Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

alextwoods Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mtdowling commented Jul 15, 2025 •

edited

Loading