|
| 1 | +# Defining Allocation Sites |
| 2 | + |
| 3 | +Boomerang provides an interface that allows the definition of individual allocation sites. An allocation site is a value that should be considered as a points-to object. |
| 4 | + |
| 5 | + |
| 6 | +## Allocation Site Interface |
| 7 | + |
| 8 | +To define an individual allocation site, we have to implement the `IAllocationSite` interface and override its method `getAllocationSite(...)` that returns an optional `AllocVal`. |
| 9 | +An `AllocVal` represents an allocation site and acts as a wrapper for the allocation site statement and value. |
| 10 | +If the optional is present, the `AllocVal` is added to the resulting allocation sites. |
| 11 | + |
| 12 | +When performing a backward analysis, Boomerang calls this method on each statement on each data-flow path. |
| 13 | +It provides three parameters to the method `getAllocationSite`: |
| 14 | + |
| 15 | +- Method: The current method |
| 16 | +- Statement: The current statement that may contain an allocation site |
| 17 | +- Val: The current propagated data-flow fact |
| 18 | + |
| 19 | +These parameters necessitate two checks that should be part of each allocation site implementation: |
| 20 | + |
| 21 | +- Check whether the statement is an assignment |
| 22 | +- Check whether the left operand of the assignment is equal to the propagated data-flow fact |
| 23 | + |
| 24 | +The first point is relevant because an allocation site is defined as an assignment. |
| 25 | +The second aspect is relevant to avoid returning statements that are not relevant to the points-to analysis. |
| 26 | +Boomerang propagates only data-flow facts that are relevant to or alias with the query variable. |
| 27 | +Therefore, one can exclude irrelevant assignments with the second check. |
| 28 | + |
| 29 | +To this end, a self-defined allocation site should have at least the following code: |
| 30 | + |
| 31 | +```java |
| 32 | +public class ExtendedAllocationSite implements IAllocationSite { |
| 33 | + |
| 34 | + @Override |
| 35 | + public Optional<AllocVal> getAllocationSite(Method method, Statement statement, Val fact) { |
| 36 | + // Check for assignments |
| 37 | + if (!statement.isAssignStmt()) { |
| 38 | + return Optional.empty(); |
| 39 | + } |
| 40 | + |
| 41 | + Val leftOp = statement.getLeftOp(); |
| 42 | + Val rightOp = statement.getRightOp(); |
| 43 | + // Check for correct data-flow fact |
| 44 | + if (!leftOp.equals(fact)) { |
| 45 | + return Optional.empty(); |
| 46 | + } |
| 47 | + |
| 48 | + // rightOp is a potential allocation site |
| 49 | + ... |
| 50 | + } |
| 51 | +} |
| 52 | +``` |
| 53 | + |
| 54 | +Last, to use our self-defined allocation site, we need to add it to the options: |
| 55 | + |
| 56 | +```java |
| 57 | +BoomerangOptions options = |
| 58 | + BoomerangOptions.builder() |
| 59 | + .withAllocationSite(new ExtendedAllocationSite()) |
| 60 | + ... |
| 61 | + .build(); |
| 62 | +``` |
| 63 | + |
| 64 | +## Simple Allocation Site |
| 65 | + |
| 66 | +To show how an implementation of the `IAllocationSite` interface may look like, we consider the following simple example: |
| 67 | + |
| 68 | +Assume our program requires *constants* and *new expressions* as allocation sites. |
| 69 | +Then, the interface implementation may look like this: |
| 70 | + |
| 71 | +```java |
| 72 | +public class SimpleAllocationSite implements IAllocationSite { |
| 73 | + |
| 74 | + @Override |
| 75 | + public Optional<AllocVal> getAllocationSite(Method method, Statement statement, Val fact) { |
| 76 | + // Check for assignments |
| 77 | + if (!statement.isAssignStmt()) { |
| 78 | + return Optional.empty(); |
| 79 | + } |
| 80 | + |
| 81 | + Val leftOp = statement.getLeftOp(); |
| 82 | + Val rightOp = statement.getRightOp(); |
| 83 | + // Check for correct data-flow fact |
| 84 | + if (!leftOp.equals(fact)) { |
| 85 | + return Optional.empty(); |
| 86 | + } |
| 87 | + |
| 88 | + // Constant allocation sites: var = <constant> |
| 89 | + if (rightOp.isConstant()) { |
| 90 | + AllocVal allocVal = new AllocVal(leftOp, statement, rightOp); |
| 91 | + return Optional.of(allocVal); |
| 92 | + } |
| 93 | + |
| 94 | + // New expressions: var = new java.lang.Object |
| 95 | + if (rightOp.isNewExpr()) { |
| 96 | + AllocVal allocVal = new AllocVal(leftOp, statement, rightOp); |
| 97 | + return Optional.of(allocVal); |
| 98 | + } |
| 99 | + |
| 100 | + return Optional.empty(); |
| 101 | + } |
| 102 | +} |
| 103 | +``` |
| 104 | + |
| 105 | +Using this allocation site implementation, Boomerang returns values that are either *new expressions* (e.g. `new java.lang.Object`) or *constants* (e.g. int, String etc.). |
| 106 | + |
| 107 | +## Allocation Site with DataFlowScope |
| 108 | + |
| 109 | +In many cases, we are interested in finding an allocation site to analyze it. |
| 110 | +However, a common scenario where Boomerang cannot find an allocation site occurs when a data-flow path ends because we have a function call that is not part of the application. |
| 111 | +For example, using the `SimpleAllocationSite` from the previous section, Boomerang would not find an allocation site in the following program: |
| 112 | + |
| 113 | +```java |
| 114 | +String s = System.getProperty("property"); // Most precise allocation site |
| 115 | +... |
| 116 | +queryFor(s); |
| 117 | +``` |
| 118 | + |
| 119 | +Boomerang does not compute an allocation site because `System.getProperty("property")` is not a *constant* or a *new expression*. |
| 120 | +Additionally, we may be interested in analyzing only our own application, that is, we do not load the JDK class `java.lang.System` and exclude it in the `DataFlowScope`. |
| 121 | +In this case, Boomerang returns an empty results set because the data-flow path ends at the call `System.getProperty("property")`. |
| 122 | + |
| 123 | +To cover these scenarios, we can include the `DataFlowScope` in the allocation site implementation. |
| 124 | +For example, we can extend the [DefaultAllocationSite](https://github.com/secure-software-engineering/Boomerang/blob/develop/boomerangPDS/src/main/java/boomerang/options/DefaultAllocationSite.java) as follows: |
| 125 | + |
| 126 | +```java |
| 127 | +public class ExtendedDataFlowScope extends DefaultAllocationSite { |
| 128 | + |
| 129 | + private final DataFlowScope dataFlowScope; |
| 130 | + |
| 131 | + public ExtendedDataFlowScope(DataFlowScope dataFlowScope) { |
| 132 | + this.dataFlowScope = dataFlowScope; |
| 133 | + } |
| 134 | + |
| 135 | + @Override |
| 136 | + public Optional<AllocVal> getAllocationSite(Method method, Statement statement, Val fact) { |
| 137 | + // Check for assignments |
| 138 | + if (!statement.isAssignStmt()) { |
| 139 | + return Optional.empty(); |
| 140 | + } |
| 141 | + |
| 142 | + Val leftOp = statement.getLeftOp(); |
| 143 | + Val rightOp = statement.getRightOp(); |
| 144 | + // Check for correct data-flow fact |
| 145 | + if (!leftOp.equals(fact)) { |
| 146 | + return Optional.empty(); |
| 147 | + } |
| 148 | + |
| 149 | + // Check for function calls that would end the data-flow path |
| 150 | + // If the function call is not excluded, Boomerang can continue with the analysis |
| 151 | + if (statement.containsInvokeExpr()) { |
| 152 | + InvokeExpr invokeExpr = statement.getInvokeExpr(); |
| 153 | + DeclaredMethod declaredMethod = invokeExpr.getDeclaredMethod(); |
| 154 | + |
| 155 | + if (dataFlowScope.isExcluded(declaredMethod)) { |
| 156 | + // rightOp is the invoke expression |
| 157 | + AllocVal allocVal = new AllocVal(leftOp, statement, rightOp); |
| 158 | + return Optional.of(allocVal); |
| 159 | + } |
| 160 | + } |
| 161 | + |
| 162 | + // If the statement does not contain a function call, we continue with the default behavior |
| 163 | + return super.getAllocationSite(method, statement, fact); |
| 164 | + } |
| 165 | +} |
| 166 | +``` |
| 167 | + |
| 168 | +With this implementation, we cover function calls that would end the analysis, and we can conclude that the allocation site cannot be computed precisely. |
| 169 | +For example, having `System.getProperty("property")` as allocation site indicates that the query variable points to some object that depends on some system variables at runtime. |
0 commit comments