Skip to content

Conversation

@demurgos
Copy link
Contributor

@demurgos demurgos commented Feb 21, 2020

This commit updates the AVM1 libraries to the version 0.10.
With this version, the general structure of these libraries should be complete.

Initially, AVM1 support was included in the SWF libraries (swf-types, swf-parser). It used the model defined by Adobe's SWF spec were you can parse actions into a vec by simply reading them sequentially. It quickly occurred that this model is too simple and did not reflect how the player interprets bytecode. Adobe's interpreter supports jumps to arbitrary offsets, it treats the bytecode as an opaque buffer and reads only one action at a time. Because of this, parsing AVM1 actions ahead of time became much more complicated and AVM1 support was moved to its own libraries.

Initially, there was only support for reading a single low-level action at a time. This was the only API provided by the version used until now by Flashback. And Flashback used it to read the actions sequentially (as in Adobe's spec). The issue with this version was that you were on your own to perform static analysis of AVM1 bytecode. It also meant that control-flow actions such as Jump or If were tightly dependent on the actual encoding because they included byte offsets. The various versions of the AVM1 libraries were focused on bringing back support for static analysis of AVM1 bytecode.

The latest versions solve this by providing two "modes" to view actions: raw or cfg. The raw mode corresponds to how the interpreter reads bytecode: a single action at a time, using byte offsets for control flow. The Control Flow Graph (CFG) mode represents the code as a graph were nodes correspond to linear sections of code (where you can safely advance through the sequence of actions) and edges are jumps in the code. The graph itself is represented as a non-empty vector of blocks. Each block has a unique label, a list of simple actions (with no impact on control flow) and a flow action. The flow action describes the outgoing edges and how they are chosen. The target of the jump is identified by its label, the value None means the end of the current function. The two main variants are CfgFlow::Simple for unconditional jumps and CfgFlow::If for jumps based on truthiness of the top of the stack.

In the case of Flashback, AVM1 support was minimal. Thanks to this, updating the AVM1 libraries to their latest version was fairly easy. For AVM1 bytecode that does not use any form of control flow (Jump, If, WaitForFrame, Throw, etc.) the behavior should be the same. For AVM1 bytecode with control flow, this commit introduce a difference. The previous implementation ignored any form of control flow and just ran everything (e.g. both branches of an If were executed). The new CFG representation forces consumers to handle it properly. In this commit the compiler simply stops when it hits its first control-flow action. Support for control-flow may require larger changes that are best left for some future commit in my opinion.

@demurgos
Copy link
Contributor Author

The CI failure on nightly seems unrelated to the changes in this commit.

This commit updates the AVM1 libraries to the version `0.10`.
With this version, the general structure of these libraries should be complete.

Initially, AVM1 support was included in the SWF libraries (`swf-types`, `swf-parser`). It used the model defined by Adobe's SWF spec were you can parse actions into a vec by simply reading them sequentially. It quickly occurred that this model is too simple and did not reflect how the player interprets bytecode. Adobe's interpreter supports jumps to arbitrary offsets, it treats the bytecode as an opaque buffer and reads only one action at a time. Because of this, parsing AVM1 actions ahead of time became much more complicated and AVM1 support was moved to its own libraries.

Initially, there was only support for reading a single low-level action at a time. This was the only API provided by the version used until now by Flashback. And Flashback used it to read the actions sequentially (as in Adobe's spec). The issue with this version was that you were on your own to perform static analysis of AVM1 bytecode. It also meant that control-flow actions such as `Jump` or `If` were tightly dependent on the actual encoding because they included byte offsets. The various versions of the AVM1 libraries were focused on bringing back support for static analysis of AVM1 bytecode.

The latest versions solve this by providing two "modes" to view actions: `raw` or `cfg`. The raw mode corresponds to how the interpreter reads bytecode: a single action at a time, using byte offsets for control flow. The Control Flow Graph (CFG) mode represents the code as a graph were nodes correspond to linear sections of code (where you can safely advance through the sequence of actions) and edges are jumps in the code. The graph itself is represented as a non-empty vector of blocks. Each block has a unique label, a list of simple actions (with no impact on control flow) and a flow action. The flow action describes the outgoing edges and how they are chosen. The target of the jump is identified by its label, the value `None` means the end of the current function. The two main variants are `CfgFlow::Simple` for unconditional jumps and `CfgFlow::If` for jumps based on truthiness of the top of the stack.

In the case of Flashback, AVM1 support was minimal. Thanks to this, updating the AVM1 libraries to their latest version was fairly easy. For AVM1 bytecode that does not use any form of control flow (`Jump`, `If`, `WaitForFrame`, `Throw`, etc.) the behavior should be the same. For AVM1 bytecode _with_ control flow, this commit introduce a difference. The previous implementation ignored any form of control flow and just ran everything (e.g. both branches of an `If` were executed). The new CFG representation forces consumers to handle it properly. In this commit the compiler simply stops when it hits its first control-flow action. Support for control-flow may require larger changes that are best left for some future commit in my opinion.
@demurgos
Copy link
Contributor Author

I triggered CI again now that the fixed nightly is published: it's green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant