Releases: rocky/python-xdis
5 short of Dad's Birthday
-
PR #73 from mitre:
Allow an alternate opmap - adds the capability to disassemble python bytecode that has
been frozen with a custom opcode mapping. This is particularly useful for disassembling
malware that uses custom opcode mappings in an attempt to hinder disassembly with standard
tools. The updates in this pull request are used by pydecipher, a tool to unfreeze and deob fuscate frozen python code. -
Add Python versions 3.8.8 and 3.9.2
Sam on his own
- Add Python 3.8.7
5.0.6
65
5.0.4
Get ready for release 5.0.3
- Add versions 3.8.5, 3.7.8, and 3.6.11
- Clarify changes to 3.8
ROT_FOUR - Update 3.9 magics and opcodes
5.0.2
5.0.1
Two small improvements that are usefil in the forthcoming trepan3k release:
- interpret
RAISE_VARARGS'sargcparameter. Some other formatting was extended too check_object_path()is more leanient in the path name (it doesn't have to end in.pyanymore), but it is
more stringent about what constitutes Python source (it compiles the text to determine validity)- In the above
is_python_source()andis_bytecode_extension()are used. They are also exported.
5.0.0
Disassembly format and options have simplified and improved.
I had this "Aha!" moment working on the cross-version interpreter x-python. It can show a better disassembly because it has materialized stack entries.
So for example when a COMPARE_OP instruction is run it can show what operands are getting compared.
It was then that I realized that this is also true much of the time statically. For example you'll often find a LOAD_CONST instruction before a RETURN_VALUE and when you do can show exactly what is getting returned. Although cute, the place where something like this is most appreciated and needed is in calling functions such as via CALL_FUNCTION. The situation here is that the name of the function is on the stack and it can be several instructions back depending on the number of parameters. However in a large number of cases, by tracking use of stack effects (added in a previous release), we can often location the LOAD_CONST of that function name.
Note though that we don't attempt work across basic blocks to track down information. Nor do we even attempt recreate expression trees. We don't track across call which has a parameter return value which is the return from another call. Still, I find this all very useful.
This is not shown by default though. Instead we use a mode called "classic". To get this, in pydisasm use the --format extended or --format extended-bytes.
And that brings up a second change in formatting. Before, we had separate flags and command-line options for whether to show just the header, and whether to include bytecode ops in the output. Now there is just a single parameter called asm_format, and choice option --format (short option -F).
As a result this release is incompatible with prior releases, hence the version bump.
A slight change was made in "classic" output. Before we had shown the index into some code table, like co_consts or co_varnames. That no longer appears. If you want that information select either the bytes or extended-bytes formats.
A bug was fixed in all offsts in the recently-added xdis.lineoffsets module.
Fleetwood66
Routines for extracting line and offset information from code objects was added.
Specifically in module xdis.lineoffsets:
* classes: LineOffsetInfo, LineOffsets, and LineOffsetsCompact
* functions: lineoffsets_in_file(), lineoffsets_in_module()
This is need to better support debugging which is done via module
pyficache.
In the future, I intend to make use of this to disambiguate which offset to break at when there are several for a line. Or to indicate better which function or module the line is located in when reporting lines.
For example in:
z = lambda x, y: x + ythere two offsets associated with that line. The first is to the assignment of z while the second is to the addition expression inside the lambda.
In other news, a long-standing bug was fixed to handle bytestring constants in 3.x. We had been erroneously converting bytestrings into 3.x. However when decompiling 1.x or 2.x bytecode from 3.x we still need to convert bytestrings into strings.
Also, operand formatting in assembly for BUILD_UNMAP_WITH_CALL has been improved, and
we note how the operand encoding has changed between 3.5. and 3.6.
Disassembly now properly marks offsets where the line number that doesn't change from the previous entry.