|
180 | 180 | "We have to reverse engineer a packed Python script.\n", |
181 | 181 | "The first layer is a sequence of zlib-decompression and unmarshaling;\n", |
182 | 182 | "here's a [peek][] at the code without the useless comments;\n", |
183 | | - "It first uses [resplit][] to split the input at line breaks, and then filters out all lines that contain a comment using [iffx][] within the [frame][]. The unit [sep][] re-inserts line breaks before fusing the lines at the end of that frame, and [trim][] strips all surrounding whitespace of the document.\n", |
184 | 183 | "\n", |
185 | | - "[resplit]: https://binref.github.io/#refinery.resplit\n", |
186 | | - "[iffx]: https://binref.github.io/#refinery.iffx\n", |
187 | | - "[sep]: https://binref.github.io/#refinery.sep\n", |
188 | | - "[trim]: https://binref.github.io/#refinery.trim\n", |
189 | | - "[peek]: https://binref.github.io/#refinery.peek\n", |
190 | | - "[frame]: https://binref.github.io/lib/frame.html" |
| 184 | + "[peek]: https://binref.github.io/#refinery.peek" |
191 | 185 | ] |
192 | 186 | }, |
193 | 187 | { |
|
220 | 214 | "cell_type": "markdown", |
221 | 215 | "metadata": {}, |
222 | 216 | "source": [ |
223 | | - "We use [csd][], another shortcut for [carve][], which includes both the `-s` and `-d` switches.\n", |
224 | | - "The former means to carve the single largest buffer, while `-d` instructs the unit to decode the buffer.\n", |
225 | | - "For a string literal, this means to remove the surrounding quotes and resolve any escape sequences.\n", |
| 217 | + "*(This pipeline first uses [resplit][] to split the input at line breaks, and then filters out all lines that contain a comment using [iffx][] within the [frame][]. The unit [sep][] re-inserts line breaks before fusing the lines at the end of that frame, and [trim][] strips all surrounding whitespace of the document.)*\n", |
| 218 | + "\n", |
| 219 | + "[resplit]: https://binref.github.io/#refinery.resplit\n", |
| 220 | + "[iffx]: https://binref.github.io/#refinery.iffx\n", |
| 221 | + "[sep]: https://binref.github.io/#refinery.sep\n", |
| 222 | + "[trim]: https://binref.github.io/#refinery.trim\n", |
| 223 | + "[peek]: https://binref.github.io/#refinery.peek\n", |
| 224 | + "[frame]: https://binref.github.io/lib/frame.html" |
| 225 | + ] |
| 226 | + }, |
| 227 | + { |
| 228 | + "cell_type": "markdown", |
| 229 | + "metadata": {}, |
| 230 | + "source": [ |
| 231 | + "We next use [esc][], which decodes string literals and resolves backslash escape sequences.\n", |
226 | 232 | "Then, we use [zl][] to decompress the buffer.\n", |
227 | | - "There is also a refinery unit [pym][] for unmarshaling which attempts to dump code objects with the correct magic.\n", |
| 233 | + "Finally, there is a unit [pym][] for unmarshaling which attempts to dump code objects with the correct magic.\n", |
228 | 234 | "This doesn't always work well, but after tweaking the unit during Flare On, it should definitely continue to work for this specific sample as I'm adding a regression test for it:\n", |
229 | 235 | "\n", |
230 | | - "\n", |
231 | | - "[csd]: https://binref.github.io/#refinery.csd\n", |
| 236 | + "[esc]: https://binref.github.io/#refinery.esc\n", |
232 | 237 | "[carve]: https://binref.github.io/#refinery.carve\n", |
233 | 238 | "[pym]: https://binref.github.io/#refinery.pym\n", |
234 | 239 | "[zl]: https://binref.github.io/#refinery.zl" |
|
240 | 245 | "metadata": {}, |
241 | 246 | "outputs": [], |
242 | 247 | "source": [ |
243 | | - "%emit project_chimera.py | csd str | zl | pym | dump stage1.pyc" |
| 248 | + "%emit project_chimera.py | csb str | esc | zl | pym | dump stage1.pyc" |
244 | 249 | ] |
245 | 250 | }, |
246 | 251 | { |
|
0 commit comments