Skip to content

Conversation

@mverver-google
Copy link

Instead of increasing address every iteration of the loop, we can just keep
it fixed and double the number of bytes copied every iteration. This works
because the string to be generated is periodic.

For example, if we want to copy 9 bytes starting 2 bytes back:

   |-------| size 9
abc.........
 ^ ^
 | |
 | target_bytes_decoded: 3
 address: 1

Then the old version of the code would generate these intermediate states:

abc.........
 ^ ^
abcbc....... (1)
   ^ ^
abcbcbc..... (2)
     ^ ^
abcbcbcbc... (3)
       ^ ^
abcbcbcbcbc. (4)
         ^ ^
abcbcbcbcbcb (5, outside the loop)

While the new version would double the range to be copied every time:

abc.........
 ^ ^
abcbc....... (1)
 ^   ^
abcbcbcbc... (2)
 ^       ^
abcbcbcbcbcb (3, outside the loop)

In general, if s = size and d = (target_bytes_decoded - address), then the
number of calls to CopyBytes is reduced from (s/d) + 1 to log(s/d)/log(2) + 1.

The total time complexity is still O(s) because CopyBytes is presumably linear
in the number of bytes copied, but we end up doing fewer calls in total, which
is likely to be faster in practice, especially if s is large and d is small.

Instead of increasing `address` every iteration of the loop, we can just keep
it fixed and double the number of bytes copied every iteration. This works
because the string to be generated is periodic.

For example, if we want to copy 9 bytes starting 2 bytes back:

       |-------| size 9
    abc.........
     ^ ^
     | |
     | target_bytes_decoded: 3
     address: 1

Then the old version of the code would generate these intermediate states:

    abc.........
     ^ ^
    abcbc....... (1)
       ^ ^
    abcbcbc..... (2)
         ^ ^
    abcbcbcbc... (3)
           ^ ^
    abcbcbcbcbc. (4)
             ^ ^
    abcbcbcbcbcb (5, outside the loop)

While the new version would double the range to be copied every time:

    abc.........
     ^ ^
    abcbc....... (1)
     ^   ^
    abcbcbcbc... (2)
     ^       ^
    abcbcbcbcbcb (3, outside the loop)

In general, if s = size and d = (target_bytes_decoded - address), then the
number of calls to CopyBytes is reduced from (s/d) + 1 to log(s/d)/log(2) + 1.

The total time complexity is still O(s) because CopyBytes is presumably linear
in the number of bytes copied, but we end up doing fewer calls in total, which
is likely to be faster in practice, especially if `s` is large and `d` is small.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant