toke.c dont call libc's memcmp() to test 1 byte in Perl_scan_str() #23533
+20
−11
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
delim_byte_len is almost always 1, open_delim_str is almost always '"' or ''' or something similar. I'm not sure which exact string of PP code will make delim_byte_len not be 1, but it would be too rare to optimize for but still must be supported.
Just test the char directly if its length of 1. Invoking libc memcmp() requires 4 ABI inputs on any CPU, and while most of the code paths above the memEQ() lines are constants directly initialized inside Perl_scan_str(), one branch uses "utf8_to_uv_or_die(,,&delim_byte_len)" which optimizes to Perl_utf8_to_uvchr_buf_helper(,,,&delim_byte_len) making the value in STRLEN delim_byte_len unbounded according to all CC. All CCs must assume the value Perl_utf8_to_uvchr_buf_helper() put inside delim_byte_len could be a 4.7GB DVD or 25GB BD .iso file.
Put the retval of SvGROW() to use.
Don't let C auto var delim_byte_len escape with "&" op thru utf8_to_uv_or_die(). Var delim_byte_len can never be stored in a register again by any CC if it escapes and must be reread from C stack after ever possible call if it escapes.