-
-
Notifications
You must be signed in to change notification settings - Fork 87
Description
Hi all,
I semi-recently came across an oddity to do with PHP 8.4, which brought a larger issue to my attention. It's part of the reason why I haven't released PHPCS 4.0 yet, as I'd like to make sure we get this right before the final release.
Setting the scene
PHP 8.0 changed the tokenization of namespaced names from multi-token to single-token.
For PHPCS 3.x, this was "undone" as of PHPCS 3.5.7, so sniff authors did not have to deal with this breaking change.
For PHPCS 4.x, the new tokenization will be adopted.
So far, so good and in 98% of all cases, this is fine.
The problem
The problem I discovered is related to the PHP 8.4 "exit as a function call" change in PHP.
The exit
/die
keywords are normally tokenized as T_EXIT
.
However, as the keywords can now be used as "function calls", they can be used in fully qualified form (when used as function call, not when used as a "constant")..., i.e. \exit()
and \die()
. Demo: https://3v4l.org/uebpB
In that case, PHP no longer tokenizes the keyword as T_EXIT
on PHP 8.0+, but as T_NAME_FULLY_QUALIFIED
.
So, the PHP native tokenization is as follows:
PHP < 8.0 | PHP 8.0+ | |
---|---|---|
exit() |
T_EXIT |
T_EXIT |
die() |
T_EXIT |
T_EXIT |
\exit() |
T_NS_SEPARATOR + T_EXIT |
T_NAME_FULLY_QUALIFIED |
\die() |
T_NS_SEPARATOR + T_EXIT |
T_NAME_FULLY_QUALIFIED |
Demo: https://3v4l.org/N4cMI
In PHPCS, the tokenization - at this time - is as follows (on all supported PHP versions):
PHPCS 3.x | PHPCS 4.x | |
---|---|---|
exit() |
T_EXIT |
T_EXIT |
die() |
T_EXIT |
T_EXIT |
\exit() |
T_NS_SEPARATOR + T_STRING |
T_NAME_FULLY_QUALIFIED |
\die() |
T_NS_SEPARATOR + T_STRING |
T_NAME_FULLY_QUALIFIED |
In my opinion, the tokenization in PHPCS 3.x is incorrect and should be changed to T_NS_SEPARATOR
+ T_EXIT
.
And while the tokenization for PHPCS 4.x is consistent with PHP, for a specialized token like T_EXIT
, it also feels counter-intuitive and would mean that any sniff/utility function which handles T_EXIT
in any way, would also need to check T_NAME_FULLY_QUALIFIED
+ the token content to see if that happens to be an FQN exit/die.
That includes PHPCS itself and means that something like the T_EXIT
being a parenthesis owner in PHPCS 4.x change is currently broken for FQN exit/die.
Not an isolated problem
While thinking this over, I realized that this is not a problem isolated to exit
/die
, but that it also applies to true
/false
/null
when used as fully qualified constants.
I'd previously already realized this was the case for PHPCS 4.x and made the necessary sniff updates for PHPCS, but what I didn't fully realize at the time, is that PHPCS 3.x tokenization for FQN true
/false
/null
is broken in the same way as FQN exit
/die
.
PHPCS 3.x | PHPCS 4.x | |
---|---|---|
true |
T_TRUE |
T_TRUE |
false |
T_FALSE |
T_FALSE |
null |
T_NULL |
T_NULL |
\true |
T_NS_SEPARATOR + T_STRING |
T_NAME_FULLY_QUALIFIED |
\false |
T_NS_SEPARATOR + T_STRING |
T_NAME_FULLY_QUALIFIED |
\null |
T_NS_SEPARATOR + T_STRING |
T_NAME_FULLY_QUALIFIED |
Note: this problem only applies to these five keywords (exit
/die
/true
/false
/null
). Other keywords cannot be used as "fully qualified" unless they aren't being used as the keyword (which is what the PHP 8.0 tokenizer change was all about).
Proposed solution
While FQN true
/false
/null
is still something which can be worked around - though it does raise the cognitive load for sniff developers -, FQN exit
/die
becomes more problematic to work around.
So, having thought this over, I'm now inclined to change the tokenization for these specific cases as follows:
PHPCS 3.x (3.13.3+) | PHPCS 4.0+ | |
---|---|---|
\exit() |
T_NS_SEPARATOR + T_EXIT |
T_EXIT (content: \exit ) |
\die() |
T_NS_SEPARATOR + T_EXIT |
T_EXIT (content: \die ) |
\true |
T_NS_SEPARATOR + T_TRUE |
T_TRUE (content: \true ) |
\false |
T_NS_SEPARATOR + T_FALSE |
T_FALSE (content: \false ) |
\null |
T_NS_SEPARATOR + T_NULL |
T_NULL (content: \null ) |
The upside of this would be that sniffs which handle T_EXIT
/T_TRUE
/T_FALSE
/T_NULL
tokens should function correctly, even when these keywords are used as fully qualified names.
Fewer sniff changes would be needed for PHPCS 4.0 for those sniffs and it most likely fixes some false positives/negatives for sniffs on PHPCS 3.x.
The downside - though I expect that to be rarely problematic - would be that sniffs which examine the 'content'
index for these tokens will need to take into account that the content could have a leading backslash.
Opinions
So, time for some opinions....
- Any concerns or objections to the above proposal ?
- When I'd make these changes (soonish), should there be a new 4.0.0-RC(2) release before the final PHPCS 4.0 release ?
/cc @asispts @dingo-d @fredden @GaryJones @greg0ire @kukulich @michalbundyra @Ocramius @sirbrillig @stronk7 @weierophinney @wimg