Skip to content

Proposal to fix Tokenizer issue with FQN exit/die/true/false/null #1201

@jrfnl

Description

@jrfnl

Hi all,

I semi-recently came across an oddity to do with PHP 8.4, which brought a larger issue to my attention. It's part of the reason why I haven't released PHPCS 4.0 yet, as I'd like to make sure we get this right before the final release.

Setting the scene

PHP 8.0 changed the tokenization of namespaced names from multi-token to single-token.

For PHPCS 3.x, this was "undone" as of PHPCS 3.5.7, so sniff authors did not have to deal with this breaking change.
For PHPCS 4.x, the new tokenization will be adopted.

So far, so good and in 98% of all cases, this is fine.

The problem

The problem I discovered is related to the PHP 8.4 "exit as a function call" change in PHP.

The exit/die keywords are normally tokenized as T_EXIT.
However, as the keywords can now be used as "function calls", they can be used in fully qualified form (when used as function call, not when used as a "constant")..., i.e. \exit() and \die(). Demo: https://3v4l.org/uebpB
In that case, PHP no longer tokenizes the keyword as T_EXIT on PHP 8.0+, but as T_NAME_FULLY_QUALIFIED.

So, the PHP native tokenization is as follows:

PHP < 8.0 PHP 8.0+
exit() T_EXIT T_EXIT
die() T_EXIT T_EXIT
\exit() T_NS_SEPARATOR + T_EXIT T_NAME_FULLY_QUALIFIED
\die() T_NS_SEPARATOR + T_EXIT T_NAME_FULLY_QUALIFIED

Demo: https://3v4l.org/N4cMI

In PHPCS, the tokenization - at this time - is as follows (on all supported PHP versions):

PHPCS 3.x PHPCS 4.x
exit() T_EXIT T_EXIT
die() T_EXIT T_EXIT
\exit() T_NS_SEPARATOR + T_STRING T_NAME_FULLY_QUALIFIED
\die() T_NS_SEPARATOR + T_STRING T_NAME_FULLY_QUALIFIED

In my opinion, the tokenization in PHPCS 3.x is incorrect and should be changed to T_NS_SEPARATOR + T_EXIT.

And while the tokenization for PHPCS 4.x is consistent with PHP, for a specialized token like T_EXIT, it also feels counter-intuitive and would mean that any sniff/utility function which handles T_EXIT in any way, would also need to check T_NAME_FULLY_QUALIFIED + the token content to see if that happens to be an FQN exit/die.

That includes PHPCS itself and means that something like the T_EXIT being a parenthesis owner in PHPCS 4.x change is currently broken for FQN exit/die.

Not an isolated problem

While thinking this over, I realized that this is not a problem isolated to exit/die, but that it also applies to true/false/null when used as fully qualified constants.

I'd previously already realized this was the case for PHPCS 4.x and made the necessary sniff updates for PHPCS, but what I didn't fully realize at the time, is that PHPCS 3.x tokenization for FQN true/false/null is broken in the same way as FQN exit/die.

PHPCS 3.x PHPCS 4.x
true T_TRUE T_TRUE
false T_FALSE T_FALSE
null T_NULL T_NULL
\true T_NS_SEPARATOR + T_STRING T_NAME_FULLY_QUALIFIED
\false T_NS_SEPARATOR + T_STRING T_NAME_FULLY_QUALIFIED
\null T_NS_SEPARATOR + T_STRING T_NAME_FULLY_QUALIFIED

Note: this problem only applies to these five keywords (exit/die/true/false/null). Other keywords cannot be used as "fully qualified" unless they aren't being used as the keyword (which is what the PHP 8.0 tokenizer change was all about).

Proposed solution

While FQN true/false/null is still something which can be worked around - though it does raise the cognitive load for sniff developers -, FQN exit/die becomes more problematic to work around.

So, having thought this over, I'm now inclined to change the tokenization for these specific cases as follows:

PHPCS 3.x (3.13.3+) PHPCS 4.0+
\exit() T_NS_SEPARATOR + T_EXIT T_EXIT (content: \exit)
\die() T_NS_SEPARATOR + T_EXIT T_EXIT (content: \die)
\true T_NS_SEPARATOR + T_TRUE T_TRUE (content: \true)
\false T_NS_SEPARATOR + T_FALSE T_FALSE (content: \false)
\null T_NS_SEPARATOR + T_NULL T_NULL (content: \null)

The upside of this would be that sniffs which handle T_EXIT/T_TRUE/T_FALSE/T_NULL tokens should function correctly, even when these keywords are used as fully qualified names.
Fewer sniff changes would be needed for PHPCS 4.0 for those sniffs and it most likely fixes some false positives/negatives for sniffs on PHPCS 3.x.

The downside - though I expect that to be rarely problematic - would be that sniffs which examine the 'content' index for these tokens will need to take into account that the content could have a leading backslash.

Opinions

So, time for some opinions....

  • Any concerns or objections to the above proposal ?
  • When I'd make these changes (soonish), should there be a new 4.0.0-RC(2) release before the final PHPCS 4.0 release ?

/cc @asispts @dingo-d @fredden @GaryJones @greg0ire @kukulich @michalbundyra @Ocramius @sirbrillig @stronk7 @weierophinney @wimg

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions