FAQ and Glossary

Short answers to common questions plus quick definitions of core terms used throughout RegexParser documentation.

Frequently Asked Questions

General Questions

Does RegexParser execute regexes?

No. RegexParser parses and analyzes patterns statically. It never actually runs the regex against input. Runtime validation is optional and uses a safe compile check with preg_match().

use RegexParser\Regex;

// Static analysis - does NOT execute
$analysis = Regex::create()->redos('/(a+)+b/');
echo $analysis->severity->value;  // 'critical' - without running it!

// Runtime validation (optional, uses PCRE)
$regex = Regex::create(['runtime_pcre_validation' => true]);
$result = $regex->validate('/test/');

Is this PCRE2-only?

Yes. RegexParser targets PHP's preg_* engine, which uses PCRE2. Patterns are validated against PCRE2 semantics.

// PCRE2-specific features work
preg_match('/\p{L}/u', $text);  // Unicode properties

// PCRE1 patterns may not work
// preg_match('/\g{0}/', $text);  // Invalid in PCRE2

Does this guarantee ReDoS safety?

No. RegexParser detects known risky structures and suggests safer alternatives, but safety depends on:

Input patterns
Flags used
Runtime limits set by PHP or the application

// RegexParser will warn about this:
$analysis = Regex::create()->redos('/(a+)+b/');
// severity: 'critical'

// But you must still:
// 1. Apply runtime limits
// 2. Validate input length
// 3. Consider using atomic groups

What is tolerant parsing?

Tolerant parsing returns a partial AST plus errors, allowing tools to continue even when patterns are partially invalid.

use RegexParser\Regex;

// Strict parsing - throws on error
$ast = Regex::create()->parse('/[broken/');  // Throws ParserException

// Tolerant parsing - returns partial AST
$result = Regex::create()->parse('/[broken/', true);
echo $result->ast instanceof \RegexParser\Node\RegexNode;  // true (partial)
echo count($result->errors);  // 1

Can I use this in CI?

Yes. RegexParser is designed for CI/CD integration.

# CLI linting
vendor/bin/regex lint src/ --format=json > regex-issues.json

# Check for critical issues
if jq '[.results[] | .issues[] | select(.type == "error")] | length' regex-issues.json | grep -q 0; then
    echo "No critical regex issues found"
else
    echo "Critical issues found!"
    exit 1
fi

# GitHub Actions example
- name: Run RegexParser
  run: vendor/bin/regex lint src/ --format=json > regex-report.json
- name: Check report
  uses: dawidd6/action-json-to-coverage@v1
  with:
    report: regex-report.json
    min_coverage: 95

Why an AST?

The Abstract Syntax Tree provides:

Benefit	Description
Precision	Exact error locations, not just "somewhere in pattern"
Analysis	Detect complex issues like ReDoS structurally
Transformation	Refactor patterns safely without string hacking
Tooling	Support IDEs, linters, formatters

// String-based tools can only guess:
preg_match('/test/', $pattern);  // What if 'test' is escaped?

// AST knows the structure:
$ast = Regex::create()->parse('/test/');
$sequence = $ast->pattern;  // Exact structure known

Usage Questions

How do I check if a pattern is safe from ReDoS?

use RegexParser\Regex;

$analysis = Regex::create()->redos('/(a+)+b/');

echo $analysis->severity->value;      // 'critical', 'safe', 'low', 'medium'
echo $analysis->confidence->value;    // 'high', 'medium', 'low'
echo $analysis->recommendations[0];   // Suggested fix

How do I optimize a pattern?

use RegexParser\Regex;

$result = Regex::create()->optimize('/[0-9]+/');

echo $result->original;    // '/[0-9]+/'
echo $result->optimized;   // '/\d+/'
echo $result->changes[0];  // 'Replaced [0-9] with \d'

How do I explain a pattern to users?

use RegexParser\Regex;

$explanation = Regex::create()->explain('/\d{3}-\d{4}/');
echo $explanation;
/*
Match exactly 3 digits, then hyphen, then exactly 4 digits.
*/

How do I generate a matching sample?

use RegexParser\Regex;

$sample = Regex::create()->generate('/[A-Z][a-z]{3,5}\d{2}/');
echo $sample;  // e.g., "Word12"

Technical Questions

What's the difference between validate() and parse()?

Method	Returns	On Error
`parse()`	RegexNode (AST)	Throws exception
`validate()`	ValidationResult	Returns result with `isValid = false`

use RegexParser\Regex;

// parse() - throws
try {
    $ast = Regex::create()->parse('/[broken/');
} catch (\Exception $e) {
    echo "Parse failed: {$e->getMessage()}";
}

// validate() - returns result
$result = Regex::create()->validate('/[broken/');
echo $result->isValid ? 'Valid' : "Invalid: {$result->error}";

How does caching work?

use RegexParser\Regex;

// Default: Filesystem cache
$regex = Regex::create();  // Uses temp directory cache

// Custom cache location
$regex = Regex::create(['cache' => '/my/app/cache']);

// Disable cache
$regex = Regex::create(['cache' => null]);

// Long-running processes should clear cache
$regex->clearValidatorCaches();

What PHP versions are supported?

PHP 8.2 and above
Uses modern PHP features (readonly classes, enums, etc.)

// Requires PHP 8.2+
$regex = Regex::create([
    'php_version' => '8.2',  // Target version
]);

Glossary

Term	Definition
AST	Abstract Syntax Tree - structured representation of the regex
Node	Single element in the AST (literal, group, quantifier, etc.)
Visitor	Algorithm that traverses the AST (compile, explain, lint)
PCRE2	Perl Compatible Regular Expressions - the engine PHP uses
ReDoS	Regular Expression Denial of Service - catastrophic backtracking
Backtracking	Engine behavior that retries alternative paths on failure
Lookaround	Zero-width assertion like `(?=...)` or `(?<=...)`
Atomic group	`(?>...)` - prevents backtracking inside the group
Possessive quantifier	`*+`, `++`, `{m,n}+` - no backtracking
Branch reset	`(?
Subroutine	`(?1)` or `(?&uses a group definition
Lexer	Tokenizesname)` - re the pattern string into tokens
Parser	Builds the AST from tokens
Tokenizer	Same as Lexer
Delimiter	Character marking pattern boundaries (e.g., `/` in `/pattern/`)
Flag	Modifier like `i` (case-insensitive) or `s` (dotall)
Quantifier	`*`, `+`, `?`, `{m,n}` - specifies repetition
Greedy	Default quantifier behavior - matches as much as possible
Lazy	`*?`, `+?`, `??` - matches as little as possible
Capturing group	`(...)` - captures matched text
Non-capturing group	`(?:...)` - groups without capture
Named group	`(?<name>...)` - captures with a name
Backreference	`\1`, `\k<name>` - refers to previous capture
Escape sequence	`\d`, `\w`, `\x{...}` - special character representation
Character class	`[...]` - matches one character from a set
Negated class	`[^...]` - matches any character NOT in the set
Shorthand class	`\d`, `\w`, `\s` - common character classes
Anchor	`^`, `$`, `\A`, `\z` - matches position, not characters
Assertion	Zero-width check like `\b` or `(?=...)`
Word boundary	`\b` - transition between word and non-word characters
Unicode property	`\p{L}`, `\p{N}` - characters matching Unicode properties

Pattern Quick Reference

Pattern	Meaning	Example
`.`	Any character except newline	`/.+/`
`\d`	Digit [0-9]	`/\d{3}/`
`\w`	Word character [a-zA-Z0-9_]	`/\w+/`
`\s`	Whitespace	`/\s*/`
`\b`	Word boundary	`/\bword\b/`
`^`	Start of string	`/^start/`
`$`	End of string	`/end$/`
`*`	0 or more	`/a*/`
`+`	1 or more	`/a+/`
`?`	0 or 1	`/a?/`
`{n,m}`	Between n and m	`/a{2,4}/`
`[abc]`	Any of a, b, c	`/[abc]/`
`[^abc]`	Not a, b, or c	`/[^abc]/`
`(...)`	Capturing group	`/(\w+)/`
`(?:...)`	Non-capturing	`/(?:foo)/`
`(?=...)`	Positive lookahead	`/(?=\d)/`
`(?!...)`	Negative lookahead	`/(?!\d)/`
`(?<=...)`	Positive lookbehind	`/(?<=\d)/`
`(?<!...)`	Negative lookbehind	`/(?<!\d)/`
`\|`	Alternation	`/a\|b/`
`\1`	Backreference	`/(\w+)\1/`
`(?>...)`	Atomic group	`/(?>a+)/`
`*+`	Possessive quantifier	`/a*+/`

Previous: Diagnostics | Next: Diagnostics Cheat Sheet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FAQ and Glossary

Frequently Asked Questions

General Questions

Does RegexParser execute regexes?

Is this PCRE2-only?

Does this guarantee ReDoS safety?

What is tolerant parsing?

Can I use this in CI?

Why an AST?

Usage Questions

How do I check if a pattern is safe from ReDoS?

How do I optimize a pattern?

How do I explain a pattern to users?

How do I generate a matching sample?

Technical Questions

What's the difference between validate() and parse()?

How does caching work?

What PHP versions are supported?

Glossary

Pattern Quick Reference

Uh oh!

FilesExpand file tree

faq-glossary.md

Latest commit

History

faq-glossary.md

File metadata and controls

FAQ and Glossary

Frequently Asked Questions

General Questions

Does RegexParser execute regexes?

Is this PCRE2-only?

Does this guarantee ReDoS safety?

What is tolerant parsing?

Can I use this in CI?

Why an AST?

Usage Questions

How do I check if a pattern is safe from ReDoS?

How do I optimize a pattern?

How do I explain a pattern to users?

How do I generate a matching sample?

Technical Questions

What's the difference between validate() and parse()?

How does caching work?

What PHP versions are supported?

Glossary

Pattern Quick Reference