Skip to content

Latest commit

 

History

History
403 lines (312 loc) · 9.69 KB

File metadata and controls

403 lines (312 loc) · 9.69 KB

Upgrading RegexParser

This guide helps you upgrade RegexParser between versions. For detailed changes, see CHANGELOG.md.

[Unreleased] → 1.3.0

Breaking Changes

1. Lint Configuration Migration

Before:

// config/regex.php
return [
    'rules' => [
        'suspiciousAsciiRange' => true,
        'duplicateCharacterClass' => true,
    ],
    'redosMode' => 'theoretical',
    'redosThreshold' => 'high',
    'redosNoJit' => false,
    'optimizations' => [
        'digits' => true,
        'word' => true,
        'ranges' => true,
        'canonicalizeCharClasses' => true,
    ],
    'minSavings' => 10,
];

After:

// config/regex.php
return [
    'checks' => [
        'lint' => [
            'suspiciousAsciiRange' => true,
            'duplicateCharacterClass' => true,
        ],
        'redos' => [
            'mode' => 'theoretical',  // 'theoretical' or 'confirmed'
            'threshold' => 'high',  // 'safe', 'low', 'medium', 'high', 'critical'
            'noJit' => false,
        ],
        'optimization' => [
            'digits' => true,
            'word' => true,
            'ranges' => true,
            'canonicalizeCharClasses' => true,
            'minSavings' => 10,
        ],
    ],
];

Migration Steps:

  1. Move rules.*checks.lint.*
  2. Move redosModechecks.redos.mode
  3. Move redosThresholdchecks.redos.threshold
  4. Move redosNoJitchecks.redos.noJit
  5. Move optimizations.*checks.optimization.*
  6. Move minSavingschecks.optimization.minSavings

2. Composer Script Replacement

Before:

{
    "scripts": {
        "regex:lint": "bin/regex lint --rules=suspiciousAsciiRange,duplicateCharacterClass"
    }
}

After:

{
    "scripts": {
        "regex:lint": "bin/regex lint --checks.lint.suspiciousAsciiRange --checks.lint.duplicateCharacterClass"
    }
}

3. PHPStan Configuration Migration

Before:

parameters:
    regexParser:
        rules:
            suspiciousAsciiRange: true
        redosMode: theoretical
        redosThreshold: high

After:

parameters:
    regexParser:
        checks:
            lint:
                suspiciousAsciiRange: true
            redos:
                mode: theoretical
                threshold: high

Deprecated Features

The following features are deprecated and will be removed in 2.0.0:

  • rules config key (use checks.lint)
  • redosMode config key (use checks.redos.mode)
  • redosThreshold config key (use checks.redos.threshold)
  • redosNoJit config key (use checks.redos.noJit)
  • optimizations config key (use checks.optimization)
  • minSavings config key (use checks.optimization.minSavings)

New Features

1. Structured Lint Configuration

All lint checks are now organized under checks namespace for better organization:

$regex = Regex::create();
$result = $regex->lint($pattern, [
    'checks' => [
        'lint' => [
            'suspiciousAsciiRange' => true,
            'duplicateCharacterClass' => true,
            'uselessRange' => true,
            'zeroQuantifier' => true,
            'redundantQuantifier' => true,
            'emptyAlternative' => true,
            'duplicateDisjunction' => true,
            'uselessBackreference' => true,
            'optimalQuantifierConcatenation' => true,
        ],
        'redos' => [
            'mode' => 'confirmed',  // NEW: Run actual tests
            'threshold' => ReDoSSeverity::MEDIUM,
            'maxTestStrings' => 1000,
            'noJit' => false,
        ],
        'optimization' => [
            'digits' => true,
            'word' => true,
            'ranges' => true,
            'canonicalizeCharClasses' => true,
            'autoPossessify' => true,
            'allowAlternationFactorization' => false,
            'minQuantifierCount' => 4,
            'minSavings' => 10,
            'verifyWithAutomata' => true,  // NEW: Verify equivalence
        ],
    ],
]);

2. Automata-Based Optimization Verification

New verifyWithAutomata option ensures optimized patterns are mathematically equivalent:

$regex = Regex::create();
$result = $regex->optimize($pattern, [
    'verifyWithAutomata' => true,  // NEW: Guarantee equivalence
]);

if ($result->optimizedPattern !== $pattern) {
    echo "Optimized safely: {$result->optimizedPattern}\n";
} else {
    echo "Pattern already optimal or cannot guarantee equivalence\n";
}

Migration Checklist

Use this checklist to ensure your upgrade is complete:

  • Replace all rules.* config with checks.lint.*
  • Replace redosMode with checks.redos.mode
  • Replace redosThreshold with checks.redos.threshold
  • Replace redosNoJit with checks.redos.noJit
  • Move optimizations.* to checks.optimization.*
  • Move minSavings to checks.optimization.minSavings
  • Update PHPStan configuration (if used)
  • Update composer scripts (if using custom flags)
  • Test your regex patterns after upgrade
  • Run bin/regex lint on your codebase
  • Review ReDoS warnings with new confirmed mode if needed

Testing Your Upgrade

After upgrading, run these commands to verify everything works:

# 1. Verify linting works with new config
bin/regex lint src/ --config=config/regex.php

# 2. Test ReDoS detection with new mode
bin/regex analyze '/(a+)+$/' --checks.redos.mode=confirmed

# 3. Verify optimization with automata verification
bin/regex optimize '/[0-9]{3}-[0-9]{3}-[0-9]{4}/' --checks.optimization.verifyWithAutomata

# 4. Run full test suite
composer phpunit

[1.2.0] → 1.3.0

Breaking Changes

1. Transpiler API Signature Change

Before:

use RegexParser\Transpiler\RegexTranspiler;

$transpiler = new RegexTranspiler($regex);
$result = $transpiler->transpile('/\d+/', 'javascript');

After:

use RegexParser\Regex;

$regex = Regex::create();
$result = $regex->transpile('/\d+/', 'javascript');

// Or with options
use RegexParser\Transpiler\TranspileOptions;

$options = new TranspileOptions(
    targetLanguage: 'javascript',
    ensureBackwardCompatibility: true,
);
$result = $regex->transpile('/\d+/', $options);

Migration Steps:

  1. Replace new RegexTranspiler($regex) with Regex::create()
  2. Call $regex->transpile() instead of $transpiler->transpile()
  3. Use TranspileOptions for advanced configurations

2. Automata Solver API Changes

Before:

use RegexParser\Automata\Solver\RegexSolver;

$solver = new RegexSolver($regex);
$result = $solver->equivalent($pattern1, $pattern2);

After:

use RegexParser\Automata\RegexLanguageSolver;

$language = RegexLanguageSolver::create();
$result = $language->equivalent($pattern1, $pattern2, new SolverOptions());

Migration Steps:

  1. Replace RegexSolver with RegexLanguageSolver
  2. Use static factory method RegexLanguageSolver::create()
  3. Pass SolverOptions for fine-tuning

New Features

1. Unicode-Aware Automata

Automata solver now supports /u flag patterns with proper Unicode handling:

$language = RegexLanguageSolver::create();
$result = $language->subset('/[α-ω]+/u', '/[a-zA-Z]+/');
// Correctly handles Unicode code points

2. CLI Graph Command

New command to export AST as DOT/Mermaid diagrams:

# Export as DOT format (Graphviz)
bin/regex graph '/\d{4}-\d{2}/' --format=dot

# Export as Mermaid format
bin/regex graph '/\d{4}-\d{2}/' --format=mermaid

# Save to file
bin/regex graph '/\d{4}-\d{2}/' --format=mermaid > diagram.mmd

[1.1.0] → 1.2.0

Breaking Changes

None. This is a feature release.

New Features

1. Symfony Bridge Commands

New Symfony bundle commands:

# Analyze Symfony routes for conflicts
php bin/console regex:routes

# Analyze Symfony security config
php bin/console regex:security

# Run all bridge analyzers
php bin/console regex:analyze

[1.0.0] → 1.1.0

Breaking Changes

1. ReDoS Severity Enum Change

Before:

use RegexParser\ReDoS\ReDoSSeverity;

$result = $regex->redos($pattern, 'high');  // string

After:

use RegexParser\ReDoS\ReDoSSeverity;

$result = $regex->redos($pattern, ReDoSSeverity::HIGH);  // enum

Migration Steps:

  1. Replace string thresholds with enum values:
    • 'safe'ReDoSSeverity::SAFE
    • 'low'ReDoSSeverity::LOW
    • 'medium'ReDoSSeverity::MEDIUM
    • 'high'ReDoSSeverity::HIGH
    • 'critical'ReDoSSeverity::CRITICAL
  2. Update type hints to use ReDoSSeverity type

2. Lint Result Structure Changes

Before:

$issues = $linter->lint($pattern);

foreach ($issues as $issue) {
    echo "{$issue['type']}: {$issue['message']}\n";
}

After:

use RegexParser\Lint\LintIssue;

$report = $linter->lint($pattern);

/** @var LintIssue $issue */
foreach ($report->issues as $issue) {
    echo "{$issue->type->value}: {$issue->message}\n";
}

Migration Steps:

  1. Access properties via object properties instead of array keys
  2. Use typed properties: $issue->type, $issue->message, $issue->position
  3. Import LintIssue for type hints

New Features

1. Compare Command

New command for pattern comparison:

# Check intersection
bin/regex compare '/edit/' '/[a-z]+/'

# Check subset
bin/regex compare '/edit/' '/[a-z]+/' --method=subset

# Check equivalence
bin/regex compare '/[0-9]+/' '/\d+/' --method=equivalence

Need Help?

If you encounter issues during upgrade:

  1. Check Troubleshooting Guide
  2. Review CHANGELOG.md for detailed changes
  3. Search GitHub Issues
  4. Create new issue with your specific upgrade problem