Skip to content

Conversation

@aapng
Copy link

@aapng aapng commented Jun 6, 2020

Regular expression-based parsing is slower than the use of Scanner, because the latter immediately uses the information about the data it has scanned.

Additionally, the current implementation crashes rarely in NSRegularExpression.firstMatch, which seems to be an issue with Apple's implementation. Although rare, it is noticeable.

All regular expressions (SVGParserRegexHelper) are removed. Parsing is built upon Scanner.

Also, re-parsing of XML during <tspan> processing is also dropped. Already existing XMLIndexer information is processed instead. The change does not affect the current behaviour for non-embedded tspans (tspans within another tspan, which are clearly not supported by the implementation at the moment). There are no new failing tests.

With the changes, SVG parsing is more efficient, shorter, a bit more resilient, has less branching and recursion, has shorter left indents. Although personally, the main reason for the minor refactoring is prevention of the rare crashes.

@ystrot ystrot self-assigned this Jun 9, 2020
@ystrot ystrot added this to the 0.9.7 milestone Jun 9, 2020
@ystrot ystrot merged commit 1d392e9 into exyte:master Jun 9, 2020
@ystrot
Copy link
Member

ystrot commented Jun 9, 2020

Thanks @aapng, it's a huge improvement!

@aapng aapng deleted the feature/efficient-parsing branch June 9, 2020 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants