Skip to content

preserve_handlebar_syntax regex should be improved #271

@IwanBurg

Description

@IwanBurg

Problem:
preserve_handlebar_syntax=True doesn't preserve handlebars with characters before or after the handlebars.

Examples of html that don't preserve handlebars:
a href="mailto:{{ Test }}"
a href="{{ Test }}?subject=x"

Current regex code:

stripped = re.sub(
                    r'="{{(.*?)}}"',
                    lambda match: '="{{' + escape(match.groups()[0]) + '}}"',
                    stripped,
                )
out = re.sub(
                    r'="%7B%7B(.+?)%7D%7D"',
                    lambda match: '="{{' + unescape(unquote(match.groups()[0])) + '}}"',
                    out,
                )

Proposed regex code:

stripped = re.sub(
                    r'="([^"]*){{(.*?)}}([^"]*?)"',
                    lambda match: '="' +
                                  match.groups()[0] +
                                  '{{' + escape(match.groups()[1]) + '}}' +
                                  match.groups()[2] + '"',
                    stripped,
                )

https://regex101.com/r/tLC41B/2

out = re.sub(
                    r'="([^"]*)%7B%7B(.+?)%7D%7D([^"]*?)"',
                    lambda match: '="' +
                                  match.groups()[0] +
                                  '{{' + unescape(unquote(match.groups()[1])) + '}}' +
                                  match.groups()[2] + '"',
                    out,
                )

https://regex101.com/r/ADvQjO/1

I tried making a pull request:
#270

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions