The is an HTML template parser. It is a modified version of Python's HTMLParse library, expanded to handle template tags.
pip install html-template-parser
# or
poetry add html-template-parserA basic usage example is remarkably similar to Python's HTMLParser:
from HtmlTemplateParser import Htp
from HtmlTemplateParser import AttributeParser
class MyAttributeParser(AttributeParser):
def handle_starttag_curly_perc(self, tag, attrs, props):
print("starttag_curly_perc", tag, attrs, props)
# get the position of the element relative to the original html
print(self.getpos())
# get the original html text
print(self.get_element_text())
def handle_endtag_curly_perc(self, tag, attrs, props):
print("endtag_curly_perc", tag, attrs, props)
def handle_value(self, value):
print("value", value)
class MyHTMLParser(Htp):
def handle_starttag(self, tag, attrs):
print("Encountered a start tag:", tag)
print(self.getpos())
MyAttributeParser(attrs).parse()
def handle_endtag(self, tag):
print("Encountered an end tag :", tag)
def handle_data(self, data):
print("Encountered some data :", data)
parser = MyHTMLParser()
parser.feed('<html><head><title>Test</title></head>'
'<body {% if this %}ok{% endif %}><h1>Parse me!</h1></body></html>')- comment
<!-- --> - comment_curly_hash
{# data #} - comment_curly_two_exlaim
{{! data }} - starttag_comment_curly_perc
{% comment "attrs" %} - endtag_comment_curly_perc
{% endcomment %} - comment_at_star
@* data *@
-
startendtag
< /> -
starttag
< -
starttag_curly_perc
{% ... %} -
starttag_curly_two_hash
{{#...}} -
starttag_curly_four
{{{{...}}}} -
endtag
<.../> -
endtag_curly_perc
{% end.. %} -
endtag_curly_two_slash
{{/...}} -
endtag_curly_four_slash
{{{{/...}}}}
- unknown_decl
- charref
- entityref
- data
- curly_two
{{ ... }} - slash_curly_two
\{{ ... }} - curly_three
{{{ ... }}} - decl
- pi
Modifiers such as ~, !--, -, +, > will show up as props on the tags.
Attributes are passed from the Htp as a complete string to be parsed with the attribute parser.