Open
Description
Problem
Our current content type detection using the python-magic library misidentifies HTML content as text/plain if the tag is missing, even when
or other HTML tags are present. This causes incorrect handling of HTML fragments.
Solution
We'll enhance detection by manually checking for or
tags. If found, we'll explicitly set the MIME type to text/html, overriding python-magic's default.
mime = magic.from_buffer(content, mime=True)
# If the file content contains HTML tags, override the detected mime type to text/html
if b"<html" in content.lower() or b"<div" in content.lower():
mime = "text/html"
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status