Skip to content

bug: _output writes files with system default encoding instead of UTF-8 on Windows #26

@injen-jb

Description

@injen-jb

Description of the bug

When using the -o flag on Windows, the output file is written with the system's default encoding (e.g. cp1252) instead of UTF-8. This corrupts any non-ASCII characters in the output — most notably the en-dash (–, U+2013) used as a separator between object names and their docstring summaries.

To Reproduce

On Windows (with a non-UTF-8 system locale, which is the default):

griffe2md my_package -o docs/output.md

The resulting file contains \x96 (cp1252 en-dash) instead of the proper UTF-8 bytes \xe2\x80\x93, causing – to display as � in UTF-8 readers.

Environment

  • griffe2md 1.3.3
  • Python 3.x on Windows 10
  • System locale: cp1252

Root cause

In src/griffe2md/_internal/main.py, the _output function opens the file without specifying an encoding:

def _output(text: str, to: IO | str | None = None) -> None:
    if isinstance(to, str):
        with Path(to).open("w") as output:  # <-- no encoding specified
            output.write(text)

On Windows, open("w") defaults to the system locale encoding (cp1252), not UTF-8.

Suggested fix:

with Path(to).open("w", encoding="utf-8") as output:
- __System__: Windows-10-10.0.19045-SP0
- __Python__: cpython 3.12.2 (J:\injen.io\redacted\project\packages\docs\.venv\Scripts\python.exe)
- __Environment variables__:
- __Installed packages__:
  - `griffe2md` v1.3.3

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions