Convert HWP files to HWPX format — the only pip install-able HWP→HWPX converter.
HWP is the legacy binary format used by Hangul (한글), the dominant word processor in South Korea. HWPX is the modern XML-based format (OWPML/ODF-like ZIP archive). This package converts between them programmatically — no Hangul installation or GUI required.
| Tool | What it does | Limitation |
|---|---|---|
| Hangul GUI | Open HWP → Save As HWPX | Manual, not scriptable |
| HwpxConverter.exe | Bundled with Hangul, GUI only | No CLI, Windows only |
| kordoc | Parses HWP → Markdown/JSON | Extracts content, doesn't convert format |
| hwp2hwpx ← this | Converts HWP → HWPX (valid ZIP/XML) | Needs Java runtime |
If you need to read HWP content → use kordoc. If you need a real HWPX file you can open/edit in Hangul → use this.
pip install hwp2hwpxRequires Java Runtime (JRE) 8+:
# Windows
winget install EclipseAdoptium.Temurin.21.JDK
# macOS
brew install temurin
# Linux (Debian/Ubuntu)
apt install default-jre# Single file
hwp2hwpx document.hwp
# Multiple files
hwp2hwpx *.hwp
# Output directory
hwp2hwpx document.hwp -o output/
# Recursive folder conversion
hwp2hwpx ./documents/ -rfrom hwp2hwpx import convert, convert_batch
# Single file
output_path = convert("document.hwp")
output_path = convert("document.hwp", "output.hwpx")
# Batch
results = convert_batch(["a.hwp", "b.hwp"], output_dir="output/")
for input_path, output_path, error in results:
if error:
print(f"FAIL: {input_path}: {error}")
else:
print(f"OK: {output_path}")Bundles neolord0/hwp2hwpx Java library as a fat JAR:
- hwplib — reads HWP binary (OLE2/CFB compound document)
- hwpxlib — writes HWPX XML (ZIP archive with OWPML structure)
Pure file-format conversion. No Hangul installation, no COM API, no DRM issues.
Korean file paths on Windows are automatically handled via temp-file workaround (JVM encoding issue bypass).
pip install -e ".[test]"
pytestApache License 2.0
Based on Java libraries by neolord0:
HWP(한글 워드프로세서) 파일을 HWPX(OWPML) 형식으로 변환하는 Python 패키지.
pip install hwp2hwpx 한 줄로 설치, 바로 사용. 한글 프로그램 설치 불필요.
pip install hwp2hwpxJava 필요: winget install EclipseAdoptium.Temurin.21.JDK
hwp2hwpx 문서.hwp
hwp2hwpx *.hwp -o 출력폴더/from hwp2hwpx import convert
convert("문서.hwp")- kordoc: HWP를 읽어서 마크다운/JSON으로 추출 (텍스트 파싱)
- hwp2hwpx: HWP를 HWPX 파일로 변환 (한글에서 열 수 있는 완전한 문서)