A Flask-based microservice that converts Word documents (.docx, .doc) to HTML format. Perfect for integration with React applications or any other frontend framework.
- ✅ Convert .docx and .doc files to HTML
- ✅ Handle headings, paragraphs, and tables
- ✅ Automatic file cleanup after conversion
- ✅ CORS enabled for frontend integration
- ✅ File size validation (16MB max)
- ✅ Error handling and validation
- ✅ Clean, responsive HTML output with CSS styling
- ✅ Beautiful web interface with drag & drop
- ✅ Real-time conversion with progress indicators
- ✅ Copy to clipboard functionality
- ✅ Conversion statistics and metrics
- ✅ NEW: RESTful API endpoints for integration
- ✅ NEW: No authentication required
- ✅ NEW: Support for both file upload and base64 data
- ✅ NEW: Comprehensive API documentation
- ✅ NEW: Test client and examples
-
Clone or navigate to the project directory:
cd docx2html-service -
Activate virtual environment:
# Windows venv\Scripts\activate # Linux/Mac source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
python main.pyThe service will start on http://localhost:8000
Once the service is running, open your browser and go to:
http://localhost:8000/
This will show you a beautiful web interface where you can:
- Upload Word documents (.docx, .doc)
- Convert them to HTML instantly
- View the converted HTML
- Copy the HTML to clipboard
- See conversion statistics
GET /Access the beautiful web interface for file upload and conversion.
POST /convert
Content-Type: multipart/form-dataRequest Body: file: Word document file (.docx or .doc)
GET /api/statusCheck API status and get service information.
POST /api/convertTwo methods supported:
Method 1: Multipart Form Data
Content-Type: multipart/form-data
Body: file field with documentMethod 2: Base64 JSON
Content-Type: application/json
Body: {"file_data": "base64_string", "filename": "document.docx"}Response:
{
"success": true,
"html": "<!DOCTYPE html>...",
"filename": "document.docx",
"message": "Document converted successfully",
"conversion_time": 1.23,
"file_size_bytes": 1024000,
"file_size_mb": 1.0
}The API endpoints are designed to be easily integrated into other applications without requiring authentication tokens.
- Python Test Client: Use
python test_api.py <document.docx>to test the API - Web Test Interface: Open
test_api.htmlin your browser to test interactively - cURL Examples: See the comprehensive examples in
API_DOCUMENTATION.md
See API_DOCUMENTATION.md for complete examples in:
- Python
- Node.js/JavaScript
- PHP
- cURL
import React, { useState } from 'react';
function DocumentConverter() {
const [file, setFile] = useState(null);
const [html, setHtml] = useState('');
const [loading, setLoading] = useState(false);
const [error, setError] = useState('');
const handleFileChange = (e) => {
setFile(e.target.files[0]);
setError('');
};
const convertDocument = async () => {
if (!file) {
setError('Please select a file');
return;
}
setLoading(true);
setError('');
const formData = new FormData();
formData.append('file', file);
try {
const response = await fetch('http://localhost:5000/convert', {
method: 'POST',
body: formData,
});
const data = await response.json();
if (data.success) {
setHtml(data.html);
} else {
setError(data.error || 'Conversion failed');
}
} catch (err) {
setError('Network error: ' + err.message);
} finally {
setLoading(false);
}
};
return (
<div>
<h1>Word to HTML Converter</h1>
<input
type="file"
accept=".docx,.doc"
onChange={handleFileChange}
/>
<button onClick={convertDocument} disabled={loading}>
{loading ? 'Converting...' : 'Convert to HTML'}
</button>
{error && <p style={{color: 'red'}}>{error}</p>}
{html && (
<div>
<h2>Converted HTML:</h2>
<div dangerouslySetInnerHTML={{ __html: html }} />
</div>
)}
</div>
);
}
export default DocumentConverter;The service handles various error scenarios:
- 400: Invalid file type, no file uploaded
- 413: File too large (over 16MB)
- 500: Internal server error during conversion
- 404: Endpoint not found
- Supported formats: .docx, .doc
- Maximum file size: 16MB
- Output: Clean HTML with embedded CSS styling
- File extension validation
- Secure filename handling
- Automatic file cleanup after processing
- CORS configuration for controlled access
To run in development mode with auto-reload:
python main.pyThe service will automatically reload when you make changes to the code.
For production deployment, consider:
- Using a production WSGI server like Gunicorn
- Setting up proper logging
- Implementing rate limiting
- Adding authentication if needed
- Using environment variables for configuration
- Import errors: Make sure all dependencies are installed
- Port conflicts: Change the port in
main.pyif 5000 is busy - File permissions: Ensure the uploads directory is writable
- CORS issues: Check if your React app is running on the correct port
The service runs in debug mode by default. For production, set debug=False in the app.run() call.
This project is open source and available under the MIT License.