Skip to content

Conversation

diyapratheep
Copy link

@diyapratheep diyapratheep commented Oct 8, 2025

Closes #126

Description

This pull request adds support for parsing and evaluating resumes in .docx format, making the hiring agent more versatile. It also includes a bug fix to make the evaluation model more robust.

Changes Made

  • New Dependency: Added python-docx to requirements.txt.
  • File Handling: Modified score.py to check the input file's extension and call the appropriate parsing function (.pdf or .docx).
  • Caching: Fixed the caching logic in score.py to correctly generate cache filenames for any supported file type.
  • DOCX Parsing: Created a new extract_json_from_docx method in pdf.py that leverages the existing LLM pipeline to ensure consistent, structured data output.
proof of docx working

How to Test

  1. Ensure the new dependency is installed: pip install python-docx
  2. Run the agent with a .docx file: python score.py /path/to/your/resume.docx
  3. Verify that the evaluation completes successfully.

introduces the capability to parse and evaluate resumes in .docx format, expanding the agent's functionality beyond just PDFs.
- Added the `python-docx` library to handle .docx file parsing.
- Modified `score.py` to dynamically detect the file extension (.pdf or .docx) and route it to the appropriate parsing function.
- Updated the caching mechanism in `score.py` to generate correct filenames for both file types.
- Refactored `pdf.py` by creating a new `extract_json_from_docx` method. This method reuses the existing core LLM logic to convert the extracted text into the structured JSONResume format, ensuring consistency.
@diyapratheep
Copy link
Author

diyapratheep commented Oct 8, 2025

@sp2hari @anxkhn-hacker This PR is ready for review. It adds support for .docx files.
FYI, I used a dummy resume for testing, so the low evaluation score is expected and can be ignored. The core parsing works.

Let me know if any changes are needed! If not requesting to close the issue and PR soon. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Add support for .docx resume files

1 participant