Skip to content

darrrio/xml-invoices-extractor

Repository files navigation

XML Invoices Extractor

A Python tool to extract data from Italian electronic invoices (XML format) and create formatted Excel reports.

Features

  • Processes Italian electronic invoices in XML format
  • Extracts key invoice information including:
    • Invoice numbers and dates
    • Supplier and customer details
    • Line items with quantities and prices
    • VAT rates and totals
  • Generates formatted Excel reports with:
    • Proper data types (dates, currencies, percentages)
    • Auto-sized columns
    • Formatted headers
    • Euro currency symbols
    • Thousand separators

Prerequisites

  • Docker Desktop (for development container)
  • VS Code with Remote Development extension
  • Python 3.6 or higher

Development Environment

  1. Clone the repository
  2. Open in VS Code
  3. When prompted, click "Reopen in Container" or run Command Palette (F1) > "Remote-Containers: Reopen in Container"
  4. Wait for the container to build and start

Project Structure

xml-invoices-extractor/
├── input-unsigned/     # Place your XML invoice files here
├── process_invoices.sh # Bash script to process invoices
├── xml_invoices_extractor.py
└── README.md

Usage

Using the Bash Script (Mandatory if you have signed *.p7m invoices)

  1. Place your signed P7M invoice files in the input folder
  2. Make the script executable:
chmod +x process_invoices.sh
  1. Run the bash script:
./process_invoices.sh
  1. Unsigned invoices will be available in the input-unsigned folder

Using the Python Script Directly

  1. Place your XML invoice files in the input-unsigned folder (Optional if you used the Bash script)
  2. Run the script:
python xml_invoices_extractor.py
  1. Find the generated Excel report (invoice_report.xlsx) in the project root directory

Input File Format

The script expects Italian electronic invoices in XML format with the following structure:

  • Root element: <p:FatturaElettronica>
  • Namespace: http://ivaservizi.agenziaentrate.gov.it/docs/xsd/fatture/v1.2
  • Key elements:
    • Numero (Invoice number)
    • Data (Invoice date)
    • DettaglioLinee (Line items)
    • ImportoTotaleDocumento (Total amount)

Output Format

The generated Excel file includes the following columns:

  • Invoice Number (text)
  • Invoice Date (date format)
  • Supplier Name (text)
  • Customer Name (text)
  • Line Number (number)
  • Description (text)
  • Quantity (number)
  • Unit Price (currency)
  • Total Price (currency)
  • VAT Rate (percentage)
  • Invoice Total Amount (currency)

Error Handling

The script includes error handling for:

  • Missing XML files or folders
  • Invalid XML structure
  • Missing data fields
  • Non-numeric values in numeric fields
  • File access and permissions issues

Contributing

  1. Fork the repository
  2. Create your feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a new Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published