Automagically generate German alt texts for web images using AI.
- Streamlit Web App and CLI Tool: Input any URL to analyze all images
- AI-powered Alt-Text Generation: German alt-text using LLMs via OpenRouter
- Context Analysis: Considers image content, surrounding text (headings, paragraphs, captions) and existing alt-text
- Excel Export: Export all results as an Excel file
- Parallel Processing: Fast batch processing with configurable worker threads
- Multiple Image Formats: Support for JPEG, PNG, and WebP formats
- Batch Processing of Multiple Web Pages: CLI tool can process URL lists
# Clone the repository
git clone https://github.com/statistikZH/alt-text-generator.git
cd alt-text-generator
# Install uv (if not already installed)
pip3 install uv
# Create virtual environment and install dependencies
uv venv
source .venv/bin/activate # On macOS/Linux
# or .venv\Scripts\activate # On Windows
uv sync
Configure the environment variables:
- The app automatically loads environment variables from
.env
or the configured path inconfig.yaml
- Set your OpenRouter API key:
export OPENROUTER_API_KEY="your_api_key_here"
Or create a .env
file in the project root:
OPENROUTER_API_KEY=your_api_key_here
Currently, the application uses Google Gemini 2.5 Flash via OpenRouter. You can easily change this by simply inserting a different OpenRouter model ID in config.yaml
.
Warning
Your data will be processed through OpenRouter by third parties. Please ensure you only input non-sensitive information.
- Start the Streamlit app:
streamlit run main.py
- Open in browser (http://localhost:8501 by default)
The CLI provides various processing capabilities with four main commands:
Using the convenience script (recommended):
# Generate alt-text for a single image
./generate-alt-text-cli single image.jpg
# Process multiple images from URL list of images
./generate-alt-text-cli batch urls.txt --output results.json
# Process all images from a webpage
./generate-alt-text-cli scrape https://example.com --output webpage_images.xlsx --format excel
# Batch process images from multiple webpages
./generate-alt-text-cli websites website_urls.txt --output all_websites.xlsx --format excel
Direct usage with uv:
uv run python _cli/cli.py single image.jpg --context "Swiss building in Zurich from 2025"
uv run python _cli/cli.py batch urls.txt --workers 10 --output results.json
Generate alt text for a single image file or URL.
./generate-alt-text-cli single <image> [options]
Arguments:
image
(required): Path to local image file or image URL
Options:
--context
,-c
: Additional context information for the image (default: empty)--alt-text
,-a
: Existing alt text to improve upon (default: empty)
Examples:
./generate-alt-text-cli single photo.jpg
./generate-alt-text-cli single https://example.com/image.png --context "Swiss building in Zurich from 2025"
./generate-alt-text-cli single image.jpg --alt-text "Old description" --context "Swiss building in Zurich from 2025"
Process multiple images from a text file containing image URLs.
./generate-alt-text-cli batch <file> [options]
Arguments:
file
(required): Text file containing image URLs (one per line)
Options:
--context
,-c
: Additional context for all images (default: empty)--workers
,-w
: Number of parallel workers (default: configured max_workers)--output
,-o
: Output file for results--format
,-f
: Output format, choices:json
,excel
(default: json)
Examples:
./generate-alt-text-cli batch image_urls.txt
./generate-alt-text-cli batch urls.txt --output results.json --workers 5
./generate-alt-text-cli batch urls.txt --context "Participants of an administration meeting" --format excel
Scrape and process all images found on a specific webpage.
./generate-alt-text-cli scrape <url> [options]
Arguments:
url
(required): URL of the webpage to scrape for images
Options:
--workers
,-w
: Number of parallel workers (default: configured max_workers)--output
,-o
: Output file for results--format
,-f
: Output format, choices:json
,excel
(default: json)
Examples:
./generate-alt-text-cli scrape https://example.com
./generate-alt-text-cli scrape https://example.com --output webpage_images.xlsx --format excel
./generate-alt-text-cli scrape https://example.com --workers 8 --output results.json
Process images from multiple websites listed in a text file.
./generate-alt-text-cli websites <file> [options]
Arguments:
file
(required): Text file containing website URLs (one per line)
Options:
--workers
,-w
: Number of parallel workers (default: configured max_workers)--output
,-o
: Output file for results--format
,-f
: Output format, choices:json
,excel
(default: json)
Examples:
./generate-alt-text-cli websites website_urls.txt
./generate-alt-text-cli websites sites.txt --output all_websites.xlsx --format excel
./generate-alt-text-cli websites urls.txt --workers 4 --output combined_results.json
All commands support the following patterns:
- Output formats: Choose between JSON or Excel formats
- Parallel processing: Adjust the number of worker threads for faster processing
- Context information: Provide additional context to improve alt text quality
The project is organized into several core modules:
-
_core/
: Core functionality modulescli_processor.py
: Parallel processing of multiple images or web pagesllm_processing.py
: AI-powered alt-text generation using OpenRouterweb_scraper.py
: Web page analysis and image extractionexporter.py
: Export functionality (Excel, JSON)models.py
: Data models for image informationconfig.py
: Configuration managementlogger.py
: Logging utilitiesprompts.py
: AI prompts for alt-text generationutils.py
: Utility functionsapp_info.py
: Configure UI textsample_urls.py
: Configure example URLstext_to_remove.py
: List of texts to remove from parsed content surrounding the image, e.g., text of navigational elements
-
_cli/
: Command-line interfacecli.py
: Main CLI application
-
_tests/
: Test suite -
main.py
: Streamlit web app -
config.yaml
: Application configuration -
generate-alt-text-cli
: Convenience script for CLI usage
This tool was developed for the cantonal administration to help content creators generate accessible alt-texts for web content. It is designed to create drafts for German alt-texts that comply with web accessibility standards.
- Simone Luchetta, Hannes Weber - Team Informationszugang & Dialog, Staatskanzlei
- Céline Colombo - Koordinationsstelle Teilhabe, Statistisches Amt
- Chantal Amrhein, Patrick Arnecke – Statistisches Amt Zürich, Team Data
Thanks to Corinna Grobe, Thomas Hofer and Roger Zedi for their valuable contributions.
We welcome feedback and contributions! Email us or open an issue or pull request.
We use ruff
for linting and formatting.
Install pre-commit hooks for automatic checks before committing:
pre-commit install
# Run the test suite
uv run python -m pytest
# Run with coverage
uv run python -m pytest --cov=_core
This project is licensed under the MIT License. See the LICENSE file for details.
This software incorporates AI models and has been developed according to and with the intent to be used under Swiss law. Please be aware that the EU Artificial Intelligence Act (EU AI Act) may, under certain circumstances, be applicable to your use of the software. You are solely responsible for ensuring that your use of the software as well as the underlying AI models complies with all applicable local, national and international laws and regulations. By using this software, you acknowledge and agree (a) that it is your responsibility to assess which laws and regulations, in particular regarding the use of AI technologies, are applicable to your intended use and to comply therewith, and (b) that you will hold us harmless from any action, claims, liability or loss in respect of your use of the software.