Skip to content

A Gradio-powered app using LangChain and Hugging Face models for multilingual translation, supporting 10+ language pairs. Features enhanced Cantonese translation, dual modes, GPU acceleration, a user-friendly UI, and batch processing capabilities for seamless text translation.

License

Notifications You must be signed in to change notification settings

WWIIITT/Language_Translation_Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

11 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŒ Language Translation App with LangChain

Python Gradio LangChain License: MIT

A powerful multilingual translation application built with Gradio, LangChain, and Hugging Face Transformers. This app provides an intuitive web interface for translating text between multiple languages, with special support for Cantonese translation.

โœจ Features

  • Multiple Language Support: Translate between 10+ language pairs including English, French, Spanish, German, Italian, Portuguese, Chinese, and Cantonese
  • Cantonese Specialization: Enhanced Cantonese translation using Facebook's NLLB-200 model
  • Dual Translation Modes: Switch between LangChain integration and direct HuggingFace pipeline
  • GPU Acceleration: Automatic GPU detection and utilization for faster translations
  • User-Friendly Interface: Clean, modern Gradio interface with examples and clear instructions
  • Batch Processing Ready: Infrastructure for batch translation capabilities
  • Context-Aware Translation: Framework for context-sensitive translations

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.9 or higher
  • CUDA-capable GPU (optional, for acceleration)

Installation

  1. Clone the repository:
git clone https://github.com/WWIIITT/language-translation-app.git
cd language-translation-app
  1. Create a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

Running the Application

python LTM.py

The app will launch and provide you with a local URL (typically http://localhost:7860). If you set share=True, it will also provide a public URL for sharing.

๐Ÿ“‹ Requirements

Create a requirements.txt file with:

gradio>=4.0.0
langchain>=0.1.0
transformers>=4.30.0
torch>=2.0.0
sentencepiece>=0.1.99
protobuf>=3.20.0

๐Ÿ—บ๏ธ Supported Translation Pairs

Standard Models (Helsinki-NLP OPUS-MT)

  • English โ†” French
  • English โ†” Spanish
  • English โ†” German
  • English โ†” Italian
  • English โ†” Portuguese
  • English โ†” Chinese (Simplified/Traditional)
  • Cantonese/Chinese โ†’ English

Enhanced Cantonese Support (Facebook NLLB-200)

  • English โ†” Cantonese (Alternative) - Better quality for Cantonese

๐Ÿ’ก Usage

Basic Translation

  1. Select your translation direction from the dropdown
  2. Enter the text you want to translate
  3. Click the "๐Ÿ”„ Translate" button
  4. View the translated result

Advanced Options

  • LangChain Mode: Toggle between LangChain integration and direct pipeline for different processing approaches
  • Examples: Click on any example to quickly test the translation

API Usage

You can also use the translation functionality programmatically:

from LTM import TranslationApp

# Initialize the app
app = TranslationApp()

# Translate text
translated = app.translate_with_langchain(
    "Hello, world!", 
    "English to French"
)
print(translated)  # "Bonjour, le monde!"

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Gradio UI     โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  TranslationApp  โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  HuggingFace    โ”‚
โ”‚   Interface      โ”‚     โ”‚     Class        โ”‚     โ”‚   Models        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚                           โ”‚
                               โ–ผ                           โ”‚
                        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                   โ”‚
                        โ”‚  LangChain   โ”‚โ—€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                        โ”‚  Pipeline    โ”‚
                        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”ง Configuration

Model Configuration

Models are defined in the TRANSLATION_MODELS and ALTERNATIVE_MODELS dictionaries. To add new models:

TRANSLATION_MODELS["English to NewLanguage"] = "model-name-here"

Performance Tuning

  • GPU Usage: Automatically detected and used when available
  • Max Length: Default 512 tokens, adjustable in pipeline configuration
  • Batch Size: Can be modified for batch processing needs

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Development Setup

# Install development dependencies
pip install -r requirements-dev.txt

# Run tests
pytest tests/

# Format code
black LTM.py

๐Ÿ› Troubleshooting

Common Issues

  1. Model Loading Errors

    • Ensure you have stable internet for first-time model downloads
    • Check available disk space (models can be 1-2GB each)
  2. GPU Not Detected

    • Verify CUDA installation: python -c "import torch; print(torch.cuda.is_available())"
    • Install appropriate PyTorch version for your CUDA version
  3. Cantonese Translation Issues

    • NLLB model requires additional memory (4GB+)
    • Ensure sentencepiece is properly installed

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

๐Ÿ”ฎ Roadmap

  • Add more language pairs
  • Implement document translation
  • Add translation quality metrics
  • Create REST API endpoint
  • Add translation history
  • Implement custom model fine-tuning
  • Add support for audio translation

โญ๏ธ If you find this project useful, please consider giving it a star!

About

A Gradio-powered app using LangChain and Hugging Face models for multilingual translation, supporting 10+ language pairs. Features enhanced Cantonese translation, dual modes, GPU acceleration, a user-friendly UI, and batch processing capabilities for seamless text translation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages