🗣️ Pronunciation Evaluator (German, can be multilingual) — Local Whisper App

This is a local Python application to evaluate spoken German sentences using OpenAI's Whisper model. Users upload audio (.webm, usually connected with a frontend sending request to this), and the app provides a pronunciation score and feedback compared to a set of expected sentences.

📁 Project Structure

├── main.py                # Backend logic (or whisper_utils.py for modular usage)
├── whisper_utils.py       # (Optional) Modular Whisper processing
├── requirements.txt       # Python dependencies
├── render.yaml            # (Ignore if using locally)
├── temp_audio.webm        # Temp uploaded audio file (auto-created/deleted)
└── __pycache__/           # Python cache

🧠 How It Works

Accepts an audio recording (WebM format).
Transcribes the audio using OpenAI Whisper.
Compares it against a predefined sentence (based on phrase_id).
Scores the pronunciation using character-level similarity.
Outputs:
- Expected vs Spoken sentence
- Accuracy Score
- Mispronounced letters

🖥️ Setup & Run Locally

1. Clone this repo

git clone https://github.com/krish-1010/whisper-backend
cd whisper-backend

2. Create a virtual environment (optional but recommended)

python -m venv venv
source venv/bin/activate  # on Windows: venv\Scripts\activate

3. Install dependencies

pip install -r requirements.txt

Dependencies include:

openai-whisper
ffmpeg-python (ensure system ffmpeg is installed)
uvicorn, fastapi (for API usage)

4. Install ffmpeg (required)

Download ffmpeg:

Windows: https://www.gyan.dev/ffmpeg/builds/
Linux/macOS: via brew or apt

Ensure ffmpeg is in your system PATH.

5. Run the FastAPI server

uvicorn main:app --reload

6. Test the API (Example with curl)

curl -X POST "http://localhost:8000/evaluate/1-1" \
  -H "accept: application/json" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@your_audio_file.webm"

🗂️ Supported Sentences

You can find all supported sentence IDs and phrases inside main.py or whisper_utils.py under EXPECTED_PHRASES.

Example:

"1-1": "Ich bin müde"
"2-3": "Kannst du helfen?"

📄 Sample Output

{
  "expected": "Ich bin müde",
  "spoken": "und bin mut",
  "score": 50.0,
  "feedback": "🗣️ Mispronounced letters: i, c, h, ü, d, e"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🗣️ Pronunciation Evaluator (German, can be multilingual) — Local Whisper App

📁 Project Structure

🧠 How It Works

🖥️ Setup & Run Locally

1. Clone this repo

2. Create a virtual environment (optional but recommended)

3. Install dependencies

4. Install ffmpeg (required)

5. Run the FastAPI server

6. Test the API (Example with curl)

🗂️ Supported Sentences

📄 Sample Output

💡 Notes

📜 License

🙏 Credits

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
README.md		README.md
main.py		main.py
render.yaml		render.yaml
requirements.txt		requirements.txt
temp_audio.webm		temp_audio.webm
whisper_utils.py		whisper_utils.py

krish-1010/whisper-backend

Folders and files

Latest commit

History

Repository files navigation

🗣️ Pronunciation Evaluator (German, can be multilingual) — Local Whisper App

📁 Project Structure

🧠 How It Works

🖥️ Setup & Run Locally

1. Clone this repo

2. Create a virtual environment (optional but recommended)

3. Install dependencies

4. Install ffmpeg (required)

5. Run the FastAPI server

6. Test the API (Example with curl)

🗂️ Supported Sentences

📄 Sample Output

💡 Notes

📜 License

🙏 Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages