Image-to-text-to-speech

Extract text from images, correct it using NLP, and synthesize it into audio.

Overview

This project aims to extract text from images, even if the images contain a board or a large rectangular area. The program performs automatic region selection when text is detected within a rectangle. If no text is detected, manual region selection can be used. The extracted text is then processed to create coherent paragraphs.

Key Features

Text extraction from images with automatic or manual region selection.
Paragraph region extraction based on vertical histograms.
Line determination within paragraphs.
Word segmentation for each phrase.
Character recognition using a Convolutional Neural Network (CNN) model.
Reformation of paragraphs from recognized characters.
Utilization of ChatGPT for Natural Language Processing (NLP) with web scraping capabilities.
Correction of phrases for improved readability and correctness.
Generation of audio output for the extracted and corrected text.

Usage

Install the necessary dependencies listed in the requirements.txt file.
Run the program and provide an image containing text as input.
The program will automatically detect text regions within rectangles. If no text is detected, manual region selection can be performed.
The extracted text will be processed to create paragraphs, lines, and words.
Character recognition using the CNN model will transform words into recognized characters.
The recognized characters will be used to reformulate paragraphs.
NLP with ChatGPT will be employed for text correction and enhancement.
The final corrected text will be converted into audio output for user-friendly accessibility.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
CNN.ipynb		CNN.ipynb
NLP_scrap.py		NLP_scrap.py
README.md		README.md
Slides.pdf		Slides.pdf
char.py		char.py
main.py		main.py
paragraph.py		paragraph.py
prediction.py		prediction.py
requirements.txt		requirements.txt
text.py		text.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image-to-text-to-speech

Overview

Key Features

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

yaghmo/Image-to-text-to-speech

Folders and files

Latest commit

History

Repository files navigation

Image-to-text-to-speech

Overview

Key Features

Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages