1+ [ ![ Packagist] ( https://img.shields.io/packagist/v/nilgems/laravel-textract )] ( https://packagist.org/packages/nilgems/laravel-textract )
12# Laravel Textract
2- A [ Laravel 9 ] ( https://laravel.com/docs/9.x/ ) package to extract text from files like DOC, XL , Image, Pdf and more. I've developed this package by inspiring [ "npm textract" ] ( https://www.npmjs.com/package/textract ) .
3+ A [ Laravel] ( https://laravel.com ) package to extract text from files like DOC, Excel , Image, Pdf and more.
34
4- [ ![ PHP Version] ( https://img.shields.io/packagist/php-v/NilGems/laravel-textract )] ( https://www.php.net/downloads )
5- [ ![ Language Used] ( https://img.shields.io/github/languages/count/NilGems/laravel-textract )] ( https://github.com/NilGems/laravel-textract/search?l=php )
6- [ ![ Packagist] ( https://img.shields.io/packagist/v/nilgems/laravel-textract )] ( https://packagist.org/packages/nilgems/laravel-textract )
5+ # Versions and compatibility
76
7+ - [ Laravel 8] ( https://laravel.com ) or higher is required.
8+ - [ Php 7.4] ( ) or higher is required
9+
10+ ### <img src =" ./blobs/danger.png?raw=true " alt =" Note " width =" 18 " > [ Laravel 9] ( https://laravel.com ) support is added.
811
912### Supported file formats
1013Following file formats is supported currently. You need to install proper extensions
1114to your server to work with all the following extension related files. The package will
12- check file content MIME type before execute. So with the extension you have maintain
13- the current content type to work this package-
14- - HTML
15- - TEXT
16- - DOC
17- - DOCX
18- - XLS, XLSX, XLSM, XLTX, XLTM, XLT
19- - CSV
20- - PDF
21- - Image
22- - Jpeg
23- - Pdf
24- - ODT
25- - ODS
26- - RTF
27-
28- ### Install
15+ check file content MIME type before execute.
16+ - ** HTML**
17+ - ** TEXT**
18+ - ** DOC**
19+ - ** DOCX**
20+ - ** XLS** , ** XLSX** , ** XLSM** , ** XLTX** , ** XLTM** , ** XLT**
21+ - ** CSV**
22+ - ** PDF**
23+ - ** Image**
24+ - _ jpeg_
25+ - _ png_
26+ - ** ODT**
27+ - ** ODS**
28+ - ** RTF**
29+
30+ <img src =" ./blobs/warning.png?raw=true " alt =" Note " width =" 12 " >*** GIF*** and *** PPT*** support is under development.
31+
32+ ** We are working hard to make this laravel plugin useful. If you found any issue please add a post on discussion.**
33+
34+ ### Installation
35+
2936```
3037composer require nilgems/laravel-textract
3138```
39+ Once installed you can do stuff like this:
40+ ```
41+ # Run the extractor
42+ $output = Textract::run('/path/to/file.extension');
43+
44+ # Display the extracted text
45+ echo $output->text;
46+
47+ # Display the extracted text word count
48+ echo $output->word_count;
49+
50+ # Display the extracted text with direct string conversion
51+ echo (string) $output;
52+ ```
53+
3254### Configuration
33- You don't need to anything special for your laravel application to work with this
34- package.
55+
56+ - You can add ** provider** in ``` app.php ``` under the ``` config ``` folder of your
57+ [ Laravel] ( https://laravel.com ) project. It's optional, the package automatically load the service provider in your application.
58+ ```
59+ 'providers' => [
60+ ...
61+ Nilgems\PhpTextract\Providers\ServiceProvider,
62+ ...
63+ ]
64+ ```
65+ - Add ** alias** in ``` app.php ``` under the ``` config ``` folder of your
66+ [ Laravel] ( https://laravel.com ) project. It's optional, the package automatically load the ``` facade ``` in your application.
67+ ```
68+ 'aliases' => [
69+ ...
70+ 'Textract' => Nilgems\PhpTextract\Textract::class,
71+ ...
72+ ]
73+ ```
3574### Example
3675Use the ``` Nilge\Textract\Textract ``` facade to run the extractor.
3776```
@@ -47,8 +86,7 @@ In ```php``` there have a restriction of execution time and memory limit defined
4786use Nilge\Textract\Textract;
4887
4988Route::get('/textract', function(){
50- $file_path = ...;
51- return Textract::run($file_path);
89+ return Textract::run('/path/to/image/example.png');
5290});
5391........
5492```
@@ -60,8 +98,7 @@ If you need to specify languages in image file for better extraction output from
6098use Nilge\Textract\Textract;
6199
62100Route::get('/textract', function(){
63- $image_file_path = ...;
64- return Textract::run($file_path, null, [
101+ return Textract::run('/path/to/image/example.png', null, [
65102 'lang' => ['eng', 'jpn', 'spa']
66103 ]);
67104});
@@ -71,10 +108,10 @@ Route::get('/textract', function(){
71108- To enable the image extraction feature you need to install [ Tesseract OCR] ( https://github.com/tesseract-ocr/tesseract )
72109- To enable the PDF extraction feature you need to install [ pdftotext] ( http://www.xpdfreader.com/download.html )
73110- To work properly, your server must have following php extensions installed -
74- - ext-fileinfo
75- - ext-zip
76- - ext-gd or ext-imagick
77- - ext-xml
111+ - ** ext-fileinfo**
112+ - ** ext-zip**
113+ - ** ext-gd** or ** ext-imagick**
114+ - ** ext-xml**
78115### Tesseract OCR Installation
79116#### <img src =" https://raw.githubusercontent.com/NilGems/laravel-textract/master/blobs/ubuntu.png " width =" 12 " alt =" Ubuntu " /> Ubuntu
80117- Update the system: ``` sudo apt update ```
@@ -103,9 +140,6 @@ Route::get('/textract', function(){
103140---
104141## 💻 Tech Stack
105142                    
106- ## 📊 GitHub Stats :
107- ![ ] ( https://github-readme-stats.vercel.app/api?username=NilGems&theme=radical&hide_border=false&include_all_commits=false&count_private=false ) <br />
108- ![ ] ( https://github-readme-stats.vercel.app/api/top-langs/?username=NilGems&theme=radical&hide_border=false&include_all_commits=false&count_private=false&layout=compact )
109143
110144---
111145[ ![ ] ( https://visitcount.itsvg.in/api?id=NilGems&icon=0&color=0 )] ( https://visitcount.itsvg.in )
0 commit comments