GitHub - siddhant-08/topic_model_R

author_scrap

This function takes an author's name as input . It then searches for the author on the project gutenberg site and saves all the works of the author as a text file.

pos_tagger

pos_tagger is a R function that does Parts-of-Speech tagging on a document and extracts nouns from it. It uses koRpus, a wrapper in R for treetagger.

korPus is installed from within the function itself. The useR just has to download treetagger before running the function.

The useR should follow the instructions here to download the treetagger.Kindly follow the instructions exactly as stated so that koRpus can locate all the necessary files. Install all the files in a single directory and also install the parameter files of the language of your document before running the function.

pos_tagger was succesfully tested on Ubuntu 14.04 LTS, Windows 8.1 and Windows 10 on texts in english.

N.B. TreeTagger installs the english parameter file as 'english-utf8.par' whereas koRpus looks for 'english.par' so kindly rename the file in the 'lib' folder of your treetagger installation before proceeding further. This problem could exist for other languages as well so kindly take note.

mallet_integration

This is a complete workflow from taking a corpus as input,cleaning it and performing topic modelling using the 'mallet' package based on the Latent Dirichlet Allocation(LDA) algorithm.

Do let me know of any bugs or suggestions by dropping me a mail at [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
README.md		README.md
author_scrap.R		author_scrap.R
mallet_integration.R		mallet_integration.R
pos_tagger.R		pos_tagger.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

author_scrap

pos_tagger

mallet_integration

About

Uh oh!

Releases

Packages

Languages

siddhant-08/topic_model_R

Folders and files

Latest commit

History

Repository files navigation

author_scrap

pos_tagger

mallet_integration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages