LingPeer is a web app that suggests potential reviewers for manuscripts in theoretical linguistics based on data from Lingbuzz. You can access it by visiting the following URL:
LingPeer works on data from Lingbuzz.net. It was collected using the lingbuzz_scraper tool.
The current models in which LingPeer is running consider all authors or co-authors that uploaded at least two manuscripts to LingBuzz since January 2016.
LingPeer was designed to be used as a web application. You just need to provide the title of a manuscript, its keywords and abstract to get a list of potential reviewers. It is possible to get recommendations based on partial data (e.g., title and keywords only), but this provides less accurate results.
It is also possible to run LingPeer as a script. First, you need to clone this repository to your local machine.
git clone https://github.com/cmunozperez/LingPeer.git
You run the script by executing the main.py file in the project directory.
python main.py
Alternatively, you can import the main.py module and use the function get_peers. It takes three string arguments as shown below.
title = 'This is the title of the manuscript'
keywords = 'keyword, another keyword, a third keyword, a final keyword'
abstract = 'This is an abstract describing some aspect of some language.'
get_peers(title, keywords, abstract)
The output of the get_peers function is a list of tuples, each of them including (i) the name of a potential reviewer, (ii) a list of keywords matching that author with the provided abstract, (iii) a sample manuscript of the author related to the abstract provided, (iv) the lingbuzz id of the manuscript, and (v) the cosine similarity between the abstract provided and the retrieved manuscript.
You can retrain the models used by LingPeer by providing a new dataset (perhaps more up to date) from Lingbuzz. To do this, follow these steps:
- Obtain a new dataset from Lingbuzz by running the lingbuzz_scraper tool.
- Place the newly generated csv file in the project directory.
- Run the script with the -newdata flag:
python main.py -newdata
This will allow you to retrain the the models using the new dataset.