This project is aimed at creating a tweet search engine. The goal of the project is to successfully crawl tweets from twitter, index them and then create a graphic user interface allowing a user to search tweets. Furthermore, the project contributes in analyzing the corpus informing the user about the impact of the tweets in the twitter sphere.
| Platform | Tech-Stack |
|---|---|
| Front-End | ReactJS and Redux, CSS, HTML |
| Back-End | Django, Python |
| Search Platform | Solr/Lucene |
| Translation Platform | Microsoft Azure |
| Analytics | Plotly |
| News Scraping | News API Praw |
| Server Instance | Amazon EC2 |
-
Search
DogDogGo allows the user to search words, phrases, hashtags etc. It offers a rich, flexible set of features for search. -
Translation
The search engine allows the user to search in multiple languages. -
Highlighting
Found results are highlighted. -
More Like This
When a user finds a document relevant the user can search similar tweets by clicking on “More like this”. This feature is similar to Google's "More Like This feature in Google News. -
Custom Search
The user can customise its search and use filters. We allow the user to filter its search on the basis of POI, Location, Hashtags, Sentiment, Language, and Source. -
Analytics
We use the SentimentIntensityAnalyser of VadarSentiment library to analyse the sentiment of each tweet as well as for the searched results. Red: Negative, Green: Positive, Orange: Neutral. -
Dynamic Search Result Analysis
On the basis of the search results, a number of analysis is provided.- Location Distribution: Location of tweets that match the query term.
- Sentiment Analysis: Sentiment analysis of the fetched results.
- Person of Interest Distribution: frequency of query term on the POI’s twitter handle
- Distribution of Devices: Devices from which the tweets were posted.
-
Tweet Corpus Analysis
The user can also visualize the statistics of the tweet corpus.- World Twitter Usage: Geo mapping of tweets around the world.
- Country Time Series: Twitter usage based on country over the time.
- POI Time Series: Twitter usage based on Person of Interest over the time.
- Sentiment Time Series: Sentiment of tweets over the time.
- Location Distribution: Distribution of tweets by location.
-
Relevant News Articles
The user can also view articles related to the tweet. The user can also view the original article. -
Tweet Replies
The user can also view the replies for a particular tweet. -
Additional Features
A few additional features are also included to enhance the user experience.- Total search count and response time of search engine.
- Pagination
- Interactive plots
- Clean user interface
- Phrase search
- Sentiment analysis and display retweet count, reply count and article count for each tweet.
Front-end: Anirudh
Back-end: Snigdha and Raunaq
Analytics: Raunaq and Anirudh
News Scraping: Deepesh
Translation and Solr Querying: Snigdha, Deepesh and Raunaq
Documentation: Snigdha