Skip to content

raunaqjain/DogDogGo

Repository files navigation

DogDogGo - Information Retrieval Project 4

This project is aimed at creating a tweet search engine. The goal of the project is to successfully crawl tweets from twitter, index them and then create a graphic user interface allowing a user to search tweets. Furthermore, the project contributes in analyzing the corpus informing the user about the impact of the tweets in the twitter sphere.

Platform Tech-Stack
Front-End ReactJS and Redux, CSS, HTML
Back-End Django, Python
Search Platform Solr/Lucene
Translation Platform Microsoft Azure
Analytics Plotly
News Scraping News API Praw
Server Instance Amazon EC2

Search Engine Features

  • Search
    DogDogGo allows the user to search words, phrases, hashtags etc. It offers a rich, flexible set of features for search.

  • Translation
    The search engine allows the user to search in multiple languages.

  • Highlighting
    Found results are highlighted.

  • More Like This
    When a user finds a document relevant the user can search similar tweets by clicking on “More like this”. This feature is similar to Google's "More Like This feature in Google News.

  • Custom Search
    The user can customise its search and use filters. We allow the user to filter its search on the basis of POI, Location, Hashtags, Sentiment, Language, and Source.

  • Analytics
    We use the SentimentIntensityAnalyser of VadarSentiment library to analyse the sentiment of each tweet as well as for the searched results. Red: Negative, Green: Positive, Orange: Neutral.

  • Dynamic Search Result Analysis
    On the basis of the search results, a number of analysis is provided.

    • Location Distribution: Location of tweets that match the query term.
    • Sentiment Analysis: Sentiment analysis of the fetched results.
    • Person of Interest Distribution: frequency of query term on the POI’s twitter handle
    • Distribution of Devices: Devices from which the tweets were posted.
  • Tweet Corpus Analysis
    The user can also visualize the statistics of the tweet corpus.

    • World Twitter Usage: Geo mapping of tweets around the world.
    • Country Time Series: Twitter usage based on country over the time.
    • POI Time Series: Twitter usage based on Person of Interest over the time.
    • Sentiment Time Series: Sentiment of tweets over the time.
    • Location Distribution: Distribution of tweets by location.
  • Relevant News Articles
    The user can also view articles related to the tweet. The user can also view the original article.

  • Tweet Replies
    The user can also view the replies for a particular tweet.

  • Additional Features
    A few additional features are also included to enhance the user experience.

    • Total search count and response time of search engine.
    • Pagination
    • Interactive plots
    • Clean user interface
    • Phrase search
    • Sentiment analysis and display retweet count, reply count and article count for each tweet.

This project was built by four students of University at Buffalo

Front-end: Anirudh
Back-end: Snigdha and Raunaq
Analytics: Raunaq and Anirudh
News Scraping: Deepesh
Translation and Solr Querying: Snigdha, Deepesh and Raunaq
Documentation: Snigdha

About

Analyzing the impact of political rhetoric in traditional and social media

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages