🎓 Sentiment & Theme Analysis of High- vs. Low-Performing Schools Using Online Reviews and Discussions

ADS509 - Applied Large Language Models for Data Science

ADS-509 Group 3

🚀 Live Demo

A working demonstration of this project can be found at TJAlytix.streamlit.app (tee-jah-lyticks).

🚀 Live Demo

A working demonstration of this project can be found at TJAlytix.streamlit.app (tee-jah-lyticks).

💻 Installation

To get started with this project, please clone the repository and navigate to it:

> git clone https://github.com/junclemente/ads509-final_project.git
> cd ads509-final_project

🌱 Environment Setup

This project uses a conda environment specified in a YAML file for reproducibility and consistent development. Ensure you have Anaconda or Miniconda installed.

Create the Environment

Run the following:

conda env create -f environment/ads509-streamlit.yaml

Update the Environment (if needed)

If there are any updates to the environment, you can update the environment with the following:

conda env update -f environment/ads509-streamlit.yaml --prune

The --prune option cleans the environment by removing packages that are no longer required.

👩‍💻👨‍💻 Contributors

🔀 Development Workflow

main → stable, production-ready branch (protected).
develop → active development branch where new features are merged.
feature/* → short-lived branches for specific tasks.

How to Contribute

Create a feature branch from develop.
Commit your changes with clear messages.
Open a Pull Request into develop.
Once reviewed, your changes will be merged into develop.
At milestones, develop is merged into main.

👉 See CONTRIBUTING.md for full guidelines.

⚙️ Methods

Exploratory Data Analysis
Text Cleaning
Topic Modeling
Sentiment Analysis

🛠️ Technologies

Python 3.11+
Pandas
NLTK
Numpy
Jupyter Notebook
Matplotlib / Seaborn
Scikit-learn
Streamlit
ChatGPT
VSCode

🎯 Objective

This project looks at how people talk about schools in high-performing vs. low-performing districts. We're pulling reviews and discussions from Reddit to see both the overall sentiment (positive or negative) and the main themes that come up in these conversations.

Our goals are to:

Run sentiment analysis to check if high-performing schools are talked about more positively compared to low-performing ones.
Use topic modeling to pull out key themes in each group (like academics, safety, teacher quality, resources).
Compare the sentiment and themes between high- and low-performing schools to get a clearer picture of how school quality is being perceived online.

🗂️ Data Sources

The Reddit API was used to gather the text data sources for processing.

⚠️ Disclaimer

This project uses aggregated Reddit data obtained via the Reddit API.

No raw Reddit posts or user comments are displayed, stored, or redistributed.
Only aggregated outputs such as keywords, topics, and sentiment scores are shown.
All analysis is for academic, non-commercial research purposes.

📖 References

Project Repository

📽️ Presentations and Projects

Web Application: TJAlytics.streamlit.app
Project Presentation: Canva.com (TBD)
Project Repository: https://github.com/junclemente/ads509-final_project

🤖 AI Assistance Disclosure

Parts of this project were developed with help from ChatGPT (OpenAI):

Debugging Python functions and pipeline logic
Drafting/rewriting docstrings and short notebook summaries
Creating small code snippets

All generated code and text were reviewed and edited by the authors.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
app		app
datasets		datasets
environment		environment
notebooks		notebooks
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PULL_REQUEST_TEMPLATE.md		PULL_REQUEST_TEMPLATE.md
README.md		README.md
reddit_utils.py		reddit_utils.py
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py
test_reddit_api.py		test_reddit_api.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎓 Sentiment & Theme Analysis of High- vs. Low-Performing Schools Using Online Reviews and Discussions

ADS509 - Applied Large Language Models for Data Science

ADS-509 Group 3

🚀 Live Demo

🚀 Live Demo

💻 Installation

🌱 Environment Setup

Create the Environment

Update the Environment (if needed)

👩‍💻👨‍💻 Contributors

🔀 Development Workflow

How to Contribute

⚙️ Methods

🛠️ Technologies

🎯 Objective

🗂️ Data Sources

⚠️ Disclaimer

📖 References

📽️ Presentations and Projects

🤖 AI Assistance Disclosure

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

junclemente/school-sentiment-nlp

Folders and files

Latest commit

History

Repository files navigation

🎓 Sentiment & Theme Analysis of High- vs. Low-Performing Schools Using Online Reviews and Discussions

ADS509 - Applied Large Language Models for Data Science

ADS-509 Group 3

🚀 Live Demo

🚀 Live Demo

💻 Installation

🌱 Environment Setup

Create the Environment

Update the Environment (if needed)

👩‍💻👨‍💻 Contributors

🔀 Development Workflow

How to Contribute

⚙️ Methods

🛠️ Technologies

🎯 Objective

🗂️ Data Sources

⚠️ Disclaimer

📖 References

📽️ Presentations and Projects

🤖 AI Assistance Disclosure

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages