Skip to content

COMP2281/software-engineering-group25-26-19

Repository files navigation

Review Assignment Due Date

DU SPIO Competitor Benchmarking - Documentation

Welcome to the technical documentation for the University Course & Fee Scraper application. This documentation is structured to help you understand the architecture, maintain the codebase, and extend its functionality.

Product handover: The completed handover confirmation document is available here: Product_Handover_Confirmation.pdf.

📚 Documentation Structure

The documentation is split into three main layers:

  1. Frontend Documentation
    • Learn about the React application, component structure, and API integration.
  2. Database Documentation
    • Understand the Prisma schema, data models, and relationships.
  3. Backend Documentation
    • Application & API: Details on the Express.js server, routing, and authentication.
    • Scraper Engine: How UCAS import, Prisma rows, manager.ts, config.ts, and custom adapters work together.

🚀 Quick Start

Prerequisites

  • Node.js 20+
  • PostgreSQL
  • npm or yarn

Installation

  1. Clone the repository

  2. Install dependencies for both backend and frontend:

    cd backend && npm install
    cd ../frontend && npm install
  3. Setup Environment Variables

    • Create a .env file in the backend directory based on .env.example.
    • At minimum, it should contain:
      DATABASE_URL=postgresql://user:password@localhost:5432/courses_dev
      PORT=5001
  4. Initialize Database

    cd backend
    npx prisma generate
    npx prisma migrate deploy

    For a clean local reset, use:

    cd backend
    npx prisma migrate reset --force

    This deletes local database data and reapplies all migrations.

  5. Import UCAS Data

    cd backend
    npx ts-node src/ucas_job.ts

    This imports universities, courses, course URLs, and course options from UCAS. The scraper works from these database rows and fills missing fee values.

  6. Run the App

    To start backend and frontend together:

    cd backend
    npm run dev

    Or run them separately:

    # Terminal 1
    cd backend
    npm run dev:server
    
    # Terminal 2
    cd frontend
    npm run dev

Running The Scraper

Scrape missing fees for one university:

cd backend
npx ts-node src/scrapers/manager.ts --universityIds="UNIVERSITY_ID"

Scrape one course for one university:

cd backend
npx ts-node src/scrapers/manager.ts --universityIds="UNIVERSITY_ID" --q="Course Name"

View the database and find university IDs:

cd backend
npx prisma studio

Scraper logs are written to backend/logs/scrape-*.log.

For the full scraper workflow, see Scraper Engine.

🛠 Project Overview

This application is designed to scrape course information and tuition fees from various UK university websites. It provides a dashboard to view the scraped data, manage scraping tasks, and export data.

Key Features

  • Automated Scraping: Configurable scrapers for different university website structures.
  • Data Standardization: Normalizes diverse fee structures into a common format.
  • Dashboard: A user-friendly interface to trigger scrapes and view results.
  • Excel Export: Download scraped data for analysis.

About

software-engineering-group25-26-19 created by GitHub Classroom

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors