Skip to content

Pipe199x/Tokyo-Azure-Spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tokyo-Azure-Spark

This project uses Azure, Apache Spark, and Python to process and analyze Olympic data. Below are the main components and technologies used in the project:

Technologies Used

  • Azure: Microsoft's cloud platform used for data storage and processing.
  • Apache Spark: A unified analytics engine for processing large volumes of data.
  • Python: The programming language used to write data processing and analysis scripts.

Project Description

The goal of this project is to process and analyze Olympic data to extract valuable information about athletes, their coaches, teams, and events. The data is stored in Azure and processed using Apache Spark to efficiently handle large volumes of data.

Project Structure

  • CSVs/: Contains all the CSV files with Olympic data.
    • Athletes.csv
    • Coaches.csv
    • EntriesGender.csv
    • Medals.csv
    • Teams.csv
  • tokyo_olympic.ipynb: The Jupyter notebook containing the data processing and analysis scripts.

About

Data analysis project using Azure, Apache Spark, and Python to process Tokyo Olympic data.

Topics

Resources

Stars

Watchers

Forks

Contributors