This repository contains hosts code and files for the tutorial "Data Cleaning: How the pros do the dirty work".
In this tutorial, you'll dive straight into the trenches of transforming messy, real-world data into something pristine and usable.
The easiest way to get started on the tutorial is by clicking on this binder link:
- Install Python3.x and pip
- Create a virtual environment
python3 -m venv data-cleaning-tutorial-env - Activate virtual environment
source data-cleaning-tutorial-env/bin/activate - Install dependencies
pip install -r requirements.txt - Run Jupyter-lab on the project root
jupyter-lab - Open
index.ipynb, follow the instructions, and run the code! - Follow the tutorial!