A comprehensive data science project analyzing Guyana's development across economic, social, and environmental dimensions spanning 100+ years.
This project uses machine learning, satellite imagery analysis, and data visualization to tell the compelling story of Guyana's development from the 1920s to present day. It covers 12 key dimensions:
- π¦ GDP & wealth trends (100 years)
- π¦ Export economy evolution (50 years)
- βοΈ Gold & diamond production
- π’οΈ Oil production (2019-present)
- π Happiness index
- π Economic inequality (Gini coefficient)
- π Literacy rates and education
- πΏ Biodiversity metrics
- π³ Forest cover & deforestation
- π§ Water resources dynamics
- Time Series Forecasting: ARIMA, Prophet, VAR, LSTM models for economic predictions
- Clustering Analysis: K-means, DBSCAN for pattern recognition
- Satellite Imagery: Google Earth Engine integration for forest and water analysis
- Computer Vision: U-Net CNN for deforestation detection
- Interactive Visualizations: Plotly, Folium maps
- Correlation & Causal Analysis: Multi-dimensional relationship exploration
- Python 3.9.6 or higher
- Google Earth Engine account (sign up at earthengine.google.com)
- Clone the repository:
git clone <your-repo-url>
cd guyana-development-analysis- Create and activate virtual environment:
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install core dependencies:
pip install -r requirements.txt- (Optional) Install geospatial packages:
First install GDAL:
# On macOS
brew install gdal
# On Ubuntu/Debian
sudo apt-get install gdal-bin libgdal-devThen install geospatial requirements:
pip install -r requirements-geo.txt- Authenticate with Google Earth Engine:
earthengine authenticateguyana-development-analysis/
β
βββ README.md # This file
βββ requirements.txt # Core Python dependencies
βββ requirements-geo.txt # Geospatial dependencies
βββ .gitignore # Git ignore rules
β
βββ guyana_analysis.ipynb # MAIN COMPREHENSIVE NOTEBOOK
β
βββ data/ # Data directory (gitignored)
β βββ raw/ # Original downloaded data
β β βββ economic/ # GDP, exports, wealth
β β βββ social/ # Happiness, literacy
β β βββ resources/ # Gold, diamond, oil
β β βββ environmental/ # Biodiversity
β β βββ satellite/ # GEE downloads, rasters
β βββ processed/ # Cleaned data
β βββ metadata/ # Data source documentation
β
βββ outputs/ # Generated artifacts (gitignored)
β βββ figures/ # Static plots (PNG, SVG)
β βββ interactive/ # Interactive HTML plots
β βββ models/ # Saved ML models
β βββ reports/ # Summary statistics
β
βββ src/ # Helper modules
βββ data_fetchers.py # API/data download functions
βββ preprocessing.py # Data cleaning utilities
βββ satellite_utils.py # GEE and raster processing
βββ visualization.py # Reusable plot functions
All data comes from free, publicly available sources:
- World Bank Open Data: GDP, exports, poverty indicators
- UN Comtrade: Export composition by commodity
- IMF: Historical economic estimates
- USGS Minerals Yearbook: Gold & diamond production
- U.S. EIA: Oil production statistics
- World Bank Commodity Prices: Price data for context
- World Happiness Report: Happiness scores and components
- UNESCO: Literacy rates and education statistics
- World Bank Poverty & Equity: Gini, income distribution
- GBIF: Species occurrence records
- IUCN Red List: Threatened species
- Living Planet Index: Population trends
- Hansen Global Forest Change: Tree cover (2000-2023)
- MODIS: NDVI vegetation indices
- JRC Global Surface Water: Water occurrence (1984-2024)
- Sentinel-2: High-resolution imagery
- Start Jupyter Notebook:
jupyter notebook-
Open
guyana_analysis.ipynb -
Run cells sequentially (first run will download data - may take time)
-
Outputs will be saved to
outputs/directory
The notebook implements:
- Time Series: ARIMA, SARIMAX, Prophet, VAR, LSTM
- Clustering: K-means, DBSCAN, Hierarchical
- Classification: Random Forest, XGBoost
- Regression: Random Forest, GWR (Geographically Weighted)
- Computer Vision: U-Net for satellite image segmentation
- Anomaly Detection: Isolation Forest, LSTM Autoencoder
- Causal Inference: Difference-in-Differences, Granger Causality
- GDP trajectories and forecasts
- Export composition changes
- Production trends
- Deforestation risk maps
- Biodiversity hotspots
- Correlation heatmaps
- GDP forecast with confidence intervals
- Animated GDP vs happiness scatter
- Forest change maps
- Flood risk maps
- ARIMA (GDP forecasting)
- Prophet (exports)
- Random Forest (deforestation risk, happiness drivers)
- XGBoost (flood risk)
- U-Net weights (forest change detection)
- Week 1: Project setup β
- Weeks 2-3: Data acquisition
- Week 4: Data preprocessing
- Week 5: Exploratory data analysis
- Weeks 6-7: ML analysis
- Week 8: Integrated analysis
- Week 9: Narrative & polish
- Week 10: Documentation & testing
This is a personal/research project. If you'd like to contribute or have suggestions, please open an issue.
MIT License - See LICENSE file for details
If you use this analysis in your research, please cite:
Guyana Development Analysis: A Multi-Dimensional Perspective (2026)
Repository: <your-repo-url>
- Data Providers: World Bank, UN, IMF, NASA, ESA, USGS, UNESCO, GBIF, IUCN
- Tools: Python, Jupyter, scikit-learn, TensorFlow, Google Earth Engine, Plotly
- Inspiration: Showcasing Guyana's rich heritage and development journey
For questions or collaboration opportunities, please open an issue on GitHub.
Note: This project requires significant computational resources for satellite imagery analysis. Consider using cloud platforms (Google Colab, Kaggle) for GPU-accelerated tasks.