Skip to content

Add GitHub Codespaces support#17

Open
thiago-grabe wants to merge 11 commits intoudacity:masterfrom
thiago-grabe:feat/codespaces
Open

Add GitHub Codespaces support#17
thiago-grabe wants to merge 11 commits intoudacity:masterfrom
thiago-grabe:feat/codespaces

Conversation

@thiago-grabe
Copy link
Copy Markdown

@thiago-grabe thiago-grabe commented Jan 18, 2026

PR Description:

This PR adds GitHub Codespaces support to the repository.

Files Added

  • .devcontainer/ - Codespaces configuration and setup scripts
    • Dockerfile - Container definition with Python 3.13 and conda
    • devcontainer.json - VS Code Codespaces configuration
    • scripts/ - Automated setup, cleanup, and environment initialization scripts
  • .github/workflows/codespaces-prebuild.yml - Prebuild workflow for faster startup
  • CODESPACES_QUICKSTART.md - Student-focused quick start guide for Codespaces
  • .vscode/tasks.json - VS Code tasks for common operations
  • environment.yml - Base conda environment specification

Files Modified

  • README.md - Added Codespaces setup instructions and links
  • requirements.txt - Updated library versions for Python 3.13 compatibility
  • All conda.yml files across lessons - Updated Python version to 3.13 and library versions
  • .gitignore - Added Codespaces-specific ignore patterns

Changes

  • Python upgraded 3.13
  • Updated library versions:
    • pandas 2.2.1 → 2.3.2
    • scikit-learn 1.4.1 → 1.7.2
    • mlflow 2.8.1 → 3.3.2
    • wandb updated to 0.24.0
  • Added automated environment setup and W&B authentication
  • Added prebuild workflow for faster Codespace creation
  • Included cleanup scripts for managing MLflow environments

@thiago-grabe thiago-grabe changed the title Feat/codespaces Add GitHub Codespaces support Jan 19, 2026
@thiago-grabe thiago-grabe marked this pull request as ready for review January 19, 2026 00:11
@thiago-grabe thiago-grabe requested a review from a team as a code owner January 19, 2026 00:11
@thiago-grabe thiago-grabe requested review from SudKul and removed request for a team January 19, 2026 00:11
Copy link
Copy Markdown

@rayryeng rayryeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! I have some general comments and a couple of essential ones that won't necessarily break the experience, but there is inconsistency with what the quick start guide proclaims and what I experienced.

@@ -0,0 +1,573 @@
# GitHub Codespaces Quick Start Guide
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per my previous comment, we should lobby Udacity to make a video of this comprehensive guide.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also:

Consider adding a section when the student needs to resume the work. Right now, it's mentioning first-time setup, but if the user decides to navigate away from the GitHub page, we should mention that they can resume the running workspace by clicking on the Codespaces tab and choosing the already running environment.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also:

Consider adding a section when the student needs to resume the work. Right now, it's mentioning first-time setup, but if the user decides to navigate away from the GitHub page, we should mention that they can resume the running workspace by clicking on the Codespaces tab and choosing the already running environment.

@@ -0,0 +1,573 @@
# GitHub Codespaces Quick Start Guide

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a Table of Contents to allow hot-linking to the individual sections for better readability:

## Table of Contents

- [Quick Decision Guide](#quick-decision-guide)
  - [Your Learning Journey](#your-learning-journey)
- [First Time Setup](#first-time-setup)
  - [1. Create Your Codespace](#1-create-your-codespace)
    - [Choosing Your Branch](#choosing-your-branch)
    - [Switching Branches After Creation](#switching-branches-after-creation)
  - [2. Add Your Weights & Biases API Key](#2-add-your-weights--biases-api-key)
  - [3. Verify Setup](#3-verify-setup)
- [Terminal Setup](#terminal-setup)
  - [Default Shell](#default-shell)
  - [Conda Environment Auto-Activation](#conda-environment-auto-activation)
  - [Environment Details](#environment-details)
- [Working with Exercises](#working-with-exercises)
  - [Lesson Structure](#lesson-structure)
  - [Exercise Workflow](#exercise-workflow)
    - [Navigate to Exercise](#navigate-to-exercise)
    - [Create Exercise Environment (First Time Only)](#create-exercise-environment-first-time-only)
    - [Activate Environment](#activate-environment)
    - [Run Exercise](#run-exercise)
- [Lesson-Specific Instructions](#lesson-specific-instructions)
  - [Lesson 1: Machine Learning Pipelines](#lesson-1-machine-learning-pipelines)
  - [Lesson 2: Data Exploration and Preparation](#lesson-2-data-exploration-and-preparation)
  - [Lesson 3: Data Validation](#lesson-3-data-validation)
  - [Lesson 4: Training & Experiment Tracking](#lesson-4-training--experiment-tracking)
  - [Lesson 5: Full Pipeline](#lesson-5-full-pipeline)
- [Common Issues](#common-issues)
  - ["WANDB_API_KEY not set"](#wandb_api_key-not-set)
  - ["conda: command not found"](#conda-command-not-found)
  - ["Disk space full"](#disk-space-full)
  - ["Environment not found"](#environment--not-found)
  - ["Port 8888 already in use"](#port-8888-already-in-use)
  - [Slow execution / timeouts](#slow-execution--timeouts)
  - ["Your local changes would be overwritten by checkout"](#your-local-changes-would-be-overwritten-by-checkout)
  - [Wrong branch / Need to start over](#wrong-branch--need-to-start-over)
- [VS Code Tasks](#vs-code-tasks)
- [Viewing Results](#viewing-results)
  - [MLflow UI](#mlflow-ui)
  - [Weights & Biases](#weights--biases)
- [Github Codespaces Limits](#github-codespaces-limits)
  - [Essential Habits](#essential-habits)
- [Tips & Best Practices](#tips--best-practices)
  - [Environment Management](#environment-management)
  - [Saving Your Work](#saving-your-work)
  - [Working with Starter Files](#working-with-starter-files)
  - [Jupyter Notebooks](#jupyter-notebooks)
  - [Port Forwarding](#port-forwarding)
- [Getting Help](#getting-help)
  - [Documentation](#documentation)
  - [Common Commands Reference](#common-commands-reference)

This Codespace uses **Zsh** as the default shell with Oh My Zsh for enhanced functionality.

### Conda Environment Auto-Activation
The `ml_workflow_base` conda environment is configured to auto-activate in new terminals.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the previous PR comment, this still didn't activate for me. I had to use conda init so that the .bashrc could be modified and it finally worked.

fi

# Configure for bash (fallback)
if [ -f ~/.bashrc ]; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Use elif

3. Rebuild your container (`Cmd+Shift+P` → "Rebuild Container")

### "conda: command not found"
**Problem**: Conda not activated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing conda init once in a terminal, then opening a new terminal fixed this for me. I had to source the script every time which is not optimal.

@ruddyscent ruddyscent requested review from Copilot and removed request for SudKul January 27, 2026 12:48
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds GitHub Codespaces/devcontainer support for the repo and aligns exercise environments with a Python 3.13-compatible dependency set.

Changes:

  • Added a devcontainer (Dockerfile + devcontainer.json) and setup/cleanup scripts for Codespaces.
  • Added Codespaces-focused documentation and VS Code tasks for common workflows.
  • Updated requirements and many lesson conda environments (notably W&B/MLflow) for the new baseline.

Reviewed changes

Copilot reviewed 63 out of 65 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
requirements.txt Expands/pins core Python package requirements.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/starter/segregate/conda.yml Bumps mlflow/wandb versions for Python 3.13 baseline.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/starter/random_forest/conda.yml Bumps mlflow/wandb versions for Python 3.13 baseline.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/starter/preprocess/conda.yml Bumps wandb version.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/starter/evaluate/conda.yml Bumps mlflow/wandb versions.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/starter/download/conda.yml Bumps wandb version.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/starter/conda.yml Bumps mlflow/wandb versions.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/starter/check_data/conda.yml Bumps wandb version.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/solution/segregate/conda.yml Bumps mlflow/wandb versions for Python 3.13 baseline.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/solution/random_forest/conda.yml Bumps mlflow/wandb versions for Python 3.13 baseline.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/solution/preprocess/conda.yml Bumps wandb version.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/solution/evaluate/conda.yml Bumps mlflow/wandb versions.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/solution/download/conda.yml Bumps wandb version.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/solution/conda.yml Bumps mlflow/wandb versions.
lesson-5-final-pipeline-release-and-deploy/exercises/exercise_14/solution/check_data/conda.yml Bumps wandb version.
lesson-4-training-validation-experiment-tracking/exercises/exercise_13/starter/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_13/solution/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_12/starter/random_forest/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_12/starter/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_12/solution/random_forest/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_12/solution/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_11/starter/random_forest/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_11/starter/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_10/starter/random_forest/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_10/starter/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_10/solution/random_forest/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/exercises/exercise_10/solution/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/demo/sklearn_pipeline/conda.yml Bumps wandb version.
lesson-4-training-validation-experiment-tracking/demo/pytorch/conda.yml Bumps wandb version.
lesson-4-training-validation-experiment-tracking/demo/hydra_sweeps/conda.yml Bumps mlflow/wandb versions.
lesson-4-training-validation-experiment-tracking/demo/hydra_sweeps/component/conda.yml Bumps mlflow/wandb versions.
lesson-3-data-validation/exercises/exercise_9/starter/test_data.py Updates pytest test signature to use fixtures.
lesson-3-data-validation/exercises/exercise_9/starter/conda.yml Bumps wandb version.
lesson-3-data-validation/exercises/exercise_9/solution/conda.yml Bumps wandb version.
lesson-3-data-validation/exercises/exercise_8/starter/conda.yml Bumps wandb version.
lesson-3-data-validation/exercises/exercise_8/solution/conda.yml Bumps wandb version.
lesson-3-data-validation/exercises/exercise_7/starter/conda.yml Fixes/updates wandb pip pin.
lesson-3-data-validation/exercises/exercise_7/solution/conda.yml Bumps wandb version.
lesson-3-data-validation/demo/fixtures/conda.yml Bumps wandb version.
lesson-2-data-exploration-and-preparation/exercises/exercise_6/starter/conda.yml Bumps mlflow/wandb versions.
lesson-2-data-exploration-and-preparation/exercises/exercise_6/solution/conda.yml Bumps mlflow/wandb versions.
lesson-2-data-exploration-and-preparation/exercises/exercise_5/solution/conda.yml Bumps mlflow/wandb versions.
lesson-2-data-exploration-and-preparation/exercises/exercise_4/starter/conda.yml Fixes ipywidgets pin + bumps wandb version.
lesson-2-data-exploration-and-preparation/exercises/exercise_4/starter/MLproject Adds MLflow project entry for launching the EDA notebook.
lesson-2-data-exploration-and-preparation/exercises/exercise_4/solution/conda.yml Fixes ipywidgets pin + bumps wandb version.
lesson-2-data-exploration-and-preparation/demo/ydata_profiling/conda.yml Bumps wandb version.
lesson-1-machine-learning-pipelines/exercises/exercise_3/starter/process_data/conda.yml Bumps wandb version.
lesson-1-machine-learning-pipelines/exercises/exercise_3/starter/main.py Updates hydra decorator config path for compatibility.
lesson-1-machine-learning-pipelines/exercises/exercise_3/starter/download_data/conda.yml Bumps wandb version.
lesson-1-machine-learning-pipelines/exercises/exercise_3/solution/process_data/conda.yml Bumps wandb version.
lesson-1-machine-learning-pipelines/exercises/exercise_3/solution/download_data/conda.yml Bumps wandb version.
lesson-1-machine-learning-pipelines/exercises/exercise_3/solution/conda.yml Bumps mlflow/wandb versions.
lesson-1-machine-learning-pipelines/exercises/exercise_2/solution/conda.yml Bumps wandb version.
environment.yml Adds base conda environment used by the devcontainer.
README.md Adds Codespaces entrypoint and quickstart link.
CODESPACES_QUICKSTART.md Adds a detailed Codespaces usage guide for students.
.vscode/tasks.json Adds VS Code tasks for common Codespaces workflows.
.gitignore Adds ignores for Codespaces/MLflow/Hydra/W&B outputs and artifacts.
.github/workflows/codespaces-prebuild.yml Adds a workflow intended to support Codespaces prebuilds.
.devcontainer/scripts/post-start.sh Prints startup banner and activates base conda env in the script context.
.devcontainer/scripts/post-create.sh Performs post-create setup incl. W&B login and shell auto-activation tweaks.
.devcontainer/scripts/on-create.sh Initializes basic working directories in the repo.
.devcontainer/scripts/cleanup-mlflow-envs.sh Adds helper to remove MLflow-created conda envs to reclaim disk.
.devcontainer/devcontainer.json Defines Codespaces/devcontainer configuration, extensions, ports, scripts.
.devcontainer/Dockerfile Builds the devcontainer image and base conda env.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# Check if already configured (avoid duplicates on rebuilds)
if ! grep -q "conda activate ml_workflow_base" ~/.zshrc; then
echo "" >> ~/.zshrc
echo "# Auto-activate ml_workflow_base conda environment" >> ~/.zshrc
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shell auto-activation snippet only appends conda activate ..., but new shells may not have conda initialized (especially since conda init ran as root in the Dockerfile). Add a conda init hook/source line (e.g., source /opt/conda/etc/profile.d/conda.sh or eval "$(/opt/conda/bin/conda shell.zsh hook)") before conda activate so activation actually works.

Suggested change
echo "# Auto-activate ml_workflow_base conda environment" >> ~/.zshrc
echo "# Auto-activate ml_workflow_base conda environment" >> ~/.zshrc
echo '# Initialize conda for this shell (needed if conda is not already initialized)' >> ~/.zshrc
echo 'if [ -f "/opt/conda/etc/profile.d/conda.sh" ]; then' >> ~/.zshrc
echo ' . "/opt/conda/etc/profile.d/conda.sh"' >> ~/.zshrc
echo 'fi' >> ~/.zshrc

Copilot uses AI. Check for mistakes.
"GitHub.copilot"
],
"settings": {
"python.defaultInterpreterPath": "/opt/conda/bin/python",
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python.defaultInterpreterPath points at /opt/conda/bin/python (the base env), but the installed packages live in the ml_workflow_base env created from environment.yml. Point this to /opt/conda/envs/ml_workflow_base/bin/python so VS Code uses the correct interpreter and finds installed deps.

Suggested change
"python.defaultInterpreterPath": "/opt/conda/bin/python",
"python.defaultInterpreterPath": "/opt/conda/envs/ml_workflow_base/bin/python",

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +23
- name: Trigger Prebuild
run: |
echo "Prebuild trigger for Codespaces"
echo "Configure in: Settings → Codespaces → Prebuilds"
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow is named as a “Prebuild” but it doesn’t perform any devcontainer build/prebuild work (it only echoes text). Either remove it (and rely on Codespaces prebuild settings) or implement an actual build/prebuild step (e.g., using the devcontainers/ci action) so it matches its purpose and doesn’t consume Actions runs for no effect.

Copilot uses AI. Check for mistakes.

---

## Github Codespaces Limits
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Branding/capitalization: use “GitHub” instead of “Github”.

Suggested change
## Github Codespaces Limits
## GitHub Codespaces Limits

Copilot uses AI. Check for mistakes.
Comment on lines +11 to +16
"label": "Create Exercise Environment",
"type": "shell",
"command": "conda env create -f conda.yml",
"group": "build",
"problemMatcher": []
},
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tasks run from the workspace root by default, but commands like conda env create -f conda.yml and mlflow run . require running inside an exercise directory (where conda.yml/MLproject exist). Consider setting options.cwd (e.g., ${fileDirname}) or wrapping each command to cd into the intended folder to avoid immediate task failures.

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +39
echo "⚠️ Legacy API key detected. Consider upgrading to v1 format."
fi
wandb login --relogin <<< "$WANDB_API_KEY"
echo "✓ W&B login successful!"
fi
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With set -e, a transient failure in wandb login (network issues, invalid key, W&B outage) will abort post-create.sh and can leave Codespaces setup in a failed state. Handle wandb login failures gracefully (e.g., capture exit code and print instructions) so setup can continue.

Copilot uses AI. Check for mistakes.
Comment on lines +7 to +11
"hostRequirements": {
"cpus": 4,
"memory": "16gb",
"storage": "32gb"
},
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These hostRequirements force fairly large resources (4 CPU / 16GB / 32GB). That can prevent Codespaces from being created on smaller/free-tier machine types and conflicts with the quickstart guidance that 2-core is sufficient. Consider lowering/removing these requirements or updating the docs to match the enforced minimum.

Copilot uses AI. Check for mistakes.
- **Stop when done**: Click Codespace name → "Stop codespace" (don't leave it running)
- **Auto-stop**: Set timeout to 30 minutes (GitHub Settings → Codespaces)
- **Delete old codespaces**: Keep only your active one
- **Use 2-core machine**: Sufficient for *All Lessons**
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Markdown formatting is broken here (*All Lessons** has mismatched asterisks). Use consistent emphasis (e.g., **All Lessons**) so the rendered guide reads correctly.

Suggested change
- **Use 2-core machine**: Sufficient for *All Lessons**
- **Use 2-core machine**: Sufficient for **All Lessons**

Copilot uses AI. Check for mistakes.

# Run exercises
mlflow run .
mlflow run . -P steps=download
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mlflow run . -P steps=download won’t work for these exercises: the MLproject files define a hydra_options parameter, not steps. Update this example to use the documented pattern (e.g., -P hydra_options="main.execute_steps='download'").

Suggested change
mlflow run . -P steps=download
mlflow run . -P hydra_options="main.execute_steps='download'"

Copilot uses AI. Check for mistakes.
README.md Outdated
@@ -1,5 +1,19 @@
# Build a Reproducible Model Workflow - Exercises

[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new)
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Codespaces badge link points to https://codespaces.new without specifying the repository, so it won’t reliably open/create a codespace for this repo. Update the link target to https://codespaces.new/<owner>/<repo> (or the full repo URL form recommended by GitHub) so the badge is repo-specific.

Suggested change
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new)
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/<owner>/<repo>)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants