Skip to content

Conversation

@Adityarya11
Copy link

@Adityarya11 Adityarya11 commented Oct 22, 2025

This PR fixes the "Invalid Notebook" issue #200 rendering error on GitHub caused by invalid/missing metadata in cell outputs (the renderer requires certain keys in display_data and disallows unexpected metadata fields).

What I changed

  • Cleaned metadata in cell outputs for notebook (in the Build-reasoning-model) that had invalid or missing entries (e.g. metadata: {"tags": null} or missing metadata for display_data).
  • Ensured notebooks pass nbformat.validate() locally.
  • Added scripts in scripts/:
    • scripts/fix_notebooks.py — scans and fixes output metadata issues for all .ipynb files.
    • scripts/validate_nb.py — validates notebooks using nbformat.

  • Ran python3 scripts/fix_notebooks.py.
  • Ran python3 scripts/validate_nb.py — all notebooks show VALID.
  • Opened notebook preview in my fork's GitHub and confirmed rendering.

- However this resulted the notebook to trip down and remove some of the cell output from the original. As mainly this caused the problem in rendering.

- If this helps, let me know i will go through each of the these issues and will try to fix them.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 22, 2025

Walkthrough

Two new utility scripts are added to the Build-reasoning-model directory. fix_notebooks.py repairs metadata issues in Jupyter notebook outputs by replacing invalid metadata with empty dictionaries, while validate_nb.py validates notebook structure across the repository using nbformat validation.

Changes

Cohort / File(s) Summary
Notebook utilities
Build-reasoning-model/fix_notebooks.py, Build-reasoning-model/validate_nb.py
New scripts: fix_notebooks.py repairs metadata issues in notebook outputs by handling None values and missing metadata fields; validate_nb.py traverses repository to validate notebook structure and reports validity status per file

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 Hop, hop, through notebooks we go,
Metadata bugs begone, validation's the show!
Broken cells fixed, structures are sound,
A clean Jupyter kingdom is what we have found! ✨📓

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "fix(notebooks): clean output metadata and validate notebooks for GitHub rendering" is directly related to the main changes in the changeset. The title clearly captures both primary additions: the fix_notebooks.py script for cleaning metadata and the validate_nb.py script for validating notebooks. The title is concise, specific, and provides meaningful context about the purpose of the changes (fixing GitHub rendering errors), allowing teammates to quickly understand the core intent of the PR. The language is clear and avoids vague or generic terms.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
Build-reasoning-model/fix_notebooks.py (1)

10-26: Metadata fixing logic is sound.

The logic correctly handles the two cases mentioned in the PR objectives:

  1. Replacing invalid metadata (None or containing None values) with empty dicts
  2. Adding missing metadata for display_data outputs

The use of changed flag and defensive nb.get("cells", []) is good practice.

For additional robustness, consider adding error handling for unexpected notebook structures:

for cell in nb.get("cells", []):
    outputs = cell.get("outputs")
    if not outputs or not isinstance(outputs, list):
        continue
    for output in outputs:
        if not isinstance(output, dict):
            continue
        # existing logic...
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ff981f0 and 4df8b95.

📒 Files selected for processing (2)
  • Build-reasoning-model/fix_notebooks.py (1 hunks)
  • Build-reasoning-model/validate_nb.py (1 hunks)
🔇 Additional comments (1)
Build-reasoning-model/validate_nb.py (1)

1-3: LGTM!

The imports are appropriate for the notebook validation functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant