-
Notifications
You must be signed in to change notification settings - Fork 6
docs: add python notebooks guide #190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
amoeba
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @esadek! This looks pretty good. I left some comments.
| Print the table: | ||
|
|
||
| ```python | ||
| print(table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding the output? It may be helpful if users aren't that familiar with PyArrow or the dataset.
docs/guides/python_notebooks.md
Outdated
|
|
||
| # Python Notebooks | ||
|
|
||
| dbc can be installed and used directly in Python notebooks (such as Jupyter or Google Colab). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering something like marimo, I wonder if it's appropriate to use the term "Python Notebook" to refer only to Jupyter / ipynb.
(IIUC, Colab is also Jupyter.)
So, I think this page should actually be called Jupyter instead of Python Notebook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even better: @esadek, could you could test dbc in a Marimo notebook and include instructions for how to use it there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On marimo, we should use something like import subprocess; subprocess.run(["dbc", "install", "duckdb"])
https://docs.marimo.io/guides/coming_from/jupyter/#magic-commands
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, when reviewing this PR I looked at and tested marimo and I don't know if it makes sense to do more than add a note about it. I doubt we want to tell users to run dbc like the above since it's not very convenient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I think a note like this would be sufficient:
If you're using a Python notebook that doesn't support magics (
%) or shell escapes (!) then usesubprocess.runto run dbc commands from your Python code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After testing marimo, I've concluded that a dedicated guide or section would be best.
In marimo, packages are typically managed via the terminal (external or internal), inline script metadata, or the integrated "Manage packages" panel. This differs from Jupyter or Google Colab, where !pip install is common.
Drivers can be installed with dbc via the terminal (external or internal) or in a cell using subprocess.run.
Locally, dbapi.connect(driver="duckdb") works as expected. However, on molab (marimo’s cloud platform), adbc_driver_manager throws an error. Instead, the full driver path must be used: dbapi.connect(driver="/tmp/uv-venv/etc/adbc/drivers/duckdb").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this @esadek. Could you open a separate issue for us to look into the molab problem? Thanks.
Co-authored-by: Bryce Mecum <[email protected]>
Co-authored-by: Bryce Mecum <[email protected]>
Co-authored-by: Bryce Mecum <[email protected]>
Add a guide for installing and using dbc in Python notebooks.
The included code has been tested in both a local notebook and Google Colab.
Closes #116