-
Notifications
You must be signed in to change notification settings - Fork 111
feat: add cols_label_with() method
#626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 5 commits
d53ae18
43766cc
f89e19b
22d93b1
ee637b8
2852f26
1ae6f8a
2ee205a
a060849
73c08f9
b3564d2
1d4b4e6
25e661c
50542f8
ce5d5f1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,9 +1,9 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from typing import TYPE_CHECKING | ||
| from typing import Callable, TYPE_CHECKING | ||
|
|
||
| from ._locations import resolve_cols_c | ||
| from ._utils import _assert_list_is_subset | ||
| from ._utils import _assert_list_is_subset, _handle_units_syntax | ||
| from ._tbl_data import SelectExpr | ||
| from ._text import BaseText | ||
|
|
||
|
|
@@ -114,8 +114,6 @@ def cols_label( | |
| ) | ||
| ``` | ||
| """ | ||
| from great_tables._helpers import UnitStr | ||
|
|
||
| cases = cases if cases is not None else {} | ||
| new_cases = cases | kwargs | ||
|
|
||
|
|
@@ -132,24 +130,80 @@ def cols_label( | |
| _assert_list_is_subset(mod_columns, set_list=column_names) | ||
|
|
||
| # Handle units syntax in labels (e.g., "Density ({{ppl / mi^2}})") | ||
| new_kwargs: dict[str, UnitStr | str | BaseText] = {} | ||
| new_kwargs = _handle_units_syntax(new_cases) | ||
|
|
||
| boxhead = self._boxhead._set_column_labels(new_kwargs) | ||
|
|
||
| return self._replace(_boxhead=boxhead) | ||
|
|
||
|
|
||
| def cols_label_with(self: GTSelf, fn: Callable[[str], str], columns: SelectExpr = None) -> GTSelf: | ||
| """ | ||
| Relabel one or more columns using a function. | ||
|
|
||
| The `cols_label_with()` function allows for modification of column labels through a supplied | ||
| function. By default, the function will be invoked on all column labels but this can be limited | ||
| to a subset via the `columns` parameter. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| fn | ||
| A function that accepts a column label as input and returns a transformed label as output. | ||
|
|
||
| columns | ||
| The columns to target. Can either be a single column name or a series of column names | ||
| provided in a list. | ||
|
|
||
| Returns | ||
| ------- | ||
| GT | ||
| The GT object is returned. This is the same object that the method is called on so that we | ||
| can facilitate method chaining. | ||
|
|
||
| Notes | ||
| ----- | ||
| GT always selects columns using their name in the underlying data. This means that a column's | ||
| label is purely for final presentation. | ||
|
|
||
| for k, v in new_cases.items(): | ||
| if isinstance(v, str): | ||
| unitstr_v = UnitStr.from_str(v) | ||
| Examples | ||
| -------- | ||
| Let's use a subset of the `sp500` dataset to create a gt table. | ||
| ```{python} | ||
| from great_tables import GT, md | ||
| from great_tables.data import sp500 | ||
|
|
||
| if len(unitstr_v.units_str) == 1 and isinstance(unitstr_v.units_str[0], str): | ||
| new_kwargs[k] = unitstr_v.units_str[0] | ||
| else: | ||
| new_kwargs[k] = unitstr_v | ||
| gt = GT(sp500.head()) | ||
| gt | ||
| ``` | ||
|
|
||
| elif isinstance(v, BaseText): | ||
| new_kwargs[k] = v | ||
| We can pass `str.upper()` to the `columns` parameter to convert all column labels to uppercase. | ||
| ```{python} | ||
| gt.cols_label_with(str.upper) | ||
| ``` | ||
|
|
||
| One useful use case is using `md()`, provided by **Great Tables**, to format column labels. | ||
| For example, the following code demonstrates how to make the `date` and `adj_close` column labels | ||
| bold using markdown syntax. | ||
| ```{python} | ||
| gt.cols_label_with(lambda x: md(f"**{x}**"), columns=["date", "adj_close"]) | ||
| ``` | ||
|
|
||
| else: | ||
| raise ValueError( | ||
| "Column labels must be strings or BaseText objects. Use `md()` or `html()` for formatting." | ||
| ) | ||
| """ | ||
| # Get the full list of column names for the data | ||
| column_names = self._boxhead._get_columns() | ||
|
|
||
| if isinstance(columns, str): | ||
| columns = [columns] | ||
| _assert_list_is_subset(columns, set_list=column_names) | ||
| elif columns is None: | ||
| columns = column_names | ||
|
|
||
| sel_cols = resolve_cols_c(data=self, expr=columns) | ||
|
|
||
| new_cases = {col: fn(col) for col in sel_cols} | ||
|
|
||
| # Handle units syntax in labels (e.g., "Density ({{ppl / mi^2}})") | ||
| new_kwargs = _handle_units_syntax(new_cases) | ||
|
||
|
|
||
| boxhead = self._boxhead._set_column_labels(new_kwargs) | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From pairing w/ Rich, @jrycw WDYT of switching the order of the parameters, so
columnsis beforefn?Looking through the
cols_*()methods in the R library, column selection usually comes first. In the case ofcols_align()where it doesn't, it sounds like it's a historical artifact. Keeping column first might help cement a pattern forcols_*()methods.It makes sense you put it first in the PR, since it doesn't have a default. WDYT of us setting it to
fn = None, and then erroring if fn is None?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@machow, nice suggestion! I definitely like your idea of introducing
fn=tocols_label()using Polars syntax.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to @machow for the comment! I think this is a good opportunity to introduce
fn=to accept a callable incols_label()as well.By the way, I noticed that we have two test files for boxhead,
test_boxhead.pyandtest__boxhead.py. I picked one to add the new test, but perhaps we should consider consolidating them in the future.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a side note, we might consider using the pl.Expr.name attribute to handle Polars syntax. For example:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I've attempted to implement this in the latest commit. I admit it's a bit of a bold move, so feel free to set it aside for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here’s a simple example:
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When passing a list of Polars expressions, if a column is referenced multiple times, we can either superimpose the transformations or follow a "last one wins" approach (current implementation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for experimenting with this. If it's okay with you, I think we need a little bit more time to chew on this approach / run it past some folks, since it's a bit cutting edge in terms of external libraries using Polars selectors. I definitely think it'll be useful to figure out in the long run though