Skip to content

[FEATURE] Narwhals migration for dataframe-agnostic codebaseΒ #658

@FBruzzesi

Description

@FBruzzesi

Description

Creating this issue to keep track of which classes/function could benefit from adopting Narwhals.

Class/Function/Module Status Related Notes
preprocessing.ColumnDropper βœ… Solved in #655
preprocessing.ColumnSelector βœ… Solved in #659
preprocessing.PandasTypeSelector βœ… Solved in #670 Consider creating another class TypeSelector and deprecate this one
common.TrainOnlyTransformerMixin πŸ”² Uses specific pandas hashing functionality. I wonder how crucial is to hash the index as well. If it's not we could just use .to_numpy() and hash the array data?
model_selection.TimeGapSplit βœ… Solved in #668
model_selection.GroupedTimeSeriesSplit πŸ”² Related to #605
projections.InformationFilter βœ… Solved in commit
meta.RegressionOutlierDetector βœ… Solved in #665
meta.hierarchical_predictor.py βœ… Solved in #667
meta.grouped_transformer.py βœ… Solved in #667
meta.grouped_predictor.py βœ… Solved in #667
linear_models._FairClassifier βœ… Solved in #669
pandas_utils.py 🚧 Partially #661 (*)
datasets.py 🚫 It would require read_csv function

Personally I would wait to have at least preprocessing.pandastransformers.py entire migration before bumping to v0.9.0.

cc @MarcoGorelli @anopsy

(*) Regarding pandas_utils

Changing log_step to narwhals is fairly easy (~4 lines of code), however as this decorator is supposed to work for any function that operates on pandas, doing so would limit its functionality. It could be reasonable to add another one which although restricted to narwhals methods, it can interoperate with all its compatible dataframes.

Legend

βœ… Done
🚧 WIP
πŸ”² Not Started
🚫 Won't do

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions