-
Notifications
You must be signed in to change notification settings - Fork 76
Open
Labels
performanceSomething related to how fast the library can handle dataSomething related to how fast the library can handle data
Milestone
Description
While working with a relatively big dataset (~10m rows), I observed that rendering the table for a pivot result is extremely slow—about 10 seconds for a 10m rows dataframe.
After a brief investigation, I found out that the problem is with converting the pivot object to a dataframe. That is implemented like that pivot.frames().toDataFrame()
TODO: investigate what is the bottleneck and optimize the Pivot->Dataframe conversion process
Metadata
Metadata
Assignees
Labels
performanceSomething related to how fast the library can handle dataSomething related to how fast the library can handle data