Skip to content

Conversation

lorenzorubi-db
Copy link
Contributor

DataExplorer new method map_chunked as an alternative to map:

  • map processes the tables one by one
  • map_chunked processes the tables in chunks of size tables_per_chunk


return res

def map_chunked(self, f: Callable, tables_per_chunk: int, **kwargs) -> list[any]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def map_chunked(self, f: Callable, tables_per_chunk: int, **kwargs) -> list[any]:
def map_chunked(self, f: Callable, tables_per_chunk: int, **kwargs) -> list[Any]:

any is a function, not a type

setup.py Outdated
"delta-spark>=2.2.0",
"pandas<2.0.0", # From 2.0.0 onwards, pandas does not support iteritems() anymore, spark.createDataFrame will fail
"numpy<1.24", # From 1.24 onwards, module 'numpy' has no attribute 'bool'.
"more_itertools",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create LPP ticket for this, otherwise re-implement a single function. Don't add whole library for the sake of a function

@lorenzorubi-db lorenzorubi-db requested a review from nfx February 3, 2024 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants