-
-
Notifications
You must be signed in to change notification settings - Fork 19.7k
Open
Labels
Arrowpyarrow functionalitypyarrow functionalityPerformanceMemory or execution speed performanceMemory or execution speed performanceStringsString extension data type and string dataString extension data type and string data
Description
In [1]: import pandas as pd
In [2]: pd.Series(["a"]).array
> /pandas/core/construction.py(695)sanitize_array()
-> subarr = maybe_convert_platform(data)
(Pdb) data
['a']
(Pdb) n
> /pandas/core/construction.py(696)sanitize_array()
-> if subarr.dtype == object:
(Pdb) subarr
array(['a'], dtype=object)
(Pdb) c
Out[5]:
<ArrowStringArray>
['a']
Length: 1, dtype: str
In [3]: pd.Series(["a"], dtype=pd.StringDtype("pyarrow"))
Out[3]:
0 a
dtype: stringmaybe_convert_platform calls construct_1d_object_array_from_listlike to convert the list to ndarray[object] for further inference. I wonder if we can perform type inference without converting to object such that for strings we can pass the list directly to ArrowStringArray._from_sequence_of_strings
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Arrowpyarrow functionalitypyarrow functionalityPerformanceMemory or execution speed performanceMemory or execution speed performanceStringsString extension data type and string dataString extension data type and string data