You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v2.3.0.rst
-35Lines changed: 0 additions & 35 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,39 +31,6 @@ Other enhancements
31
31
- The :meth:`~Series.cumsum`, :meth:`~Series.cummin`, and :meth:`~Series.cummax` reductions are now implemented for :class:`StringDtype` columns (:issue:`60633`)
32
32
- The :meth:`~Series.sum` reduction is now implemented for :class:`StringDtype` columns (:issue:`59853`)
In previous versions, comparing :class:`Series` of different string dtypes (e.g. ``pd.StringDtype("pyarrow", na_value=pd.NA)`` against ``pd.StringDtype("python", na_value=np.nan)``) would result in inconsistent resulting dtype or incorrectly raise. pandas will now use the hierarchy
in determining the result dtype when there are different string dtypes compared. Some examples:
52
-
53
-
- When ``pd.StringDtype("pyarrow", na_value=pd.NA)`` is compared against any other string dtype, the result will always be ``boolean[pyarrow]``.
54
-
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("pyarrow", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
55
-
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("python", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
56
-
57
-
.. _whatsnew_230.api_changes:
58
-
59
-
API changes
60
-
~~~~~~~~~~~
61
-
62
-
- When enabling the ``future.infer_string`` option, :class:`Index` set operations (like
63
-
union or intersection) will now ignore the dtype of an empty :class:`RangeIndex` or
64
-
empty :class:`Index` with ``object`` dtype when determining the dtype of the resulting
- Bug in :meth:`.DataFrameGroupBy.min`, :meth:`.DataFrameGroupBy.max`, :meth:`.Resampler.min`, :meth:`.Resampler.max` where all NA values of string dtype would return float instead of string dtype (:issue:`60810`)
89
-
- Bug in :meth:`DataFrame.sum` with ``axis=1``, :meth:`.DataFrameGroupBy.sum` or :meth:`.SeriesGroupBy.sum` with ``skipna=True``, and :meth:`.Resampler.sum` with all NA values of :class:`StringDtype` resulted in ``0`` instead of the empty string ``""`` (:issue:`60229`)
90
55
- Bug in :meth:`Series.__pos__` and :meth:`DataFrame.__pos__` where an ``Exception`` was not raised for :class:`StringDtype` with ``storage="pyarrow"`` (:issue:`60710`)
91
56
- Bug in :meth:`Series.rank` for :class:`StringDtype` with ``storage="pyarrow"`` that incorrectly returned integer results with ``method="average"`` and raised an error if it would truncate results (:issue:`59768`)
92
57
- Bug in :meth:`Series.replace` with :class:`StringDtype` when replacing with a non-string value was not upcasting to ``object`` dtype (:issue:`60282`)
In previous versions, comparing :class:`Series` of different string dtypes (e.g. ``pd.StringDtype("pyarrow", na_value=pd.NA)`` against ``pd.StringDtype("python", na_value=np.nan)``) would result in inconsistent resulting dtype or incorrectly raise. pandas will now use the hierarchy
in determining the result dtype when there are different string dtypes compared. Some examples:
27
+
28
+
- When ``pd.StringDtype("pyarrow", na_value=pd.NA)`` is compared against any other string dtype, the result will always be ``boolean[pyarrow]``.
29
+
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("pyarrow", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
30
+
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("python", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
31
+
32
+
.. _whatsnew_231.string_fixes.ignore_empty:
33
+
34
+
Index set operations ignore empty RangeIndex and object dtype Index
When enabling the ``future.infer_string`` option, :class:`Index` set operations (like
38
+
union or intersection) will now ignore the dtype of an empty :class:`RangeIndex` or
39
+
empty :class:`Index` with ``object`` dtype when determining the dtype of the resulting
40
+
Index (:issue:`60797`).
41
+
42
+
This ensures that combining such empty Index with strings will infer the string dtype
43
+
correctly, rather than defaulting to ``object`` dtype. For example:
44
+
45
+
.. code-block:: python
46
+
47
+
>>> pd.options.mode.infer_string =True
48
+
>>> df = pd.DataFrame()
49
+
>>> df.columns.dtype
50
+
dtype('int64') # default RangeIndex for empty columns
51
+
>>> df["a"] = [1, 2, 3]
52
+
>>> df.columns.dtype
53
+
<StringDtype(na_value=nan)># new columns use string dtype instead of object dtype
54
+
55
+
.. _whatsnew_231.string_fixes.bugs:
56
+
57
+
Bug fixes
58
+
^^^^^^^^^
59
+
- Bug in :meth:`.DataFrameGroupBy.min`, :meth:`.DataFrameGroupBy.max`, :meth:`.Resampler.min`, :meth:`.Resampler.max` where all NA values of string dtype would return float instead of string dtype (:issue:`60810`)
60
+
- Bug in :meth:`DataFrame.sum` with ``axis=1``, :meth:`.DataFrameGroupBy.sum` or :meth:`.SeriesGroupBy.sum` with ``skipna=True``, and :meth:`.Resampler.sum` with all NA values of :class:`StringDtype` resulted in ``0`` instead of the empty string ``""`` (:issue:`60229`)
61
+
- Fixed bug in :meth:`DataFrame.explode` and :meth:`Series.explode` where methods would fail with ``dtype="str"`` (:issue:`61623`)
13
62
14
-
Enhancements
15
-
~~~~~~~~~~~~
16
-
-
17
63
18
64
.. _whatsnew_231.regressions:
19
65
@@ -26,7 +72,7 @@ Fixed regressions
26
72
27
73
Bug fixes
28
74
~~~~~~~~~
29
-
- Fixed bug in :meth:`DataFrame.explode` and :meth:`Series.explode` where methods would fail with ``dtype="str"`` (:issue:`61623`)
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v3.0.0.rst
+8Lines changed: 8 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,6 +28,9 @@ Enhancement2
28
28
29
29
Other enhancements
30
30
^^^^^^^^^^^^^^^^^^
31
+
- :func:`pandas.merge` propagates the ``attrs`` attribute to the result if all
32
+
inputs have identical ``attrs``, as has so far already been the case for
33
+
:func:`pandas.concat`.
31
34
- :class:`pandas.api.typing.FrozenList` is available for typing the outputs of :attr:`MultiIndex.names`, :attr:`MultiIndex.codes` and :attr:`MultiIndex.levels` (:issue:`58237`)
32
35
- :class:`pandas.api.typing.SASReader` is available for typing the output of :func:`read_sas` (:issue:`55689`)
33
36
- Added :meth:`.Styler.to_typst` to write Styler objects to file, buffer or string in Typst format (:issue:`57617`)
@@ -318,6 +321,8 @@ Optional libraries below the lowest tested version may still work, but are not c
318
321
+------------------------+---------------------+
319
322
| Package | New Minimum Version |
320
323
+========================+=====================+
324
+
| pyarrow | 12.0.1 |
325
+
+------------------------+---------------------+
321
326
| pytz | 2023.4 |
322
327
+------------------------+---------------------+
323
328
| fastparquet | 2024.2.0 |
@@ -745,9 +750,11 @@ Indexing
745
750
- Bug in :meth:`DataFrame.__getitem__` returning modified columns when called with ``slice`` in Python 3.12 (:issue:`57500`)
746
751
- Bug in :meth:`DataFrame.__getitem__` when slicing a :class:`DataFrame` with many rows raised an ``OverflowError`` (:issue:`59531`)
747
752
- Bug in :meth:`DataFrame.from_records` throwing a ``ValueError`` when passed an empty list in ``index`` (:issue:`58594`)
753
+
- Bug in :meth:`DataFrame.loc` and :meth:`DataFrame.iloc` returning incorrect dtype when selecting from a :class:`DataFrame` with mixed data types. (:issue:`60600`)
748
754
- Bug in :meth:`DataFrame.loc` with inconsistent behavior of loc-set with 2 given indexes to Series (:issue:`59933`)
749
755
- Bug in :meth:`Index.get_indexer` and similar methods when ``NaN`` is located at or after position 128 (:issue:`58924`)
750
756
- Bug in :meth:`MultiIndex.insert` when a new value inserted to a datetime-like level gets cast to ``NaT`` and fails indexing (:issue:`60388`)
757
+
- Bug in :meth:`Series.__setitem__` when assigning boolean series with boolean indexer will raise ``LossySetitemError`` (:issue:`57338`)
751
758
- Bug in printing :attr:`Index.names` and :attr:`MultiIndex.levels` would not escape single quotes (:issue:`60190`)
752
759
- Bug in reindexing of :class:`DataFrame` with :class:`PeriodDtype` columns in case of consolidated block (:issue:`60980`, :issue:`60273`)
753
760
@@ -777,6 +784,7 @@ I/O
777
784
- Bug in :meth:`DataFrame.to_excel` when writing empty :class:`DataFrame` with :class:`MultiIndex` on both axes (:issue:`57696`)
778
785
- Bug in :meth:`DataFrame.to_excel` where the :class:`MultiIndex` index with a period level was not a date (:issue:`60099`)
779
786
- Bug in :meth:`DataFrame.to_stata` when exporting a column containing both long strings (Stata strL) and :class:`pd.NA` values (:issue:`23633`)
787
+
- Bug in :meth:`DataFrame.to_stata` when input encoded length and normal length are mismatched (:issue:`61583`)
780
788
- Bug in :meth:`DataFrame.to_stata` when writing :class:`DataFrame` and ``byteorder=`big```. (:issue:`58969`)
781
789
- Bug in :meth:`DataFrame.to_stata` when writing more than 32,000 value labels. (:issue:`60107`)
782
790
- Bug in :meth:`DataFrame.to_string` that raised ``StopIteration`` with nested DataFrames. (:issue:`16098`)
0 commit comments