Skip to content

Commit 497fe85

Browse files
authored
BUG: fix convert_dtypes dropping values from sliced mixed-dtype DataFrames (#64712)
Co-authored-by: moktamd <moktamd@users.noreply.github.com>
1 parent 346ce92 commit 497fe85

File tree

3 files changed

+16
-2
lines changed

3 files changed

+16
-2
lines changed

doc/source/whatsnew/v3.1.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,7 @@ Numeric
166166
Conversion
167167
^^^^^^^^^^
168168
- Fixed :func:`pandas.array` to preserve mask information when converting NumPy masked arrays, converting masked values to missing values (:issue:`63879`).
169+
- Fixed bug in :meth:`DataFrame.convert_dtypes` where values were dropped from sliced :class:`DataFrame` objects with mixed dtypes when the internal block structure spanned multiple columns (:issue:`64702`)
169170
- Fixed bug in :meth:`DataFrame.from_records` where ``exclude`` was ignored when ``data`` was an iterator and ``nrows=0`` (:issue:`63774`)
170171
-
171172

pandas/core/internals/blocks.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -544,7 +544,7 @@ def convert_dtypes(
544544
for blk in blks:
545545
# Determine dtype column by column
546546
sub_blks = (
547-
[blk] if blk.ndim == 1 or self.shape[0] == 1 else list(blk._split())
547+
[blk] if blk.ndim == 1 or blk.shape[0] == 1 else list(blk._split())
548548
)
549549
dtypes = [
550550
convert_dtypes(
@@ -558,7 +558,7 @@ def convert_dtypes(
558558
)
559559
for b in sub_blks
560560
]
561-
if all(dtype == self.dtype for dtype in dtypes):
561+
if all(dtype == blk.dtype for dtype in dtypes):
562562
# Avoid block splitting if no dtype changes
563563
rbs.append(blk.copy(deep=False))
564564
continue

pandas/tests/frame/methods/test_convert_dtypes.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -240,3 +240,16 @@ def test_convert_dtypes_complex(self):
240240
)
241241
result = df.convert_dtypes()
242242
tm.assert_frame_equal(result, expected)
243+
244+
def test_convert_dtypes_mixed_column_after_slice(self):
245+
# GH#64702
246+
df = pd.DataFrame(data=[[1, "a"], [2, "b"], ["c", 3]], columns=["col1", "col2"])
247+
df = df.loc[[0, 1]].copy()
248+
result = df.convert_dtypes()
249+
expected = pd.DataFrame(
250+
{
251+
"col1": pd.array([1, 2], dtype="Int64"),
252+
"col2": pd.array(["a", "b"], dtype="string"),
253+
}
254+
)
255+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)