Skip to content

Commit f746ec3

Browse files
committed
BUG: df.loc[i] with Categorical column containing NaN
1 parent 1feacde commit f746ec3

File tree

3 files changed

+42
-1
lines changed

3 files changed

+42
-1
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -947,6 +947,7 @@ Indexing
947947
- Bug in :meth:`Series.__setitem__` when assigning boolean series with boolean indexer will raise ``LossySetitemError`` (:issue:`57338`)
948948
- Bug in printing :attr:`Index.names` and :attr:`MultiIndex.levels` would not escape single quotes (:issue:`60190`)
949949
- Bug in reindexing of :class:`DataFrame` with :class:`PeriodDtype` columns in case of consolidated block (:issue:`60980`, :issue:`60273`)
950+
- Bug in :meth:`DataFrame.loc.__getitem__` and :meth:`DataFrame.iloc.__getitem__` with a :class:`CategoricalDtype` column with integer categories raising when trying to index a row containing a ``NaN`` entry (:issue:`58954`)
950951
- Bug in :meth:`Index.__getitem__` incorrectly raising with a 0-dim ``np.ndarray`` key (:issue:`55601`)
951952

952953
Missing

pandas/core/internals/managers.py

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@
5050
is_list_like,
5151
)
5252
from pandas.core.dtypes.dtypes import (
53+
CategoricalDtype,
5354
DatetimeTZDtype,
5455
ExtensionDtype,
5556
SparseDtype,
@@ -1138,7 +1139,24 @@ def fast_xs(self, loc: int) -> SingleBlockManager:
11381139
# Such assignment may incorrectly coerce NaT to None
11391140
# result[blk.mgr_locs] = blk._slice((slice(None), loc))
11401141
for i, rl in enumerate(blk.mgr_locs):
1141-
result[rl] = blk.iget((i, loc))
1142+
item = blk.iget((i, loc))
1143+
if (
1144+
result.dtype.kind in "iub"
1145+
and lib.is_float(item)
1146+
and isna(item)
1147+
and isinstance(blk.dtype, CategoricalDtype)
1148+
):
1149+
# GH#58954 caused bc interleaved_dtype is wrong for Categorical
1150+
# TODO(GH#38240) this will be unnecessary
1151+
# Note that doing this in a try/except would work for the
1152+
# integer case, but not for bool, which will cast the NaN
1153+
# entry to True.
1154+
if result.dtype.kind == "b":
1155+
new_dtype = object
1156+
else:
1157+
new_dtype = np.float64
1158+
result = result.astype(new_dtype)
1159+
result[rl] = item
11421160

11431161
if isinstance(dtype, ExtensionDtype):
11441162
cls = dtype.construct_array_type()

pandas/tests/indexing/test_categorical.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -571,3 +571,25 @@ def test_getitem_categorical_with_nan(self):
571571
df = DataFrame(ser)
572572
assert df.loc[np.nan, 0] == 2
573573
assert df.loc[np.nan][0] == 2
574+
575+
def test_getitem_row_categorical_with_nan(self):
576+
# GH#58954
577+
df = DataFrame({"a": [1, 2], "b": CategoricalIndex([1, None])})
578+
579+
res = df.iloc[1]
580+
expected = Series([2, np.nan], index=df.columns, name=1)
581+
tm.assert_series_equal(res, expected)
582+
583+
res = df.loc[1]
584+
tm.assert_series_equal(res, expected)
585+
586+
def test_getitem_row_categorical_with_nan_bool(self):
587+
# GH#58954
588+
df = DataFrame({"a": [True, False], "b": CategoricalIndex([False, None])})
589+
590+
res = df.iloc[1]
591+
expected = Series([False, np.nan], index=df.columns, dtype=object, name=1)
592+
tm.assert_series_equal(res, expected)
593+
594+
res = df.loc[1]
595+
tm.assert_series_equal(res, expected)

0 commit comments

Comments
 (0)