Skip to content

Commit 73b5578

Browse files
authored
DOC: Added extra sentences to clarify series.GroupBy snippets in examples (#59331)
* Added messages for each releveant snippet * some small corrections to clarify further * removed trailing whitespace * more formatting correction * more cleanup * reverting changes * trying to format documentation correctly * removed some part of addee text * testing if removing list works * reverting some changes * reverting changes * checking if minor changes also leads to failures * reverting all changes to pass the tests * checking is small changes causes errors as well * pusing the changes back
1 parent 88a5668 commit 73b5578

File tree

1 file changed

+26
-0
lines changed

1 file changed

+26
-0
lines changed

pandas/core/series.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1815,14 +1815,30 @@ def _set_name(
18151815
Parrot 30.0
18161816
Parrot 20.0
18171817
Name: Max Speed, dtype: float64
1818+
1819+
We can pass a list of values to group the Series data by custom labels:
1820+
18181821
>>> ser.groupby(["a", "b", "a", "b"]).mean()
18191822
a 210.0
18201823
b 185.0
18211824
Name: Max Speed, dtype: float64
1825+
1826+
Grouping by numeric labels yields similar results:
1827+
1828+
>>> ser.groupby([0, 1, 0, 1]).mean()
1829+
0 210.0
1830+
1 185.0
1831+
Name: Max Speed, dtype: float64
1832+
1833+
We can group by a level of the index:
1834+
18221835
>>> ser.groupby(level=0).mean()
18231836
Falcon 370.0
18241837
Parrot 25.0
18251838
Name: Max Speed, dtype: float64
1839+
1840+
We can group by a condition applied to the Series values:
1841+
18261842
>>> ser.groupby(ser > 100).mean()
18271843
Max Speed
18281844
False 25.0
@@ -1845,11 +1861,16 @@ def _set_name(
18451861
Parrot Captive 30.0
18461862
Wild 20.0
18471863
Name: Max Speed, dtype: float64
1864+
18481865
>>> ser.groupby(level=0).mean()
18491866
Animal
18501867
Falcon 370.0
18511868
Parrot 25.0
18521869
Name: Max Speed, dtype: float64
1870+
1871+
We can also group by the 'Type' level of the hierarchical index
1872+
to get the mean speed for each type:
1873+
18531874
>>> ser.groupby(level="Type").mean()
18541875
Type
18551876
Captive 210.0
@@ -1865,12 +1886,17 @@ def _set_name(
18651886
b 3
18661887
dtype: int64
18671888
1889+
To include `NA` values in the group keys, set `dropna=False`:
1890+
18681891
>>> ser.groupby(level=0, dropna=False).sum()
18691892
a 3
18701893
b 3
18711894
NaN 3
18721895
dtype: int64
18731896
1897+
We can also group by a custom list with NaN values to handle
1898+
missing group labels:
1899+
18741900
>>> arrays = ['Falcon', 'Falcon', 'Parrot', 'Parrot']
18751901
>>> ser = pd.Series([390., 350., 30., 20.], index=arrays, name="Max Speed")
18761902
>>> ser.groupby(["a", "b", "a", np.nan]).mean()

0 commit comments

Comments
 (0)