pandas-dev
diff --git a/‎.github/workflows/unit-tests.yml
Lines changed: 0 additions & 3 deletions b/‎.github/workflows/unit-tests.yml
Lines changed: 0 additions & 3 deletions
diff --git a/‎AUTHORS.md
Lines changed: 6 additions & 6 deletions b/‎AUTHORS.md
Lines changed: 6 additions & 6 deletions
diff --git a/‎ci/deps/actions-311-downstream_compat.yaml
Lines changed: 2 additions & 1 deletion b/‎ci/deps/actions-311-downstream_compat.yaml
Lines changed: 2 additions & 1 deletion
diff --git a/‎doc/source/reference/indexing.rst
Lines changed: 1 addition & 0 deletions b/‎doc/source/reference/indexing.rst
Lines changed: 1 addition & 0 deletions
diff --git a/‎doc/source/user_guide/io.rst
Lines changed: 1 addition & 1 deletion b/‎doc/source/user_guide/io.rst
Lines changed: 1 addition & 1 deletion
diff --git a/‎doc/source/whatsnew/v2.3.0.rst
Lines changed: 0 additions & 35 deletions b/‎doc/source/whatsnew/v2.3.0.rst
Lines changed: 0 additions & 35 deletions
diff --git a/‎doc/source/whatsnew/v2.3.1.rst
Lines changed: 51 additions & 5 deletions b/‎doc/source/whatsnew/v2.3.1.rst
Lines changed: 51 additions & 5 deletions
diff --git a/‎doc/source/whatsnew/v3.0.0.rst
Lines changed: 8 additions & 0 deletions b/‎doc/source/whatsnew/v3.0.0.rst
Lines changed: 8 additions & 0 deletions
diff --git a/‎environment.yml
Lines changed: 1 addition & 2 deletions b/‎environment.yml
Lines changed: 1 addition & 2 deletions
diff --git a/‎pandas/_libs/src/datetime/pd_datetime.c
Lines changed: 4 additions & 0 deletions b/‎pandas/_libs/src/datetime/pd_datetime.c
Lines changed: 4 additions & 0 deletions
@@ -139,9 +139,6 @@ jobs:
 
       moto:
         image: motoserver/moto:5.0.27
-        env:
-          AWS_ACCESS_KEY_ID: foobar_key
-          AWS_SECRET_ACCESS_KEY: foobar_secret
         ports:
           - 5000:5000
 
 
@@ -7,12 +7,12 @@ About the Copyright Holders
     led by Wes McKinney. AQR released the source under this license in 2009.
 *   Copyright (c) 2011-2012, Lambda Foundry, Inc.
 
-    Wes is now an employee of Lambda Foundry, and remains the pandas project
+    Wes became an employee of Lambda Foundry, and remained the pandas project
     lead.
 *   Copyright (c) 2011-2012, PyData Development Team
 
     The PyData Development Team is the collection of developers of the PyData
-    project. This includes all of the PyData sub-projects, including pandas. The
+    project. This includes all of the PyData sub-projects, such as pandas. The
     core team that coordinates development on GitHub can be found here:
     https://github.com/pydata.
 
@@ -23,11 +23,11 @@ Our Copyright Policy
 
 PyData uses a shared copyright model. Each contributor maintains copyright
 over their contributions to PyData. However, it is important to note that
-these contributions are typically only changes to the repositories. Thus,
+these contributions are typically limited to changes to the repositories. Thus,
 the PyData source code, in its entirety, is not the copyright of any single
 person or institution. Instead, it is the collective copyright of the
 entire PyData Development Team. If individual contributors want to maintain
-a record of what changes/contributions they have specific copyright on,
+a record of the specific changes or contributions they hold copyright to,
 they should indicate their copyright in the commit message of the change
 when they commit the change to one of the PyData repositories.
 
@@ -50,7 +50,7 @@ Other licenses can be found in the LICENSES directory.
 License
 =======
 
-pandas is distributed under a 3-clause ("Simplified" or "New") BSD
+pandas is distributed under the 3-clause ("Simplified" or "New") BSD
 license. Parts of NumPy, SciPy, numpydoc, bottleneck, which all have
-BSD-compatible licenses, are included. Their licenses follow the pandas
+BSD-compatible licenses, are included. Their licenses are compatible with the pandas
 license.
@@ -50,7 +50,8 @@ dependencies:
   - pytz>=2023.4
   - pyxlsb>=1.0.10
   - s3fs>=2023.12.2
-  - scipy>=1.12.0
+  # TEMP upper pin for scipy (https://github.com/statsmodels/statsmodels/issues/9584)
+  - scipy>=1.12.0,<1.16
   - sqlalchemy>=2.0.0
   - tabulate>=0.9.0
   - xarray>=2024.1.1
 
@@ -98,6 +98,7 @@ Conversion
    :toctree: api/
 
    Index.astype
+   Index.infer_objects
    Index.item
    Index.map
    Index.ravel
 
@@ -5432,7 +5432,7 @@ A simple example loading all data from an Iceberg table ``my_table`` defined in
     df = pd.read_iceberg("my_table", catalog_name="my_catalog")
 
 Catalogs must be defined in the ``.pyiceberg.yaml`` file, usually in the home directory.
-It is possible to to change properties of the catalog definition with the
+It is possible to change properties of the catalog definition with the
 ``catalog_properties`` parameter:
 
 .. code-block:: python
 
@@ -31,39 +31,6 @@ Other enhancements
 - The :meth:`~Series.cumsum`, :meth:`~Series.cummin`, and :meth:`~Series.cummax` reductions are now implemented for :class:`StringDtype` columns (:issue:`60633`)
 - The :meth:`~Series.sum` reduction is now implemented for :class:`StringDtype` columns (:issue:`59853`)
 
-.. ---------------------------------------------------------------------------
-.. _whatsnew_230.notable_bug_fixes:
-
-Notable bug fixes
-~~~~~~~~~~~~~~~~~
-
-These are bug fixes that might have notable behavior changes.
-
-.. _whatsnew_230.notable_bug_fixes.string_comparisons:
-
-Comparisons between different string dtypes
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-In previous versions, comparing :class:`Series` of different string dtypes (e.g. ``pd.StringDtype("pyarrow", na_value=pd.NA)`` against ``pd.StringDtype("python", na_value=np.nan)``) would result in inconsistent resulting dtype or incorrectly raise. pandas will now use the hierarchy
-
-    object < (python, NaN) < (pyarrow, NaN) < (python, NA) < (pyarrow, NA)
-
-in determining the result dtype when there are different string dtypes compared. Some examples:
-
-- When ``pd.StringDtype("pyarrow", na_value=pd.NA)`` is compared against any other string dtype, the result will always be ``boolean[pyarrow]``.
-- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("pyarrow", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
-- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("python", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
-
-.. _whatsnew_230.api_changes:
-
-API changes
-~~~~~~~~~~~
-
-- When enabling the ``future.infer_string`` option, :class:`Index` set operations (like
-  union or intersection) will now ignore the dtype of an empty :class:`RangeIndex` or
-  empty :class:`Index` with ``object`` dtype when determining the dtype of the resulting
-  Index (:issue:`60797`)
-
 .. ---------------------------------------------------------------------------
 .. _whatsnew_230.deprecations:
 
@@ -85,8 +52,6 @@ Numeric
 
 Strings
 ^^^^^^^
-- Bug in :meth:`.DataFrameGroupBy.min`, :meth:`.DataFrameGroupBy.max`, :meth:`.Resampler.min`, :meth:`.Resampler.max` where all NA values of string dtype would return float instead of string dtype (:issue:`60810`)
-- Bug in :meth:`DataFrame.sum` with ``axis=1``, :meth:`.DataFrameGroupBy.sum` or :meth:`.SeriesGroupBy.sum` with ``skipna=True``, and :meth:`.Resampler.sum` with all NA values of :class:`StringDtype` resulted in ``0`` instead of the empty string ``""`` (:issue:`60229`)
 - Bug in :meth:`Series.__pos__` and :meth:`DataFrame.__pos__` where an ``Exception`` was not raised for :class:`StringDtype` with ``storage="pyarrow"`` (:issue:`60710`)
 - Bug in :meth:`Series.rank` for :class:`StringDtype` with ``storage="pyarrow"`` that incorrectly returned integer results with ``method="average"`` and raised an error if it would truncate results (:issue:`59768`)
 - Bug in :meth:`Series.replace` with :class:`StringDtype` when replacing with a non-string value was not upcasting to ``object`` dtype (:issue:`60282`)
 
@@ -9,11 +9,57 @@ including other versions of pandas.
 {{ header }}
 
 .. ---------------------------------------------------------------------------
-.. _whatsnew_231.enhancements:
+.. _whatsnew_231.string_fixes:
+
+Improvements and fixes for the StringDtype
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. _whatsnew_231.string_fixes.string_comparisons:
+
+Comparisons between different string dtypes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In previous versions, comparing :class:`Series` of different string dtypes (e.g. ``pd.StringDtype("pyarrow", na_value=pd.NA)`` against ``pd.StringDtype("python", na_value=np.nan)``) would result in inconsistent resulting dtype or incorrectly raise. pandas will now use the hierarchy
+
+    object < (python, NaN) < (pyarrow, NaN) < (python, NA) < (pyarrow, NA)
+
+in determining the result dtype when there are different string dtypes compared. Some examples:
+
+- When ``pd.StringDtype("pyarrow", na_value=pd.NA)`` is compared against any other string dtype, the result will always be ``boolean[pyarrow]``.
+- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("pyarrow", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
+- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("python", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
+
+.. _whatsnew_231.string_fixes.ignore_empty:
+
+Index set operations ignore empty RangeIndex and object dtype Index
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+When enabling the ``future.infer_string`` option, :class:`Index` set operations (like
+union or intersection) will now ignore the dtype of an empty :class:`RangeIndex` or
+empty :class:`Index` with ``object`` dtype when determining the dtype of the resulting
+Index (:issue:`60797`).
+
+This ensures that combining such empty Index with strings will infer the string dtype
+correctly, rather than defaulting to ``object`` dtype. For example:
+
+.. code-block:: python
+
+    >>> pd.options.mode.infer_string = True
+    >>> df = pd.DataFrame()
+    >>> df.columns.dtype
+    dtype('int64')               # default RangeIndex for empty columns
+    >>> df["a"] = [1, 2, 3]
+    >>> df.columns.dtype
+    <StringDtype(na_value=nan)>  # new columns use string dtype instead of object dtype
+
+.. _whatsnew_231.string_fixes.bugs:
+
+Bug fixes
+^^^^^^^^^
+- Bug in :meth:`.DataFrameGroupBy.min`, :meth:`.DataFrameGroupBy.max`, :meth:`.Resampler.min`, :meth:`.Resampler.max` where all NA values of string dtype would return float instead of string dtype (:issue:`60810`)
+- Bug in :meth:`DataFrame.sum` with ``axis=1``, :meth:`.DataFrameGroupBy.sum` or :meth:`.SeriesGroupBy.sum` with ``skipna=True``, and :meth:`.Resampler.sum` with all NA values of :class:`StringDtype` resulted in ``0`` instead of the empty string ``""`` (:issue:`60229`)
+- Fixed bug in :meth:`DataFrame.explode` and :meth:`Series.explode` where methods would fail with ``dtype="str"`` (:issue:`61623`)
 
-Enhancements
-~~~~~~~~~~~~
--
 
 .. _whatsnew_231.regressions:
 
@@ -26,7 +72,7 @@ Fixed regressions
 
 Bug fixes
 ~~~~~~~~~
-- Fixed bug in :meth:`DataFrame.explode` and :meth:`Series.explode` where methods would fail with ``dtype="str"`` (:issue:`61623`)
+-
 
 .. ---------------------------------------------------------------------------
 .. _whatsnew_231.other:
 
@@ -28,6 +28,9 @@ Enhancement2
 
 Other enhancements
 ^^^^^^^^^^^^^^^^^^
+- :func:`pandas.merge` propagates the ``attrs`` attribute to the result if all
+  inputs have identical ``attrs``, as has so far already been the case for
+  :func:`pandas.concat`.
 - :class:`pandas.api.typing.FrozenList` is available for typing the outputs of :attr:`MultiIndex.names`, :attr:`MultiIndex.codes` and :attr:`MultiIndex.levels` (:issue:`58237`)
 - :class:`pandas.api.typing.SASReader` is available for typing the output of :func:`read_sas` (:issue:`55689`)
 - Added :meth:`.Styler.to_typst` to write Styler objects to file, buffer or string in Typst format (:issue:`57617`)
@@ -318,6 +321,8 @@ Optional libraries below the lowest tested version may still work, but are not c
 +------------------------+---------------------+
 | Package                | New Minimum Version |
 +========================+=====================+
+| pyarrow                | 12.0.1              |
++------------------------+---------------------+
 | pytz                   | 2023.4              |
 +------------------------+---------------------+
 | fastparquet            | 2024.2.0            |
@@ -745,9 +750,11 @@ Indexing
 - Bug in :meth:`DataFrame.__getitem__` returning modified columns when called with ``slice`` in Python 3.12 (:issue:`57500`)
 - Bug in :meth:`DataFrame.__getitem__` when slicing a :class:`DataFrame` with many rows raised an ``OverflowError`` (:issue:`59531`)
 - Bug in :meth:`DataFrame.from_records` throwing a ``ValueError`` when passed an empty list in ``index`` (:issue:`58594`)
+- Bug in :meth:`DataFrame.loc` and :meth:`DataFrame.iloc` returning incorrect dtype when selecting from a :class:`DataFrame` with mixed data types. (:issue:`60600`)
 - Bug in :meth:`DataFrame.loc` with inconsistent behavior of loc-set with 2 given indexes to Series (:issue:`59933`)
 - Bug in :meth:`Index.get_indexer` and similar methods when ``NaN`` is located at or after position 128 (:issue:`58924`)
 - Bug in :meth:`MultiIndex.insert` when a new value inserted to a datetime-like level gets cast to ``NaT`` and fails indexing (:issue:`60388`)
+- Bug in :meth:`Series.__setitem__` when assigning boolean series with boolean indexer will raise ``LossySetitemError`` (:issue:`57338`)
 - Bug in printing :attr:`Index.names` and :attr:`MultiIndex.levels` would not escape single quotes (:issue:`60190`)
 - Bug in reindexing of :class:`DataFrame` with :class:`PeriodDtype` columns in case of consolidated block (:issue:`60980`, :issue:`60273`)
 
@@ -777,6 +784,7 @@ I/O
 - Bug in :meth:`DataFrame.to_excel` when writing empty :class:`DataFrame` with :class:`MultiIndex` on both axes (:issue:`57696`)
 - Bug in :meth:`DataFrame.to_excel` where the :class:`MultiIndex` index with a period level was not a date (:issue:`60099`)
 - Bug in :meth:`DataFrame.to_stata` when exporting a column containing both long strings (Stata strL) and :class:`pd.NA` values (:issue:`23633`)
+- Bug in :meth:`DataFrame.to_stata` when input encoded length and normal length are mismatched (:issue:`61583`)
 - Bug in :meth:`DataFrame.to_stata` when writing :class:`DataFrame` and ``byteorder=`big```. (:issue:`58969`)
 - Bug in :meth:`DataFrame.to_stata` when writing more than 32,000 value labels. (:issue:`60107`)
 - Bug in :meth:`DataFrame.to_string` that raised ``StopIteration`` with nested DataFrames. (:issue:`16098`)
 
@@ -64,9 +64,8 @@ dependencies:
   - dask-core
   - seaborn-base
 
-  # local testing dependencies
+  # Mocking s3 tests
   - moto
-  - flask
 
   # benchmarks
   - asv>=0.6.1
 
@@ -192,6 +192,10 @@ static npy_datetime PyDateTimeToEpoch(PyObject *dt, NPY_DATETIMEUNIT base) {
   return npy_dt;
 }
 
+/* Initializes and exposes a customer datetime C-API from the pandas library
+ * by creating a PyCapsule that stores function pointers, which can be accessed
+ * later by other C code or Cython code that imports the capsule.
+ */
 static int pandas_datetime_exec(PyObject *Py_UNUSED(module)) {
   PyDateTime_IMPORT;
   PandasDateTime_CAPI *capi = PyMem_Malloc(sizeof(PandasDateTime_CAPI));