Summary
Both DataFrame paths walk every item with model_dump(mode="python") — for a 1000-row trades_history page this is 1000 nested-dict allocations plus per-field Python-level conversion, then pandas/polars re-infers dtypes from records. Order of magnitude slower than column-oriented construction.
Location
kalshi/models/common.py:44-55 — to_dataframe
kalshi/models/common.py:57-68 — to_polars
Evidence
def to_dataframe(self) -> pandas.DataFrame:
...
records = [item.model_dump(mode="python") for item in self.items]
return pd.DataFrame(records)
def to_polars(self) -> polars.DataFrame:
...
records = [item.model_dump(mode="python") for item in self.items]
return pl.DataFrame(records)
Recommended fix
Build columns once ({field: [getattr(item, field) for item in self.items] for field in cls.model_fields}) and pass to pd.DataFrame(columns_dict) / pl.from_dict(...); preserves the #225 Decimal contract and avoids per-row dict construction. Add a bench_page_to_dataframe.py.
Severity & category
medium / performance
Summary
Both DataFrame paths walk every item with
model_dump(mode="python")— for a 1000-rowtrades_historypage this is 1000 nested-dict allocations plus per-field Python-level conversion, then pandas/polars re-infers dtypes from records. Order of magnitude slower than column-oriented construction.Location
kalshi/models/common.py:44-55—to_dataframekalshi/models/common.py:57-68—to_polarsEvidence
Recommended fix
Build columns once (
{field: [getattr(item, field) for item in self.items] for field in cls.model_fields}) and pass topd.DataFrame(columns_dict)/pl.from_dict(...); preserves the #225 Decimal contract and avoids per-row dict construction. Add abench_page_to_dataframe.py.Severity & category
medium / performance