Skip to content

Test coverage: Page.to_dataframe / to_polars nested-model serialization unpinned #101

Description

@TexasCoding

From Wave 5 — F-Q-11 and the perf-side F-R-08 (same code, different angle). Severity: medium.

Gap

`tests/test_page_dataframe.py` uses a flat `_Row` (str, Decimal, int). Real SDK pages return models with nested structures: `Market` has `OrderbookLevel` sub-objects; `Candlestick` has nested OHLC; multivariate results have lists of dicts. `model_dump(mode="python")` produces nested dicts in the DataFrame column, which has engine-specific behavior:

  • pandas: object column with dicts.
  • polars: struct column.

Neither is asserted.

Why it matters

Users running `client.markets.list().to_dataframe()` and trying to query a nested column hit subtle behavior they need to know about. If a future refactor flips `mode="python"` to `mode="json"`, nested `Decimal`s become strings and silently break `.sum()` on price columns. The smoke tests don't catch this — they assert Decimal stays Decimal, but only for top-level fields.

Test

Add a `_NestedRow(BaseModel)` with a nested `_Inner` BaseModel field plus a `list[Decimal]` field. Assert:

  • `to_dataframe()` produces an object-dtype column containing the nested dict (not a string).
  • `to_polars()` produces a struct or list column.
  • A top-level Decimal column still has Decimal values (not str).

Pins the `mode="python"` contract that's currently only in a docstring.

Adjacent perf observation

`_R-08` notes the list-comprehension `[item.model_dump(mode="python") for item in self.items]` allocates N intermediate dicts; for large pages a `pl.from_dicts([m.dict for m in self.items])` skips Pydantic's serializer. Worth measuring if a user reports slow dataframe construction. Out of scope for this test-only issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions