Skip to content

[AAASM-2640] ♻️ (aa-ffi-python): Delegate RuntimeClient transport to aa-sdk-client#79

Merged
Chisanan232 merged 4 commits into
masterfrom
v0.0.1/AAASM-2640/refactor/thin_shim
Jun 6, 2026
Merged

[AAASM-2640] ♻️ (aa-ffi-python): Delegate RuntimeClient transport to aa-sdk-client#79
Chisanan232 merged 4 commits into
masterfrom
v0.0.1/AAASM-2640/refactor/thin_shim

Conversation

@Chisanan232
Copy link
Copy Markdown
Contributor

Description

Make the Python SDK's pyo3 binding (rust/aa-ffi-python) a thin shim over the shared aa-sdk-client crate, per ADR 0002 (Epic AAASM-2552, "SDK security boundary + FFI consolidation"). The runtime-client logic — UDS transport, IPC wire codec, and the AssemblyClient lifecycle — now lives once in aa-sdk-client instead of being reimplemented in the binding.

  • RuntimeClient.connect / send_event / close delegate to aa_sdk_client::AssemblyClient (spawn_ipc_thread + AssemblyClient::new / report_event / shutdown).
  • The local tokio worker loop, frame codec, varint helpers, and the synchronous query_policy round-trip (PolicyResult / PolicyTimeoutError) are removed. Policy / approval are server-side in the trust model, and advisory, non-authoritative credential preflight is provided transitively by aa-sdk-client — the shim holds no security authority.
  • Type translation (GovernanceEvent, audit_event_to/from_wire_bytes) is retained.
  • Shared crates (aa-core / aa-proto / aa-sdk-client) are pinned to one agent-assembly SHA (9cf8a033) — git-SHA distribution per ADR 0002. tokio / once_cell dropped; the new aa-proto ActionType::ToolResult variant is handled.

Net: rust/aa-ffi-python/src/lib.rs shrinks from 719 → ~290 lines (−497 / +64).

The Python-facing export/test alignment (dropping PolicyResult / PolicyTimeoutError from _core exports + updating the gated native tests) follows in AAASM-2641, stacked on this PR.

Type of Change

  • ♻️ Refactoring

Breaking Changes

  • Yes (please describe below)

The native _core module drops query_policy, PolicyResult, and PolicyTimeoutError. No pure-Python caller uses them (policy checks go through the httpx gateway client); they were a SDK-side reimplementation of runtime-client logic that this Epic retires. The documented public API (init_assemblycontext.client.*) is unaffected.

Related Issues

  • Related JIRA ticket: AAASM-2640 (Story AAASM-2561, Epic AAASM-2552)

Testing

  • cargo build compiles the shim against aa-sdk-client with zero warnings; the PyO3 extension links (-undefined dynamic_lookup, as maturin sets).
  • uv sync + pytest green (418 passed, 11 skipped — native + optional-framework tests skip in a pure-Python install).
  • No tests required (explain why)

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Comments added for complex logic
  • Documentation updated if needed
  • All tests passing

Bump the aa-core / aa-proto git-SHA pin to the agent-assembly master
commit that ships aa-sdk-client, and add aa-sdk-client itself at the
same SHA (single workspace checkout per ADR 0002). The next commit
delegates the runtime client to it.

Handle the new aa-proto ActionType::ToolResult variant in the
audit-event translation helpers, which the new SHA requires.
Make the pyo3 binding a thin shim over the shared aa-sdk-client crate
(ADR 0002). RuntimeClient.connect/send_event/close now delegate to
aa_sdk_client::AssemblyClient — the UDS transport, IPC wire codec, and
background lifecycle live once in aa-sdk-client instead of being
reimplemented here. Remove the local tokio worker loop, frame codec, and
the synchronous query_policy round-trip (PolicyResult / PolicyTimeoutError):
policy and approval are server-side per the trust model, and the advisory,
non-authoritative credential preflight is provided transitively by
aa-sdk-client — the shim holds no security authority.

Type translation (GovernanceEvent, audit_event_to/from_wire_bytes) is
retained. Drop the now-unused tokio and once_cell dependencies.
aa-sdk-client ships events over a bounded channel with a blocking send,
so under backpressure report_event can park the calling thread. Holding
the GIL there stalls every other Python thread — and deadlocks outright
when the runtime peer is an in-process Python thread (the native test's
mock runtime). Wrap the delegation in py.detach so the GIL is released
for the duration of the send.
The thin shim no longer exposes PolicyResult, so the native-core-build
workflow's import verification (`from agent_assembly._core import ...,
PolicyResult`) failed after maturin built the module. Import only the
symbols the shim exposes (RuntimeClient, GovernanceEvent).
@Chisanan232
Copy link
Copy Markdown
Contributor Author

🤖 Claude Code — review result

CI: ✅ green. build-native-core initially failed because the workflow's import smoke-check still referenced PolicyResult (which this PR removes from _core); fixed in 0104291 (import only RuntimeClient, GovernanceEvent). All required checks now pass. BLOCKED is solely REVIEW_REQUIRED (awaiting Pioneer approval), not CI.

Scope vs AAASM-2640 / Story AAASM-2561 / ADR 0002: ✅ complete.

  • RuntimeClient.connect/send_event/close delegate to aa_sdk_client::AssemblyClient; the local tokio worker loop, frame codec, varint, and the synchronous query_policy round-trip are deleted — runtime-client logic now lives once in aa-sdk-client.
  • Shared crates pinned to one SHA 9cf8a033; new aa-proto ActionType::ToolResult handled; tokio/once_cell dropped; extension-module feature preserved.
  • No security authority in the shim (advisory preflight is transitive via aa-sdk-client's preflight feature).

Notes (non-blocking, intentional):

  1. Backpressure changeaa-sdk-client uses a bounded (256) channel with a blocking send vs the old unbounded one. Under a stalled runtime the caller can block once the buffer fills; the GIL is released (py.detach, 12fe563) so the interpreter is not frozen and an in-process runtime peer can drain. This is the shared client's designed behavior.
  2. API surfaceconnect now returns PyResult (can raise if the IPC thread fails to spawn); the raw RuntimeClient(socket_path) constructor is removed (only connect remains — no Python caller used the constructor). query_policy/PolicyResult/PolicyTimeoutError removed (ADR: policy/approval are server-side; no pure-Python caller used them).
  3. Enhancement — events now pass aa-sdk-client's advisory credential preflight, which the old binding never did.
  4. Event shape — labels now follow the canonical report_event form {event_type, details=<AuditEntry JSON>}; the old python-specific agent_id_hex/session_id_hex top-level labels are gone, but the identity remains inside the details payload and the runtime re-derives/normalizes.

Verdict: ✅ Ready to approve & merge. This is the base of the stack — merge first (#79#80#81), rebasing each child onto master after its parent merges.

@Chisanan232 Chisanan232 merged commit 154be3c into master Jun 6, 2026
1 check passed
@Chisanan232 Chisanan232 deleted the v0.0.1/AAASM-2640/refactor/thin_shim branch June 6, 2026 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant