Add reflector instrumentation, pluggable payload mapper, v2 templates#7
Add reflector instrumentation, pluggable payload mapper, v2 templates#7jnzs1836 wants to merge 1 commit into
Conversation
- BaseAgenticOptimizer: add `_invoke_agent` helper that times the agent
invocation and captures Strands metrics (tokens, cycles, tool calls,
latency) onto `last_wall_clock_s` / `last_metrics`. ContrastiveReflectionOptimizer
routes through it so callers can read step cost without subclassing.
- AgentCoreRolloutEngine: the engine still builds the canonical
`{"data_sample": ..., "params": ...}` payload. New `payload_mapper:
Callable[[dict], dict]` init kwarg is a transform applied to that
canonical payload before the HTTP call, so deployments expecting a
different shape (flat fields, renamed keys, nested envelopes) can remap
without subclassing the engine. Default is None (no transform), fully
backward-compatible.
- New built-in templates `contrastive_reflection_v2/{system_prompt,task_message_system_prompt}.jinja`:
forbids the "Learned Behaviors" appendix, requires holistic integration
of insights, adds a safety invariant (require user confirmation before
consequential actions), and a length guardrail (<= max(1.1x original,
original + 500 chars)). Submission flow still uses the existing
`submit_optimized_params` tool contract.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR rebases and reintroduces prior work to add instrumentation around agent invocation, make AgentCore payloads customizable without subclassing, and ship updated “contrastive_reflection_v2” built-in templates.
Changes:
- Add
BaseAgenticOptimizer._invoke_agent()to time agent calls and capture Strands usage/latency/tool metrics intolast_wall_clock_sandlast_metrics; routeContrastiveReflectionOptimizerthrough it. - Add optional
payload_mapper: Callable[[dict], dict]toAgentCoreRolloutEngineto transform the canonical{"data_sample", "params"}payload before invoking the runtime. - Add new built-in templates under
templates/contrastive_reflection_v2/.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| strands_harness_optimizer/optimizers/system_prompt/base_agentic_optimizer.py | Adds _invoke_agent() plus last_wall_clock_s / last_metrics fields to expose per-invocation cost/latency. |
| strands_harness_optimizer/optimizers/system_prompt/contrastive_reflection.py | Routes reflection invocation through _invoke_agent() to enable instrumentation. |
| strands_harness_optimizer/rollout_engines/agentcore_engine.py | Adds payload_mapper hook to reshape outgoing runtime payloads without subclassing. |
| strands_harness_optimizer/templates/contrastive_reflection_v2/system_prompt.jinja | New v2 system prompt template focusing on integrated edits vs appendices. |
| strands_harness_optimizer/templates/contrastive_reflection_v2/task_message_system_prompt.jinja | New v2 task message template enforcing structural preservation, safety invariant, and length guardrail. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
jnzs1836
left a comment
There was a problem hiding this comment.
I think adding try-catch may silently drop the agent failure, causing the further issue for the agent execution. Since we are running an LLM agent experiments, the failure in the agent execution should be visible to the users and they should figure out by themselves, otherwise the experiment results may be biased.
Supersedes #2.
PR #2 was branched before the package was renamed to
strands_harness_optimizer(and before CI was added), so its changes landed on now-nonexistent paths and it couldn't be updated in place due to branch-protection rules. This branch rebases that work onto currentmain:_invoke_agenthelper that times the agent invocation and captures Strands metrics (tokens, cycles, tool calls, latency) ontolast_wall_clock_s/last_metrics.ContrastiveReflectionOptimizerroutes through it.payload_mapper: Callable[[dict], dict]init kwarg that transforms the canonical{"data_sample", "params"}payload before the HTTP call. DefaultNone(no transform) — fully backward-compatible.contrastive_reflection_v2/{system_prompt,task_message_system_prompt}.jinja.Original work by @hanDing (commit authorship preserved). Verified locally:
pytest -m "not integration"→ 39 passed.