Skip to content

Conversation

@josh-hhai
Copy link
Contributor

@josh-hhai josh-hhai commented Aug 28, 2025

Completely replaces the old sdk with datamodel-codegen for the openapi spec, and removes traceloop and implemente opentlemetry. Greatly increases testability. Delivers no sdk code change support for bring your own instrumnetor.

TODO: Need to rework github workflows to suit the new model.
TODO: Look at adding specific environment testing, i.e. aws lambda


Note

Introduces extensive specification packages (architecture, SRD, tasks, implementation) for a workflow engine, a production-hardened HoneyHive SDK Docs MCP server (v2, concurrency-safe), and prioritized documentation fixes, including validation and testing plans.

  • Specifications:
    • Workflow Engine Design: Adds phase-gated workflow architecture, checkpointing, state/error handling, RAG integration, and testing strategy.
    • HoneyHive SDK Docs MCP:
      • v1 specs and a comprehensive v2 modular redesign (config via JSON/dataclasses, ServerFactory DI, selective tool loading, concurrency safety, version pinning, failure-mode testing).
      • Includes SRD, specs, tasks, implementation guides, validation/improvements analyses, and supporting docs.
    • Documentation P0 Fixes:
      • Adds SRD/specs/tasks/implementation for prioritized doc improvements (Getting Started restructure, compatibility matrices, span enrichment guide, testing/deployment guides), with validation gates and execution plan.

Written by Cursor Bugbot for commit 923d420. This will update automatically on new commits. Configure here.

@dhruv-hhai
Copy link
Contributor

Check for backwards compatibility of new API client against the old. Refer to docs scripts for reference + the speakeasy SDK ref docs in main.

@dhruv-hhai
Copy link
Contributor

Check for backwards compatibility on environment variable references + add support for experiment harness related env vars

HoneyHiveClient does not read standard environment variables

@dhruv-hhai
Copy link
Contributor

Enable verbose flag on the HoneyHiveClient init for customer to debug API errors.

@dhruv-hhai
Copy link
Contributor

Apply pydantic models on SDK caller params directly instead of inside the function.

@dhruv-hhai
Copy link
Contributor

Add a SSL cert override option on the HoneyHiveClient init with the env var for httpx.

Add a SSL no verify flag on HoneyHiveClient

@dhruv-hhai
Copy link
Contributor

Standardize python error handling middleware (context handler of some kind) and add that in all client wrapper classes.

@dhruv-hhai
Copy link
Contributor

Add docstrings on SDK functions

@dhruv-hhai
Copy link
Contributor

Investigate pydantic alternative

@dhruv-hhai
Copy link
Contributor

Ensure there's an async method for each API call wrapper.

@dhruv-hhai
Copy link
Contributor

Nit: Add argument builders for async callers

@dhruv-hhai
Copy link
Contributor

dhruv-hhai commented Aug 28, 2025

Investigate an alternative to data model codegen to also include client codegen

@dhruv-hhai
Copy link
Contributor

dhruv-hhai commented Aug 28, 2025

Drop the HoneyHiveLogger class / evaluate moving the logger repo into this repo

@dhruv-hhai
Copy link
Contributor

Drop project from tracer init

@dhruv-hhai
Copy link
Contributor

Drop unused imports

@dhruv-hhai
Copy link
Contributor

Move tracer away from singleton. We want to support multiple sessions within the same runtime.

@dhruv-hhai
Copy link
Contributor

Default session name should be the file name where the tracer is initialized.

@dhruv-hhai
Copy link
Contributor

dhruv-hhai commented Aug 28, 2025

Check if TracerProvider is initialized before initializing. We should support not being the main provider if someone already has a tracer provider set.

@dhruv-hhai
Copy link
Contributor

OTLP export is enabled by default

@dhruv-hhai
Copy link
Contributor

Provide a flag to disable batching on span exporter + support simple span processor

Lambda mode flag to auto-set these configs

@dhruv-hhai
Copy link
Contributor

Provide ability to set custom session_id via an argument on the tracer init.

@dhruv-hhai
Copy link
Contributor

Auto-generate UUIDv4 for session_id even if session start fails

@dhruv-hhai
Copy link
Contributor

Pick up session_id from baggage context if available by default

@dhruv-hhai
Copy link
Contributor

Setup baggage context should also check if pre-existing baggage has the main association properties set

@dhruv-hhai
Copy link
Contributor

The context manager sets span attributes as honeyhive.* it should also support traceloop.association.properties.*

@dhruv-hhai
Copy link
Contributor

dhruv-hhai commented Aug 28, 2025

Centralize the enrich_session implementation to the tracer class. Don't do create_event do update_event.

@dhruv-hhai
Copy link
Contributor

dhruv-hhai commented Aug 28, 2025

Enrich session should use the baggage to fetch the session_id, not the tracer (since we aren't on a singleton model)

@dhruv-hhai
Copy link
Contributor

Drop configure_otlp_exporter

@dhruv-hhai
Copy link
Contributor

Migrate evaluate from the old SDK. Look at the multi-threading patterns + tracer initializations to see how we want to support multiple sessions in one runtime.

@github-actions
Copy link
Contributor

📚 Documentation Preview Built

Documentation preview is ready!

📦 Download Preview

Download documentation artifact

🔍 How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

✅ Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Preview generated for PR #154

Updated all pre-commit hooks and scripts to use .praxis-os/ paths instead of .agent-os/.

Problem:
- feature-list-sync and documentation-compliance-check hooks were failing
- All pre-commit infrastructure still referenced old .agent-os/ directory
- Prevented commits from completing after praxis OS migration

Solution:
- Updated .pre-commit-config.yaml: Changed 3 file pattern references
- Updated scripts/check-feature-sync.py: Changed 4 references (.agent-os/product/features.md → .praxis-os/workspace/product/features.md)
- Updated scripts/check-documentation-compliance.py: Changed 6 references
- Updated scripts/validate-docs-navigation.sh: Changed comment and echo message
- Updated scripts/validate-no-mocks-integration.py: Changed spec path reference
- Updated 5 additional scripts: test-generation-*.py, setup-dev.sh, generate-test-from-framework.py, benchmark/README.md

Total Changes:
- Files modified: 10 (pre-commit config + 9 scripts)
- References updated: 43 (.agent-os/ → .praxis-os/)
- Hook paths now correctly reference .praxis-os/workspace/product/features.md and .praxis-os/standards/universal/best-practices.md

Files:
- .pre-commit-config.yaml
- scripts/check-feature-sync.py
- scripts/check-documentation-compliance.py
- scripts/validate-docs-navigation.sh
- scripts/validate-no-mocks-integration.py
- scripts/test-generation-metrics.py
- scripts/test-generation-framework-check.py
- scripts/setup-dev.sh
- scripts/generate-test-from-framework.py
- scripts/benchmark/README.md
- CHANGELOG.md, docs/changelog.rst (release notes)
@github-actions
Copy link
Contributor

📚 Documentation Preview Built

Documentation preview is ready!

📦 Download Preview

Download documentation artifact

🔍 How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

✅ Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Preview generated for PR #154

path: tests/lambda/benchmark-results.json

- name: Comment benchmark results on PR
if: github.event_name == 'pull_request'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Dead Code: Workflow Step Never Executes

The lambda-performance-benchmark job has a condition if: github.event_name == 'schedule' at line 170, meaning it only runs on scheduled events. However, the "Comment benchmark results on PR" step at line 211 has a condition if: github.event_name == 'pull_request', which will never be true since the job never runs on pull request events. This step will never execute, making the PR commenting functionality dead code.

Fix in Cursor Fix in Web

Add formatted table display for experiment results matching main branch behavior:
- Add rich library dependency for beautiful terminal table formatting
- Implement print_table() method on ExperimentResultSummary
  - Displays run summary (ID, status, pass/fail counts)
  - Shows aggregated metrics in formatted table
  - Lists per-datapoint results (up to 20)
  - Uses emojis and color for visual clarity
- Add print_results parameter to evaluate() (default: True)
- Add 7 comprehensive unit tests for print_table() functionality
  - Tests strip ANSI codes for clean assertions (production code stays clean)

All tests pass (2897/2897 unit tests)
Recovered from .agent-os migration (before bb881b4):
- .praxis-os/workspace/product/features.md (734 lines)
- .praxis-os/standards/universal/best-practices.md (390 lines)

These files were accidentally omitted during the praxis OS migration
and are required by the feature-list-sync pre-commit hook.
@github-actions
Copy link
Contributor

📚 Documentation Preview Built

Documentation preview is ready!

📦 Download Preview

Download documentation artifact

🔍 How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

✅ Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Preview generated for PR #154

]
}
}
} No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Temporary Artifacts Don't Belong in Version Control

A backup configuration file .cursor/mcp.json.backup-20251112-085756 was accidentally committed to the repository. Backup files are temporary artifacts that shouldn't be version controlled. The file contains hardcoded absolute paths specific to a developer's local machine (/Users/josh/src/github.com/honeyhiveai/python-sdk/), which won't work for other developers and clutters the repository.

Fix in Cursor Fix in Web

level: "INFO" # Options: DEBUG, INFO, WARNING, ERROR, CRITICAL
format: "text" # Options: "text" (human-readable) or "json" (structured)
log_dir: ".cache/logs/" # Log file location (usually fine as-is)
behavioral_metrics_enabled: true # Track query diversity, trends, prepend effectiveness
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Backup Files Pollute Version Control

Another backup configuration file .praxis-os/config/mcp.yaml.backup2 was accidentally committed. Multiple backup files suggest iterative configuration changes that should have been cleaned up before committing. These backup files are temporary artifacts that don't belong in version control.

Fix in Cursor Fix in Web

on:
# Run after docs are deployed - MANDATORY on every deploy
workflow_run:
workflows: ["Deploy Documentation"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Workflow Trigger Mismatch Breaks Validation Automation

The workflow_run trigger references a workflow named "Deploy Documentation", but the actual workflow in docs-deploy.yml is named "Deploy Documentation to GitHub Pages". This mismatch prevents the validation workflow from triggering after documentation deployment completes. The workflow will never run automatically after deployments, breaking the intended validation pipeline.

Fix in Cursor Fix in Web

…ements

- Add with_distributed_trace_context() helper for simplified server-side distributed tracing
- Fix @trace decorator to preserve distributed baggage (session_id, project, source)
- Fix span processor to prioritize baggage over tracer instance attributes
- Enhance enrich_span_context() with inputs/outputs/metadata parameters
- Update HoneyHiveTracerBase.init() return type to Self for better type inference
- Add unit tests for distributed tracing context and decorator baggage preservation
- Fix DatasetsAPI to handle 204 No Content responses
- Update distributed tracing tutorial and API reference documentation
- Add comprehensive summary document for all improvements

Breaking Changes: None
@github-actions
Copy link
Contributor

📚 Documentation Preview Built

Documentation preview is ready!

📦 Download Preview

Download documentation artifact

🔍 How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

✅ Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Preview generated for PR #154

…examples

- Replace generic Flask examples with real Google ADK agent code
- Feature with_distributed_trace_context() helper as primary approach
- Demonstrate mixed invocation pattern (remote + local agents)
- Update architecture diagrams for AI agent use case
- Add practical troubleshooting for Google ADK setup
- Simplify from 3 services to 2 processes for clarity
- Include actual working code from examples/integrations/
…tion

- Remove Pattern 1-5 (common enrichment patterns) from span enrichment guide
- Add comprehensive section on enrich_span_context() for inline span creation
- Emphasize use case: when it's hard to split code into separate functions
- Include comparison table: @trace decorator vs enrich_span_context()
- Add real-world RAG pipeline example with inline spans
- Provide clear guidance on when to use each approach
- Reduce doc from 757 to 544 lines (more focused on key patterns)
…orial

- Change :doc:`custom-spans` to :doc:`../how-to/advanced-tracing/custom-spans`
- Fixes Sphinx build warning: unknown document 'custom-spans'
@github-actions
Copy link
Contributor

📚 Documentation Preview Built

Documentation preview is ready!

📦 Download Preview

Download documentation artifact

🔍 How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

✅ Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Preview generated for PR #154

level: "INFO" # Options: DEBUG, INFO, WARNING, ERROR, CRITICAL
format: "text" # Options: "text" (human-readable) or "json" (structured)
log_dir: ".cache/logs/" # Log file location (usually fine as-is)
behavioral_metrics_enabled: true # Track query diversity, trends, prepend effectiveness
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Accidental Backup Configuration Committed

Backup configuration file committed to version control. The .gitignore includes *.bak* to exclude backup files, but this pattern doesn't match .backup-* files. This appears to be a temporary backup that was accidentally committed and shouldn't be in the repository.

Fix in Cursor Fix in Web

…rable span limits

- Add configurable span limits (max_attributes=1024, max_events=1024, max_links=64)
- Add preserve_core_attributes config (default=True) with lazy activation at 95%
- Implement inline preservation in _finalize_span_dynamically (no processor)
- Add priorities.py module defining critical attributes for preservation
- Remove verbose debug logging from hot path for performance
- Update logging default to WARNING when verbose=False
- Add comprehensive unit tests for preservation logic and config
- Add pylint disables for 10.00/10, fix mypy by adding config to TYPE_CHECKING

Resolves span attribute FIFO eviction causing data loss on large payloads.
Performance: <1ms overhead only on spans approaching attribute limit.
- Add comprehensive spec for configurable span limits feature
- Include SRD, implementation guide, and testing strategy
- Add addendum for lazy-activated core attribute preservation
- Include workflow completion summary and supporting documentation
- Add pre-commit gauntlet survival guide to standards
- Include pessimistic review and resolution documentation

This spec documents the span attribute limit configuration feature that
was just implemented, including the design rationale, implementation
approach, testing strategy, and all supporting documentation from the
workflow execution process.
@github-actions
Copy link
Contributor

📚 Documentation Preview Built

Documentation preview is ready!

📦 Download Preview

Download documentation artifact

🔍 How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

✅ Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Preview generated for PR #154

if: always()
run: |
if [ ! -z "$SERVER_PID" ]; then
kill $SERVER_PID || true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: HTTP server started before documentation exists

The validate-local-build job starts an HTTP server pointing to _build/html directory before that directory is created. The server is launched on line 144, but tox -e docs (which creates _build/html) doesn't run until line 156. This causes a race condition where the server may fail to start or serve incorrect content. The server startup should be moved after the documentation build step.

Fix in Cursor Fix in Web

@github-actions
Copy link
Contributor

📚 Documentation Preview Built

Documentation preview is ready!

📦 Download Preview

Download documentation artifact

🔍 How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

✅ Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Preview generated for PR #154

- Document max_attributes, max_events, max_links, max_span_size settings
- Document preserve_core_attributes feature and lazy activation
- Emphasize that SDK defaults (1024 attrs, 10MB) are optimized for 95% of use cases
- Clarify backend maximums (10,000 attrs, 100MB) are for edge cases only
- Add environment variable names for all new settings (HH_MAX_*)
- Include configuration examples showing conservative increases (not maxing out)
- Document when and why to adjust limits (only when hitting actual errors)
- Add performance, memory, and cost implications
- Explain OpenTelemetry FIFO eviction and core attribute preservation

Documentation philosophy: Defaults are intentionally conservative and well-chosen.
Users should not preemptively max out limits - only increase when needed.
@github-actions
Copy link
Contributor

📚 Documentation Preview Built

Documentation preview is ready!

📦 Download Preview

Download documentation artifact

🔍 How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

✅ Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Preview generated for PR #154

)

# Use decorators
@trace(event_type="llm_call")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Documentation uses string literal instead of EventType enum (Bugbot Rules)

The example code uses a string literal "llm_call" for the event_type parameter, but project rules mandate using EventType enums in all documentation. This violates the critical rule stated in .cursor/rules/execute-tasks.mdc: "EventType enums only - Never string literals in documentation". The correct pattern should import EventType from honeyhive.models and use EventType.model or appropriate enum value instead.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants