honeyhiveai · josh-hhai · Sep 2, 2025 · Sep 2, 2025 · Sep 2, 2025 · Sep 2, 2025
diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md
@@ -0,0 +1,68 @@
+# Claude Code Configuration - HoneyHive Python SDK
+
+## Project Context
+This is the HoneyHive Python SDK (complete-refactor branch) - a comprehensive observability and evaluation platform for LLM applications.
+
+## Agent OS Integration
+The project uses Agent OS for structured development. Key directories:
+- Standards: `.agent-os/standards/` - Global coding standards
+- Product: `.agent-os/product/` - Product documentation
+- Specs: `.agent-os/specs/` - Feature specifications
+
+## Critical Project Rules
+
+### 🔴 MUST FOLLOW
+1. **ALWAYS use tox for testing** - Never run pytest directly
+   ```bash
+   tox -e py311  # Python 3.11 tests
+   tox -e unit   # Unit tests only
+   ```
+
+2. **Type hints are MANDATORY** - All functions must have type hints
+3. **No code in `__init__.py`** - Only imports allowed
+4. **Use Black formatting** - Line length 88
+5. **Multi-instance tracers** - No singleton pattern
+
+### Key Patterns
+- Unified `@trace` decorator works for both sync/async
+- HTTP tracing disabled by default for performance
+- Graceful degradation - never crash host application
+- Environment variables: HH_*, HTTP_*, EXPERIMENT_*
+
+## Quick Commands
+
+### Testing
+```bash
+tox -e py311        # Test on Python 3.11
+tox -e unit         # Run unit tests
+tox -e integration  # Run integration tests
+tox -e lint         # Run linting
+```
+
+### Common Patterns
+```python
+# Initialize tracer
+from honeyhive import HoneyHiveTracer
+
+tracer = HoneyHiveTracer.init(
+    api_key="hh_api_...",
+    project="my-project"
+)
+
+# Use decorators
+@trace(event_type="llm_call")
+async def my_function():
+    return await process()
+```
+
+## Development Workflow
+1. Check `.agent-os/product/roadmap.md` for current priorities
+2. Create specs in `.agent-os/specs/` for new features
+3. Follow standards in `.agent-os/standards/`
+4. Update `.agent-os/product/decisions.md` for architectural choices
+
+## References
+- Product Overview: `.agent-os/product/overview.md`
+- Code Style: `.agent-os/standards/code-style.md`
+- Best Practices: `.agent-os/standards/best-practices.md`
+- Technical Decisions: `.agent-os/product/decisions.md`
diff --git a/.cursor/commands/trace.md b/.cursor/commands/trace.md
diff --git a/.cursor/mcp.json b/.cursor/mcp.json
@@ -0,0 +1,28 @@
+{
+  "mcpServers": {
+    "praxis-os": {
+      "command": "${workspaceFolder}/.praxis-os/venv/bin/python",
+      "args": [
+        "-m",
+        "ouroboros",
+        "--transport",
+        "dual",
+        "--log-level",
+        "INFO"
+      ],
+      "env": {
+        "PROJECT_ROOT": "${workspaceFolder}",
+        "PYTHONPATH": "${workspaceFolder}/.praxis-os",
+        "PYTHONUNBUFFERED": "1"
+      },
+      "autoApprove": [
+        "pos_search_project",
+        "pos_workflow",
+        "pos_browser",
+        "pos_filesystem",
+        "current_date",
+        "get_server_info"
+      ]
+    }
+  }
+}
diff --git a/.cursor/mcp.json.backup-20251112-085756 b/.cursor/mcp.json.backup-20251112-085756
@@ -0,0 +1,26 @@
+{
+  "mcpServers": {
+    "python-sdk": {
+      "command": "/Users/josh/src/github.com/honeyhiveai/python-sdk/.praxis-os/venv/bin/python",
+      "args": [
+        "-m",
+        "ouroboros",
+        "--transport",
+        "dual",
+        "--log-level",
+        "DEBUG"
+      ],
+      "env": {
+        "PYTHONPATH": "/Users/josh/src/github.com/honeyhiveai/python-sdk/.praxis-os"
+      },
+      "autoApprove": [
+        "pos_search_project",
+        "pos_workflow",
+        "pos_browser",
+        "pos_filesystem",
+        "get_server_info",
+        "current_date"
+      ]
+    }
+  }
+}
diff --git a/.cursor/rules/analyze-product.mdc b/.cursor/rules/analyze-product.mdc
@@ -0,0 +1,119 @@
+# Analyze Product - HoneyHive Python SDK
+
+When analyzing the existing codebase or adding Agent OS to existing code:
+
+## Analysis Process
+
+### 1. Understand Current Architecture
+```python
+# Key directories to analyze
+src/honeyhive/
+├── api/           # API client layer
+├── tracer/        # OpenTelemetry integration
+├── evaluation/    # Evaluation framework
+├── models/        # Data models
+└── utils/         # Shared utilities
+```
+
+### 2. Key Architectural Patterns
+
+#### Multi-Instance Support
+- Each tracer instance is independent
+- No global singleton pattern
+- Thread-safe operations
+
+#### Unified Decorators
+```python
+# Single @trace works for both sync and async
+from honeyhive.models import EventType
+
+@trace(event_type=EventType.tool)
+def sync_func(): pass
+
+@trace(event_type=EventType.tool)
+async def async_func(): pass
+```
+
+#### Graceful Degradation
+- SDK never crashes host application
+- Errors logged but handled gracefully
+- Optional returns for non-critical operations
+
+### 3. Current Implementation Details
+
+#### Testing Framework
+- **950+ tests** currently passing (831 unit + 119 integration)
+- **81.14% coverage** achieved (exceeds 80% requirement)
+- **Two-tier testing**: Unit (mocked, fast) vs Integration (real APIs, no mocks)
+- **tox** for test orchestration
+- Python 3.11, 3.12, 3.13 support
+- **NO MOCKS IN INTEGRATION TESTS** - Critical rule established
+
+#### Configuration
+- Environment variables: HH_*, HTTP_*, EXPERIMENT_*
+- Configuration precedence: Constructor > Env > Defaults
+- HTTP tracing disabled by default
+
+#### Key Dependencies
+- OpenTelemetry >=1.20.0
+- httpx >=0.24.0
+- pydantic >=2.0.0
+- Python 3.11+ required
+
+### 4. Integration Points
+
+#### Provider Integrations
+- OpenAI / Azure OpenAI
+- Anthropic Claude
+- Google Gemini
+- AWS Bedrock
+- 15+ more providers
+
+#### Framework Support
+- LangChain / LangGraph
+- LlamaIndex
+- CrewAI
+- LiteLLM
+
+### 5. When Analyzing Existing Code
+
+#### Check for:
+- Existing test patterns
+- Configuration mechanisms
+- Error handling approaches
+- Performance optimizations
+- Security practices
+
+#### Document in Agent OS:
+- Update `.agent-os/product/features.md` with discovered features
+- Add to `.agent-os/product/decisions.md` for architectural choices
+- Create specs in `.agent-os/specs/` for improvements
+
+## Critical Patterns to Maintain
+
+1. **NO MOCKS IN INTEGRATION TESTS** - Integration tests must use real systems
+2. **Always use tox** for testing - Never pytest directly
+3. **Type hints mandatory** on all functions with docstrings
+4. **No code in __init__.py** files - Only imports
+5. **Multi-instance support** required - No singleton pattern
+6. **Graceful degradation** essential - Never crash host app
+7. **EventType enums only** - Never string literals in documentation
+8. **80% test coverage** minimum (project-wide)
+9. **Test count reporting** - Always report total tests correctly (unit + integration)
+
+## Standards to Follow
+Always reference:
+- **Best Practices**: `.agent-os/standards/best-practices.md` (includes Agent OS spec standards)
+- **Technology Stack**: `.agent-os/standards/tech-stack.md` for technology choices
+- **Code Style**: `.agent-os/standards/code-style.md` for coding standards
+
+## References
+- **Product Documentation**:
+  - Overview: `.agent-os/product/overview.md`
+  - Features: `.agent-os/product/features.md`
+  - Roadmap: `.agent-os/product/roadmap.md`
+  - Decisions: `.agent-os/product/decisions.md`
+- **Standards Documentation**:
+  - Best Practices: `.agent-os/standards/best-practices.md`
+  - Tech Stack: `.agent-os/standards/tech-stack.md`
+  - Code Style: `.agent-os/standards/code-style.md`
diff --git a/.cursor/rules/create-spec.mdc b/.cursor/rules/create-spec.mdc
@@ -0,0 +1,96 @@
+# Create Spec - HoneyHive Python SDK
+
+When creating specifications for new features, follow the Agent OS specification standards.
+
+## 🚨 CRITICAL: Follow Agent OS Standards
+
+**All specification creation MUST follow the standards defined in:**
+- **`.agent-os/standards/best-practices.md`** - Complete Agent OS specification standards (starting at "📋 Agent OS Specification Standards")
+
+**Key Requirements**:
+- **File Structure**: Follow the mandatory 5-file structure (srd.md, specs.md, tasks.md, README.md, implementation.md)
+- **Content Standards**: Each file has specific required sections and format requirements
+- **Task Format**: Follow checkbox specifications defined in `.cursor/rules/execute-tasks.mdc`
+- **Date Standards**: Use current system date for all spec creation
+
+## Spec Creation Protocol
+
+**MANDATORY**: When creating new Agent OS specs, AI assistants MUST:
+
+### 1. Get Current Date
+```bash
+CURRENT_DATE=$(date +"%Y-%m-%d")
+echo "Today is: $CURRENT_DATE"
+```
+
+### 2. Create Directory with Proper Naming
+```bash
+SPEC_NAME="your-spec-name"
+SPEC_DIR=".agent-os/specs/${CURRENT_DATE}-${SPEC_NAME}"
+mkdir -p "$SPEC_DIR"
+```
+
+### 3. Create ALL Required Files
+```bash
+# Create mandatory files
+touch "$SPEC_DIR/srd.md"
+touch "$SPEC_DIR/specs.md" 
+touch "$SPEC_DIR/tasks.md"
+
+# Create recommended files
+touch "$SPEC_DIR/README.md"
+
+# Create optional files (if needed)
+touch "$SPEC_DIR/implementation.md"
+```
+
+### 4. Use Proper Headers in Each File
+```markdown
+# Spec Name - File Type
+
+**Date**: 2025-09-06
+**Status**: Draft/Active/Completed
+**Priority**: High/Medium/Low
+```
+
+## Validation Commands
+
+**Use the validation commands defined in `.agent-os/standards/best-practices.md`**
+
+**Quick Validation**:
+```bash
+# Get current date for spec creation
+CURRENT_DATE=$(date +"%Y-%m-%d")
+echo "Today is: $CURRENT_DATE"
+
+# Verify spec follows Agent OS standards
+# (Complete validation commands are in .agent-os/standards/best-practices.md)
+```
+
+## Standards to Follow
+- **Agent OS Standards**: `.agent-os/standards/best-practices.md`
+- **Technology Stack**: `.agent-os/standards/tech-stack.md`
+- **Code Style**: `.agent-os/standards/code-style.md`
+
+## Critical Rules for HoneyHive SDK
+1. **NO MOCKS IN INTEGRATION TESTS** - Integration tests must use real systems
+2. **All functions must have type hints** and docstrings
+3. **Minimum 80% test coverage** (project-wide)
+4. **Use tox for ALL testing** - Never pytest directly
+5. **Graceful degradation required** - Never crash host app
+6. **Use EventType enums** - Never string literals in documentation
+7. **Test count reporting** - Always report total tests correctly (unit + integration)
+
+## Common Violations to Prevent
+
+**❌ WRONG**:
+- Not consulting `.agent-os/standards/best-practices.md` before creating specs
+- Duplicating standards content instead of referencing it
+- Ignoring existing Agent OS specification structure
+- **Task format errors**: Using checkboxes on section headers or wrong checkbox format
+
+**✅ CORRECT**:
+- **Always reference Agent OS standards first**: Read `.agent-os/standards/best-practices.md`
+- **Follow established patterns**: Use existing specs as templates
+- **Proper task format**: Follow checkbox specifications in `.cursor/rules/execute-tasks.mdc`
+- **Leverage standards system**: Reference, don't duplicate