-
Notifications
You must be signed in to change notification settings - Fork 4
fix(search-service): Pass structured_answer through SearchOutput #632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(search-service): Pass structured_answer through SearchOutput #632
Conversation
Add structured_answer field to SearchOutput creation to ensure citations
and structured output data flows through the search pipeline.
**Changes:**
1. **SearchOutput Creation** (search_service.py:586):
- Add `structured_answer=result_context.structured_answer` to SearchOutput
- Ensures structured output (with citations) is included in search results
- Previously: Field existed in schema but not populated from result_context
2. **Debug Logging**:
- Log document_metadata count before SearchOutput creation
- Log first document name for debugging
- Helps track data flow through search pipeline
**Why This Matters:**
- SearchOutput schema has `structured_answer: StructuredAnswer | None` field
- generation_stage.py creates structured output and adds to result_context
- But SearchService wasn't passing it through to SearchOutput
- Result: Structured output generated but lost before returning to caller
**Data Flow:**
```
generation_stage.py
↓
result_context.structured_answer = StructuredAnswer(...)
↓
SearchService._search_with_executor()
↓
SearchOutput(
answer=...,
documents=...,
structured_answer=result_context.structured_answer ← ADDED
)
↓
MessageProcessingOrchestrator
↓
Frontend (citations display)
```
**Testing:**
- Structured output now included in SearchOutput
- Citations data flows through to conversation API response
- No breaking changes (field is optional, None if not generated)
**Dependencies:**
- Requires PR #626 (Structured Output schema) for StructuredAnswer field definition
- Works with PR #631 (Conversation API config) to enable user-controlled structured output
**Related:**
- Part of Issue #629 fix (citations not displaying)
- Small but critical piece of the structured output pipeline
🚀 Development Environment OptionsThis repository supports Dev Containers for a consistent development environment. Option 1: GitHub Codespaces (Recommended)Create a cloud-based development environment:
Option 2: VS Code Dev Containers (Local)Use Dev Containers on your local machine:
Option 3: Traditional Local SetupSet up the development environment manually: # Clone the repository
git clone https://github.com/manavgup/rag_modulo.git
cd rag_modulo
git checkout fix/search-service-structured-answer-passthrough
# Initialize development environment
make dev-init
make dev-build
make dev-up
make dev-validateAvailable CommandsOnce in your development environment: make help # Show all available commands
make dev-validate # Validate environment setup
make test-atomic # Run atomic tests
make test-unit # Run unit tests
make lint # Run lintingServices AvailableWhen running
This automated message helps reviewers quickly set up the development environment. |
Code Review - PR #632: Pass structured_answer through SearchOutput
|
…nd SearchOutput - Add structured_answer field to SearchOutput schema with StructuredAnswer import - Add structured_answer field to SearchContext dataclass for pipeline data flow - Fix quote style in search_service.py debug logging (double quotes) - Apply Ruff formatting to search_service.py This ensures structured output with citations generated by generation_stage.py flows through to the SearchOutput response and reaches the frontend. Related to PR #626 (Structured Output schema) Enables PR #630 (Frontend Citations UI) Signed-off-by: Claude <[email protected]> Signed-off-by: manavgup <[email protected]>
# Conflicts: # backend/rag_solution/services/pipeline/search_context.py
Code Review: PR #632 - Pass structured_answer through SearchOutputSummaryThis PR fixes a critical data flow issue where structured output with citations was being generated in the pipeline but lost before returning to the caller. The fix is minimal, focused, and backward compatible. ✅ Strengths1. Excellent Problem Identification
2. Minimal, Surgical Fix
3. Comprehensive Documentation
4. Proper Schema Updates
5. Code Quality
🔍 Areas for Improvement1. Missing Test Coverage
|
Problem
SearchOutputschema hasstructured_answerfield (from PR #626), butSearchServicewasn't populating it when creatingSearchOutputinstances.Impact: Structured output with citations generated by
generation_stage.pywas lost before returning to caller.Solution
Add
structured_answer=result_context.structured_answerwhen creatingSearchOutput.Changes
search_service.py (backend/rag_solution/services/search_service.py:586):
Debug Logging:
document_metadatacountData Flow
Why This Was Needed
SearchOutputhasstructured_answer: StructuredAnswer | Nonefieldgeneration_stage.pycreates structured output correctlySearchServicewasn't passing it throughTesting
✅ Structured output now included in
SearchOutput✅ Citations data flows through to conversation API response
✅ No breaking changes (field is optional,
Noneif not generated)✅ Backward compatible with existing code
Related
Size
Tiny PR: 6 lines added (1 line critical, 5 lines debug logging)
Breaking Changes
None - field is optional and defaults to
None