-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Problem
The citations feature is not displaying in the frontend despite all infrastructure being in place and the structured_output_enabled flag being correctly sent from frontend to backend.
Expected Behavior
When user checks "Enable Citations (Structured Output)" checkbox and sends a query:
- Frontend sends
metadata.config_metadata.structured_output_enabled: true - Backend receives config and uses structured output generation
- Backend returns
metadata.search_metadata.structured_answer.citations - Frontend displays "X citations" button in MessageMetadataFooter
- User clicks button to see CitationsAccordion with citation details
Actual Behavior
- ✅ Frontend correctly sends
structured_output_enabled: truein payload - ✅ Backend MessageProcessingOrchestrator extracts and merges config_metadata
- ❌ Backend continues using Chain of Thought (CoT) instead of structured output
- ❌ No citations generated or returned
- ❌ Frontend only shows "5 sources" button, no "X citations" button
Evidence
Frontend payload (confirmed correct):
{
"session_id": "3fb379e1-66e9-4c48-93fd-c46e8d9df6a5",
"content": "what was IBM revenue?",
"role": "user",
"message_type": "question",
"metadata": {
"config_metadata": {
"structured_output_enabled": true,
"cot_enabled": false,
"show_cot_steps": false
}
}
}Backend logs:
- Shows "Using CoT-generated answer" (should use structured output)
- Debug logs from MessageProcessingOrchestrator not appearing
- Config may not be reaching generation_stage.py
Root Cause Analysis
Suspected Issue: Config metadata not propagating from MessageProcessingOrchestrator → SearchService → generation_stage.py
Files Involved:
backend/rag_solution/services/message_processing_orchestrator.py- Extracts config ✅backend/rag_solution/services/search_service.py- Should pass config to generationbackend/rag_solution/generation/generation_stage.py- Needs to checkstructured_output_enabledflag
Current Logic in generation_stage.py (suspected issue):
# Line ~120-140: CoT takes precedence
if search_output.cot_output and search_output.cot_output.final_answer:
final_answer = search_output.cot_output.final_answer # ← Always used if CoT exists
logger.info("Using CoT-generated answer")
elif search_output.structured_answer:
final_answer = search_output.structured_answer.answer # ← Never reached
logger.info("Using structured answer")Problem: Need to respect structured_output_enabled flag to skip CoT when structured output requested.
Proposed Solution
Option 1: Check Flag in generation_stage.py (Recommended)
# In generation_stage.py, check config_metadata first
structured_output_enabled = config_metadata.get("structured_output_enabled", False)
if structured_output_enabled and search_output.structured_answer:
# User explicitly requested structured output - use it
final_answer = search_output.structured_answer.answer
logger.info("Using structured answer (user preference)")
elif search_output.cot_output and search_output.cot_output.final_answer:
# Fall back to CoT if available
final_answer = search_output.cot_output.final_answer
logger.info("Using CoT-generated answer")
elif search_output.structured_answer:
# Fall back to structured answer
final_answer = search_output.structured_answer.answer
logger.info("Using structured answer (fallback)")Components Already Working ✅
- Frontend UI: CitationsAccordion component, toggle checkbox, MessageMetadataFooter
- Frontend API: apiClient.sendConversationMessage() sends correct payload structure
- Backend Orchestrator: Extracts and merges user config_metadata
- Backend Schemas: ConversationMessageInput accepts metadata with config_metadata
- WatsonX Provider: Structured output generation with JSON schema validation
Related Issues
- PR Implement Structured Output with JSON Schema Validation #604 #626 - Structured output with JSON schema validation (parent PR)
- Issue Backend leaking internal processing data to frontend (10-20x payload bloat) #628 - Backend data leak (separate issue for payload bloat)
Priority
MEDIUM - Feature incomplete but not blocking other work. Citations infrastructure complete, just needs config propagation fix.
Estimated Effort
2-4 hours - Small change to generation_stage.py logic, testing, verification