Skip to content

Evaluate reranking: keep, optimize, or remove for production #667

@BjornMelin

Description

@BjornMelin

Summary

Deep research + codebase review to decide whether to disable/remove reranking for production, or keep/refine it for quality. Outcome should finalize reranking and embeddings strategy for launch.

Linked issue

Goals

  • Evaluate real-world quality impact vs cost/latency of reranking.
  • Decide keep, optimize, or remove reranking.
  • Lock final production settings and document the decision.

Scope of investigation

1) Codebase review (current reranking implementation)

  • src/lib/rag/reranker.ts (Together reranker + NoOp fallback)
  • src/lib/rag/retriever.ts (rerank flow + telemetry + fallback)
  • src/ai/tools/server/rag.ts and /api/rag/search handlers
  • Config + envs: TOGETHER_AI_API_KEY, reranker config schema (@schemas/rag)
  • Test coverage: src/lib/rag/__tests__/reranker.test.ts, src/lib/rag/__tests__/retriever.test.ts

2) Data + perf analysis

  • Identify current rerank usage patterns and latency (telemetry spans/events).
  • Measure impact on retrieval quality (offline eval or A/B) and end-user relevance.
  • Cost analysis: Together rerank usage per request vs value.

3) Research

  • Review AI SDK rerank() guidance and current Together rerank model options.
  • Confirm any current best practices for batching, topN, or score thresholds.

Decision options

A) Remove reranking

  • Pros: lower cost/latency; simpler system.
  • Cons: potential drop in relevance/precision.

B) Keep reranking (as-is)

  • Pros: quality boost where needed; minimal change.
  • Cons: cost and latency persist.

C) Refine/optimize reranking

  • Examples: reduce topN, apply to specific queries only, threshold-based fallback, cache results, or model swap.

Expected deliverables

  • Written decision with evidence and reasoning.
  • Updated architecture/ADR note capturing final call.
  • If changes are needed, a concrete implementation plan and checklist.

Acceptance criteria

  • Decision documented with rationale and supporting data.
  • Clear production configuration for embeddings + reranking for release.
  • If reranking is modified or removed, updated tests + docs are identified.

Notes

This issue focuses on decision-making and deep review; execution should follow in a separate implementation issue or PR.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions