Skip to content

Commit b791605

Browse files
mongodbenBen Perlmuttermmeigs
authored
[feature branch] Search Content API (Epic EAI-956) (#786)
* stub searchContent route * EAI-973: Persist search queries and results in mongodb (#804) * Create MongoDbSearchResultsStore, add limit to DefaultFindContent and add test for limit * Implement saveSearchResult, create MongoDbSearchResultsStore.test.ts * lint, format * Check entire returned document in MongoDbSearchResultsStore.test.ts * Create ResultChunk type and zod check * Correct usage of limit in makeDefaultFindContent * PR feedback: cast badSearchResultRecord as any * Use unknown instead of any for ResultChunk additional metadata * PR feedback: Combine describe blocks in MongoDbSearchResultsStore.test.ts, remove zod checks where unnecessary in MongoDbSearchResultsStore * Combine describes * EAI-970: Expose search endpoint (#810) * Create MongoDbSearchResultsStore, add limit to DefaultFindContent and add test for limit * Implement saveSearchResult, create MongoDbSearchResultsStore.test.ts * lint, format * Check entire returned document in MongoDbSearchResultsStore.test.ts * Create ResultChunk type and zod check * Correct usage of limit in makeDefaultFindContent * PR feedback: cast badSearchResultRecord as any * Starting structure of searchContent * Use unknown instead of any for ResultChunk additional metadata * Add searchContent test file, broaden QueryFilter & MongoDbAtlasVectorSearchFilter types * Work on clarity of comments in contentRouter * PR feedback: Combine describe blocks in MongoDbSearchResultsStore.test.ts, remove zod checks where unnecessary in MongoDbSearchResultsStore * Use generics on middleware: requireRequestOrigin, requireValidIpAddress * Structure out contentRouter test file * Combine describes * Clean * makeFindContentWithMongoDbMetadata * config.test.ts * Clean * PR feedback * Correct types * Correct test * Fix test return type * lint * Revert move of classifyMongoDbProgrammingLanguageAndProduct, jest needs function outside of file to mock * Remove unnecessary tests and comments * (EAI-971) Add `searchContent` to openapi spec documentation (#823) * Add searchContent pieces to openapi spec * PR feedback * (EAI-1159) Implement middleware in new `contentRouter` (#816) * Create MongoDbSearchResultsStore, add limit to DefaultFindContent and add test for limit * Implement saveSearchResult, create MongoDbSearchResultsStore.test.ts * lint, format * Check entire returned document in MongoDbSearchResultsStore.test.ts * Create ResultChunk type and zod check * Correct usage of limit in makeDefaultFindContent * PR feedback: cast badSearchResultRecord as any * Starting structure of searchContent * Use unknown instead of any for ResultChunk additional metadata * Add searchContent test file, broaden QueryFilter & MongoDbAtlasVectorSearchFilter types * Work on clarity of comments in contentRouter * PR feedback: Combine describe blocks in MongoDbSearchResultsStore.test.ts, remove zod checks where unnecessary in MongoDbSearchResultsStore * Use generics on middleware: requireRequestOrigin, requireValidIpAddress * Structure out contentRouter test file * Combine describes * Clean * makeFindContentWithMongoDbMetadata * config.test.ts * Clean * PR feedback * Correct types * Correct test * Fix test return type * lint * Revert move of classifyMongoDbProgrammingLanguageAndProduct, jest needs function outside of file to mock * Created addCustomData.ts, generics, use in both contentRouter and conversationRouter * Clean * Remove unnecessary tests and comments * Added custom middleware to contentRouter, used in searchContent route, added to tests * Add customData to db... * Clean: allow undefined customData value * Alter types for createConversationsMiddlewareReq * Rerun tests * PR feedback * Add Locals types to middleware invocations * Lint, fix trace name, remove unnecessary import * (EAI-972) Add extra braintrust tracing to searchContent route (#822) Add extra braintrust tracing to searchContent route * Use safely parsed req.body, handle possibly undefined dataSources * (EAI-1173) Content sources endpoint (#833) * Create listDataSources route, plug into searchContent, blank file created for listDatasources test file * Clean sources endpoint and data types * Add openapi spec documentation for new listDataSources endpoint * listDataSources test file * PR feedback and add isCurrent * Clean yaml * Rename version -> versions * PR feedback * ci * Change embedding search index * Use plain memory server * Fix tests * Make unused parameter obvious * Use cache in listDataSources * Add tests for cache * Strong typing on atlas search filters (#860) Co-authored-by: Ben Perlmutter <[email protected]> --------- Co-authored-by: Ben Perlmutter <[email protected]> Co-authored-by: mmeigs <[email protected]>
1 parent 7dc67d0 commit b791605

40 files changed

+1715
-388
lines changed

docs/docs/server/openapi.yaml

Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,104 @@ servers:
1717
security:
1818
- CustomHeaderAuth: []
1919
paths:
20+
/content/search:
21+
post:
22+
operationId: searchContent
23+
tags:
24+
- Content
25+
summary: Search content
26+
requestBody:
27+
required: true
28+
content:
29+
application/json:
30+
schema:
31+
type: object
32+
properties:
33+
query:
34+
type: string
35+
description: The search query string.
36+
dataSources:
37+
type: array
38+
items:
39+
type: object
40+
properties:
41+
name:
42+
type: string
43+
type:
44+
type: string
45+
versionLabel:
46+
type: string
47+
required: [name]
48+
description: An array of data sources to search. If not provided, latest version of all data sources will be searched.
49+
limit:
50+
type: integer
51+
minimum: 1
52+
maximum: 100
53+
default: 5
54+
description: The maximum number of results to return.
55+
required:
56+
- query
57+
responses:
58+
200:
59+
description: OK
60+
headers:
61+
Content-Type:
62+
schema:
63+
type: string
64+
example: application/json
65+
content:
66+
application/json:
67+
schema:
68+
$ref: "#/components/schemas/SearchResponse"
69+
400:
70+
description: Bad Request
71+
headers:
72+
Content-Type:
73+
schema:
74+
type: string
75+
example: application/json
76+
content:
77+
application/json:
78+
schema:
79+
$ref: "#/components/responses/BadRequest"
80+
500:
81+
description: Internal Server Error
82+
headers:
83+
Content-Type:
84+
schema:
85+
type: string
86+
example: application/json
87+
content:
88+
application/json:
89+
schema:
90+
$ref: "#/components/responses/InternalServerError"
91+
92+
/content/sources:
93+
get:
94+
operationId: listDataSources
95+
tags:
96+
- Content
97+
summary: List available data sources
98+
description: Returns metadata about all available data sources.
99+
responses:
100+
200:
101+
description: OK
102+
content:
103+
application/json:
104+
schema:
105+
$ref: "#/components/schemas/ListDataSourcesResponse"
106+
500:
107+
description: Internal Server Error
108+
content:
109+
application/json:
110+
schema:
111+
$ref: "#/components/schemas/ErrorResponse"
112+
20113
/conversations:
21114
post:
22115
operationId: createConversation
116+
tags:
117+
- Conversations
23118
summary: Start new conversation
24119
description: |
25120
Start a new conversation.
@@ -43,6 +138,8 @@ paths:
43138
/conversations/{conversationId}/messages:
44139
post:
45140
operationId: addMessage
141+
tags:
142+
- Conversations
46143
summary: Add message to the conversation
47144
tags:
48145
- Conversations
@@ -133,6 +230,8 @@ paths:
133230
# /conversations/{conversationId}:
134231
# get:
135232
# operationId: getConversation
233+
# tags:
234+
# - Conversations
136235
# summary: Get a conversation
137236
# parameters:
138237
# - $ref: "#/components/parameters/conversationId"
@@ -152,6 +251,8 @@ paths:
152251
/conversations/{conversationId}/messages/{messageId}/rating:
153252
post:
154253
operationId: rateMessage
254+
tags:
255+
- Conversations
155256
summary: Rate message
156257
tags:
157258
- Conversations
@@ -176,6 +277,8 @@ paths:
176277
/conversations/{conversationId}/messages/{messageId}/comment:
177278
post:
178279
operationId: commentMessage
280+
tags:
281+
- Conversations
179282
summary: Add comment to assistant message
180283
tags:
181284
- Conversations
@@ -972,6 +1075,68 @@ components:
9721075
required: [type, call_id, output, status]
9731076
ErrorResponse:
9741077
type: object
1078+
SearchResponse:
1079+
type: object
1080+
properties:
1081+
results:
1082+
type: array
1083+
items:
1084+
$ref: "#/components/schemas/Chunk"
1085+
Chunk:
1086+
type: object
1087+
properties:
1088+
url:
1089+
type: string
1090+
description: The URL of the search result.
1091+
title:
1092+
type: string
1093+
description: Title of the search result.
1094+
text:
1095+
type: string
1096+
description: Chunk text
1097+
metadata:
1098+
type: object
1099+
properties:
1100+
sourceName:
1101+
type: string
1102+
description: The name of the source.
1103+
sourceType:
1104+
type: string
1105+
tags:
1106+
type: array
1107+
items:
1108+
type: string
1109+
additionalProperties: true
1110+
ListDataSourcesResponse:
1111+
type: object
1112+
properties:
1113+
dataSources:
1114+
type: array
1115+
items:
1116+
$ref: "#/components/schemas/DataSourceMetadata"
1117+
DataSourceMetadata:
1118+
type: object
1119+
required:
1120+
- id
1121+
properties:
1122+
id:
1123+
type: string
1124+
description: The name of the data source.
1125+
versions:
1126+
type: array
1127+
items:
1128+
type: object
1129+
properties:
1130+
label:
1131+
type: string
1132+
description: Version label
1133+
isCurrent:
1134+
type: boolean
1135+
description: Whether this version is current active version.
1136+
description: List of versions for this data source.
1137+
type:
1138+
type: string
1139+
description: The type of the data source.
9751140
parameters:
9761141
conversationId:
9771142
name: conversationId
@@ -987,3 +1152,19 @@ components:
9871152
schema:
9881153
type: string
9891154
description: The unique identifier for a message.
1155+
1156+
tags:
1157+
- name: Content
1158+
x-displayName: Search Content
1159+
description: Search MongoDB content
1160+
- name: Conversations
1161+
x-displayName: Conversations
1162+
description: Interact with MongoDB Chatbot
1163+
1164+
x-tagGroups:
1165+
- name: Content
1166+
tags:
1167+
- Content
1168+
- name: Conversations
1169+
tags:
1170+
- Conversations

packages/chatbot-server-mongodb-public/src/config.ts

Lines changed: 34 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,10 @@ import {
1717
AddCustomDataFunc,
1818
FilterPreviousMessages,
1919
makeDefaultFindVerifiedAnswer,
20-
defaultCreateConversationCustomData,
21-
defaultAddMessageToConversationCustomData,
2220
makeVerifiedAnswerGenerateResponse,
21+
addDefaultCustomData,
22+
ConversationsRouterLocals,
23+
ContentRouterLocals,
2324
addMessageToConversationVerifiedAnswerStream,
2425
responsesVerifiedAnswerStream,
2526
type MakeVerifiedAnswerGenerateResponseParams,
@@ -34,8 +35,18 @@ import {
3435
import { redactConnectionUri } from "./middleware/redactConnectionUri";
3536
import path from "path";
3637
import express from "express";
37-
import { makeMongoDbPageStore, logger } from "mongodb-rag-core";
38-
import { wrapOpenAI, wrapTraced } from "mongodb-rag-core/braintrust";
38+
import {
39+
makeMongoDbPageStore,
40+
makeMongoDbSearchResultsStore,
41+
logger,
42+
} from "mongodb-rag-core";
43+
import { createAzure, wrapLanguageModel } from "mongodb-rag-core/aiSdk";
44+
import {
45+
makeBraintrustLogger,
46+
BraintrustMiddleware,
47+
wrapOpenAI,
48+
wrapTraced,
49+
} from "mongodb-rag-core/braintrust";
3950
import { AzureOpenAI } from "mongodb-rag-core/openai";
4051
import { MongoClient } from "mongodb-rag-core/mongodb";
4152
import {
@@ -57,13 +68,9 @@ import {
5768
responsesApiStream,
5869
addMessageToConversationStream,
5970
} from "./processors/generateResponseWithTools";
60-
import {
61-
makeBraintrustLogger,
62-
BraintrustMiddleware,
63-
} from "mongodb-rag-core/braintrust";
6471
import { makeMongoDbScrubbedMessageStore } from "./tracing/scrubbedMessages/MongoDbScrubbedMessageStore";
6572
import { MessageAnalysis } from "./tracing/scrubbedMessages/analyzeMessage";
66-
import { createAzure, wrapLanguageModel } from "mongodb-rag-core/aiSdk";
73+
import { makeFindContentWithMongoDbMetadata } from "./processors/findContentWithMongoDbMetadata";
6774
import { makeMongoDbAssistantSystemPrompt } from "./systemPrompt";
6875
import { makeFetchPageTool } from "./tools/fetchPage";
6976
import { makeCorsOptions } from "./corsOptions";
@@ -129,6 +136,11 @@ export const embeddedContentStore = makeMongoDbEmbeddedContentStore({
129136
},
130137
});
131138

139+
export const searchResultsStore = makeMongoDbSearchResultsStore({
140+
connectionUri: MONGODB_CONNECTION_URI,
141+
databaseName: MONGODB_DATABASE_NAME,
142+
});
143+
132144
export const verifiedAnswerConfig = {
133145
embeddingModel: OPENAI_VERIFIED_ANSWER_EMBEDDING_DEPLOYMENT,
134146
findNearestNeighborsOptions: {
@@ -306,7 +318,7 @@ export const makeGenerateResponse = (args?: MakeGenerateResponseParams) =>
306318

307319
export const createConversationCustomDataWithAuthUser: AddCustomDataFunc =
308320
async (req, res) => {
309-
const customData = await defaultCreateConversationCustomData(req, res);
321+
const customData = await addDefaultCustomData(req, res);
310322
if (req.cookies.auth_user) {
311323
customData.authUser = req.cookies.auth_user;
312324
}
@@ -350,11 +362,20 @@ export async function closeDbConnections() {
350362
logger.info(`Segment logging is ${segmentConfig ? "enabled" : "disabled"}`);
351363

352364
export const config: AppConfig = {
365+
contentRouterConfig: {
366+
findContent: makeFindContentWithMongoDbMetadata({
367+
findContent,
368+
classifierModel: languageModel,
369+
}),
370+
searchResultsStore,
371+
embeddedContentStore,
372+
middleware: [requireValidIpAddress<ContentRouterLocals>(), requireRequestOrigin<ContentRouterLocals>()],
373+
},
353374
conversationsRouterConfig: {
354375
middleware: [
355376
blockGetRequests,
356-
requireValidIpAddress(),
357-
requireRequestOrigin(),
377+
requireValidIpAddress<ConversationsRouterLocals>(),
378+
requireRequestOrigin<ConversationsRouterLocals>(),
358379
useSegmentIds(),
359380
redactConnectionUri(),
360381
cookieParser(),
@@ -363,10 +384,7 @@ export const config: AppConfig = {
363384
? createConversationCustomDataWithAuthUser
364385
: undefined,
365386
addMessageToConversationCustomData: async (req, res) => {
366-
const defaultCustomData = await defaultAddMessageToConversationCustomData(
367-
req,
368-
res
369-
);
387+
const defaultCustomData = await addDefaultCustomData(req, res);
370388
const customData = {
371389
...defaultCustomData,
372390
};

0 commit comments

Comments
 (0)