Skip to content

mikechao/company-researcher

Repository files navigation

Company Researcher Agent

An AI Agent that will search the web for information based on a company name and report schema provided by the user.

This is partially a port of the Python project from LangChain AI

Currently live at https://company-researcher-orpin.vercel.app/

How it works

The Company Researcher Agent leverages LangGraph in the backend to execute the research task while using NuxtJS on the frontend to convey results and progress to the user.

The graph constructed with LangGraph

Langgraph-graph

  1. We start with the generateQueries node, where we ask the AI to generate search queries that are related to the company name and report schema provided by the user.
  2. We than move on to the researchCompany node, where the search queries are executed with the Tavily API. Then we ask the LLM to take notes from the search results that are relevant to the report schema from the search results.
  3. The gatherNotesExtractSchema is the next step where we ask the LLM to fill in the report schema provided by the user using notes taken in the previous step.
  4. In the reflection step we ask the LLM to look over the filled in report schema and evaluate if it is satisfactory. This is determined by Are any required fields missing?, Are any fields incomplete or containing uncertain information? and Are there fields with placeholder values or "unknown" marker? If the reflection is satisfactory a reason will be provided and the results shown to the user. If the reflection is not satisfactory 1 to 3 search queries are provided to fill in missing fields and we go back to the researchCompany node to start the process again.

Making it work in a serverless environment

When deploying to Vercel we need to take into account the time limit that is placed on the execution of HTTP endpoint calls. Vercel puts in these limits to prevent run away consumption of resources. There is a good chance that executing the LangGraph graph as construed above would run into these limits. (Default 10 seconds, configurable up to 60s)

Because of these limits on execution I introduced a new node in the graph where I used interrupt to wait for feedback from the "user". LangGraph Interrupt The "user" in this case is just the frontend where I programmatically send a message to continue execution of the graph. Each node in the previous graph is now route to the waitForResponse node where interrupt is used.

The resulting graph

Lang Graph Step

Sending progress and intermediate results

I used a combination of LangChain/LangGraph's stream events api and custom events to display progress to the user. LangGraph Stream Events When a custom event is surfaced as part of the ReadableStream returned from the backend I used Vercel AI SDK stream protocol to send the data to be processed/handled by the front end.

🛠️ Installation Steps

  1. Get an Anthropic API Key
  2. Get a Tavily API Key
  3. Get a Postgres URL
  4. Create a .env by following env example
  5. Install project dependencies
pnpm install
  1. Start the development server on http://localhost:3000
pnpm dev

👷 Built with

Name Link Usage
NuxtJS My Skills Building pages, interactions and server apis
TypeScript My Skills Static typing, better autocompletion
Pinia My Skills Management of intermediate research results
Langchain Integration with LLM managing memory and prompts
LangGraph Orchestrating the research workflow
Pnpm My Skills Manage JavaScript packages
Postgres My Skills Persistence of chat messages
Tailwind Css My Skills CSS Styling
Vite My Skills Build tool
Visual Studio Code My Skills Code Editor
Vercel My Skills App hosting and useChat composable
Zod Defining structured output for LLMs, form validation, REST Endpoint input validation

🚧 Issues

  • In the Python version from LangChain they are able to specify an Input, Overall and Output State for the graph. However in the JavaScript/TypeScript version of LangGraph this causes a type mismatch error. The same type mismatch error is exist in their multi-schema example The type mismatch error is also shown in multi.post.ts The following Github issue is tracking this, but there doesn't seem to be much activity.

Releases

No releases published

Packages

 
 
 

Contributors

Languages