Ensure you have uv and Python 3.11 installed. The .python-version file tells uv which Python version to install and use when running this project.
Later Python versions cause unstructured-inference to throw errors (tested on 3.13).
- Create a virtual environment and activate it:
uv venv
source .venv/bin/activate- Copy
.env.exampleinto.envand fill in the required environment variables:
cp .env.example .env
nano .envOPENAI_API_KEY: Your OpenAI API key (found here)MONGO_URI: MongoDB connection string (e.g.,mongodb://localhost:27017)MONGO_DB_NAME: Database name (default:agent_workflow)
- Unstructured has many required dependencies to get OCR and document extraction working.
poppler(for PDFs):
choco install poppler # Windows
brew install poppler # macOS
apt install -y poppler-utilslibmagic,libreoffice,pandoc,tesseractare the other required libraries, and installing the Python wrappers may or may not suffice. If all else fails, use the Docker installation.
- Run the server for development:
uv run fastapi devYou can run the test suite using pytest with --dev dependencies installed:
pytest -vSwagger docs can be found at the /docs endpoint, e.g. http://localhost:8000/docs.
You can also use the Postman collection.
With Docker Compose installed, you can start the project with:
docker compose up --buildGET /agents- Get all agentsPOST /agents- Create a new research agentGET /agents/{agent_id}- Retrieve agent detailsDELETE /agents/{agent_id}- Delete an agentPOST /agents/{agent_id}/queries- Send research queries to agentPUT /agents/{agent_id}/websites- Add website content to agent's knowledge basePUT /agents/{agent_id}/files- Add file content to agent's knowledge base
-
Added endpoint to view all agents (not in spec)
-
Agent creation (
POST /agents)- Instead of supplying an object/string in
agent_post, the request form can be supplied withname,prompt(bothstr),files(list[UploadFile]) andwebsites(list[str]). They are not required. See the Swagger docs for more info - Returns ID of agent document
- Instead of supplying an object/string in
-
Knowledge base
- Uses a SHA256 hashing to prevent duplicate file processing. The files are stored as ID references in
agent.files. If the file has been uploaded before, the previous copy's ID will be appended to the agent's file list. - Similarly, websites are cleaned with
courlanand the URLs stored inFile.nameto prevent saving duplicate links. - Adds a
created_atfield to track when a file was added
- Uses a SHA256 hashing to prevent duplicate file processing. The files are stored as ID references in
.
├── app/ # Main application directory
│ ├── agents/ # Agent implementations and behaviors
│ ├── api/ # FastAPI route definitions
│ ├── models/ # Database models (Beanie/MongoDB)
│ ├── schemas/ # Pydantic schemas for request/response
│ ├── services/ # Business logic and service layer
│ └── main.py # FastAPI application entry point
├── docs/ # Project documentation
├── tests/ # Test suite
└── README.md # Project documentation
-
FileandWebsiteare defined within the same model but they should be stored separately. The website scraping logic is currently rudimentary and should be further expanded on in the future, such as by only usingNarrativeTexttype elements or removing large whitespaces. -
OCR and text extraction are supported by
unstructuredbut can be tweaked further with individual logic per file type. -
While uploading files/websites, if the total number of tokens of an agent's knowledge base exceeds 120k, an error is raised. GPT-4o-mini's max input context window is 128k, so the 8k difference is reasonable. As we can supply the agents custom prompts that may exceed 8k, the custom prompt length should be taken into account as well.