InsightGraph is a local-first AI career and market intelligence workspace for technical job seekers, researchers, and builders. It ingests web pages, job posts, company pages, PDFs, GitHub-style sources, and pasted documents, then turns them into cited RAG answers, a knowledge graph, source trust controls, company dossiers, skill velocity charts, watchlists, and an application shortlist.
The current product wedge is practical and narrow: help a user answer which companies to track, which skills are rising, which roles are showing up, what changed recently, and whether the sources behind an answer are trustworthy.
- Crawl and ingest real sources from topic search, URLs, and pasted text.
- Extract clean chunks, entities, relationships, skills, roles, companies, tools, locations, and evidence snippets.
- Store chunks in Qdrant for semantic retrieval.
- Store structured data in Postgres and graph relationships for exploration.
- Provide cited chat answers with retrieved evidence and confidence metadata.
- Show an interactive graph with edge evidence.
- Score source reliability and allow trust, neutral, or untrusted source overrides.
- Build AI career market maps from indexed evidence.
- Show company dossiers with hiring signals, stack, skills, sources, and evidence.
- Compare pasted resume text against market demand.
- Track skill velocity from recent ingestions.
- Save watchlists and generate change briefs.
- Recommend and shortlist jobs from indexed evidence.
- Support local OSS LLM defaults and optional Anthropic Claude generation with a user-provided key.
These examples were captured from a local run after ingesting AI/RAG hiring and market sources. They are sample outputs, not static mock data.
InsightGraph is not a magic job search engine and does not guarantee employment outcomes. Its quality depends on source quality. Targeted company career pages, real job posts, GitHub repos, and trusted articles produce much better results than broad noisy job-board pages.
Scheduled watchlists are not fully automated yet; manual watchlist refresh works now. Claude is optional and only used for chat/generation, not embeddings.
| Layer | Technology |
|---|---|
| Web UI | Next.js App Router, TypeScript, Tailwind CSS, Cytoscape.js |
| API | FastAPI, Pydantic, SQLAlchemy, Alembic |
| Worker | Celery, Redis |
| Crawling | Crawl4AI, Playwright-compatible crawler stack |
| Search discovery | SearxNG by default, optional Brave, Tavily, SerpAPI |
| Database | Postgres |
| Vector search | Qdrant |
| Graph store | Memgraph |
| Object storage | MinIO |
| Observability | Phoenix / OpenTelemetry |
| Local AI | Ollama default, vLLM optional |
| Paid generation | Anthropic Claude Messages API, optional |
.
+-- apps/web # Single local Next.js TypeScript web app
+-- services/api # FastAPI backend, schemas, providers, RAG, migrations
+-- services/worker # Celery ingestion worker
+-- infra # Docker Compose, Dockerfiles, service config
+-- pipelines # Future Airflow pipeline scaffold
+-- docs/assets # README and project visual assets
+-- package.json # Root scripts
+-- .env.example # Local environment template
- macOS, Linux, or Windows with WSL2.
- Node.js 20 or newer.
- npm.
- Docker Desktop or Docker Engine.
- Optional: Ollama if you want local model generation outside Docker.
- Optional: Anthropic API key if you want Claude responses.
- Optional: Brave, Tavily, or SerpAPI key if you want hosted search in addition to SearxNG.
Check basics:
node --version
npm --version
docker --version
docker compose versionFrom the project root:
cd "/Users/karanchandradey/Downloads/AI Portfolio InsightGraph"
npm install
cp .env.example .envStart the backend stack:
npm run dev:stackThis starts Postgres, Redis, Qdrant, Memgraph, MinIO, Phoenix, SearxNG, FastAPI, and the Celery worker.
In a second terminal, start the single local Next.js web app:
npm run devOpen:
http://localhost:3001
| Service | URL |
|---|---|
| Web UI | http://localhost:3001 |
| API | http://localhost:8000 |
| API docs | http://localhost:8000/docs |
| Qdrant | http://localhost:6333 |
| Memgraph Bolt | localhost:7687 |
| MinIO console | http://localhost:9001 |
| Phoenix | http://localhost:6006 |
| SearxNG | http://localhost:8080 |
MinIO default local credentials:
user: insightgraph
password: insightgraph-secret
Copy the template:
cp .env.example .envImportant defaults:
WEB_ORIGIN=http://localhost:3001
SEARCH_PROVIDER=searxng,brave,tavily,serpapi
SEARXNG_URL=http://searxng:8080
DEFAULT_LLM_PROVIDER=ollama
OLLAMA_MODEL=qwen2.5:7b-instruct
OLLAMA_EMBEDDING_MODEL=embeddinggemmaFor local development, the API and worker run inside Docker, so service URLs point to Docker service names such as postgres, qdrant, and searxng.
Terminal 1:
npm run dev:stackTerminal 2:
npm run devThe web app uses the same-origin /v1/* gateway and proxies to FastAPI when the backend is running.
The Docker web image exists, but it is not used by default because the project should have one local Next.js app during development.
Run the containerized web profile only when intentionally testing the web image:
npm run dev:stack:webdocker compose -f infra/docker-compose.yml downTo stop and remove local Docker volumes, which deletes indexed data:
docker compose -f infra/docker-compose.yml down -vUse the left sidebar.
You can ingest:
- A topic, such as
AI infra startups hiring RAG engineers. - Specific URLs.
- Pasted source text.
Click Queue ingestion, then open the Jobs tab to watch progress.
What happens:
- Search discovery finds sources when a topic is provided.
- The worker crawls and extracts text.
- Content is converted to markdown-like clean text.
- Documents are chunked.
- Chunks are embedded and stored in Qdrant.
- Entities and relationships are extracted.
- Graph evidence and trust metadata are stored.
Open the Career tab.
You can:
- Save a target profile.
- Set target role, keywords, preferred stack, locations, and resume text.
- Inspect skill velocity.
- Open company dossiers.
- See hiring roles, stack, skills, locations, sources, and evidence.
- Run resume-to-market gap scoring.
- Get evidence-backed project recommendations.
Good target profile examples:
Target role: AI/RAG engineer
Keywords: RAG, agent, retrieval, evaluation, LLM infra
Preferred stack: Qdrant, LangGraph, FastAPI, Docker, OpenTelemetry
Locations: Remote, Bangalore, San Francisco
Open the Applications tab.
You can:
- Review job recommendations scored from indexed evidence.
- See matched and missing skills.
- Add jobs to a persistent shortlist.
- Track status: tracking, applied, interviewing, rejected, archived.
This is intentionally not an auto-apply tool. It is a decision workspace.
Open the Chat tab and ask questions like:
Show me companies hiring for agentic RAG roles, compare their stack, and map skills I need.
The answer includes:
- Provider and model used.
- Retrieved evidence.
- Citation links.
- Source reliability.
- Trust status.
- Retrieval strategy and confidence metadata.
Open the Graph tab.
Use it to inspect:
- Companies.
- Roles.
- Tools.
- Skills.
- Locations.
- Relationships.
- Evidence snippets behind graph edges.
Open the Sources tab.
You can mark each source as:
trustedneutraluntrusted
Trust status affects source reliability scoring and retrieval ranking.
Open the Watchlists tab.
You can:
- Save a market topic.
- Run a manual refresh.
- Generate a change brief.
- See new companies, roles, skills, documents, and weekly skill velocity.
Example watchlist:
Name: Agentic RAG hiring
Topic: AI infra startups hiring RAG agent engineers
Pages: 20
The app is designed to run without paid APIs. If no generation provider works, chat falls back to extractive answers from indexed evidence.
Ollama settings:
DEFAULT_LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://ollama:11434
OLLAMA_MODEL=qwen2.5:7b-instruct
OLLAMA_EMBEDDING_MODEL=embeddinggemmaStart the optional Ollama container profile:
docker compose -f infra/docker-compose.yml --profile local-models up ollamaThen pull models inside the Ollama environment if needed:
ollama pull qwen2.5:7b-instruct
ollama pull embeddinggemmaClaude support is optional and paid. It is used for generation/chat, not embeddings.
Option 1: environment variables in .env:
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
ANTHROPIC_BASE_URL=https://api.anthropic.com
ANTHROPIC_VERSION=2023-06-01Option 2: store a workspace key from the UI:
- Open the left sidebar.
- Paste the Claude API key.
- Confirm the model.
- Click
Store key.
Keys are encrypted before being stored in Postgres and are never exposed to the browser bundle.
Run the optional vLLM profile:
VLLM_MODEL=Qwen/Qwen2.5-7B-Instruct docker compose -f infra/docker-compose.yml --profile vllm up vllmDefault search discovery uses local SearxNG:
SEARCH_PROVIDER=searxng,brave,tavily,serpapi
SEARXNG_URL=http://searxng:8080Optional providers:
BRAVE_SEARCH_API_KEY=...
TAVILY_API_KEY=...
SERPAPI_API_KEY=...Provider order is controlled by SEARCH_PROVIDER:
SEARCH_PROVIDER=searxng
SEARCH_PROVIDER=brave,searxng
SEARCH_PROVIDER=tavily,serpapi,searxngCore:
GET /v1/health
GET /v1/models
POST /v1/llm/test-key
POST /v1/provider-credentials
Ingestion and research:
POST /v1/ingestions
GET /v1/jobs
GET /v1/documents
POST /v1/search
POST /v1/chat
GET /v1/graph
GET /v1/trends
GET /v1/entities
GET /v1/evidence
Trust and cleanup:
PATCH /v1/documents/{id}/trust
PATCH /v1/entities/{id}
POST /v1/entities/{id}/merge
POST /v1/pins
GET /v1/pins
Career intelligence:
GET /v1/career/profile
POST /v1/career/profile
GET /v1/career/market-map
GET /v1/career/company-dossiers
GET /v1/career/company-dossiers/{company_id}
GET /v1/career/skill-velocity
POST /v1/career/skill-gap
GET /v1/career/job-recommendations
GET /v1/career/shortlist
POST /v1/career/shortlist
PATCH /v1/career/shortlist/{item_id}
Watchlists:
POST /v1/watchlists
GET /v1/watchlists
POST /v1/watchlists/{id}/run
GET /v1/watchlists/{id}/brief
GET /v1/watchlists/{id}/weekly-brief
curl http://localhost:8000/v1/healthcurl -X POST http://localhost:8000/v1/ingestions \
-H "content-type: application/json" \
-d '{
"workspace_id": "default",
"topic": "AI infra startups hiring RAG agent engineers",
"max_pages": 20,
"crawl_depth": 1,
"urls": [],
"pasted_sources": []
}'curl -X POST http://localhost:8000/v1/chat \
-H "content-type: application/json" \
-d '{
"workspace_id": "default",
"question": "Which companies are hiring for agentic RAG roles and what stack do they use?",
"provider": "extractive",
"model": "extractive-local",
"retrieval_limit": 12
}'curl -X POST http://localhost:8000/v1/career/profile \
-H "content-type: application/json" \
-d '{
"workspace_id": "default",
"name": "Primary target",
"target_role": "AI/RAG engineer",
"target_keywords": ["RAG", "agent", "retrieval", "evaluation"],
"preferred_stack": ["Qdrant", "LangGraph", "FastAPI", "Docker"],
"preferred_locations": ["Remote"],
"seniority": "mid",
"resume_text": "Python, FastAPI, LangChain, Docker..."
}'curl "http://localhost:8000/v1/career/company-dossiers?workspace_id=default&limit=10"curl "http://localhost:8000/v1/career/job-recommendations?workspace_id=default&limit=20"Run backend unit tests plus web typecheck:
npm testRun only web typecheck:
npm run typecheckRun production web build:
npm run buildIf Turbopack fails with an internal local port permission error in a sandboxed environment, rerun the build in a normal terminal.
Error:
failed to connect to the docker API
Fix:
- Start Docker Desktop.
- Wait until Docker says it is running.
- Run:
npm run dev:stackError:
EADDRINUSE: address already in use :::3001
Find the process:
lsof -i :3001Stop it, or run the web app on another port manually:
npm --workspace apps/web run dev -- -p 3002Check:
curl http://localhost:8000/v1/healthIf it fails, start the backend:
npm run dev:stackCheck the Jobs tab first. Then inspect worker logs:
docker compose -f infra/docker-compose.yml logs --tail=160 worker
docker compose -f infra/docker-compose.yml logs --tail=160 apiCommon causes:
- Docker stack is not running.
- Worker is not connected to Redis.
- Search provider has no results.
- Source pages block crawling.
- Ingestion is still queued or running.
This deletes Postgres, Qdrant, Memgraph, MinIO, Redis, and Phoenix volumes:
docker compose -f infra/docker-compose.yml down -vThen restart:
npm run dev:stack- The single local web app is
apps/web. - The backend applies Alembic migrations on container startup.
- The API creates the default workspace automatically.
- The Next.js app calls
/v1/*through the same-origin gateway. - The Docker
webprofile is optional and should not be run alongside the local dev app unless you intentionally want a containerized web check.
InsightGraph is most useful as a personal research terminal for AI career strategy:
- Track a niche market.
- Identify companies and roles.
- Understand repeated skill demand.
- Compare your resume against real evidence.
- Build a focused application shortlist.
- Inspect source trust before acting.
It is not a replacement for LinkedIn, an auto-apply bot, or a polished commercial job board. It is an evidence-backed decision layer you control locally.
Built by Karan Chandra Dey [K28], Founder and CEO @ K28.
Website: k28art.space
MIT
