oss-data-analyst is an intelligent AI agent that converts natural language questions into SQL queries and provides data analysis. Built with the Vercel AI SDK, it features multi-phase reasoning (planning, building, execution, reporting) and streams results in real-time.
Note: This is a reference architecture. The semantic catalog and schemas included are simplified examples for demonstration purposes. Production implementations should use your own data models and schemas.
- Multi-Phase AI Agent: Planning → Building → Execution → Reporting workflow
- Real-time Streaming: Live updates during query processing
- Smart Data Analysis: Automated insights and visualizations
- SQL Validation: Syntax checking and security policy enforcement
- Natural Language: Ask questions in plain English
- Modern UI: Built with Next.js, React, and TailwindCSS
- Extensible Tools: Easy to add custom tools and capabilitiets
- Node.js 20.19.3+
- pnpm 8.15.0+
- AI Gateway API key
-
Clone the repository
git clone https://github.com/vercel/oss-data-analyst.git cd oss-data-analyst -
Install dependencies
pnpm install
-
Set up environment variables
cp env.local.example .env.local
Edit
.env.localand add your Vercel AI Gateway key -
Initialize the database
pnpm initDatabase
This creates a SQLite database with sample data (Companies, People, Accounts)
-
Run the development server
pnpm dev
-
Open your browser Navigate to
http://localhost:3000
pnpm build
pnpm startThis repository includes a sample database schema with three main entities to demonstrate oss-data-analyst's capabilities:
Represents organizations in your database. Each company has:
- Basic information (name, industry, employee count)
- Business metrics (founded date, status)
- Example: Technology companies, Healthcare organizations, etc.
Represents customer accounts or subscriptions tied to companies. Each account includes:
- Account identification (account number, status)
- Financial metrics (monthly recurring value, contract details)
- Relationship to parent company
- Example: Active subscriptions with monthly values ranging from $10k-$50k
Represents individual employees or contacts within companies. Each person has:
- Personal information (name, email)
- Employment details (department, title, salary)
- Relationship to their company
- Example: Engineers, Sales representatives, Managers across different departments
oss-data-analyst uses a multi-phase agentic workflow:
-
Planning Phase
- Analyzes natural language query
- Searches semantic catalog for relevant entities
- Identifies required data and relationships
- Generates execution plan
-
Building Phase
- Constructs SQL query from plan
- Validates syntax and security policies
- Optimizes query structure
- Finds join paths between tables
-
Execution Phase
- Estimates query cost
- Executes SQL against database
- Handles errors with automatic repair
- Streams results
-
Reporting Phase
- Formats query results
- Generates visualizations (charts, tables)
- Provides natural language explanations
- Performs sanity checks on data
Modify system prompts in src/lib/prompts/:
planning.ts- Planning phase behaviorbuilding.ts- SQL generation logicexecution.ts- Query execution handlingreporting.ts- Results interpretation
Try asking oss-data-analyst (using the sample database):
- "How many companies are in the Technology industry?"
- "What is the average salary by department?"
- "Show me the top 5 accounts by monthly value"
- "Which companies have the most employees?"
- "What is the total revenue for Active accounts?"
- "How many people work in Engineering?"
The default setup uses SQLite for demonstration. To use with Snowflake or other databases:
- Update
src/lib/oss-data-analyst-agent-advanced.tsto import from./tools/executeinstead of./tools/execute-sqlite - Configure your database credentials in
.env.local - Update the semantic catalog in
src/lib/semantic/with your schema definitions
Database Not Found
- Run
pnpm initDatabaseto create and seed the database - Check that
data/oss-data-analyst.dbexists
AI Gateway API Errors
- Verify your API key is valid in
.env.local - Check API rate limits and credits
Build Errors
- Run
pnpm installto update dependencies - Check TypeScript errors with
pnpm run type-check - Clear
.nextfolder and rebuild