Terraform modules for deploying a production-ready RAG (Retrieval-Augmented Generation) chatbot using AWS S3 Vectors for semantic search and Amazon Bedrock for LLM inference.
- S3 Vectors Integration - Native AWS vector search (no external vector DB needed)
- Amazon Bedrock - Serverless LLM inference (Claude 3.5 Sonnet)
- Serverless Architecture - Lambda + API Gateway + DynamoDB
- ZIP Deployment - Simple deployment without Docker complexity
- Modular Design - Reusable Terraform modules
- Production Ready - Monitoring, logging, rate limiting included
S3 Vectors is AWS's managed vector search capability (launched late 2025) that enables semantic search directly on S3 buckets. Key benefits:
- No separate vector database infrastructure
- Integrated with S3 storage
- Pay-per-query pricing
- Automatic index management
- Supports multiple embedding models
┌─────────────┐
│ Client │
└──────┬──────┘
│
▼
┌─────────────┐
│ API Gateway │ (REST API + Rate Limiting)
└──────┬──────┘
│
▼
┌─────────────┐ ┌──────────────┐
│ Lambda │──────▶│ S3 Vectors │ (Semantic Search)
│ (Python) │ │ Index │
└──────┬──────┘ └──────────────┘
│ │
│ ▼
▼ ┌──────────────┐
┌─────────────┐ │ S3 Bucket │ (Knowledge Base Data)
│ Bedrock │ │ (Documents) │
│ (Claude) │ └──────────────┘
└──────┬──────┘
│
▼
┌─────────────┐
│ DynamoDB │ (Sessions, Messages, Analytics)
└─────────────┘
- AWS Account with appropriate permissions
- Terraform >= 1.0
- AWS CLI configured
- Python 3.12+ (for S3 Vectors index building)
- boto3 >= 1.39.9 (for S3 Vectors support)
# Clone the repository
git clone https://github.com/slauger/bedrock-s3-vectors-rag.git
cd bedrock-s3-vectors-rag/examples/basic
# Configure your deployment
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your project name
# Deploy infrastructure
terraform init
terraform plan
terraform apply
# Get your API endpoint
terraform output api_endpoint# Upload your documents to S3
aws s3 cp ./knowledge-base/ s3://my-chatbot-kb-data/website/ --recursive# Install dependencies
cd ../../modules/lambda
pip3 install boto3>=1.40.4
# Build the vector index
python3 build_s3_vectors_index.py \
--bucket my-chatbot-vector-bucket \
--kb-bucket my-chatbot-kb-data \
--kb-prefix website/ \
--index-name kb-index
# This will:
# 1. Read all documents from S3
# 2. Generate embeddings using Bedrock
# 3. Create S3 Vectors index
# 4. Upload vectors to index# Test via curl
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/prod/chat" \
-H "Content-Type: application/json" \
-d '{
"message": "What can you help me with?",
"session_id": "test-123"
}'Serverless chatbot handler with S3 Vectors integration.
Features:
- ZIP-based deployment (no Docker required)
- S3 Vectors retrieval
- Bedrock LLM inference
- DynamoDB session management
- CloudWatch logging
Usage:
module "lambda" {
source = "./modules/lambda"
project_name = "my-chatbot"
s3_vectors_bucket_name = "my-vectors-bucket"
s3_vectors_index_name = "kb-index"
bedrock_embed_model = "amazon.titan-embed-text-v2:0"
model_id = "anthropic.claude-3-5-sonnet-20241022-v2:0"
# ... other config
}REST API with rate limiting and CORS support.
Features:
- API Gateway REST API
- Usage plans and API keys
- Rate limiting (burst + sustained)
- CORS configuration
- CloudWatch access logs
Usage:
module "api_gateway" {
source = "./modules/api_gateway"
project_name = "my-chatbot"
lambda_invoke_arn = module.lambda.invoke_arn
burst_limit = 20
rate_limit = 10
cors_allow_origin = "'https://example.com'"
}S3 buckets for knowledge base data and vectors.
Features:
- KB data bucket with versioning
- Vector storage bucket
- Encryption at rest (AES256)
- Lifecycle policies
- Optional website hosting bucket
Usage:
module "s3" {
source = "./modules/s3"
project_name = "my-chatbot"
vector_bucket_name = "my-vectors"
kb_data_bucket_name = "my-kb-data"
create_website_bucket = false
}Session and message storage.
Features:
- Sessions table (conversation state)
- Messages table (chat history)
- Analytics table (usage metrics)
- TTL for automatic cleanup
- Optional Point-in-Time Recovery
Usage:
module "dynamodb" {
source = "./modules/dynamodb"
project_name = "my-chatbot"
enable_pitr = true
}CloudWatch alarms and dashboards.
Features:
- Lambda error rate alarms
- API Gateway 4xx/5xx alarms
- DynamoDB throttle alarms
- SNS email notifications
- Custom CloudWatch dashboard
Usage:
module "monitoring" {
source = "./modules/monitoring"
project_name = "my-chatbot"
lambda_function_name = module.lambda.function_name
alarm_email = "ops@example.com"
}# terraform.tfvars
project_name = "my-chatbot"
aws_region = "us-east-1"# terraform.tfvars
project_name = "prod-chatbot"
aws_region = "us-east-1"
bedrock_model_id = "anthropic.claude-3-5-sonnet-20241022-v2:0"
lambda_timeout = 60
lambda_memory_size = 1024
enable_pitr = true
enable_monitoring = true
alarm_email = "ops@example.com"
api_burst_limit = 50
api_rate_limit = 25
log_level = "INFO"Approximate monthly costs for moderate usage (us-east-1):
| Service | Usage | Cost |
|---|---|---|
| Lambda | 10K requests (512MB, 30s avg) | ~$0.20 |
| API Gateway | 10K requests | ~$0.035 |
| DynamoDB | 100K read/write units | ~$1.25 |
| S3 Storage | 10 GB | ~$0.23 |
| S3 Vectors Queries | 10K queries | ~$0.50 |
| Bedrock (Claude 3.5) | 1M input + 500K output tokens | ~$6.00 |
| Total | ~$8-10/month |
For production workloads (100K requests/month): ~$50-100/month
| Feature | S3 Vectors | Pinecone | Weaviate | OpenSearch |
|---|---|---|---|---|
| Setup Complexity | ⭐⭐⭐⭐⭐ Low | ⭐⭐⭐ Medium | ⭐⭐ High | ⭐ Very High |
| AWS Integration | ✅ Native | ❌ External | ❌ External | |
| Pricing Model | Pay-per-query | Monthly subscription | Self-hosted | Instance-based |
| Serverless | ✅ Yes | ❌ No | ❌ No | |
| Maintenance | ⭐⭐⭐⭐⭐ None | ⭐⭐⭐ Low | ⭐⭐ Medium | ⭐ High |
| Latency | ~100-200ms | ~50-100ms | ~50-100ms | ~50-100ms |
S3 Vectors is ideal when:
- You want minimal infrastructure complexity
- Your data is already in S3
- You need serverless scaling
- Cost predictability is important
- You don't need sub-50ms latency
Simple deployment with all defaults.
Production deployment with:
- Custom VPC
- Enhanced monitoring
- Multi-region failover
- WAF integration
The project includes automated validation:
Terraform Validation Pipeline (.github/workflows/terraform-validation.yml)
tofu fmt -check- Format validationtflint- Terraform linting (all modules + examples)tofu validate- Syntax validation
Runs on:
- Push to
mainordevelopbranches - Pull requests to
mainordevelop - Manual workflow dispatch
Status Badge:
Before committing, run:
# Format all Terraform files
tofu fmt -recursive
# Run TFLint on all modules
for dir in modules/*/ examples/*/; do
echo "Linting $dir..."
(cd "$dir" && tflint --init && tflint)
donecd modules/lambda
python3 build_s3_vectors_index.py \
--bucket my-vectors \
--kb-bucket my-kb-data \
--kb-prefix website/ \
--index-name kb-index \
--batch-size 100cd modules/lambda/src
python3 -c "
from lambda_function import lambda_handler
event = {
'body': '{\"message\": \"Hello\", \"session_id\": \"test\"}'
}
result = lambda_handler(event, None)
print(result)
"# Check if index exists
import boto3
s3 = boto3.client('s3')
response = s3.head_vector_index(
Bucket='my-vectors',
Index='kb-index'
)
print(response)Increase timeout and memory:
lambda_timeout = 60
lambda_memory_size = 1024Adjust rate limits:
api_burst_limit = 50
api_rate_limit = 25- Multi-region deployment example
- Streaming response support
- Web UI example (React/Next.js)
- Advanced RAG techniques (HyDE, multi-query)
- Custom embedding model support
- Automated KB sync from GitHub/S3
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
MIT License - see LICENSE for details.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Wiki
Built with:
- LangChain - LLM application framework
- AWS Samples - Official AWS examples
- Bedrock Claude Chat - Another Bedrock chatbot example
Made with ❤️ for the AWS community