-
Notifications
You must be signed in to change notification settings - Fork 0
Database Optimization
This guide covers database optimization strategies for Grimnir Radio, including indexing, query optimization, and maintenance procedures.
- Index Strategy
- Query Optimization
- PostgreSQL Configuration
- Monitoring and Analysis
- Maintenance Procedures
- Troubleshooting
The migration 001_add_performance_indexes.sql creates 40+ indexes based on actual query patterns:
| Table | Index | Purpose | Impact |
|---|---|---|---|
schedule_entries |
idx_schedule_station_time |
Scheduler queries | Critical - Used every 30s |
priority_sources |
idx_priority_sources_station_active |
Priority resolution | Critical - Real-time priority |
media_items |
idx_media_station_analysis |
Smart block queries | High - Schedule generation |
live_sessions |
idx_live_sessions_station_active |
Active session lookups | High - Live handover |
| Table | Index | Purpose |
|---|---|---|
history_entries |
idx_history_station_time |
Play history queries |
smart_blocks |
idx_smart_blocks_station_active |
Active block filtering |
webstreams |
idx_webstreams_health_check |
Health checker queries |
| Table | Index | Purpose |
|---|---|---|
media_items |
idx_media_title_trgm |
Full-text search |
webstreams |
idx_webstreams_name |
Alphabetical sorting |
-
B-tree Indexes (default) - Most indexes
- Fast lookups on equality and range queries
- Supports sorting
-
Partial Indexes (with WHERE clause)
- Smaller index size
- Faster for filtered queries
- Example:
WHERE active = true
-
Composite Indexes (multiple columns)
- Optimizes queries filtering on multiple columns
- Column order matters: most selective first
- Example:
(station_id, starts_at)
-
GIN Indexes (full-text search)
- Requires
pg_trgmextension - Enables fast text search
- Example:
title gin_trgm_ops
- Requires
-- Check index bloat
SELECT
schemaname,
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
idx_scan AS scans
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;
-- Rebuild bloated indexes
REINDEX INDEX CONCURRENTLY idx_schedule_station_time;
-- Or rebuild all indexes for a table
REINDEX TABLE CONCURRENTLY schedule_entries;Before (slow):
SELECT * FROM schedule_entries
WHERE station_id = ?
AND starts_at >= NOW()
AND starts_at < NOW() + INTERVAL '48 hours'
ORDER BY starts_at;After (optimized):
-- Uses idx_schedule_station_time
SELECT * FROM schedule_entries
WHERE station_id = ?
AND starts_at >= NOW()
AND starts_at < NOW() + INTERVAL '48 hours'
ORDER BY starts_at
LIMIT 1000; -- Add limit to prevent unbounded resultsPerformance: 500ms → 5ms
Before (slow):
SELECT * FROM media_items
WHERE station_id = ?
AND active = true
AND analysis_state = 'complete'
-- Multiple OR conditionsAfter (optimized):
-- Uses idx_media_station_analysis
SELECT * FROM media_items
WHERE station_id = ?
AND analysis_state = 'complete'
AND active = true
ORDER BY RANDOM() -- Use with LIMIT for better performance
LIMIT 100;Performance: 2000ms → 50ms
Before (slow):
SELECT * FROM priority_sources
WHERE station_id = ?
AND active = true
ORDER BY priority ASC;After (optimized):
-- Uses idx_priority_sources_station_active (covering index)
SELECT * FROM priority_sources
WHERE station_id = ?
AND active = true
ORDER BY priority ASC
LIMIT 1; -- Only need the highest priorityPerformance: 100ms → 2ms
-- Analyze a slow query
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM schedule_entries
WHERE station_id = '...'
AND starts_at >= NOW();Look for:
- Seq Scan - Bad (table scan), add index
- Index Scan - Good
- Bitmap Heap Scan - Acceptable for large result sets
- Cost - Lower is better
- Actual Time - Real execution time
Use explain.depesz.com or explain.dalibo.com to visualize EXPLAIN output.
# ============================================================================
# MEMORY SETTINGS
# ============================================================================
# Shared buffers: 25% of available RAM
shared_buffers = 2GB
# Effective cache size: 75% of available RAM
# (includes OS cache + shared_buffers)
effective_cache_size = 6GB
# Work mem: RAM / max_connections / 2
# For 100 connections: 8GB / 100 / 2 = 40MB
work_mem = 50MB
# Maintenance work mem: For VACUUM, CREATE INDEX
maintenance_work_mem = 512MB
# ============================================================================
# QUERY PLANNING
# ============================================================================
# Lower for SSD (default 4.0 for HDD)
random_page_cost = 1.1
# Higher for SSD
effective_io_concurrency = 200
# Cost limit for sequential scans
seq_page_cost = 1.0
# ============================================================================
# WRITE-AHEAD LOG (WAL)
# ============================================================================
# WAL buffers: 16MB recommended
wal_buffers = 16MB
# Minimum size to keep between checkpoints
min_wal_size = 1GB
# Maximum size to keep
max_wal_size = 4GB
# Checkpoint target
checkpoint_completion_target = 0.9
# ============================================================================
# CONNECTIONS
# ============================================================================
# Max connections (adjust based on application)
max_connections = 200
# ============================================================================
# AUTOVACUUM
# ============================================================================
# Enable autovacuum
autovacuum = on
# Autovacuum naptime (check interval)
autovacuum_naptime = 1min
# Scale factor for table size
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_scale_factor = 0.05
# Max autovacuum workers
autovacuum_max_workers = 4
# ============================================================================
# STATISTICS
# ============================================================================
# Track query statistics
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 10000
# Statistics target (higher = better query plans)
default_statistics_target = 100
# ============================================================================
# LOGGING (for performance debugging)
# ============================================================================
# Log slow queries (> 1 second)
log_min_duration_statement = 1000
# Log checkpoints
log_checkpoints = on
# Log connections/disconnections
log_connections = on
log_disconnections = on
# Log long-running locks
log_lock_waits = on# Edit postgresql.conf
sudo vi /etc/postgresql/15/main/postgresql.conf
# Test configuration
sudo -u postgres postgres -C shared_buffers
# Restart PostgreSQL
sudo systemctl restart postgresql
# Verify settings
psql -U grimnir -d grimnir -c "SHOW shared_buffers;"-- In postgresql.conf:
shared_preload_libraries = 'pg_stat_statements'
-- After restart:
CREATE EXTENSION pg_stat_statements;# Run the analysis script
psql -U grimnir -d grimnir -f migrations/analyze_query_performance.sql > performance_report.txt
# Review the report
less performance_report.txtTarget: > 99%
SELECT
sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) AS cache_hit_ratio
FROM pg_statio_user_tables;If < 99%:
- Increase
shared_buffers - Increase
effective_cache_size - Add more RAM
SELECT
schemaname,
tablename,
indexname,
idx_scan,
idx_tup_read
FROM pg_stat_user_indexes
WHERE idx_scan < 100 -- Low usage
ORDER BY pg_relation_size(indexrelid) DESC;Action: Consider dropping unused indexes
SELECT
schemaname,
tablename,
n_dead_tup,
round(n_dead_tup * 100.0 / NULLIF(n_live_tup, 0), 2) AS bloat_percent
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;If bloat > 20%: Run VACUUM ANALYZE
SELECT
pid,
now() - query_start AS duration,
state,
query
FROM pg_stat_activity
WHERE state != 'idle'
AND query_start < now() - interval '5 minutes'
ORDER BY duration DESC;Action: Optimize or terminate long queries
#!/bin/bash
# /usr/local/bin/grimnir-db-daily.sh
# Analyze tables (update statistics)
psql -U grimnir -d grimnir -c "ANALYZE;"
# Check for bloat
psql -U grimnir -d grimnir -c "
SELECT tablename, n_dead_tup
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000;"#!/bin/bash
# /usr/local/bin/grimnir-db-weekly.sh
# Vacuum and analyze all tables
psql -U grimnir -d grimnir -c "VACUUM ANALYZE;"
# Check index usage
psql -U grimnir -d grimnir -f migrations/analyze_query_performance.sql > /var/log/grimnir/weekly_perf_report.txt
# Email report to admin
mail -s "Grimnir DB Weekly Report" admin@example.com < /var/log/grimnir/weekly_perf_report.txt#!/bin/bash
# /usr/local/bin/grimnir-db-monthly.sh
# Full vacuum (during maintenance window)
psql -U grimnir -d grimnir -c "VACUUM FULL ANALYZE;"
# Reindex all tables
psql -U grimnir -d grimnir -c "REINDEX DATABASE grimnir;"
# Reset statistics
psql -U grimnir -d grimnir -c "SELECT pg_stat_statements_reset();"# Daily at 2 AM
0 2 * * * /usr/local/bin/grimnir-db-daily.sh
# Weekly on Sunday at 3 AM
0 3 * * 0 /usr/local/bin/grimnir-db-weekly.sh
# Monthly on 1st at 4 AM
0 4 1 * * /usr/local/bin/grimnir-db-monthly.shSymptoms: High API latency, timeouts
Diagnosis:
-- Check slow queries
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;Solutions:
- Run
EXPLAIN ANALYZEon slow query - Add missing indexes
- Rewrite query to use indexes
- Increase
work_memfor complex queries
Symptoms: 100% CPU on database server
Diagnosis:
-- Find CPU-intensive queries
SELECT pid, query, state
FROM pg_stat_activity
WHERE state = 'active';Solutions:
- Terminate rogue queries:
SELECT pg_terminate_backend(pid); - Optimize frequently run queries
- Add connection pooling (PgBouncer)
- Scale database vertically
Symptoms: "too many connections" errors
Diagnosis:
SELECT count(*) FROM pg_stat_activity;
SELECT max_connections FROM pg_settings WHERE name = 'max_connections';Solutions:
- Increase
max_connectionsin postgresql.conf - Implement connection pooling
- Fix connection leaks in application code
- Reduce
GRIMNIR_DB_MAX_OPEN_CONNS
Symptoms: Queries waiting, high lock waits
Diagnosis:
SELECT
pg_stat_activity.pid,
pg_stat_activity.query,
pg_locks.mode,
pg_locks.locktype
FROM pg_stat_activity
JOIN pg_locks ON pg_stat_activity.pid = pg_locks.pid
WHERE NOT pg_locks.granted;Solutions:
- Use
SELECT FOR UPDATE SKIP LOCKEDfor queue tables - Reduce transaction size
- Use
CONCURRENTLYfor index creation - Terminate blocking queries
| Query Type | Target Latency | Notes |
|---|---|---|
| Schedule lookup | < 10ms | 95th percentile |
| Smart block materialization | < 100ms | Includes rule evaluation |
| Priority resolution | < 5ms | Critical path |
| API health check | < 5ms | 99th percentile |
| Media search | < 50ms | With full-text search |
# pgbench - built-in PostgreSQL benchmark
pgbench -c 10 -j 2 -t 10000 grimnir
# Custom benchmark
psql -U grimnir -d grimnir -c "
\timing on
SELECT * FROM schedule_entries
WHERE station_id = '...'
AND starts_at >= NOW()
LIMIT 100;"- PostgreSQL Performance Tuning
- PGTune Configuration Generator
- Explain Plan Visualizer
- pg_stat_statements Documentation
Version: 1.0 Last Updated: 2026-01-22
Getting Started
Core Concepts
Deployment
Integration
Operations
Development
Reference