Query parser OOM on large row-value IN queries

## Query parser OOM on large row-value IN queries
### Environment
- PgDog: v0.1.32 – v0.1.34
- Deployment: Kubernetes, 2 pods, 8Gi memory limit per pod
- Config: `pooler_mode = "transaction"`, `query_parser = "auto"`, `query_cache_limit = 500`
- Backend: 1 primary + 1 replica (so `auto` enables the parser)
- Clients: ~15 Go/Python services, ~50-100 concurrent connections
### Problem
PgDog pods get OOM-killed under sustained load when clients send row-value `IN` queries with hundreds of tuples. Memory grows continuously and is not reclaimed, eventually exceeding the 8Gi container limit.
Example query pattern (generated by SQLAlchemy):
```sql
SELECT raw_vendor, app_hint, ... (35 columns)
FROM desktop_app_hint
WHERE (raw_vendor, app_hint) IN (
  ('vendor1', 'hint1'), ('vendor2', 'hint2'), ... -- 300-600 tuples
)
```
Each unique query produces a large AST via `pg_query`. With `query_cache_limit = 500`, up to 500 of these large ASTs are cached. Under sustained load with varying parameters, the cache fills with large entries and memory grows until OOM.
### Reproduction
We wrote a test that reproduces the issue locally with Docker (256MB memory limit):
```python
# 100 concurrent workers, 500 tuples per query, unique queries each time
# PgDog config: query_parser = "on", query_cache_limit = 50, 1 primary + 1 replica
# Results:
# parser=on,  128MB limit → 86.5% error rate, OOM crash
# parser=off, 128MB limit → 0% errors, +4MB growth
# parser=on,  1GB limit   → 0% errors, +268MB growth
```
Key findings from our testing:
| Config | Memory growth | Errors |
|---|---|---|
| `query_parser = "on"`, 128MB | OOM crash | 86.5% |
| `query_parser = "off"`, 128MB | +4MB | 0% |
| `query_parser = "on"`, 1GB | +268MB | 0% |
| `query_cache_limit = 0` vs `1000` | No difference in peak | Same |
`query_cache_limit` does not affect peak memory — the memory is consumed during **parsing**, not caching. Even with `query_cache_limit = 0`, 100 concurrent large queries allocate hundreds of MB simultaneously.
### Observations
1. **Memory is not returned to the OS after parsing.** Even after queries complete and ASTs are dropped, RSS stays high (Rust allocator retains pages). This means memory only grows over the pod's lifetime.
2. **The parser runs on every query**, even when all queries are wrapped in `BEGIN` transactions and read/write classification adds no value (with `conservative` strategy, all `BEGIN` goes to primary regardless).
3. **Disabling the parser (`query_parser = "off"`) completely eliminates the memory growth** but also disables `BEGIN READ ONLY` → replica routing and `SET` command tracking.
### Impact
- PgDog pods restart every ~10 minutes under production-like load at 8Gi
- Each restart causes `AdminShutdown` / `driver: bad connection` errors across all connected services
- We had to disable the query parser entirely as a workaround, losing read/write routing
### Questions
1. Is there a way to limit per-query parser memory, or skip parsing for queries above a certain size?
2. Could the query cache use a size-based eviction (total bytes) rather than count-based (`query_cache_limit`)?
3. Is there a way to enable `BEGIN READ ONLY` → replica routing without the full query parser? The transaction type (`READ ONLY` vs read-write) is available at the protocol level without parsing the SQL body.
### Workarounds we've tried
- Reducing `query_cache_limit` → no effect on peak memory
- Increasing pod memory (8Gi) → delays the OOM but doesn't prevent it
- Chunking queries into smaller IN lists → reduces per-query AST size but doesn't solve the accumulation
- Using `unnest(ARRAY[...])` instead of tuple IN lists → reduces query text size, but parser still allocates ASTs
- **`query_parser = "off"`** → only effective workaround, but disables routing features

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query parser OOM on large row-value IN queries #854