Skip to content

Query parser OOM on large row-value IN queries #854

@alexnick-glow

Description

@alexnick-glow

Query parser OOM on large row-value IN queries

Environment

  • PgDog: v0.1.32 – v0.1.34
  • Deployment: Kubernetes, 2 pods, 8Gi memory limit per pod
  • Config: pooler_mode = "transaction", query_parser = "auto", query_cache_limit = 500
  • Backend: 1 primary + 1 replica (so auto enables the parser)
  • Clients: ~15 Go/Python services, ~50-100 concurrent connections

Problem

PgDog pods get OOM-killed under sustained load when clients send row-value IN queries with hundreds of tuples. Memory grows continuously and is not reclaimed, eventually exceeding the 8Gi container limit.
Example query pattern (generated by SQLAlchemy):

SELECT raw_vendor, app_hint, ... (35 columns)
FROM desktop_app_hint
WHERE (raw_vendor, app_hint) IN (
  ('vendor1', 'hint1'), ('vendor2', 'hint2'), ... -- 300-600 tuples
)

Each unique query produces a large AST via pg_query. With query_cache_limit = 500, up to 500 of these large ASTs are cached. Under sustained load with varying parameters, the cache fills with large entries and memory grows until OOM.

Reproduction

We wrote a test that reproduces the issue locally with Docker (256MB memory limit):

# 100 concurrent workers, 500 tuples per query, unique queries each time
# PgDog config: query_parser = "on", query_cache_limit = 50, 1 primary + 1 replica
# Results:
# parser=on,  128MB limit → 86.5% error rate, OOM crash
# parser=off, 128MB limit → 0% errors, +4MB growth
# parser=on,  1GB limit   → 0% errors, +268MB growth

Key findings from our testing:

Config Memory growth Errors
query_parser = "on", 128MB OOM crash 86.5%
query_parser = "off", 128MB +4MB 0%
query_parser = "on", 1GB +268MB 0%
query_cache_limit = 0 vs 1000 No difference in peak Same
query_cache_limit does not affect peak memory — the memory is consumed during parsing, not caching. Even with query_cache_limit = 0, 100 concurrent large queries allocate hundreds of MB simultaneously.

Observations

  1. Memory is not returned to the OS after parsing. Even after queries complete and ASTs are dropped, RSS stays high (Rust allocator retains pages). This means memory only grows over the pod's lifetime.
  2. The parser runs on every query, even when all queries are wrapped in BEGIN transactions and read/write classification adds no value (with conservative strategy, all BEGIN goes to primary regardless).
  3. Disabling the parser (query_parser = "off") completely eliminates the memory growth but also disables BEGIN READ ONLY → replica routing and SET command tracking.

Impact

  • PgDog pods restart every ~10 minutes under production-like load at 8Gi
  • Each restart causes AdminShutdown / driver: bad connection errors across all connected services
  • We had to disable the query parser entirely as a workaround, losing read/write routing

Questions

  1. Is there a way to limit per-query parser memory, or skip parsing for queries above a certain size?
  2. Could the query cache use a size-based eviction (total bytes) rather than count-based (query_cache_limit)?
  3. Is there a way to enable BEGIN READ ONLY → replica routing without the full query parser? The transaction type (READ ONLY vs read-write) is available at the protocol level without parsing the SQL body.

Workarounds we've tried

  • Reducing query_cache_limit → no effect on peak memory
  • Increasing pod memory (8Gi) → delays the OOM but doesn't prevent it
  • Chunking queries into smaller IN lists → reduces per-query AST size but doesn't solve the accumulation
  • Using unnest(ARRAY[...]) instead of tuple IN lists → reduces query text size, but parser still allocates ASTs
  • query_parser = "off" → only effective workaround, but disables routing features

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions