Skip to content

Possible native-memory leak with PostgreSQL scanner (testes via Node.js) RSS grows per iteration while JS heap stays flat #361

@renatocron

Description

@renatocron

When running long-lived iterative queries from Node.js against three backends (DuckDB only, DuckDB < SQLite, DuckDB < Postgres), RSS grows linearly only in the Postgres path (into multiple GB), while V8 heapUsed stays ~≤100 MB and looks similar across all modes. This suggests growth in native memory outside the JS heap, likely in the Postgres scanner path.

Image

I few month's back I theorized that the issue was on the postgres scanner over https://www.reddit.com/r/DuckDB/comments/1itnxxv/duckdb_memory_growth_patterns_in_longrunning/ where I saw many allocated pages over a period of time, but only today I tested with different drivers.

Environment

  • OS: Ubuntu/Debian (x86_64)
  • Node.js: node -v = v24.3.0
  • duckdb-node package: npm ls duckdb = 1.2.0-alpha.14
  • PostgreSQL server: PostgreSQL 17.6 (Ubuntu 17.6-1.pgdg22.04+1)

Repro repo / script

https://github.com/renatocron/duckdb-memory-leak-test

  • Repo contains memory_test.js (runs the same analytical query across modes) and produces large JSON logs with memory stats.
  • Also included: mem_compare.py to align/downsample and plot/emit a reduced JSON for web.

How to reproduce

# 1) Install deps
npm install

# 2) Create Postgres DB (tool populates tables automatically)
createdb duckdb_memory_test   # or use psql as in the README

# 3) Run tests (examples)
#   a) Postgres path (suspected leak):
TEST_MODE=postgres QUERY_INTERVAL=1000 MAX_ITERATIONS=0 \
NEW_CONNECTION=false FORCE_GC=false \
node --expose-gc memory_test.js

#   b) DuckDB-only baseline:
TEST_MODE=duckdb node --expose-gc memory_test.js

#   c) SQLite scanner:
TEST_MODE=sqlite node --expose-gc memory_test.js

# 4) Compare results
./mem_compare.py memory_stats_*.json --output-plot plot.png --target-points 400
# (Optional) Interactive web view:
./mem_compare.py memory_stats_*.json --output-json reduced.json
python3 -m http.server 8080

Observed behavior

  • RSS (MB):

    • Postgres (orange): steady linear growth up to ~2.4 GB across ~450k iterations.
    • DuckDB (blue): plateaus ~250–350 MB with occasional steps.
    • SQLite (green): sawtooth between ~300–1000 MB, but not linear unbounded.
  • V8 heapUsed (MB): slow, similar growth across all modes (~7 → 90 MB).

Expected behavior

  • RSS should plateau or oscillate within a bounded range once steady-state is reached, similar to DuckDB-only runs. Large, iteration-proportional native memory growth in the Postgres path is unexpected.

Artifacts

  • PNG plots (aligned to common max iteration):

    • RSS vs iteration: (attached as plot.png)
    • heapUsed vs iteration: (attached as heap_used.png)
  • Raw logs: memory_stats_postgres_*.json, memory_stats_sqlite_*.json, memory_stats_duckdb_*.json (each ~96 MB) if iter =~ 500k

    • If desired, maintainers can use mem_compare.py ... --output-json reduced.json for a tiny repro dataset.

Notes / hypotheses

  • Growth seems native (outside V8).
  • Candidate areas: Postgres scanner result materialization, Arrow/Vector buffers, lifetimes of DataChunk/ColumnDataCollection, libpq result handling, prepared statement cache, or per-iteration allocations not released.
  • DuckDB memory limit is set (e.g., 500MB), but RSS for the Postgres path exceeds it over time, suggesting allocations not accounted in DuckDB’s limit or outside its allocator.

Things I can try if helpful

  • Repeat using DuckDB CLI (no Node) with INSTALL/LOAD postgres; and a tight loop of the same query to isolate Node.

Any tool to help this? As running ~ 500k manually seens not feasible

  • Swap allocator to jemalloc and dump stats; try MALLOC_CONF=prof:true,prof_active:true.

Tried on the production worker, did not changed the behavior

Thank you!

Happy to run any diagnostic builds or patches.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions