Skip to content

Modules: fixed double-free in shared dict update with eviction.#1041

Merged
xeioex merged 3 commits intonginx:masterfrom
xeioex:shared_dict_evict_crash
Mar 30, 2026
Merged

Modules: fixed double-free in shared dict update with eviction.#1041
xeioex merged 3 commits intonginx:masterfrom
xeioex:shared_dict_evict_crash

Conversation

@xeioex
Copy link
Copy Markdown
Contributor

@xeioex xeioex commented Mar 26, 2026

When updating an existing key's string value in a shared dictionary with timeout and evict enabled, ngx_js_dict_alloc() could trigger ngx_js_dict_evict() if the zone was full. Since the node being updated was still in the expire tree, eviction could free it. The subsequent ngx_slab_free_locked() call in the update path then freed the already-freed string data, causing the "chunk is already free" alert followed by a segfault.

The issue was more likely to occur when the zone fills to capacity and eviction is triggered during the update path.

This closes #1036 issue on Github.

xeioex added 3 commits March 26, 2026 17:32
Previously, when updating an existing key's string value in a shared
dictionary with timeout and evict enabled, ngx_js_dict_alloc() could
trigger ngx_js_dict_evict() if the zone was full.  Since the node being
updated was still in the expire tree, eviction could free it.  The
subsequent ngx_slab_free_locked() call in the update path then freed the
already-freed string data, causing the "chunk is already free" alert
followed by a segfault.

The fix removes the node from the expire tree before allocating
memory for the new value, preventing eviction from reaching it.
On allocation failure the node is re-inserted with its original
expiry time.
Previously, when a slab allocation failed in evict mode, only 16
entries were evicted with a single retry.  This could still result
in SharedMemoryError when the freed slab slots did not match the
requested allocation size class, even though the zone had plenty
of evictable entries.

In practice, it might happen when the following conditions are met:
    - The shared zone is full
    - evict flag is enabled
    - key/value entries differ in size

The allocation now retries in a loop, evicting 16 entries at a time,
until the allocation succeeds or no more entries remain in the expire
tree.

After this change, allocation with evict enabled can only fail when:
    - the value is larger than the zone's usable space
    - the expire tree has no entries left to evict
    - zone metadata overhead leaves insufficient room
Previously, keys(), items(), and size() called ngx_js_dict_expire()
under a read lock.  Since ngx_js_dict_expire() deletes nodes from
both rbtrees and frees slab memory, concurrent readers on different
worker processes could corrupt shared memory by freeing the same
expired nodes simultaneously.

The fix removes ngx_js_dict_expire() calls from all read-locked
paths and instead skips expired entries during iteration, consistent
with how get() and has() already handle expiry.  Actual cleanup of
expired entries is deferred to write-side operations (set, add,
delete, clear).
@xeioex xeioex force-pushed the shared_dict_evict_crash branch from 8f4f0e3 to ccf787e Compare March 27, 2026 01:13
@xeioex xeioex marked this pull request as ready for review March 27, 2026 02:21
@xeioex xeioex requested a review from VadimZhestikov March 27, 2026 02:21
@VadimZhestikov
Copy link
Copy Markdown
Contributor

May be it would be nice add extra regression test for the third change, covering keys()/items()/size() with expired-but-not-yet-reclaimed entries, ideally for both njs and qjs paths. Otherwise looks good.

Copy link
Copy Markdown
Contributor

@VadimZhestikov VadimZhestikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@xeioex
Copy link
Copy Markdown
Contributor Author

xeioex commented Mar 30, 2026

@VadimZhestikov

May be it would be nice add extra regression test for the third change, covering keys()/items()/size() with expired-but-not-yet-reclaimed entries,

Yes, I considered it and decided not to add more tests as existing tests

 like(http_get('/keys?dict=foo'), qr/empty/, 'foo keys after expire');
  like(http_get('/keys?dict=bar'), qr/FOO\,FOO2/, 'bar keys after a delay');
  like(http_get('/size?dict=foo'), qr/size: 0/, 'no of items in foo after expire');
  like(http_get('/items?dict=bar'), qr/FOO,zzz|FOO2,aaa/, 'bar items');

already cover what we need the most. expired-but-not-yet-reclaimed - is hard to observe the internal state of whether nodes were reclaimed or just skipped.

@xeioex xeioex merged commit c3a4725 into nginx:master Mar 30, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shared Memory "Already Free" Bug?

2 participants