Skip to content

Refactor dict restoring abstraction#3561

Open
JimB123 wants to merge 3 commits intovalkey-io:unstablefrom
JimB123:refactor-dict
Open

Refactor dict restoring abstraction#3561
JimB123 wants to merge 3 commits intovalkey-io:unstablefrom
JimB123:refactor-dict

Conversation

@JimB123
Copy link
Copy Markdown
Member

@JimB123 JimB123 commented Apr 24, 2026

The recent update to dict (#3366) improves performance by making dict a thin wrapper on top of the hashtable implementation.

As part of that refactoring, we lost some of our dict abstraction. The dictEntry became public (again). The defrag code was diving into the entry directly.

This update:

  • Hardens the dict abstraction by making dictEntry opaque (again)
  • Moving some defrag capability back to the dict (out of defrag)
  • Uses a .c file rather than the .h file (allowing for opaqueness in the data structure and code)
  • Eliminates the requirement to configure dictEntryGetKey on every dict (it's essentially a required constant)

Link time optimization (LTO) will result in the same inlining of functions, however it can now use configurable options to tune the level of inlining. This potentially reduces L1 cache bloat.

Comment thread src/dict.h
* the "htdict" prefix is used to avoid colliding with the "dict" in libvalkey */
#define dictCreate(type) htdictCreate(type)
#define dictExpand(d, size) htdictExpand(d, size)
#define dictSetKey(d, de, key) htdictSetKey(d, de, key)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to eliminate this function. "set key" is an anti-pattern for a dict structure. The only place this is being used is for one use-case in expire.c. We could have fixed that with the key dup callback, but that callback was eliminated in the transition to hashtable.

Might just need a better API which addresses the use case in expire.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or would it be feasible to refactor that use case to use hashtable directly?

I got slightly distracted - turns out it was easy. I had a PR ready (#3566) before I remembered I was reviewing here 😓

Copy link
Copy Markdown
Contributor

@zuiderkwast zuiderkwast Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rainsupreme I see you solved it by creating a dict clone based on hashtable where the dictEntry clone is not opaque to your code.

I'd call that's a workaround for something you can't do in dict anymore.

IMHO we could just keep dictEntry non-opaque and then it's strait-forward to just replace the individual calls to functions like dictAddOrFind with the lower-level hashtable functions + manual dictEntry manipulation whenever the dict API is limiting you.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2026

Codecov Report

❌ Patch coverage is 83.76068% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.63%. Comparing base (d2db0c2) to head (f11a02c).
⚠️ Report is 10 commits behind head on unstable.

Files with missing lines Patch % Lines
src/dict_ht.c 83.62% 19 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3561      +/-   ##
============================================
+ Coverage     76.35%   76.63%   +0.27%     
============================================
  Files           159      159              
  Lines         80054    80135      +81     
============================================
+ Hits          61125    61408     +283     
+ Misses        18929    18727     -202     
Files with missing lines Coverage Δ
src/cluster_legacy.c 88.08% <ø> (-0.16%) ⬇️
src/config.c 78.09% <ø> (ø)
src/defrag.c 81.96% <100.00%> (-0.30%) ⬇️
src/eval.c 91.78% <ø> (+0.27%) ⬆️
src/expire.c 97.31% <ø> (-0.81%) ⬇️
src/functions.c 96.61% <ø> (-0.03%) ⬇️
src/fuzzer_command_generator.c 76.72% <ø> (-0.11%) ⬇️
src/latency.c 83.33% <ø> (ø)
src/module.c 25.31% <ø> (ø)
src/rdb.c 77.32% <ø> (+0.13%) ⬆️
... and 6 more

... and 22 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Jim Brunner <brunnerj@amazon.com>
Copy link
Copy Markdown
Contributor

@rainsupreme rainsupreme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, though I think there are couple things to fix. I do think keeping the opaqueness of dict is the right move, and we don't have to give up the performance gain either!

I'm a little disappointed with the dict naming conflict from libvalkey, and all the #define boilerplate in dict.h. We didn't have to do this before this change (apparently?), so my (unresearched) hunch is that there should be some way to avoid the boilerplate. Could we remove dict.h from server.h and only include it where dict is used, or something like that? 🤔

Comment thread src/dict.h
Comment thread src/dict_ht.c Outdated
Comment thread src/dict_ht.c
Comment thread src/dict_ht.c
Comment thread src/dict.h
* the "htdict" prefix is used to avoid colliding with the "dict" in libvalkey */
#define dictCreate(type) htdictCreate(type)
#define dictExpand(d, size) htdictExpand(d, size)
#define dictSetKey(d, de, key) htdictSetKey(d, de, key)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or would it be feasible to refactor that use case to use hashtable directly?

I got slightly distracted - turns out it was easy. I had a PR ready (#3566) before I remembered I was reviewing here 😓

Signed-off-by: Jim Brunner <brunnerj@amazon.com>
Signed-off-by: Jim Brunner <brunnerj@amazon.com>
@JimB123 JimB123 marked this pull request as ready for review April 28, 2026 01:12
Copy link
Copy Markdown
Contributor

@rainsupreme rainsupreme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much effort for very little gain IMHO, but I won't refuse it. :)

Comment thread src/Makefile
db.o \
debug.o \
defrag.o \
dict_ht.o \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't mix spaces and tabs.

Comment thread src/dict_ht.c


dict *htdictCreate(dictType *type) {
type->entryGetKey = htdictEntryGetKey;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This manipulates the dictType which isn't owned by this dict instance. It's it's shared among multiple dicts. It isn't marked as const but still...

Is this ugliness the cost this PR is paying for removing some other uglinesses?

I guess it can be OK, but it's worth mentioning. :)

Comment thread src/dict_ht.c
#define UNUSED(V) ((void)V)

/* Callback for dictType.entryGetKey, which expects void pointers. */
const void *htdictEntryGetKey(const void *entry) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be marked as static right?

Comment thread src/dict_ht.c
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woudn't it be nicer if the filename and the function prefix htdict would match? E.g.

  • htdictCreate/htdict.c
  • dictHtCreate/dict_ht.c
  • dicthtCreate/dictht.c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants