Skip to content

feat: add Sieve algorithm#220

Draft
kurtextrem wants to merge 1 commit intoYomguithereal:masterfrom
kurtextrem:master
Draft

feat: add Sieve algorithm#220
kurtextrem wants to merge 1 commit intoYomguithereal:masterfrom
kurtextrem:master

Conversation

@kurtextrem
Copy link
Copy Markdown

As I've prepared the code based on the LRU data structure from mnemonist already for my own purposes, I figured I'd open a draft PR so anyone can take a look or do further improvements/iterations on the code (or tests with it).

As written in #212, in my benchmark, the Map() implementation comes out on top (with strings as keys).

@Yomguithereal
Copy link
Copy Markdown
Owner

Thanks @kurtextrem. I will merge your PR as soon as I start working on the SIEVE cache. It's just that I am swamped currently.

@kurtextrem
Copy link
Copy Markdown
Author

no worries, take your time.

@NeoPhi
Copy link
Copy Markdown

NeoPhi commented Feb 17, 2024

@kurtextrem I'm probably doing something wrong in my benchmark. Running this implementation of SieveMap I'm seeing much worse hit ratio compared to your js-sieve implementation, my own simple Sieve implementation, or mnemonist/lru-map. Output of hit % on a sample K/V trace.

Sieve 94.69188501697717%
js-sieve 94.69199169100739%
SieveMap 86.2063926088917%
LRUMap 94.2229459801187%

Sample code for the above can be found here: https://github.com/NeoPhi/cache-playground

@kurtextrem
Copy link
Copy Markdown
Author

@NeoPhi Cool stuff, thank you for doing this!
From looking at your implementation, it looks like you're not using a doubly-linked list. From reading the Sieve paper I had the impression it's a must, but apparently it isn't? This will definitely also be much faster in execution than my js-sieve implementation (at least as long as hash collisions in the map stay unexpensive to resolve).

The differences in hit ratio are surprising to me, but I can't really tell you what's off. I took the example code referenced by the sieve paper and fixed some JS errors (#212 (comment)).

@NeoPhi
Copy link
Copy Markdown

NeoPhi commented Feb 18, 2024

I'm relying heavily on the iterator semantics of the JS Map. which has edge cases compared to a proper doubly linked list implementation, hence the small discrepancy in hit rates and why your implementation is slightly higher. (There was a bug in my hand implementation, hit count is identical).

I think the proposed code changes linked to from #212 (comment) are buggy. I tried a simple use case and it looks like the internal data structures are getting corrupted:

import SieveMap from "../mnemonist/sieve-map.js";
const test = new SieveMap(3);
test.set('A', 'A');
test.set('B', 'B');
test.set('C', 'C');
console.log(Array.from(test.entries()));
// Output
// [ [ 'C', 'C' ], [ undefined, undefined ], [ undefined, undefined ] ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants