Skip to content

Encapsulate database reading/writing in a lazy container#188

Open
RobinMcCorkell wants to merge 3 commits intoHolzhaus:mainfrom
RobinMcCorkell:database
Open

Encapsulate database reading/writing in a lazy container#188
RobinMcCorkell wants to merge 3 commits intoHolzhaus:mainfrom
RobinMcCorkell:database

Conversation

@RobinMcCorkell
Copy link
Copy Markdown
Contributor

@RobinMcCorkell RobinMcCorkell commented Oct 30, 2025

See the tests for example usage. This replaces header.read_pages() with a lazily-loaded Database that encapsulates the header and a vector of pages, some of which may be loaded into memory. A page can be loaded with load_page() which takes a page index, and returns a mutable reference to the page.

I added an iterator API to iterate over pages and rows, though this requires some unsafe code due to the lifetime requirements of the iterator trait (the standard library also uses unsafe for iterators for the same reason btw). It's not ideal but better than the other alternatives I tried, which either required higher-kinded types which proved difficult to use or severely limited the ergonomics.

Other changes of note:

  • Alignment/padding of rows has been removed, as it turns out this was writing zeros over the top of actual data and not just seeking to the padded byte. I left comments describing the alignment requirements for the future.
  • Unknown row and page types have been removed, so that parsing an unknown page is a hard error. This aids debugging, and we don't need to load pages that are unknown with this API anyway.
  • The History plain page type enum has been commented out since we don't have a History row type so having this is pointless and complicates table loading (code might assume we understand the table because of the page type then find we can't actually parse it)

@RobinMcCorkell RobinMcCorkell force-pushed the database branch 2 times, most recently from 03c8f30 to f28824c Compare October 31, 2025 11:01
@RobinMcCorkell
Copy link
Copy Markdown
Contributor Author

@Swiftb0y @Holzhaus thoughts on this? I now have database writing working correctly based on this design too, ready for a followup PR.

@Swiftb0y
Copy link
Copy Markdown
Collaborator

I have no objections to the concept, though since rekordcrate is already quite fast and memory is cheap, this primarily seems like premature optimization. Did you encounter a concrete problem that this solves?

@RobinMcCorkell
Copy link
Copy Markdown
Contributor Author

The current API with read_pages loads just the pages for a given table, so I assumed there was value in selectively loading the DB. If we don't care about that then we should eagerly load and persist everything, I agree.

@Holzhaus
Copy link
Copy Markdown
Owner

Holzhaus commented Nov 17, 2025

Hey @RobinMcCorkell, thank you for the working on this and sorry for the delay. I will give this a try this weekend.

Comment thread src/pdb/mod.rs Outdated
Comment thread src/pdb/mod.rs Outdated
Comment thread tests/tests_pdb.rs.in Outdated
@RobinMcCorkell RobinMcCorkell force-pushed the database branch 2 times, most recently from 5270bd9 to ffc3b08 Compare December 24, 2025 04:12
@RobinMcCorkell
Copy link
Copy Markdown
Contributor Author

@acrilique could you take a look at this proposed change? This idea predates your high level API in #210, and should capture database writes as well as reads. I don't know the best way to integrate this idea with the high level API though, WDYT?

@acrilique
Copy link
Copy Markdown
Contributor

acrilique commented Dec 24, 2025

I like this change and I think the high level API could benefit from it. This Database (although I don't like the name as it kinda semantically overlaps with Pdb) is more low-level, managing i/o with lazy loading which is cool. The Pdb stuff is more high level and intended for common use cases of users of the library. I really think both can be integrated, and can give it a try myself if you want (e.g. I could PR to your branch).

The only thing that bugs me is the naming. Could we call this lower-level one something like PdbFile to be clearer about which one is responsible for i/o? Also if someone has a better naming scheme in mind I'm all ears

@RobinMcCorkell
Copy link
Copy Markdown
Contributor Author

I have integrated this with the high-level API by adding methods to iterate over row variants and keeping a free function to post-process the playlists into the high-level structure. I chose to leave the naming (removing Pdb which is no longer needed) since the object really does represent a handle to the database, one that can be opened and closed.

Comment thread src/device.rs Outdated
@RobinMcCorkell
Copy link
Copy Markdown
Contributor Author

@acrilique would appreciate a fresh review on this with the latest commits

@acrilique
Copy link
Copy Markdown
Contributor

Sorry Robin, I've been a bit busy during the holidays and now I'm back to college w exams and stuff. I'll try to take a look these days, if @Holzhaus or @Swiftb0y don't do it before me.
Anyway I feel like we shouldn't be in a hurry to merge this specific one (given the fact that it's an optimization+refactor), and just having it here as a proof of concept is already nice.

IMO the next natural step would be the db-from-scratch generation, as reexports (both with and without the lazy) were already working for me in pioneer cdj players when I tested in mid-december.
Would love to hear what you think

@RobinMcCorkell
Copy link
Copy Markdown
Contributor Author

The generation code I'm working on relies on this database API, hence why I'm keen to get this in first. There's a lot of extra complication when generating a database like generating offsets and choosing how many rows can fit

Copy link
Copy Markdown
Contributor

@acrilique acrilique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments/questions but it looks very good already, and the unsafe bit makes sense and is well documented.

Comment thread src/main.rs
Comment thread src/pdb/mod.rs Outdated
Copy link
Copy Markdown
Contributor

@acrilique acrilique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found the time to do a more thorough review. Left a few comments. ptal

Comment thread src/lib.rs Outdated
Comment thread src/main.rs
Comment thread src/pdb/io.rs Outdated
Comment thread src/pdb/mod.rs
Comment thread src/pdb/io.rs Outdated
@RobinMcCorkell RobinMcCorkell force-pushed the database branch 3 times, most recently from 85fa5b6 to e658221 Compare March 5, 2026 09:57
@acrilique
Copy link
Copy Markdown
Contributor

A bit more wrangling and I've removed all the unsafe code!

Sorry @RobinMcCorkell I still see the unsafe code blocks, I assume it's bc u haven't pushed yet?

@RobinMcCorkell
Copy link
Copy Markdown
Contributor Author

@acrilique indeed I missed the push, check now

Copy link
Copy Markdown
Contributor

@acrilique acrilique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this LGTM. Hope the maintainers can take a look and merge at some point.

I'll start working on implementing a SerializedSize trait for row types to keep pushing on the serialization effort.

@RobinMcCorkell
Copy link
Copy Markdown
Contributor Author

I already have something in the works for that too: #220

Comment thread src/pdb/mod.rs
Comment thread src/main.rs
@acrilique
Copy link
Copy Markdown
Contributor

Hi @RobinMcCorkell @Holzhaus @Swiftb0y can we rebase and merge this one? Or at least get some fresh criticism from the maintainers as it was last reviewed a while ago.

I have already written a program (which depends on rekordcrate with this PR and a few extra changes by me) that generates a fresh database from scratch that can be then read by a CDJ. So we've already come a long way

@RobinMcCorkell RobinMcCorkell force-pushed the database branch 3 times, most recently from e2ea919 to 422964d Compare April 1, 2026 21:21
@RobinMcCorkell
Copy link
Copy Markdown
Contributor Author

@Holzhaus @Swiftb0y this is ready for review/merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants