-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix IDBFS autoPersist on mkdir #24799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix IDBFS autoPersist on mkdir #24799
Conversation
src/lib/libidbfs.js
Outdated
@@ -85,10 +85,14 @@ addToLibrary({ | |||
if (n.memfs_stream_ops.close) return n.memfs_stream_ops.close(stream); | |||
}; | |||
|
|||
// Persist IndexedDB right away if the node we just created is a directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why this is needed? (Either here or in the PR description)
Also, is there some kind of test we could write for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the PR description and the commit message mention, FS.mkdir
uses mknod
under the hood:
Lines 739 to 748 in 1d05ff8
mkdir(path, mode = 0o777) { | |
mode &= {{{ cDefs.S_IRWXUGO }}} | {{{ cDefs.S_ISVTX }}}; | |
mode |= {{{ cDefs.S_IFDIR }}}; | |
#if FS_DEBUG | |
if (FS.trackingDelegate['onMakeDirectory']) { | |
FS.trackingDelegate['onMakeDirectory'](path, mode); | |
} | |
#endif | |
return FS.mknod(path, mode, 0); | |
}, |
The node
that we return would otherwise only call IDBFS.queuePersist
when an open handle/stream would get closed, which does not happen for creating a directory as evidenced by the implementation of FS.mkdir
. It is clear that the intent here was to persist IDBFS on directory creation, so I added code that calls IDBFS.queuePersist
if mknod
was called with a mode of directory.
I will try to write a test for this behavior a bit later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that makes sense.
How about updating the comment/description to say something like. "Force a persist now since there is not corresponding close
call for newly creating directories".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the same problem exist for folks that directly call mknod
to make files? (i.e. they never get closed, so the persist is never triggered?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it looks like the persistence should probably only be enabled if autoPersist
is set, so maybe we just need to add mknod
to list of audoPersist
things by the // Also kick off persisting the filesystem on other operations that modify the filesystem
comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the same problem exist for folks that directly call
mknod
to make files? (i.e. they never get closed, so the persist is never triggered?)
Actually yes, that is a good point. I removed the check for file mode and made it so that an mknod
invocation will queue a persist operation immediately. I believe that without this, files created via open()
would not get persisted without a close()
call either.
Actually it looks like the persistence should probably only be enabled if
autoPersist
is set, so maybe we just need to addmknod
to list ofaudoPersist
things by the// Also kick off persisting the filesystem on other operations that modify the filesystem
comment.
mknod
patching is already inside an if
statement that checks whether autoPersist
is set. I can remove the IDBFS.queuePersist
call from the patched and add a line that patches mknod
again if you like.
Also, is there some kind of test we could write for this?
I fixed the fact that the test wasn't checking whether all write operations (in the sense that they write to the filesystem, not literally write()
) call IDBFS.queuePersist
(I tested that the test now fails with my fix to mknod
reverted). However, I only fixed it for mkdir
, and the fix is somewhat hacky. Ideally we want to be sure that every write operation is tested, but splitting the test
function at every write operation would make an unmaintainable mess. JSPI would help a lot (since it uses stack switching under the hood without requiring any changes to the code), but I'm not sure how to integrate it and it's not very well supported by browsers at the moment.
d034d12
to
0d7f63b
Compare
test/fs/test_idbfs_sync.c
Outdated
struct stat st; | ||
|
||
// a directory using which we will check if auto-persistence is working | ||
if ((stat("/working1/dir2", &st) != -1) || (errno != ENOENT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this stat
fail if queuePersist
does not happen?
I would have thought it would work in any cases since the in-memory FS contains this directory, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the first part of the test, so it checks for non-existence of the directory before creating it. It's still the "filesystem seeding" part of code as before, now it's just split before mkdir so we can make sure mkdir calls IDBFS.queuePersist
. See the part after #else
for the code that checks whether /working1/dir2
actually got persisted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I also updated the commit message and the PR description with a more thorough explanation of why this is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it a little hard to understand with the #ifdef FIRST
straddling the function boundary here. I wonder if we could structure this to avoid that.
Also I opened #24820 to make this test more readable in general. The second one to land will need to deal with some conflicts I guess. I'm happy to wait if you prefer.
test/fs/test_idbfs_sync.c
Outdated
// Run the remainder of the test after all already scheduled JavaScript | ||
// tasks have executed (such as the one scheduled by IDBFS.queuePersist). | ||
// This is to ensure that mkdir also triggers a call to queuePersist and | ||
// to isolate mkdir from queuePersist invocations made by other syscalls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I still don't understand why this is needed. Why can't we run the remainder of this test directly? How can we tell that the mkdir triggers a call to queuePersist
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a write operation is invoked, it calls IDBFS.queuePersist()
, which schedules a sync. The important thing is that the sync operation doesn't happen synchronously, partly because of the intention to buffer up the writes and sync them all in one go via the usage of setTimeout
to schedule the sync operation, partly because the IndexedDB APIs are asynchronous themselves so even if setTimeout
wasn't used this would still be the case.
setTimeout
:
emscripten/src/lib/libidbfs.js
Lines 36 to 41 in 9886169
// Programs typically write/copy/move multiple files in the in-memory | |
// filesystem within a single app frame, so when a filesystem sync | |
// command is triggered, do not start it immediately, but only after | |
// the current frame is finished. This way all the modified files | |
// inside the main loop tick will be batched up to the same sync. | |
mount.idbPersistState = setTimeout(startPersist, 0); |
Asynchronous code (which isn't limited to this bit of code, it's just how the IndexedDB API works):
emscripten/src/lib/libidbfs.js
Lines 100 to 113 in 9886169
syncfs: (mount, populate, callback) => { | |
IDBFS.getLocalSet(mount, (err, local) => { | |
if (err) return callback(err); | |
IDBFS.getRemoteSet(mount, (err, remote) => { | |
if (err) return callback(err); | |
var src = populate ? remote : local; | |
var dst = populate ? local : remote; | |
IDBFS.reconcile(src, dst, callback); | |
}); | |
}); | |
}, |
That means--and this might be an incorrect assumption on my part--that while the synchronous C code is running (or rather while the JS that invokes the Wasm stuff is running), the JavaScript event loop, the "main thread", is blocked. No setTimeout
or promise callbacks can run, the page is frozen, everything is waiting for the current task to stop executing and yield so that execution can switch elsewhere (e.g. back into userland code like setTimeout
, promise, IndexedDB, or other async API callbacks, or yield into the browser).
The implication of that is, if we don't yield between write operations, they will be executed within a single task. If any of those write operations call IDBFS.queuePersist()
, it will schedule a task to run after ours, which will sync the whole memory FS (in the state after we yielded from C) to IDB. That means that if just one of the write operations call IDBFS.queuePersist()
, all of the write calls that happened synchronously during the same task will get persisted.
This is bad for testing because we want to ensure that every write operation calls IDBFS.queuePersist()
. I'm not entirely sure what is the best way to check for that. The way I did it here is specific to the mkdir
bug--I made sure that no IDBFS.queuePersist
was scheduled at the time of the mkdir
call by letting the JS engine run pending tasks (one of which is the queuePersist
one). Ideally we want to test every write operation. The end-to-end approach could be to make a write call, then inside setTimeout
call syncfs(true)
, then call stat
. Right now I assume the Emscripten runtime is teared down and restarted after the first half of the test is done and before the second half starts.
Does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think that makes sense.
What you are trying to do isolute the mkdir so that it doesn't run alongside any other operations.
How about we add another phase to the test. So you have FIRST and SECOND. In the SECOND phase you could call only mkdir. After the second phase, without your fix, the mkdir would be lost I assume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about making a separate test with two phases each for each write operation that should auto-persist? I think it makes sense since we are trying to verify each write operation's correctness, and it might be nicer than arbitrarily splitting an existing test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the second phase, without your fix, the mkdir would be lost I assume?
Yup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about making a separate test with two phases each for each write operation that should auto-persist? I think it makes sense since we are trying to verify each write operation's correctness, and it might be nicer than arbitrarily splitting an existing test.
I'm not sure it worth creating that many new tests.. unless we can do it in way that its expressed clearly and concisely. I'd rather not have N different tests here if we can avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I've pushed a commit that adds a separate test. I tried to keep it clean. It seems worth it to me to have a test that actually checks if every operation persists or not. What do you think?
9a22905
to
85c3250
Compare
FS.mkdir does not use mkdir under the hood but rather mknod. The patched version of mknod installed by IDBFS only invokes IDBFS.queuePersist() when the node's stream is closed, which never happens for mkdir as the handle is immediately discarded and no stream is ever created. Furthermore, the bare fact of file/directory creation should be persisted even if no writes are made to the node. The test doesn't catch this because reconcilation happens in a JavaScript task, and no tasks or microtasks can execute while code is executing. IDBFS.syncfs only has a chance to run after the entirety of the test function finishes executing, at which point all of file/directory operations have been executed and persisted in memory. As long as one of those operations calls IDBFS.queuePersist, it ensures that the whole memory filesystem will get persisted to IDB and therefore shadows missing calls to IDBFS.queuePersist in other operations. Remove the mkdir node op patching (which has never been a real node op) and make mknod queue a persist operation as soon as the node is created without waiting for the caller to interact with a stream.
85c3250
to
8e5e9f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the complexity of the new test, but I think we can reduce some of it.
test/fs/test_idbfs_autopersist.c
Outdated
#ifdef TEST_PREPARE | ||
prepare(); | ||
goto end; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of goto end
here can you just use #else
and have #endif
instead of the end label?
test/test_browser.py
Outdated
@@ -1353,6 +1353,32 @@ def test_fs_idbfs_sync(self, args): | |||
print('done first half') | |||
self.btest_exit('fs/test_idbfs_sync.c', cflags=['-lidbfs.js', f'-DSECRET="{secret}"', '-sEXPORTED_FUNCTIONS=_main,_test,_finish', '-lidbfs.js'] + args) | |||
|
|||
def test_fs_idbfs_autopersist(self): | |||
common_cflags = ['-lidbfs.js', '-sEXPORTED_FUNCTIONS=_main,_test,_finish'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just do self.cflags +=
here instead of having common_cflags
.
test/test_browser.py
Outdated
] | ||
|
||
print('preparing') | ||
self.btest_exit('fs/test_idbfs_autopersist.c', cflags=common_cflags + get_defines(None, 0) + ['-DTEST_PREPARE']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use if/else/endif
as mentioned about you can just remove get_defines(None, 0)
completely here.
Then you can do self.cflags += f'-DTEST_CASE={test_cases.index(current_case_name)}
here on this line and remove the get_defines
function completely.
Then you can just do cflags=[f'-DTEST_PHASE=1]'
and cflags=[f'-DTEST_PHASE=2']
below.
test/test_browser.py
Outdated
'unlink', | ||
'rename', | ||
'mkdir', | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this list here you can use parameterize
to have the case
case passed in as an argument.
Then each case gets its own test that can be run and debugged on its own.
@@ -54,7 +54,7 @@ addToLibrary({ | |||
// If the automatic IDBFS persistence option has been selected, then automatically persist | |||
// all modifications to the filesystem as they occur. | |||
if (mount?.opts?.autoPersist) { | |||
mnt.idbPersistState = 0; // IndexedDB sync starts in idle state | |||
mount.idbPersistState = 0; // IndexedDB sync starts in idle state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a separate bug? What was the effect of this bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The effect wasn't observable to users because the code doesn't strictly check for equality with 0
and undefined
is considered falsy:
emscripten/src/lib/libidbfs.js
Line 35 in 052cd1a
if (!mount.idbPersistState) { |
This fix is needed for the runOnceIDBFSIdle
test helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the complexity of the new test, but I think we can reduce some of it.
8e5e9f8
to
de579b9
Compare
test/test_browser.py
Outdated
def test_fs_idbfs_autopersist(self, test_case, phase_count): | ||
self.cflags += ['-lidbfs.js', '-sEXPORTED_FUNCTIONS=_main,_test,_finish', f'-DTEST_CASE=TEST_CASE_{test_case}'] | ||
for phase in range(phase_count): | ||
self.btest_exit('fs/test_idbfs_autopersist.c', cflags=[f'-DTEST_PHASE={phase+1}']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better!
Pushed new code with your suggestions, except now that the test is parameterized there isn't a clean way to run some code ( |
de579b9
to
564874d
Compare
564874d
to
ab29008
Compare
FS.mkdir does not use mkdir under the hood but rather mknod. The patched version of mknod installed by IDBFS only invokes IDBFS.queuePersist() when the node's stream is closed, which never happens for mkdir as the handle is immediately discarded and no stream is ever created.
Furthermore, the bare fact of file/directory creation should be persisted even if no writes are made to the node.
The test doesn't catch this because reconcilation happens in a JavaScript task, and no tasks or microtasks can execute while code is executing. IDBFS.syncfs only has a chance to run after the entirety of the test function finishes executing, at which point all of file/directory operations have been executed and persisted in memory. As long as one of those operations calls IDBFS.queuePersist, it ensures that the whole memory filesystem will get persisted to IDB and therefore shadows missing calls to IDBFS.queuePersist in other operations.
Remove the mkdir node op patching (which has never been a real node op) and make mknod queue a persist operation as soon as the node is created without waiting for the caller to interact with a stream.
Add a separate test that runs every operation that modifies the filesystem in isolation and ensures the said operation calls
IDBFS.queuePersist()
.