-
Notifications
You must be signed in to change notification settings - Fork 3.5k
WasmFS JS API: Implement truncate #19543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WasmFS JS API: Implement truncate #19543
Conversation
| assert(s.st_size == 8); | ||
|
|
||
| EM_ASM( | ||
| FS.truncate('truncatetest', 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with all these changes it would be good to look for other tests that already use this API, if any? If we were previously disabling, or not running, those tests under WASMFS it would be good to enable them now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see anything explicitly calling FS.truncate() in the test dir, so I added them to test_fs_js_api for now. Are there any places besides the test directory that could potentially contain disabled tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it would only be in the test dir. I guess we had not tests for that API. Thanks for looking into it. Hopefully after we are done with this we will have much better test coverage for the JS API!
src/library_wasmfs.js
Outdated
| truncate: (path, len) => { | ||
| if (len < 0) { | ||
| throw new FS.ErrnoError({{{ cDefs.EINVAL }}}); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one calls truncate("foo", -1) I would probably expect it to try to resize of MAX_INT or something like that.. so maybe we don't need this check and we can just let the underlying call fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I was thinking, but I had one test that tried calling truncate with -10. I tried printing out the value the syscall receives, which was 4294867286. But instead of setting the return value to -1 and setting errno accordingly, I think the syscall aborts.
Aborted()
RuntimeError: Aborted(). Build with -sASSERTIONS for more info.
at abort (/home/ubuntu/Documents/emsdk/emscripten/out/test/test_fs_js_api.js:1:5674)
at _abort (/home/ubuntu/Documents/emsdk/emscripten/out/test/test_fs_js_api.js:1:14749)
at <anonymous>:wasm-function[187]:0xd2a9
at <anonymous>:wasm-function[197]:0xe13e
at <anonymous>:wasm-function[172]:0xba9a
at <anonymous>:wasm-function[178]:0xce81
at _wasmfs_truncate (<anonymous>:wasm-function[261]:0x13101)
This is the error. It doesn't seem to match the pattern of how we've been returning JS API errors before (Normally the syscall returns -1 and I can return errno to JS, then throw an error from JS with an error name and code). Is this close to what you were saying by letting the call fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we don't want to syscall aborting like that. If you build with -g or --profiling-funcs you should get functions names for those wasm functions in the backtrace BTW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is strange because doTruncate does return -EINVAL when its input is negative. It would be good to get to the bottom of this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that doTruncate takes in off_t size, but truncate64 and ftruncate64 take in size as a uint64_t, so is it possible that the size is always positive in doTruncate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is needed for the Node backend, as it silently accepts negative offsets rather than failing with EINVAL, see: nodejs/node#35632.
Though, it looks like setSize() is currently not implemented for the Node backend:
emscripten/system/lib/wasmfs/backends/node_backend.cpp
Lines 120 to 122 in 48b0645
| int setSize(off_t size) override { | |
| WASMFS_UNREACHABLE("TODO: implement NodeFile::setSize"); | |
| } |
(i.e. this PR won't work for
-sNODERAWFS + -sWASMFS)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really interesting. I don't think I've been testing with NODEFS or NODERAWFS. Should I start adding those flags to test_fs_js_api?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If setSize() (and insertMove() due to FS.rename()) is supported in WasmFS' Node backend (which is probably better as a follow-up), you could do:
--- a/test/test_core.py
+++ b/test/test_core.py
@@ -6017,6 +6017,7 @@ Module = {
def test_fs_writeFile(self):
self.do_run_in_out_file_test('fs/test_writeFile.cpp')
+ @also_with_noderawfs
def test_fs_js_api(self):
self.set_setting("FORCE_FILESYSTEM")
self.do_runf(test_file('fs/test_fs_js_api.c'), 'success')After that, you can verify with:
$ ./test/runner wasmfs.test_fs_js_api* core2.test_fs_js_api*Whether the test passes on both memory and Node backend, either with or without WasmFS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After going back to the problem this morning, I realized I never checked that the right value was getting passed to js_api.cpp. I originally had _wamsfs_truncate in js_api.cpp take in size as type off_t, and for some reason negative values would not be passed in correctly. After changing off_t to long, the tests pass without making any changes to syscalls.cpp. I also tried passing in an int64_t to js_api.cpp, but that didn't seem to work. I initially wanted to use off_t to match the POSIX API, but could there be a potential problem with using that type in js_api.cpp?
Out of curiosity, I tried printing the values in syscalls.cpp, and I could print uint64_t as a signed integer using %lld. When the uint64_t then gets passed to doTruncate (Which takes in size as type off_t), does C automatically force the uint64_t to become signed? What if the unsigned value is larger than the signed max?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
off_t is defined here:
emscripten/system/lib/libc/musl/arch/emscripten/bits/alltypes.h
Lines 240 to 243 in 48b0645
| #if defined(__NEED_off_t) && !defined(__DEFINED_off_t) | |
| typedef _Int64 off_t; | |
| #define __DEFINED_off_t | |
| #endif |
So, it corresponds to int64_t. I just opened PR #19559 for this.
|
Regarding the syscall interface, I think for consistency all the native syscalls take primitive types.. see system/lib/libc/musl/arch/emscripten/syscall_arch.h. If changing the types of those functions to |
| if (err == -1) { | ||
| return errno; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Syscalls don't set errno, so this doesn't look right. Syscalls return negative error codes and libc handles those by setting errno appropriately. I would expect this code to be checking whether err is a negative error code (not necessarily -1) instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that errno is being set by libc behind the scenes? I've been doing this for the other js_api.cpp functions and the error codes match with the legacy API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no, all those places should be fixed. For example, in _wasmfs_mknod, it checks whether __syscall_mknodat returns -1, and if so returns errno. But __syscall_mknodat never sets errno and doesn't return -1 on errors; instead, it either returns -EINVAL or -EPERM or other negative error codes that doOpen can return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what went wrong now. The whole time I thought the syscalls were returning -1, so the errno values were returning correctly - turns out the if statement is always false, so the methods were just returning the syscall return value and I never checked the actual return values of the syscalls before, which made all the tests pass. Those issues have all been addressed in this PR.
tlively
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % separable comment.
system/lib/wasmfs/js_api.cpp
Outdated
| int _wasmfs_truncate(char *path, long size) { | ||
| return __syscall_truncate64((intptr_t)path, size); | ||
| } | ||
|
|
||
| int _wasmfs_ftruncate(int fd, long size) { | ||
| return __syscall_ftruncate64(fd, size); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there's no extra logic necessary here, is it possible to skip these wrappers and just call the underlying syscall directly from the JS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, although I'm having trouble figuring out how to import the syscalls to library_wasmfs.js. I've tried using __syscall_truncate64 and ___syscall_truncate64 (Three underscores) and neither worked. Normally, I would add _wasmfs_truncate to emcc.py, but I don't think it worked. How should I go about making the syscall available to call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that could be a followup.
When calling __syscall_truncate64 from JS you would use an extra underscore... but when adding it to the settings.REQUIRED_EXPORTS list in emcc.py you would not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this as a follow-up sounds good to me.
|
As a side note, should we worry about the issue with using |
|
I'm hoping we can land #19559 today BTW.. since that tests already passed. |
emcc.py
Outdated
| '_wasmfs_pread', | ||
| '_wasmfs_symlink', | ||
| '__syscall_truncate64', | ||
| '__syscall_ftruncate64', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason to use the raw syscalls and not a _wasmfs_* wrapper? Wrappers have the benefit that changes to the syscalls don't affect them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thomas suggested a little earlier that since truncate doesn't have any additional processing, it might make sense to directly use the raw syscall. However, I do see your point that a wrapper would be beneficial. Would it be preferable to go back to calling the wrapper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now you should continue to use wrapper functions and we can separately/later consider weather they are actually needed since (IIUC) these are not the only once that are simple pass-through functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't block landing anything on this conversation, but I still think pass-through wrappers aren't that useful. If we update the syscalls (which should be extremely rare), then we would still have to update their callers. Whether those callers are in C or JS doesn't make much of a difference as far as I can tell.
system/lib/wasmfs/js_api.cpp
Outdated
| return -errno; | ||
| } | ||
| return err; | ||
| return __syscall_openat(AT_FDCWD, (intptr_t)path, flags, mode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these bugfixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We realized that this check isn't needed, since the syscalls do not actually set errno. The tests were passing before because err was never equal to -1, so the js_api.cpp function directly returned the error without using errno, giving the desired effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make a separate PR for this cleanup stuff?
This PR implements
truncateandftruncate. Tests were also added totest_fs_js_api.One thing I noticed was that both the man pages and
doTruncatesyscall note thattruncateandftruncatetake in size as typeoff_t, but__syscall_truncate64and__syscall_ftruncate64take in size as auint64_t. This means that passing in negative numbers directly to__syscall_truncate64and__syscall_ftruncate64gives a weird number and no error, so I had to add a manual check inlibrary_wasmfs.js, which I didn't do for the other methods. Was there a specific reason that the two syscalls take inuint64_tinstead ofoff_t?