WARNING: this crate is experimental and even careful use is likely undefined behavior.
This crate exposes four C standard library functions to Rust:
pub fn setjmp(env: *mut jmp_buf) -> c_int;
pub fn sigsetjmp(env: *mut sigjmp_buf, savesigs: c_int) -> c_int;
pub fn longjmp(env: *mut jmp_buf, val: c_int) -> c_void;
pub fn siglongjmp(env: *mut sigjmp_buf, val: c_int) -> c_void;
as well as the jmp_buf and sigjmp_buf types needed to use them.
See
setjmp(3)
for details and caveats.
Also see RFC #2625.
To interact better with C code that may use
setjmp()/longjmp():
- If C code calls rust code, and the rust code calls C code, and a
longjmp()happens, you may want the rust code to catch, thelongjmp(), transform it into a panic (to safely unwind), thencatch_unwind(), then turn it back into alongjmp()to return to someplace in the C code (the last place that calledsetjmp()). - If rust code calls C code, the rust code might want to catch a
longjmp()from the C code and handle it somehow. - Rust code might want to
longjmp()to return control to C code.
It is possible to use setjmp()/longjmp() just for managing
control flow in rust (without interacting with C), but that would be
quite dangerous and has no clear use case.
Ordinarily, using a C function from rust is easy: you just declare it. Why go to the trouble of making a special crate?
- Document the numerous problems and caveats, as done in this document.
- Explore the problem space enough that the rust language team might feel comfortable defining the behavior (in at least some narrow circumstances).
- Provide tests to see if something breaks in an obvious way.
- Handle some platform issues:
- The
jmp_bufandsigjmp_buftypes are not trivial and are best defined using bindgen on the system's<setjmp.h>header. - libc implementations often use macros to change the symbols
actually referenced; and this is done differently on different
platforms. For instance, instead of
sigsetjmpthe actual libc symbol might be__sigsetjmp, and there may be a macro to rewrite thesigsetjmp()call into__sigsetjmp().
- The
The invocation of setjmp can appear only in the following contexts (see this comment):
- the entire controlling expression of
match, e.g.match setjmp(env) { ... }. if setjmp(env) $integer_relational_operator $integer_constant_expression { ... }- the entire expression of an expression statement:
setjmp(env);
See tests for examples.
Beyond the many challenges using setjmp/longjmp in C, there are
additional challenges using them from rust.
- The behavior of these functions is defined in terms of C, and therefore any application to rust is by analogy (until rust defines the behavior).
- Rust has destructors, and C does not. Any
longjmp()must be careful to not jump over any stack frame that owns references to variables that have destructors. - Rust doesn't have a concept of functions that return multiple
times, like
fork()orsetjmp(), so it's easy to imagine that rust might generate incorrect code around such a function. - Rust uses LLVM during compilation, which needs to be made aware of
functions that return multiple times by using the
returns_twiceattribute; but rust has no way to propagate that attribute to LLVM. Without this attribute, it's possible that LLVM itself will generate incorrect code (See this comment). - Jumping can interrupt well-bracketed control flow, circumventing guarantees about what code has run.
- Jumping can return control to a point before a value was moved, thereby allowing use-after-drop bugs.
- Jumping deallocates variables without destructing them (it doesn't merely leak them).
Given these problems, you should seriously consider alternatives.
One alternative is to use C wrappers when entering a rust stack frame
from C or a C stack frame from rust. The wrappers could turn special
return values from rust into a C longjmp() if necessary, or catch
a longjmp() from C and turn it into a rust panic!(),
respectively. This is not always practical, however, so sometimes
calling setjmp()/longjmp() from rust is still the best
solution.
- Mark any function calling
setjmp()with#[inline(never)]to reduce the chances for misoptimizations. - Code between a
setjmp()returns0and possiblelongjmp()should be as minimal as possible. Typically, this might just be saving/setting global variables and calling a C FFI function (which mightlongjmp()). This code should avoid allocating memory on the heap, using types that implement theDroptrait, or code that is complex enough that it might trigger misoptimizations. - Code before a
longjmp()or any parent stack frames should also be minimal. Typically, this would be just enough code to retrieve a return value from a callee, or catch a panic withcatch_unwind(). This code should avoid allocating memory on the heap, using types that implement theDroptrait, or code that is complex enough that it might trigger misoptimizations.