Plan: Replace pony_error with Error-Flag Returns (Issue #4443) #4997
Replies: 6 comments 1 reply
-
|
Phase 1 and 2 changes are ugly but I think at this point, I am going to roll with them to get to the end and see if we can get a working build. |
Beta Was this translation helpful? Give feedback.
-
|
Are you going to benchmark the changes? (sorry for unsolicited comments I've just been watching (and inspired by the productivity)) Since exceptions can be a lot faster than bit returns, especially when they don't need data. Though that's mostly predicated on handling them rarely |
Beta Was this translation helpful? Give feedback.
-
Phase 3 implementation notes: issues not accounted for in the planDuring Phase 3 implementation, four issues surfaced that the original plan didn't anticipate. All relate to the same root cause: 1.
|
Beta Was this translation helpful? Give feedback.
-
Smoke Test for Performance ImpactUsing Why message-ubench
Build (one-time)
Run (~100 min)Alternate 10 runs (5 per version): Each run:
Analyze
Decision criteria
Limitations
|
Beta Was this translation helpful? Give feedback.
-
Phase 4 implementation note: bare partial lambdasDuring Phase 4 implementation, we discovered a fundamental issue with removing exception infrastructure that the plan didn't anticipate. The problemBare functions ( Who's affectedThe only user of bare partial lambdas in the stdlib is the let throw_fn = @{() ? => error }
@pony_serialise(@pony_ctx(), data, Pointer[None], r, alloc_fn, throw_fn) ?This pattern passes a bare lambda as a callback to C runtime code. When the C code detects an error condition (e.g., an actor in the object graph), it calls No other stdlib package, example, or internal code uses this pattern. The compiler's internal use of What needs to happen if we move forward
These are prerequisites for Phase 4, not follow-up work — without them, Phase 4 produces a compiler that silently breaks working code at runtime. |
Beta Was this translation helpful? Give feedback.
-
Phase 4 implementation — deviations from planPhase 4 has been implemented. Here are the deviations from the original plan: 1. Removed Pony-level serialise package (not in original plan)The Removed:
The C-level serialise runtime ( No other stdlib package or example uses the serialise package. 2. Replaced
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Context
Pony's
errorkeyword currently uses C++ exception unwinding (libunwind on POSIX, SEH on Windows) to propagate errors. This requires LLVMinvokeinstructions instead ofcallat every partial function call site, which inhibits LLVM optimizations and adds platform-specific complexity. Issue #4443 proposes replacing this with return-value error flags: partial functions return a{value, i1}tuple, and callers check the boolean error flag.This plan produces an experimental build that can be tested before submitting an RFC. The RFC's core proposal is "remove partial FFI and use error-flag returns instead of exceptions."
Each phase is designed as an independent Claude session with low context requirements.
Phase 1: Remove pony_error from time.c
Goal: Eliminate the only
pony_error()call in time.c and the only partial FFI declaration outside the net package.Changes:
src/libponyrt/lang/time.c: Changeformat_invalid_parameter_handlerto an empty body (remove thepony_error()call but keep the callback). The handler must remain registered — without it, the Windows CRT terminates the process whenstrftimegets invalid parameters. With an empty handler,strftimereturns 0 instead. Add a check after thestrftimecall: ifr == 0andlen > some_threshold(e.g., the buffer has already been doubled past a reasonable size), return an empty string instead of retrying infinitely. The currentwhile(r == 0)retry loop assumesr == 0means "buffer too small," but with the handler silenced, it also covers invalid format strings — which would loop forever without an escape hatch.packages/time/posix_date.pony: Remove?from the@ponyint_formattimeFFI declaration. UpdatePosixDate.format()to check for empty string return anderrorexplicitly (the method stays partial at the Pony level).Files:
src/libponyrt/lang/time.c,packages/time/posix_date.ponyTest:
make test-stdlib-release(specifically the time package tests)Size: ~30 LOC
Phase 2: Remove pony_error from socket.c
Goal: Eliminate all 13
pony_error()calls in socket.c. After this phase, no C runtime code callspony_error().Approach: Use
SIZE_MAXas an error sentinel return value. The socket functions (pony_os_writev,pony_os_send,pony_os_recv,pony_os_sendto,pony_os_recvfrom) currently returnsize_ton success and callpony_error()on failure. Change them to returnSIZE_MAXon failure instead.Changes:
src/libponyrt/lang/socket.c: Replace all 13pony_error()calls withreturn SIZE_MAX(include<stdint.h>or equivalent forSIZE_MAX). Functions affected:pony_os_writev(lines 933, 950)pony_os_send(lines 978, 992) — note: no Pony callers found, may be dead codepony_os_recv(lines 1003, 1014, 1016)pony_os_sendto(lines 1028, 1035, 1045)pony_os_recvfrom(lines 1057, 1071, 1073)packages/net/tcp_connection.pony: Remove?from@pony_os_recvand@pony_os_writevFFI declarations. At each call site, addif len == USize.max_value() then error endafter the FFI call. The surroundingtry...elseblocks (which callhard_close()) stay unchanged.packages/net/udp_socket.pony: Same treatment for@pony_os_sendtoand@pony_os_recvfrom.Files:
src/libponyrt/lang/socket.c,packages/net/tcp_connection.pony,packages/net/udp_socket.ponyTest:
make test-stdlib-release(specifically the net package tests)Size: ~150 LOC
Future improvement (not this plan): Replace sentinel values with struct returns containing both the value and an error code, per Sean's preference in the issue comment.
Phase 3: Codegen — Error-Flag Returns
Goal: Replace exception-based error handling in generated code with error-flag tuple returns. This is the core change that enables the RFC.
After this phase: Pony's
errorkeyword produces a return value (not an exception),try/elseblocks use branching (not landing pads), and partial FFI is disallowed.Return type scheme
T{T, i1}None{None, i1}voidi1The
i1isfalsefor normal return,truefor error. On error, the value portion isundef.Risk note
All sub-steps (3a–3r) must be correct simultaneously — a bug in any one breaks the entire build. The highest-risk sub-steps are 3f (
gen_error, most branching logic for error propagation vs. local handling), 3h (gen_call, most complex interaction between error flag extraction, branching, and class constructor override), and 3l (genfun_forward, easy to miss the tuple unwrap/rewrap). Budget debugging time accordingly.Sub-steps (all within one session — they cannot be tested independently)
3a. Add helper functions (
src/libponyc/codegen/gencontrol.cor a new header):error_flag_type(c, real_type)— returnsLLVMStructType({real_type, i1}), or justi1ifreal_typeis voidwrap_result(c, value, is_error)— creates the{value, error_flag}tuple (or just thei1for void)unwrap_result(c, tuple)— extracts the real value from the tupleunwrap_error(c, tuple)— extracts the error flag3b. Change
make_signature(src/libponyc/codegen/genfun.c:85-156):ast_childidx(m->fun->ast, 5)— the 6th child of the method AST is thecan_errornode; checkast_id(can_error) == TK_QUESTION. This is the same pattern used ingenheader.c:132.error_flag_type()for partial functionscompile_method_t(add abool is_partialfield) for use inmake_signatureand body generators (genfun_fun,genfun_new,genfun_forward)bool is_partialtocompile_frame_t(codegen.h:78). This is needed becausegen_returnandgen_erroraccess the frame (c->frame), not the method. Setc->frame->is_partialaftercodegen_startfuningenfun_fun,genfun_new,genfun_forward, andstart_function(ingenprim.cfornullable_pointer_apply). Propagate it throughcodegen_pushscope,codegen_pushloop, andcodegen_pushtry— all three propagatebare_functionfrom prev and must also propagateis_partial. Missingcodegen_pushtrywould causegen_returninside atrybody to seeis_partial == falseand produce a plain return instead of an error-flag-wrapped return.3c. Change
genfun_fun(src/libponyc/codegen/genfun.c:481-540):gen_assign_castat line 520 usesr_type = LLVMGetReturnType(f_type), which is now{T, i1}. This must be changed to cast against the unwrapped element type (extract the first element type from the struct), then wrap the cast value in{cast_value, false}beforegenfun_build_ret. Casting aTvalue to{T, i1}directly would be a type error.false(i1) instead ofgenfun_build_ret_void3d. Change
genfun_new(src/libponyc/codegen/genfun.c:594-626):false(i1, no error)thisin{this, false}tuple3e. Change
gen_return(src/libponyc/codegen/gencontrol.c:585-611):c->frame->is_partial{T, i1}(value-returning) ori1(class constructor that was originally void). Distinguish these by checking whether the LLVM return type is a struct:{T, i1}): generate the expression value, cast it, wrap in{value, false}, returni1return (partial class constructor): thereturnexpression in a constructor body evaluates toNonewhich gets discarded — returnfalse(no error)returnstatement in a constructor is rare but legal in Pony. The key insight is thatreturnfrom a constructor means "finish normally," so the error flag is alwaysfalse.3f. Change
gen_error(src/libponyc/codegen/gencontrol.c:895-930):gencall_error(c)call at line 926 with new logic:c->frame->invoke_target != NULL(inside try):LLVMBuildBrtoinvoke_targetinvoke_target(propagating): return error tuple ({undef, true}or justtruefor void)3g. Change
gencall_error(src/libponyc/codegen/gencall.c:1397-1408):nullable_pointer_applyin genprim.c{undef, true}ortrue), then unreachable-free termination3h. Change
gen_call(src/libponyc/codegen/gencall.c:729-935):codegen_callpony.newcallmetadata (lines 908-912) must be set on the raw call instruction (before any unwrapping), sincegenopt.cc:282uses it to identify constructor calls for heap-to-stack optimizationast_canerror(ast):unwrap_error(c, result)error_blockandcontinue_blockLLVMBuildCondBr(c->builder, error_flag, error_block, continue_block)error_block: ifc->frame->invoke_target != NULL, branch to it; else do an error returncontinue_block: extract the real value withunwrap_result(c, result)and use it as the expression resultr = args[0]unconditionally overrides the call result with the receiver for class constructor calls. After the change, a partial class constructor returnsi1(error flag). The error flag must be checked BEFORE overriding withargs[0]. The override should only happen on the non-error path. This is already handled naturally if the error-flag check (above) happens before line 924 — the error path branches away, and only the continue path reaches line 927.3i. Change
gen_ffi(src/libponyc/codegen/gencall.c:1155-1308):LLVMBuildCall2errvariable tracking?on FFI)use @some_ffi[...](...) ?declarations will get a compiler error after this change. Phases 1 and 2 already removed all partial FFI declarations in the standard library, so the stdlib is safe. This is intentional — the RFC proposes eliminating partial FFI entirely.3j. Change
gen_try(src/libponyc/codegen/gencontrol.c:613-725):else_blockis now a regular basic blockcodegen_pushtrystill sets theinvoke_target(now conceptually "error handler target")else_blockno longer starts with a landing pad — it's just where error-flag branches land3k. Change
gen_disposing_block_can_error(src/libponyc/codegen/gencontrol.c:727-820):3l. Change
genfun_forward(src/libponyc/codegen/genfun.c:784-822):codegen_calland return the result after agen_assign_cast. For partial methods, the target now returns{T, i1}instead ofT.make_signaturein 3b).gen_assign_cast(using the unwrapped type, not{T, i1}), re-wrap as{cast_value, error_flag}, and return the wrapped tuple. This preserves the error flag transparently without branching.i1, not a tuple): thei1passes through directly — no destructuring needed. Just return the target'si1result.3m. Change
nullable_pointer_apply(src/libponyc/codegen/genprim.c:493-516):start_functioncall (line 500) to useerror_flag_type(c, t_elem->use_type)instead oft_elem->use_typeas the return type. The current code bypassesmake_signatureentirely, so the return type must be wrapped explicitly here.{undef, true}instead of callinggencall_error. In the non-null case (is_false block): wrap result in{result, false}beforegenfun_build_ret.c->frame->is_partial = trueaftercodegen_startfun.3n. Verify
make_unbox_function(src/libponyc/codegen/gendesc.c:61-128):codegen_call, and forwards the result viagenfun_build_ret. This works naturally for value-returning partial functions (tuple passes through). For partial class constructors (void→i1), the unbox function also changes from void to i1 return, and the forwardedi1result passes through correctly. Verify during implementation.3o. Verify
genopt.ccheap-to-stack optimization (src/libponyc/codegen/genopt.cc:258-283):pony.newcallmetadata. It usesdyn_cast<CallBase>(inst)(line 258) which covers bothCallInstandInvokeInst, so switching from invoke to call requires no changes here. Verify during implementation that the metadata survives on the newcallinstruction and that the optimization still triggers.3p. Verify
genheader.c(src/libponyc/codegen/genheader.c:125-147):ast_id(can_error) == TK_QUESTIONreturns false). After Phase 3, partial functions return{T, i1}struct types, but this code path is never reached for them. No changes needed; verify during implementation.3q. Verify
genfun_be(src/libponyc/codegen/genfun.c:542-591):can_errorfrom the AST but never use it — behaviours can't be partial. The handler always returns void. No changes needed; verify during implementation.3r. Update test program (
test/full-program-tests/issue-4475-case-1/main.pony):errorin a partial constructor and verifies the compiler produces valid IRcall @pony_error) but the program behavior is the sameFiles:
src/libponyc/codegen/gencall.c,src/libponyc/codegen/gencontrol.c,src/libponyc/codegen/genfun.c,src/libponyc/codegen/genfun.h,src/libponyc/codegen/genprim.c,src/libponyc/codegen/gendesc.c,src/libponyc/codegen/codegen.h(possibly),src/libponyc/codegen/genopt.cc(verify only),src/libponyc/codegen/genheader.c(verify only),test/full-program-tests/issue-4475-case-1/main.ponyTest:
make test-ci-core(full test suite: libponyrt, libponyc, full-program, stdlib, examples)Size: ~400-500 LOC (estimate is approximate given the number of interacting sub-steps)
Phase 4: Cleanup — Remove Exception Infrastructure
Goal: Remove dead code left after the codegen change.
Changes:
src/libponyrt/lang/posix_except.c: Removepony_error()implementation and all exception/personality function codesrc/libponyrt/lang/win_except.c: Samesrc/libponyrt/lang/except_try_catch.ll: Removepony_try()LLVM IR implementationsrc/libponyrt/pony.h: Removepony_error()andpony_try()declarations, removepony_partial_fntypedefsrc/libponyc/codegen/codegen.c: Removepony_errorandponyint_personality_v0function declarations (lines 608-616). Removec->personalityfield if no longer needed.src/libponyc/codegen/gencall.c: Removeinvoke_funfunction (lines 88-108). Removegencall_errorif not already removed.src/libponyc/codegen/codegen.h: Renameinvoke_targettoerror_targetincompile_frame_tand update all references:codegen.c(4 refs incodegen_pushscope,codegen_pushloop,codegen_pushtry),gencall.c(remaining refs after Phase 3 removals ingen_callandgen_ffi)src/libponyrt/lang/lsda.c,src/libponyrt/lang/lsda.h: Remove LSDA scanning code (used exclusively by the personality functions, now dead)test/libponyrt/lang/error.cc: RemovePonyTryandPonyTryCantCatchCppExcepttests (they test the removed functions)src/libponyrt/CMakeLists.txt: Removeposix_except.c(line 30),win_except.c(line 37), andlsda.c(line 28) from_c_src. Remove the_ll_except_src/_ll_except_objvariables (lines 55-56), thellccommand lookup (lines 58-61), the custom command that compiles the.llfile (lines 96-101), and the${_ll_except_obj}reference inadd_library(line 69). Also update thePONY_RUNTIME_BITCODEsection (line 165) which appends${_ll_except_src}to the bitcode link list.Note:
pony_try()andpony_error()arePONY_API(public C API). Removing them is a breaking change for any C code that embeds the Pony runtime and uses these functions. This is intentional for the RFC prototype — the RFC proposes eliminating the exception-based error mechanism entirely.Files:
src/libponyrt/lang/posix_except.c,src/libponyrt/lang/win_except.c,src/libponyrt/lang/except_try_catch.ll,src/libponyrt/lang/lsda.c,src/libponyrt/lang/lsda.h,src/libponyrt/pony.h,src/libponyrt/CMakeLists.txt,src/libponyc/codegen/codegen.c,src/libponyc/codegen/codegen.h,src/libponyc/codegen/gencall.c,test/libponyrt/lang/error.ccTest:
make test-ci-coreSize: ~200 LOC (mostly deletions)
Verification
After each phase, run the specified test command. The full suite (
make test-ci-core) covers:test-libponyrt— runtime unit teststest-libponyc— compiler unit tests (codegen, type checking, etc.)test-full-programs-{debug,release}— compile-and-run integration teststest-stdlib-{debug,release}— standard library teststest-examples— compile all examplesFuture improvements (not in this plan)
SIZE_MAXsentinel with struct returns containing both value and error code (Sean's preferred option 2 from the issue)nounwindattribute: With all invokes replaced by calls and landing pads removed, generated functions could be markednounwind, enabling additional LLVM optimizationspony_tryAPI: If C embedding use cases need to call partial Pony functions, design a newpony_trythat understands error-flag tuple returns instead of exceptionspony_os_send: No Pony callers exist for this C function — it may be dead code worth removingBuild commands
Beta Was this translation helpful? Give feedback.
All reactions