Embedded LLD Phase 1: Build Infrastructure + CRT Objects #4942
Closed
SeanTAllen
started this conversation in
ponyc
Replies: 1 comment
-
|
Implementation PR: #4945 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Scope
This phase establishes build infrastructure for embedded LLD linking in ponyc. It does NOT change how ponyc links user programs — the legacy
system()path remains the only active code path.What this phase delivers:
genexe.candprogram.cconverted to C++ for future LLD API accessWhat this phase does NOT do:
--sysrootis Phase 2)Prerequisites
All already in place:
lib/llvm/src/(hash2078da43e25a4623)lib/llvm/src/lld/(present in monorepo checkout)lib/llvm/src/compiler-rt/lib/builtins/crtbegin.candcrtend.cSteps
Step 1: Add LLD to the LLVM build
File:
lib/CMakeLists.txtAdd before
add_subdirectory(llvm/src/llvm)(before line 147):LLVM's cmake processes
LLVM_ENABLE_PROJECTSby iterating the list and callingadd_subdirectory("${CMAKE_CURRENT_SOURCE_DIR}/../${proj}" "${proj}")for each entry. From the LLVM source root atlib/llvm/src/llvm/,../lldresolves tolib/llvm/src/lld/, which exists and contains aCMakeLists.txt.LLD respects the existing
LLVM_ENABLE_ZLIB OFFandLLVM_ENABLE_ZSTD OFFsettings — it builds without compression support, which is fine (compressed debug sections are uncommon and not needed for correctness).The build produces these static libraries in
build/libs/lib/:liblldCommon.a— shared utilitiesliblldELF.a— ELF linker (Linux, BSD)liblldMachO.a— Mach-O linker (macOS)liblldCOFF.a— COFF linker (Windows)liblldWasm.a— WebAssembly linkerliblldMinGW.a— MinGW-style linkerLLD headers install to
build/libs/include/lld/. LLD cmake config installs tobuild/libs/lib/cmake/lld/.CI impact: Changes to
lib/CMakeLists.txtinvalidate the libs cache (keyed onhashFiles('lib/CMakeLists.txt', ...)). The first CI run after this change will rebuild libs from scratch, including LLD. Subsequent runs cache normally.Step 2: Link LLD libraries into ponyc
File:
CMakeLists.txt(top-level)After the existing
find_package(LLVM ...)on line 27, add:The
lib64path mirrors the existing pattern used byfind_package(LLVM ...),find_package(GTest ...), andfind_package(benchmark ...)on the same lines.After the
llvm_map_components_to_libnamescall (line 216), define LLD libraries:set(PONYC_LLD_LIBS lldELF lldMachO lldCOFF lldWasm lldMinGW lldCommon)lldCommonmust be last — the other LLD libraries depend on it. Link order matters for static archives.File:
src/ponyc/CMakeLists.txtLine 77 is outside the MSVC/non-MSVC conditional (which ends at line 74) and applies to all platforms. Change it to include LLD:
No changes needed to the MSVC block (lines 18-27) — line 77 already covers MSVC.
File:
test/libponyc/CMakeLists.txtChange line 135:
File:
benchmark/libponyc/CMakeLists.txtChange line 51:
This is the only benchmark target that links
${PONYC_LLVM_LIBS}(benchmark/libponyrtdoes not).Binary size note: Since no code references LLD symbols yet, the static linker won't pull in any LLD object files from the archives. The
LLD_HAS_DRIVERmacro (Step 6) expands to a forward declaration only — not a symbol reference. Binary size increase in Phase 1 should be zero. When Phase 2 actually calls LLD driver functions, only the referenced driver's code gets pulled in.Step 3: Update libponyc-standalone to include LLD
The standalone library bundles all LLVM libraries into a single archive. It needs to also include LLD.
File:
src/libponyc/CMakeLists.txtLinux (around line 217): The
findcommand currently globslibLLVM*.a. Add a second command for LLD right after it:macOS (around line 150): Add a glob for LLD archives and include them in the
libtoolcommand:Then include
${LLD_OBJS}in thelibtool -static -ocommand alongside${LLVM_OBJS}.FreeBSD (around line 169): Add the same
find ... liblld*.aline after thelibLLVM*.aline.MSVC (around line 132): Add LLD .lib files:
Include
${LLD_OBJS}in thelib.execommand alongside${LLVM_OBJS}.Note: OpenBSD and DragonFly have
# TODOstubs (lines 188-191) and don't currently build standalone libraries. This plan doesn't change that — LLD additions for those platforms would come when their standalone builds are implemented.Step 4: Build compiler-rt CRT objects
Rather than adding compiler-rt as an LLVM sub-project (which drags in its complex cmake infrastructure —
CompilerRTUtils,BuiltinTests,filter_available_targets, etc.), compile the two CRT source files directly. They are small, self-contained C files that only need a standard C compiler.New file:
src/crt/CMakeLists.txtCompile flags explained:
CRT_HAS_INITFINI_ARRAY: Use.init_array/.fini_arraysections (standard on all modern Linux). Without this, crtbegin.c falls back to legacy.ctors/.dtorssections.EH_USE_FRAME_REGISTRY: Include.eh_framesection and frame registration code for exception handling support.PONY_PIC_FLAG(for S variants): Position-independent code, needed for PIE executables and shared libraries. ponyc defaults to PIE.CRT variant naming follows GCC conventions:
crtbeginS.ocrtendS.ocrtbeginT.o--static)crtend.o--static)File:
CMakeLists.txt(top-level)Add the CRT subdirectory, guarded to only build on Linux (ELF targets that need CRT objects):
macOS doesn't use CRT objects (uses
-lSysteminstead). Windows uses MSVC CRT. BSD CRT objects are handled in Phase 6.For cross-compilation,
CMAKE_SYSTEM_NAMEis set toLinuxby thecross-libponyrtMakefile target (-DCMAKE_SYSTEM_NAME=Linux), so the CRT objects are compiled with the cross-compiler automatically.File:
MakefileUpdate the
cross-libponyrttarget (line 220) to also build CRT objects:The change is adding
crt_objectsto the--targetlist. Sincecrt_objectsonly exists whenCMAKE_SYSTEM_NAMEisLinux(whichcross-libponyrtalways sets), this is safe.Step 5: Ship CRT objects in the install layout
File:
MakefileIn the
installtarget (after line 327, following the existing libponyrt-pic copy), add:The
if [ -f ... ]guard matches the existing pattern for optional files (libponyrt-pic, libponyc, etc.) and ensures non-Linux installs don't fail.After this, the installed directory layout on Linux is:
Step 6: Convert genexe.c to C++
Rename:
src/libponyc/codegen/genexe.c→src/libponyc/codegen/genexe.ccThe existing
PONY_EXTERN_C_BEGIN/PONY_EXTERN_C_ENDguards ingenexe.hensure proper C linkage forgenexe()andgen_main(). Callers in C files (e.g.,codegen.c) continue to work — they see C-linkage declarations via the header, and the C++ implementation emits C-linkage symbols.This is the same pattern already used by
genopt.cc,gendebug.cc, andhost.ccin the same directory.On MSVC, all
.cfiles are already compiled as C++ (lines 116-117 ofsrc/libponyc/CMakeLists.txt), so the code is already C++-compatible.Add LLD header includes at the top of
genexe.cc:The
LLD_HAS_DRIVERmacro expands to anamespace lld { namespace elf { bool link(...); } }forward declaration. No symbols are referenced, so no LLD object files are pulled into the binary.Open question for Sean: A reviewer suggested deferring the
#include <lld/Common/Driver.h>andLLD_HAS_DRIVERmacros to Phase 2, arguing they're unused code. My reasoning for including them now: they validate that LLD headers are correctly installed and includable as part of the build. Without them, a header installation problem wouldn't surface until Phase 2. The tradeoff is unused includes in production code for one phase.File:
src/libponyc/CMakeLists.txtChange line 28 from
codegen/genexe.ctocodegen/genexe.cc.Step 7: Convert program.c to C++
Rename:
src/libponyc/pkg/program.c→src/libponyc/pkg/program.ccSame approach as Step 6.
program.halready hasPONY_EXTERN_C_BEGIN/PONY_EXTERN_C_END.File:
src/libponyc/CMakeLists.txtChange line 75 from
pkg/program.ctopkg/program.cc.Step 8: Add embedded library args API
The existing
program_lib_build_args()builds a flat string with shell-specific formatting (-Wl,prefixes,"..."quoting,-l/.libaffixes). Embedded LLD needs structured data: separate path and library lists without formatting.The raw data already exists in
program_t(data->libpathsanddata->libs), but paths fromopt->package_search_paths(CLI-pandPONYPATH) are only merged duringprogram_lib_build_args. Also, stored values are wrapped in"..."byquoted_locator().File:
src/libponyc/pkg/program.hAdd inside the
PONY_EXTERN_C_BEGIN/PONY_EXTERN_C_ENDblock:File:
src/libponyc/pkg/program.ccAdd fields to
program_t:Initialize the new fields in
program_create()(all toNULL/0).Implement
program_lib_build_args_embedded():data->lib_args == NULLANDdata->embedded_paths == NULL(ensures neither build function has been called yet)data->libpaths+opt->package_search_pathsembedded_pathsarray, populate with unquoted path stringsdata->libsembedded_libsarray, populate with unquoted library name stringsMutual exclusion between the two build functions relies on
embedded_paths != NULL(set by the embedded path) andlib_args != NULL(set by the legacy path). Each function asserts both areNULLon entry:program_lib_build_args_embedded(): assertdata->lib_args == NULL && data->embedded_paths == NULLprogram_lib_build_args(): addpony_assert(data->embedded_paths == NULL)at line 191, alongside the existingpony_assert(data->lib_args == NULL)No sentinel value is needed — the arrays themselves serve as the indicator.
program_free()is unaffected because it already guardslib_argswith aNULLcheck, and the new arrays get their ownNULL-guarded free calls."Unquoting" strips the
"..."wrapper added byquoted_locator(): skip the first byte, copy up to (but not including) the last byte, null-terminate. Usestringtabfor the result so callers don't need to manage memory.Implement the four accessor functions as simple index-into-array operations with bounds assertions.
Update
program_free()to free theembedded_pathsandembedded_libsarrays (allocated viaponyint_pool_alloc_size).Serialization:
program_thasprogram_serialise_trace,program_serialise, andprogram_deserialisefunctions. The new embedded fields are transient — they're only populated during the linking phase, never during compilation when serialization occurs. They will always beNULL/0at serialization time. Nevertheless, update the serialization functions:program_serialise: write the new fields asNULL/0(they're already zero, but be explicit)program_deserialise: initialize the new fields toNULL/0program_serialise_trace: no action needed forNULLpointersDesign Decisions
CRT build: direct compilation vs. compiler-rt cmake
Decision: Compile
crtbegin.candcrtend.cdirectly, bypassing compiler-rt's cmake.Rationale: compiler-rt's cmake infrastructure is complex — it requires
CompilerRTUtils,BuiltinTests,filter_available_targets, and extensive platform detection. We need exactly two small C files. Direct compilation with explicit flags is simpler, more maintainable, and produces identical output. The source files are stable LLVM project code from the vendored submodule.LLD drivers to link
Decision: List all LLD drivers in
PONYC_LLD_LIBSon all platforms.Rationale: Since no LLD symbols are referenced in Phase 1, the static linker won't pull in any LLD code — binary size increase is zero. When later phases reference specific drivers, only those drivers' code is pulled in. Listing all drivers now avoids revisiting the CMakeLists in each subsequent phase.
CRT platforms
Decision: Only build CRT objects on Linux (guarded by
CMAKE_SYSTEM_NAME STREQUAL "Linux").Rationale: macOS doesn't use CRT objects. Windows uses MSVC CRT. BSD CRT is deferred to Phase 6. Linux is the primary target for Phases 2-3.
compiler-rt licensing
Non-issue: compiler-rt is under the Apache 2.0 license with LLVM Exceptions — the same license as the already-vendored LLVM. No new license concerns.
Testing
Build verification
Regression testing
All tests must pass with zero changes to test code — this phase doesn't alter behavior.
CRT object verification
Cross-compilation verification
Install verification
Binary size measurement
Files Modified
lib/CMakeLists.txtLLVM_ENABLE_PROJECTS "lld"CMakeLists.txtfind_package(LLD), definePONYC_LLD_LIBS, addsrc/crtsubdirectorysrc/ponyc/CMakeLists.txt${PONYC_LLD_LIBS}test/libponyc/CMakeLists.txt${PONYC_LLD_LIBS}benchmark/libponyc/CMakeLists.txt${PONYC_LLD_LIBS}src/libponyc/CMakeLists.txtsrc/crt/CMakeLists.txtsrc/libponyc/codegen/genexe.c→.ccsrc/libponyc/pkg/program.c→.ccsrc/libponyc/pkg/program.hMakefilecross-libponyrtto build CRTBeta Was this translation helpful? Give feedback.
All reactions