Skip to content

windows: fix native build hook (cl/link arg conflict + gizmo.obj name collision)#164

Open
mushogenshin wants to merge 10 commits into
nmfisher:developfrom
mushogenshin:fix/windows-cl-link-conflict
Open

windows: fix native build hook (cl/link arg conflict + gizmo.obj name collision)#164
mushogenshin wants to merge 10 commits into
nmfisher:developfrom
mushogenshin:fix/windows-cl-link-conflict

Conversation

@mushogenshin
Copy link
Copy Markdown
Contributor

Problem

thermion_dart's native build hook fails on Windows in two distinct, sequenced ways. CI on a downstream project (Flutter stable, native-assets enabled, MSVC 2026 toolchain on windows-latest) was failing silently inside flutter pub get's build hook with cl.exe exit code 2 — the actual diagnostic was buried in the per-package build.log inside the pub cache, never surfacing to stderr.

After mirroring Level.SEVERE records to stderr (the prior debug/surface-build-stderr patch on our fork), the real errors became visible. There are two of them.

Bug #1LNK1561: entry point must be defined

build.dart's Windows branch manually appends '/link' and '/LIBPATH:$libDir' to the user flags list:

if (platform == "windows") ...[
  ...includeDirs.map((d) => "/I${path.join(pkgRootFilePath, d)}"),
  "@${srcs.uri.toFilePath(windows: true)}",
  '/link',
  "/LIBPATH:$libDir",
],

But native_toolchain_c.runCl (run_cbuilder.dart:375-398) already injects /Fe:, /LD, its own /link separator, /MACHINE, and /LIBPATH (from libraryDirectories, which the hook already passes via libraryDirectories: [libDir] on the CBuilder). The order it produces is:

cl.exe ...userFlags... ...defines... ...includes... /LD /Fe:thermion_dart.dll /link /MACHINE:X64 /LIBPATH:libDir

When userFlags ends with our /link /LIBPATH:libDir, the resulting cl command line has two /link separators:

cl.exe /O2 /TP /std:c++20 /MD ...defines... /I... @sources.rsp
       /link /LIBPATH:libDir       ← ours, comes early
       /LD /Fe:thermion_dart.dll
       /link /MACHINE:X64 /LIBPATH:libDir   ← native_toolchain_c's

cl.exe stops compile-flag parsing at the first /link, so /LD and /Fe: (which would normally trigger cl.exe to forward /DLL and /OUT: to LINK and predefine _DLL) end up in LINK's argument vector. LINK rejects them as LNK4044: unrecognized option. With no /DLL reaching LINK, it tries to produce an EXE, looks for an entry point, and fails fatally:

LINK : warning LNK4044: unrecognized option '/DRELEASE'; ignored
LINK : warning LNK4044: unrecognized option '/DNDEBUG'; ignored
LINK : warning LNK4044: unrecognized option '/LD'; ignored
LINK : warning LNK4044: unrecognized option '/link'; ignored
LINK : fatal error LNK1561: entry point must be defined

While we're touching the Windows compile flags, '/VERBOSE' is also wrong there — it's a linker option, not a compiler one, and cl.exe parses it as the deprecated /V<string>, emitting D9035: option 'V' has been deprecated. If the verbose link map is ever needed, it belongs after native_toolchain_c's own /link separator (e.g. via linkerOptions), not in the compile flag list.

Fix

Remove the manual /link /LIBPATH:libDir and let libraryDirectories: [libDir] (which the hook is already setting) do its job. Drop /VERBOSE from the compile flags. Comment in build.dart documents the two-/link failure mode so this doesn't regress.

Bug #2LNK2019: unresolved external symbol thermion::Gizmo::* (×4)

After Bug #1 is fixed, the link reveals a separate problem:

gizmo.obj : warning LNK4042: object specified more than once; extras ignored
TGizmo.obj : error LNK2019: unresolved external symbol "public: __cdecl thermion::Gizmo::Gizmo(class thermion::SceneAsset *, ...)"
TGizmo.obj : error LNK2019: unresolved external symbol "public: void __cdecl thermion::Gizmo::pick(...)"
TGizmo.obj : error LNK2019: unresolved external symbol "public: void __cdecl thermion::Gizmo::highlight(...)"
TGizmo.obj : error LNK2019: unresolved external symbol "public: void __cdecl thermion::Gizmo::unhighlight(...)"
thermion_dart.dll : fatal error LNK1120: 4 unresolved externals

Two source files compile to the same gizmo.obj basename:

  • native/src/scene/Gizmo.cpp — the C++ class implementing thermion::Gizmo
  • native/include/material/gizmo.c — a const uint8_t GIZMO_PACKAGE[] data blob

On macOS / Linux / Android the filesystem is case-sensitive and Gizmo.objgizmo.obj, so both coexist. Windows is case-insensitive, and _processMaterials adds the material file to the source list after the class is gathered, so the material's gizmo.obj write-clobbers the class's. thermion::Gizmo's constructor, pick, highlight, and unhighlight symbols disappear from the link, and TGizmo.cpp's c_api wrappers fail to resolve them.

Fix

Rename the material .c file to gizmo_material.c (gizmo.h is unchanged — only the .c basename collided; nothing else in the tree references the .c filename). The 'gizmo' key in materialSources is left intact, so _includeMaterial still produces the GIZMO_ENABLED=1 define and existing call sites are unaffected.

What's affected

Platform Before After Reason
Windows broken (LNK1561 then LNK1120) builds both fixes apply
iOS works works (unchanged) doesn't take the Windows code path; case-sensitive FS for the rename
macOS works works (unchanged) doesn't take the Windows code path; case-sensitive FS
Android works works (unchanged) same as macOS/iOS
Linux works works (unchanged) same as macOS/iOS

The two commits are independent root causes and bisect cleanly — they're separated for that reason. Happy to split into two PRs if that's preferred.

Verification

Tested on a Windows 11 dev box, MSVC 2026 (cl.exe 19.50.35728), Flutter stable 3.41.2 with flutter config --enable-native-assets:

$ flutter test
... (build hook compiles thermion_dart sources, links thermion_dart.dll)
00:18 +500: All tests passed!

Artifacts produced as expected:

  • <app>/.dart_tool/hooks_runner/shared/thermion_dart/build/<hash>/thermion_dart.dll
  • <thermion_flutter>/.dart_tool/generated_headers.cmake

Without these patches (with only the older fix/windows-link-libpath heuristic applied), the same flutter test on the same machine fails inside the build hook before any test runs. The reproducer is straightforward — any downstream Flutter project pinning thermion_flutter and thermion_dart to upstream develop and running flutter test (or flutter build windows) on Windows.

Related

  • #160 — iOS vulkan-source exclusion. Independent; could merge in any order.
  • #163 — mobile eager-scale gesture recognizer. Independent.

Together with #160 and #163, this PR closes out the Windows leg of a cross-platform Thermion spike (iOS, Android, macOS, Windows) for an existing Flutter project. The downstream rationale for needing all three landed: build green on every target before binary-size and runtime-stability comparisons against the existing WebView+Three.js renderer can run.

The build hook in thermion_dart/hook/build.dart enumerates every
.cpp file under native/src/ and applies platform-specific exclusions
for `windows`, `d3d`, and `linux` paths. There is no analogous
exclusion for `vulkan/`, so on iOS the hook compiles
`native/src/vulkan/VulkanUtils.cpp` and
`native/src/vulkan/BaseVulkanTexture.cpp`. Both reference symbols in
the `bluevk::` namespace, but iOS is Metal-only and the platform
branch never adds `bluevk` to its `libs` list.

Result on iOS device builds (Xcode 15, Flutter master):

    Undefined symbols for architecture arm64:
      "bluevk::vkMapMemory", referenced from:
          thermion::vulkan::readVkImageToBitmap(...) in VulkanUtils-*.o
      "bluevk::vkFreeMemory", referenced from: ...
      ... (~50 more bluevk:: symbols)
    ld: symbol(s) not found for architecture arm64

The whole `flutter build ios` fails inside `dart_build` because the
shared lib never links.

Fix: skip vulkan-prefixed source paths when targetOS is iOS, mirroring
the structure of the existing windows / linux exclusions a few lines
above. macOS still gets the vulkan sources (it links `bluevk`),
Android still gets them (also links `bluevk`), Linux still gets them.
Only iOS changes.

Verified: `flutter build ios --debug --no-codesign` now produces a
working `Runner.app` on top of upstream develop.
The mobile gesture handler in `_MobileListenerWidget` used a regular
`GestureDetector` whose internal `ScaleGestureRecognizer` waits for
movement to cross `kPanSlop` before claiming the gesture arena. When
a Thermion view sits inside an ancestor `Scrollable` (e.g. ListView,
PageView, CustomScrollView), the ancestor's `VerticalDragGesture-
Recognizer` reaches its acceptance threshold first and wins the
arena — touches starting on the viewer get interpreted as page
scrolls, not viewport gestures. The result on iOS is "sometimes
tumbles, sometimes scrolls" depending on drag direction and speed.

Fix: switch to `RawGestureDetector` with a `_EagerScaleGesture-
Recognizer` subclass that calls `resolve(GestureDisposition.accepted)`
inside `addAllowedPointer`. The arena is claimed on PointerDown, so
ancestor scrollables never get the chance to win. Single-finger orbit
and pinch-zoom both resolve to the viewport regardless of how the
viewer is composed.

Tap and double-tap can no longer be detected via separate
`TapGestureRecognizer` / `DoubleTapGestureRecognizer` entries (the
eager scale wins arena before they ever accept), so they're
synthesized inside the scale callbacks: a "tap" is a scale gesture
whose total focal-point movement stayed below 8 px and whose duration
stayed below 250 ms; a "double-tap" is two such taps within 300 ms
of each other and within 8 px of each other. Both still feed
`InputHandler` via the existing `TouchEvent(tap, ...)` / `TouchEvent(
doubleTap, ...)` events, so callers see no API change.

Verified: `flutter build ios --debug --no-codesign` succeeds; running
on iPhone the multi-viewer stress test (8 viewers in a vertical
ListView) now lets the parent scroll only when drags start in the
gutter and routes every touch starting on a viewer to the viewport
deterministically.

Filing alongside the iOS vulkan-source exclusion (PR nmfisher#160).
The Windows branch of the build hook gathers cl.exe options
(includes, defines, response file with sources) but stops short of
adding the /link separator and /LIBPATH:$libDir. Without those, the
linker has no idea where the Filament .lib artifacts that pub get
downloaded actually live, so it fails with LNK1104 / LNK2019 on
every Filament symbol reference and cl.exe exits 2.

The library *inputs* themselves don't need to be added on the
command line because native/include/ThermionWin32.h declares all of
them via #pragma comment(lib, "filament.lib") and friends, and that
header is transitively included by the Windows vulkan/d3d sources
plus the generic c_api headers (TCamera.h, TRenderer.cpp,
TRenderManager.cpp). The pragma directives emit linker directives
that the linker honours automatically — but it still has to find
the .lib files via a search path, hence /LIBPATH.

Uncomment /link and /LIBPATH:$libDir; leave /DLL out (CBuilder.library
already passes /LD which makes /DLL redundant) and leave the
individual sources out (they're already in the response file).

Verification needed on Windows: I don't have a Windows dev box.
Filing this as a separate fork branch so it can be rebased / dropped
independently of the iOS fixes (nmfisher#160 vulkan-source exclusion, nmfisher#163
eager scale).
Thermion's build hook routes all log output through a Logger that
writes to .dart_tool/thermion_dart/log/build.log. native_toolchain_c
captures subprocess (cl.exe, clang, ld) stdout/stderr and routes
the stderr through `logger.severe(...)`. On a build failure
runProcess throws a ProcessException whose message is just the
command line and exit code — the actual compiler / linker output
stays inside that build.log file in the pub cache, where it is
invisible to CI logs and to anyone running `flutter build` who
isn't already poking around in `.dart_tool/`.

Result: compile / link failures look completely opaque downstream.
The exception output shows the cl.exe command, exit code 2, and
nothing else. Has been blocking diagnosis of the Windows native
build for several CI iterations.

Mirror SEVERE-level records to stderr inside the existing log
handler so the real error reaches whoever's watching. Successful
builds aren't noisier — compilers don't emit much stderr on
success — and the build.log file behavior is unchanged.
native_toolchain_c.runCl already injects /Fe:, /LD, its own /link
separator, /MACHINE, and /LIBPATH (from libraryDirectories) when
dynamicLibrary is set. By manually adding our own '/link
/LIBPATH:$libDir' to the user flags list, cl.exe's parser stopped
reading compile-time flags at our separator — pushing the toolchain's
auto-injected /LD and /Fe: into LINK's argument vector, where LINK
ignored them as LNK4044. With /LD never reaching cl.exe, no /DLL was
forwarded to the linker, which then tried to produce an EXE and bailed
out with LNK1561 ("entry point must be defined").

Drop the redundant /link /LIBPATH and let `libraryDirectories: [libDir]`
do its job. Also drop /VERBOSE from the compile flag list — it's a
linker option that cl.exe parses as deprecated /V<string> (warning
D9035); if the verbose link map is ever needed, it belongs after
native_toolchain_c's own /link separator.

Diagnosed via the stderr-tee added in the prior commit; the smoking
gun was the cl.exe command dump in build.log showing two /link
separators and unrecognized cl flags trailing the first one.
Windows filesystems are case-insensitive, so the basename collision
between native/src/scene/Gizmo.cpp and native/include/material/gizmo.c
caused both source files to compile to the same gizmo.obj path. The
material was added to the source list AFTER the scene class, so its
.obj write-clobbered the class's, leaving thermion::Gizmo's
constructor, pick, highlight, and unhighlight symbols out of the link.
TGizmo.cpp's c_api wrappers then failed with four LNK2019s and a
final LNK1120 ("4 unresolved externals"). LINK's earlier LNK4042
warning ("object specified more than once; extras ignored") was the
breadcrumb.

Rename the material .c file to gizmo_material.c (gizmo.h stays put,
since only the .c basename collided). materialSources entry in
build.dart updated to match. The "gizmo" key still drives the
GIZMO_ENABLED=1 define so existing call sites and tests are
unaffected.

Verified with `flutter test` on Windows after both this and the
preceding cl/link fix: 500/500 tests pass, thermion_dart.dll
produced.
The link hook called CLinker.library(... LinkerOptions.manual(...))
with no `sources`. native_toolchain_c.runCl translates that into:

    cl.exe /O2 /LD /Fe:thermion_dart.dll /link /OPT:REF /MACHINE:X64
           /LIBPATH:<empty link/ output dir>

i.e. cl.exe with no inputs at all, which exits immediately with
`cl : Command line error D8003 : missing source filename`. Under
`flutter test` the link hook is not invoked, so this only surfaces
during a real `flutter build windows --release` — where it shows up
as the terse "Linking native assets failed" through MSBuild, with
the actual D8003 stranded inside the per-package build.log in the
pub cache.

The link phase here is optional: the build hook already produced
thermion_dart.dll and added a CodeAsset for it. Pass those
build-hook assets through unchanged on Windows. Keep the CLinker
call on platforms where it currently works.
…call

`thermion_flutter_plugin`'s own translation units include Filament
headers (transitively, via `WindowsVulkanContext.h` → `Platform.h`),
but `target_include_directories(... INTERFACE ...)` only exposes the
DART_PKG_HEADERS paths to consumers, not to the plugin itself. cl.exe
then could not find `<utils/...>` headers when compiling the plugin,
producing C3083 / C2039 / C4430 on filament headers.

Switch the scope to PUBLIC so the plugin's own compile and any
consumer both see the headers.

Also drop the immediately-following

    include_directories(${PLUGIN_NAME} INTERFACE ...)

call. `include_directories()` accepts only path arguments — the
target name and `INTERFACE` keyword are silently treated as bogus
"directory" entries, so the call doesn't do what its author
intended; it just adds two non-existent paths to the directory-level
include set.
Filament's `backend/Platform.h` friend-declares
`utils::io::ostream& operator<<(...)` without first declaring the
nested namespace. Sibling Filament headers like
`backend/DriverEnums.h` carry the forward declaration

    namespace utils::io { class ostream; }

so the rest of Filament compiles cleanly. The plugin's translation
unit reaches `Platform.h` (via `WindowsVulkanContext.h`) before any
of those siblings, leaving the namespace undeclared and cl.exe
rejecting the friend with C3083 ("the symbol to the left of '::'
must be a type"), C2039 ("'ostream': is not a member of 'utils'"),
and C4430.

Mirror Filament's own forward declaration in the plugin header,
just before the WindowsVulkanContext include. Pulling the full
`<utils/ostream.h>` would also work but drags in more than the
friend needs; the namespace + class fwd-decl matches what
DriverEnums.h does.
@mushogenshin
Copy link
Copy Markdown
Contributor Author

Update: validating this branch end-to-end with flutter build windows --release (the path the Windows CI workflow takes) surfaced three more issues beyond what the original PR body covers. They're all in commits already on the branch so this PR now carries all five fixes; I left the original two as the headline since they're the load-bearing ones, but happy to split or rewrite the description if you'd prefer the PR scoped tighter.

The original commits (8495428, 7fc87f6) are sufficient for flutter test to compile, link, and load thermion_dart.dll — that's why the spike's other targets validated cleanly via tests. They're not sufficient for flutter build windows --release, which exercises the Flutter plugin's CMake build and the link hook in addition to the build hook. The new commits cover those:

bef84aewindows: pass build-hook assets through link hook unchanged

thermion_dart/hook/link.dart calls CLinker.library(... LinkerOptions.manual(...)) with no sources. On Windows, native_toolchain_c.runCl translates that into:

cl.exe /O2 /LD /Fe:thermion_dart.dll /link /OPT:REF /MACHINE:X64
       /LIBPATH:<empty link/ output dir>

i.e. cl.exe with no input files at all, which exits with cl : Command line error D8003: missing source filename. flutter test doesn't invoke the link hook so this stays hidden in test runs; flutter build windows --release does, and the failure surfaces as a terse "Linking native assets failed" through MSBuild with the actual D8003 stranded inside the per-package build.log.

The link phase here isn't doing tree-shaking with manual mode — it's a no-op. Pass the build-hook's code assets through unchanged on Windows; keep the CLinker call on platforms where it currently works. (If you'd prefer a deeper fix that wires real link-time sources into CLinker so it can actually do something useful on Windows, happy to take that on as a follow-up — this commit just unblocks the build.)

341fc8cwindows: PUBLIC plugin include scope, drop bogus include_directories call

thermion_flutter/thermion_flutter/windows/CMakeLists.txt had:

target_include_directories(${PLUGIN_NAME} INTERFACE
  "${CMAKE_CURRENT_SOURCE_DIR}/include"
  "${CMAKE_CURRENT_SOURCE_DIR}"
  "${DART_PKG_HEADERS}"
)

include_directories(${PLUGIN_NAME} INTERFACE
  "${CMAKE_CURRENT_SOURCE_DIR}/include"
  "${CMAKE_CURRENT_SOURCE_DIR}"
  "${DART_PKG_HEADERS}"
)

INTERFACE only exposes the include dirs to consumers, not to the plugin's own translation units. So thermion_flutter_plugin.cpp couldn't see the Filament headers it transitively needs (via WindowsVulkanContext.hPlatform.h). Switched to PUBLIC so both the plugin's own compile and any consumer get the headers.

The second include_directories(${PLUGIN_NAME} INTERFACE ...) call below is also CMake-invalid: include_directories() accepts only path arguments — the target name and INTERFACE keyword are silently treated as bogus "directory" entries, so the call was a no-op all along. Removed.

1c0864dwindows: forward-declare utils::io::ostream in plugin header

After fixing the include scope, the plugin compile still failed with:

Platform.h(99,23): error C3083: 'io': the symbol to the left of a '::' must be a type
Platform.h(99,27): error C2039: 'ostream': is not a member of 'utils'
Platform.h(99,34): error C2143: syntax error: missing ';' before '&'
Platform.h(99,27): error C2433: 'filament::backend::ostream': 'friend' not permitted on data declarations
Platform.h(99,27): error C4430: missing type specifier - int assumed.

Filament's backend/Platform.h friend-declares utils::io::ostream& operator<<(...) but doesn't declare the nested namespace itself — it relies on a sibling header (e.g. backend/DriverEnums.h) being included first to provide:

namespace utils::io { class ostream; }

thermion_dart.dll's own translation units transitively include DriverEnums.h before Platform.h so the .dll build is fine. The plugin's translation unit reaches Platform.h via WindowsVulkanContext.h first, with no prior utils::io declaration in scope, and cl.exe rejects the friend.

Mirror Filament's own forward declaration in thermion_flutter_plugin.h, just before the WindowsVulkanContext.h include. Pulling the full <utils/ostream.h> would also work but drags in more than the friend needs.

Verification

flutter build windows --release against the branch HEAD now produces both musculature.exe and Kineograph.exe in build/windows/x64/runner/Release/, with thermion_dart.dll correctly placed at build/native_assets/windows/. Same machine (Windows 11, MSVC 14.50, Flutter stable 3.41.2, native-assets enabled) where the original PR body was validated via flutter test. The downstream CI workflow should now go green; will report back once the next tag push completes.

…missing

`thermion_flutter`'s build hook produces `generated_headers.cmake`
from `thermion_dart`'s metadata during `flutter assemble dart_build`,
which runs as a custom build step inside MSBuild AFTER cmake config
completes. On a fresh checkout the file doesn't exist when cmake
first reads `windows/CMakeLists.txt`, so the unconditional `include()`
errored with "include could not find requested file" and aborted
cmake — leaving the build stuck because the dart_build step never
got to run, so the file was never produced. The only way out was to
fire the hook out of band first (e.g. `flutter test`) and re-run
`flutter run`.

Include conditionally and register the file as a configure
dependency. On a clean checkout this means a two-pass build:

  1. First pass: file missing, cmake succeeds with empty
     DART_PKG_HEADERS, dart_build runs and writes the file. Plugin
     compile fails because the Filament headers are not on the
     include path yet.
  2. Second pass: cmake re-runs (CMAKE_CONFIGURE_DEPENDS noticed
     the new file), DART_PKG_HEADERS is populated, plugin compile
     succeeds.

Subsequent builds are no-ops once the file is settled. Status
message points the user at the recovery path so the first-pass
plugin failure is self-explanatory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant