[WebAssembly] GC constructor functions in otherwise unused archive objects

This allows __wasilibc_populate_libpreopen to be GC'd in more cases
where it isn't needed, including when linked from Rust's libstd.

To be clear this change only relates to object files at are part of ar archives and are not part of the link? Perhaps mention that in the PR title/description.


My understanding is that any file that is created as an ObjFile is by definition live. and that all files in symtab->objectFiles are also by definition live.

Archive files don't create any ObjFiles until they are pulled into the link for some reason (i.e. they are live).

What am I missing here?

To be clear this change only relates to object files at are part of ar archives and are not part of the link? Perhaps mention that in the PR title/description.

Yes, this relates to functions in objects which don't, after GC, contribute any functions to the link. I've now added "archive" to the title.


There are effectively two GC algorithms in wasm-ld today. The first selects the objects that aren't in archives, plus the objects in archives they (transitively) reference. The second one is MarkLive.cpp, which selects exported functions, plus functions they (transitively) reference.

What this patch is saying is, if a constructor function gets pulled in because its object is selected in the first phase, but MarkLive.cpp's GC determines that no functions in that object are transitively called from an export in the second phase, the constructor doesn't need to be called.

I see .. so something like this:

  1. Transitive dependency pulls object out of archive.
  2. Source of dependency turns out to not be live in the final link due to --gc-sections.
  3. Object file should no longer be considered part of the link after all, reversing the decision made in (1).

If I'm understanding correctly, one possible problem with this is that is makes --gc-sections observable. I could get a static ctor that runs with --no-gc-sections but then is not run with --gc-sections.... maybe this is so subtle as not to matter? But I would normally expect those two builds with be identical in their behaviour.

That's correct.

You can get similar observable behavior changes from any optimization that can delete code, which can lead fewer object files being pulled in. Consider C code that does this:

int x = 0;
if (x) {

Plain -O2 will remove the call to foo here. If that's the only call to foo and foo is defined in an object which has a static constructor, the constructor will run at -O0 and won't run at -O2. This patch has a similar effect.

This patch upholds the rule that if the constructor is defined in a .o file which contributes to the final link, it'll run. It's just doing more optimization before making that determination.

  • Reorganize the code a little so that we don't have to call mark multiple times.
  • Fix a bug where we weren't considering calls from constructors as keeping other constructors live.
  • Add a few more tests.
These were tabs, because that's what LLVM emits. I've now changed them to single plain spaces.


Yes, it's iterative in the same way that the broader mark-and-sweep is iterative. When it sees an edge to something that's already live, it doesn't traverse that edge.

I've finished addressing the review comments, so this is now ready for review again.

lgtm % test nits


I've been putting these external declaration in column 0


No need to .section for functions since each one implicitly gets its own one anyway.

Looks like we have a specific emscripten test for this since it started failing on our llvm roller:

Was this change supposed to effect archive object included via --whole-archive too?

Fix (I believe) is in

Oops, I accidentally raced here and posted Looks like our fixes are very similar, so I'll review yours and you can land whichever you prefer.