This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Treat undefined symbols uniformly
ClosedPublic

Authored by int3 on May 9 2021, 5:28 PM.

Details

Reviewers
thakis
gkm
Group Reviewers
Restricted Project
Commits
rG2516b0b5261d: [lld-macho] Treat undefined symbols uniformly
Summary

In particular, we should apply the -undefined behavior to all
such symbols, include those that are specified via the command line
(i.e. -e, -u, and -exported_symbol). ld64 supports this too.

Diff Detail

Event Timeline

int3 created this revision.May 9 2021, 5:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2021, 5:28 PM
int3 requested review of this revision.May 9 2021, 5:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2021, 5:28 PM
thakis accepted this revision.May 9 2021, 6:14 PM
thakis added a subscriber: thakis.

Interesting! Do you know of anything that actually does this? Why? Every use case I can think of would better use -U main instead of -undefined dynamic_lookup.

(Change lgtm in any case.)

This revision is now accepted and ready to land.May 9 2021, 6:14 PM
thakis added inline comments.May 9 2021, 6:17 PM
lld/MachO/Driver.cpp
1143

I think this needs basically the same change. -u _foo -undefined dynamic_lookup forces a dynamically-looked-up symbol foo to exist at load time with ld64. (Can't really think of a use case for that either, but since it's next to the entry point right now and behaves the same way right now, this should probably receive the same treatment in this change.)

int3 updated this revision to Diff 343964.May 9 2021, 9:40 PM
int3 retitled this revision from [lld-macho] Treat an undefined entry symbol like any other undefined to [lld-macho] Treat undefined symbols uniformly.
int3 edited the summary of this revision. (Show Details)
int3 removed a reviewer: thakis.

handle -u and -exported_symbol similarly

This revision now requires review to proceed.May 9 2021, 9:40 PM
int3 added a reviewer: thakis.May 9 2021, 9:40 PM
int3 marked an inline comment as done.May 9 2021, 9:45 PM

Bah, I accidentally removed you as the reviewer when doing arc diff --verbatim. I've made a bunch of changes since you stamped it though, might be worth to have another quick look...

Interesting! Do you know of anything that actually does this?

Nope, it was just something that occurred to me to test as I was reviewing our entry symbol handling...

lld/MachO/Driver.cpp
1143

good point... looks like -exported_symbol needs the same treatment too. I've verified that ld64 does apply the same behavior to them, and this diff has been expanded accordingly.

Thanks! Still lg, but more suggestions:

lld/MachO/Driver.cpp
1140–1141

Could we keep this here so all the treatUndefinedSymbol()s are in one place? Or is there something that forces this to move around?

if (config->outputType == MH_EXECUTE && isa<Undefined>(config->entry)) {
   treatUndefinedSymbol(cast<Undefined(config->entry), "the entry point");
   if (isa<Undefined>(config->entry))
     return false;
 }
lld/MachO/SymbolTable.cpp
190 ↗(On Diff #343964)

nit: s/&/source/

And if we're doing capturing, do [source, sym] and remove the arg, to make things self-consistent.

(& works here, but with & it's easy to accidentally capture a stack variable, and if the closure escapes that's a recipe for subtle bugs, and compilers don't yet warn on default-capturing locals in escaping closures. So I try to not use default capture.)

198 ↗(On Diff #343964)

nit: this reads imho easier like so:

std::string message = "undefined symbol: " + toString(sym);
if (!source.empty())
  message +=  "\n>>> referenced by " + source.str();
else
  message +=  "\n>>> referenced by " + toString(sym.getFile());
return message;

That was source doesn't point to different things based on control flow, and it's fairly linear and straight-forward. (The "referenced by" string is duplicated, but right on top of each other, so it's obvious it's the same)

(It allocates more, but only in a code path that emits diags, so it doesn't normally run anyways.)

int3 updated this revision to Diff 344104.May 10 2021, 10:11 AM
int3 marked 4 inline comments as done.
  • move undefined check for entry back into Driver.cpp
  • return all undefined symbol errors from -u, not just the first one, similar to what we do for -exported_symbol
lld/MachO/SymbolTable.cpp
190 ↗(On Diff #343964)

fair enough. while at it I renamed message to msg so it doesn't shadow the message defined in lld/Common.

thakis accepted this revision.May 10 2021, 10:27 AM

lg with writer.cpp change undone

lld/MachO/SymbolTable.cpp
194 ↗(On Diff #344104)

Since you're doing returns in the conditional now: style guide says no else after return :P

190 ↗(On Diff #343964)

meh, the local message string shadows too. i'd keep the old name (but up to you)

lld/MachO/Writer.cpp
1055

...this change is no longer needed now that this is done in driver, rigth?

This revision is now accepted and ready to land.May 10 2021, 10:27 AM
int3 marked 3 inline comments as done.May 10 2021, 12:43 PM
int3 added inline comments.
lld/MachO/SymbolTable.cpp
194 ↗(On Diff #344104)

I don't like this part of the style guide :p Guess I'll go with your original suggestion to preserve symmetry

190 ↗(On Diff #343964)

fair point...

lld/MachO/Writer.cpp
1055

I swore I deleted it... oops

int3 updated this revision to Diff 344157.May 10 2021, 12:43 PM
int3 marked 3 inline comments as done.

update

This revision was landed with ongoing or failed builds.May 10 2021, 12:46 PM
This revision was automatically updated to reflect the committed changes.