This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Add warnings for various symbols that cannot be ordered
ClosedPublic

Authored by jhenderson on Jan 24 2018, 6:08 AM.

Details

Summary

There are a number of different situations when symbols are requested to be ordered in the --symbol-ordering-file that cannot be ordered for some reason. To assist with identifying these symbols, and either tidying up the order file, or the inputs, a number of warnings have been added. As some users may find these warnings unhelpful, due to how they use the symbol ordering file, a switch has also been added to disable these warnings.

The cases where we now warn are:

  1. Entries in the order file that don't correspond to any symbol in the input
  2. Undefined symbols
  3. Absolute symbols
  4. Symbols imported from shared objects
  5. Symbols that are discarded, due to e.g. --gc-sections or /DISCARD/ linker script sections
  6. Multiple of the same entry in the order file

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

jhenderson created this revision.Jan 24 2018, 6:08 AM
jhenderson edited the summary of this revision. (Show Details)

Added a new warning to warn if the same name is specified multiple times in the order file.

grimar added a subscriber: grimar.Jan 29 2018, 7:23 AM
jhenderson edited the summary of this revision. (Show Details)

Rebased following r323779. Thanks to that change, buildSectionOrder() is now only called in one place, and only after processSectionCommands(). This means that we don't need the additional boolean to prevent warnings being emitted twice in some situations, and can also now warn for symbols in /DISCARD/ sections that would otherwise be ordered.

grimar added inline comments.Jan 31 2018, 7:56 AM
ELF/Writer.cpp
1045

I would suggest to use bool here:

bool Inserted = SymbolOrder.insert({S, {Priority++, false}}).second;
if (Config->WarnSymbolOrdering && !Inserted)

as we usually trying to avoid using auto and also here type of Inserted would be pair,
though its name seems assuming it is a bool and probably looks better.

1048

Looks you can use early continue:

if (OrderEntry == SymbolOrder.end())
  continue;
OrderEntry->second.second = true;

With that code below seems simplifies to:

int &Priority = SectionOrder[D->Section];
Priority = std::min(Priority, OrderEntry->second.first);
1061

Problem of that line is that it is described as "present in inputs" though set only if
Config->WarnSymbolOrdering is set. Though that should probably be independent.

So I suggest to move it out, like in one of above comments.

1076

We usually inline such short conditions I believe:

if (Config->WarnSymbolOrdering)
  for (auto V : SymbolOrder)
    if (!V.second.second)
      warn("symbol ordering file: no symbol '" + V.first +
           "' found in input");

Also V name usually stands for Vector, though it is not the case here. I would rename.

Did you consider to use MapVector for SymbolOrder to iterate in stable insertion order here ?

jhenderson marked 4 inline comments as done.

Addressed review comments.

ELF/Writer.cpp
1048

Thanks, you're right. I think the problem I thought was that there would not be entries in SectionOrder for all sections in the end, if we stopped early, but actually, that doesn't matter, because we use lookup() on this map eventually, so a missing value is the same as 0.

1076

No, I hadn't considered MapVector, but I think it is a good idea. Thanks for suggesting it.

ruiu added inline comments.Jan 31 2018, 7:09 PM
ELF/Writer.cpp
1042–1044

Actually I'd do this in the driver so that we can make the writer focus on writing.

1052

For a multi-line if, please add {}.

1075

I'd move this to the driver as well.

ruiu added a comment.Jan 31 2018, 7:49 PM

I noticed that all these error checks can be moved to the driver (which is a better place to show a warning message for bad user inputs) if you do like this: https://github.com/rui314/llvm-project/commit/301cf4915cf99b5270905ba823b3b56be327e3f8

I just hacked it up, so I didn't test it, but I think this should work to describe the idea.

In D42475#994397, @ruiu wrote:

I noticed that all these error checks can be moved to the driver (which is a better place to show a warning message for bad user inputs) if you do like this: https://github.com/rui314/llvm-project/commit/301cf4915cf99b5270905ba823b3b56be327e3f8

I just hacked it up, so I didn't test it, but I think this should work to describe the idea.

I like the principle of your idea, but unfortunately, there are two issues with it that I discovered (plus one easily fixed bug in the implementation for absolute symbols). Firstly, it is relying on the symbols in the final symbol table, which means, for example, that where there are local symbols requested to be ordered, a "symbol not found" warning will be emitted, whereas in the old behaviour, local symbols could be ordered. Secondly, since this location in the driver happens before processSectionCommands, there will be no warning for /DISCARD/-ed, which with the current proposal there is. It might be possible to work around the former, by changing the implementation to loop over symbols in the input files, as done currently, but it is not possible to solve the latter using this approach. I'd therefore prefer to keep the change in the current location.

jhenderson updated this revision to Diff 132410.Feb 1 2018, 9:02 AM

Add missing braces. Also remove a .global marker on the absolute symbol, so that it shows up if we stop warning for local symbols, or warn differently for local symbols.

ruiu added inline comments.Feb 1 2018, 12:10 PM
ELF/Writer.cpp
1042–1044

At least you can move this to the driver, no?

1050

I don't like second.second because it's not that meaningful. Can you define a struct for this?

1071

I wouldn't repeat OrderEntry->second. Instead, I'd assign it to a variable with no auto type.

jhenderson updated this revision to Diff 132584.Feb 2 2018, 7:28 AM
jhenderson marked 3 inline comments as done.

Move warning of duplicate entries to earlier, and renamed a few fields and variables.

grimar added a comment.Feb 2 2018, 8:17 AM

My thoughts below. Though probably part about changes in Driver.cpp
may conflict with previous Rui's comment.

Rui, did you mean you want to compute priority in driver ? I realize why you did
that in your version, but since it was decided to keep parts of logic
in writer, it seems to be more appropriate place for doing that.
I assume you meant to report "specified multiple times" there, but
leave the rest to buildSectionOrder ?

ELF/Config.h
77

I do not think this struct should be here.
I believe in Driver.cpp you want to report "specified multiple times" error,
but Priority and Present are things that you need only once in
buildSectionOrder and so them probably should be computed, used
and then forgotten right there.

80

So I think it can be reverted to

std::vector<llvm::StringRef> SymbolOrderingFile;

And then you could for example use local DenseSet<llvm::StringRef>
in getSymbolOrderingFile to check and report duplicates.

ELF/Driver.cpp
597

This can be:

if (!MB) 
  return {};

MapVector<StringRef, SymbolOrderEntry> Ret;
...

To me, it seems inefficient to check for duplicates in one place and then loop through the same list later to set priorities.

jhenderson updated this revision to Diff 133212.Feb 7 2018, 6:58 AM
jhenderson marked 4 inline comments as done.

Address review comments:

  • Move struct back into buildSectionOrder()
  • Move priority and presence initialisation back to buildSectionOrder()
  • Simplify initial check in getSymbolOrderingFile()
  • Use "B" macro for new options

I haven't gone as far as @rafael suggested on the mailing list and moved everything back into the Writer (except for the reading in), since that seemed to be against what @ruiu and @grimar prefer. I'm happy with it either way.

grimar added a comment.Feb 7 2018, 7:42 AM

I haven't gone as far as @rafael suggested on the mailing list and moved everything back into the Writer (except for the reading in), since that seemed to be against what @ruiu and @grimar prefer. I'm happy with it either way.

Thanks for update, Reporting duplicates during reading (with suggestion below) in Driver.cpp looks nice for me, though I have no very strong preference FWIW.

ELF/Driver.cpp
597

I guess this can be SetVector<StringRef> and then something like below should work:

SetVector<StringRef> Names;
for (StringRef S : args::getLines(*MB)) 
  if (!Names.insert(S).second && Config->WarnSymbolOrdering)
    warn("symbol ordering file: symbol '" + S + "' specified multiple times");
return Names.takeVector();
ruiu added inline comments.Feb 7 2018, 6:21 PM
ELF/Driver.cpp
597

This is nice.

jhenderson updated this revision to Diff 133407.Feb 8 2018, 5:51 AM
jhenderson marked an inline comment as done.

Address review comments:

  • Use SetVector instead of two different variables, and simplify related code.
  • Fix comment in test.
jhenderson updated this revision to Diff 133410.Feb 8 2018, 6:19 AM

Pull creation of MemoryBuffer back out of getSymbolOrderingFile().

@ruiu, would you mind confirming that you are happy with this? Rafael gave a LGTM with this and the changes in the previous update, but since you had comments, I'd appreciate it if you confirm before I commit.

ruiu added inline comments.Feb 8 2018, 5:39 PM
ELF/Driver.cpp
593

Remove {}

595

I prefer the following style

warn(MB.getBufferIdentifier() + ": duplicate symbol: " + S);

because (1) it is easier to parse by script, and (2) easier to read if symbol name is extremely long (which is actually common for C++ programs).

ELF/Writer.cpp
1047

It is a conventional name for this.

1050

Ent is perhaps more conventional.

1076

Remove {}

jhenderson updated this revision to Diff 133633.Feb 9 2018, 9:29 AM
jhenderson marked 5 inline comments as done.

Addressed Rui's review comments:

  • Renamed some variables
  • Removed some braces
  • Reformatted messages to put symbols at end and source/object file at front (but see inline comments)
jhenderson added inline comments.Feb 9 2018, 9:29 AM
ELF/Driver.cpp
595

I have a slight concern that "duplicate symbol" could cause confusion with a multiply-defined symbol error. How about "duplicate ordered symbol"? Also, unless we store the order file name somewhere (e.g. in the Config variable), we can't use it in the other warnings, so I've used the owning file of the symbol instead. That still leaves the "no such symbol warning" however without an easily identified file.

jhenderson updated this revision to Diff 133634.Feb 9 2018, 9:30 AM

Clang-format

grimar added inline comments.Feb 12 2018, 1:43 AM
ELF/Driver.cpp
595

I think we often include command line option name to error message, for example:

error("max-page-size: value isn't a power of 2");
...
error("-image-base: number expected, but got " + S);

So to avoid confusion and make message shorter, we could probably use something like:
warn("--warn-symbol-ordering: " + MB.getBufferIdentifier() + ": duplicate symbol: " + S);

jhenderson added inline comments.Feb 12 2018, 2:44 AM
ELF/Driver.cpp
595

This seems like a bit of a new case to me, so it's unclear what should be done. At the moment, there broadly appear to be four kinds of error/warning messages that are somewhat related to this, based on a quick reading of the LLD code:

  1. Messages for problems with object files/archives. These are usually of the form "<object or source file name>: message"
  2. Messages for problems with reading/opening input or output files. These are usually of the form "cannot open <file name> ..."
  3. Messages for problems with input switches. These are of the form "<switch name>: message"
  4. Messages for problems with linker script parsing/evaluation. These are of the "<script file name>: message"

There are also quite a few that don't really fit any pattern, mostly either missing any file name or with the file name in the middle or end of the message, which should probably be one of the other formats. Unfortunately, the closest equivalent to our situation (i.e. a message regarding the contents of a non-script or object input) is the version script warning for duplicate symbol, which does not mention the version script's filename at all, although the message is clear that the duplicate symbol is in the version script, not elsewhere. The next closest is probably the linker script situation (since it is a non-object/archive input), where "-T/--script" is not mentioned in the message, which is the form I've gone for (and explicitly mentioned "ordered"), but I'm happy to change the format to include both switch name and file name if that's the consensus.

grimar added inline comments.Feb 12 2018, 8:30 AM
ELF/Driver.cpp
595

I'm happy to change the format to include both switch name and file name if that's the consensus.

I am fine with either way :) We can tweak message text after anyways.

ruiu accepted this revision.Feb 13 2018, 5:39 PM

LGTM

This revision is now accepted and ready to land.Feb 13 2018, 5:39 PM

Rebased following recent changes by @rafael. I'm going to commit this shortly.

This revision was automatically updated to reflect the committed changes.