This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/CommandGuide/
-
CommandGuide/
2/2
llvm-readobj.rst
-
test/tools/llvm-readobj/
-
tools/
-
llvm-readobj/
-
MachO/
21/23
stabs-sorted.yaml
16/17
sort-symbols.test
-
tools/llvm-readobj/
-
llvm-readobj/
22/26
MachODumper.cpp
2/4
ObjDumper.h
3/3
Opts.td
1
llvm-readobj.h
17/19
llvm-readobj.cpp

Differential D116787

[llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO only, for now).
ClosedPublic

Authored by oontvoo on Jan 6 2022, 7:41 PM.

Download Raw Diff

Details

Reviewers

MaskRay
jhenderson

Group Reviewers

Restricted Project

Commits

rGe6e5e3e025ec: [llvm-readobj] Fix forward build breakages caused by https://reviews.llvm.
rGea9cf2dc96c7: [llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO…

Summary

This would help making tests less brittle as the order will be fixed.

(see also PR/53026)

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

jhenderson added inline comments.Jan 10 2022, 12:22 AM

llvm/test/tools/llvm-readobj/MachO/stabs.yaml
56 ↗	(On Diff #398213)	It's probably worth pulling this into a new test file entirely, so that you can have additional cases that trigger different tie-breaking behaviours.
llvm/tools/llvm-readobj/MachODumper.cpp
633–644
661–662	These should be `const &`, right? Otherwise, you end up copying the tuple contents.
669	This isn't a bug, merely a possible improvement.
673	Note that if you move this `ListScope` earlier, outside the `if`, you a) avoid the duplication with the `else`, and b), shouldn't need the `+ 1` I mentioned in my earlier comment about indentation.
llvm/tools/llvm-readobj/Opts.td
47	"displaying" is the term used for other options. This option is (currently) Mach-O specific. Unless you plan on implementing it for other formats too, please move it to the Mach-O specific options block. These options are listed alphabetically within each block. Please maintain that order. Please remember to update the documentation for llvm-readobj (and llvm-readelf, if you are planning on this being a generic option), located at llvm/docs/CommandGuide. If this is going to be Mach-O specific (for now), I'd name the variable name accordingly (i.e. something like `macho_sort_symbols`. Also, it will need to have the `grp_macho` Group, like the other Mach-O specific options.
llvm/tools/llvm-readobj/llvm-readobj.cpp
111	Noting that this change isn't needed if you drop the reference to JSON output style from elsewhere in this patch.
280	This option is currently Mach-O specific, so move it accordingly, unless you plan on implementing other formats. Also, these lists are in alphabetical order. Please maintain that order.
llvm/tools/llvm-readobj/llvm-readobj.h
45	Noting that this change isn't needed if you drop the reference to JSON output style from elsewhere in this patch.

oontvoo marked 12 inline comments as done.Jan 10 2022, 10:02 AM

oontvoo added inline comments.

llvm/tools/llvm-readobj/MachODumper.cpp
645–649	Thanks! That does look better!
673	I initially moved it here (closer to where it is used) for clarity. Anyways, i can move it back.
llvm/tools/llvm-readobj/llvm-readobj.cpp
280	Yes, I plan to add that to at least, ELF - but dont want to do it in one patch. (esp. since the ELF-dumper is a bit more complex) Fixed the ordering, though.

addressed review comments

Harbormaster completed remote builds in B142475: Diff 398685.Jan 10 2022, 10:29 AM

undo changes in stabs.yaml

Harbormaster completed remote builds in B142478: Diff 398691.Jan 10 2022, 10:51 AM

jhenderson added inline comments.Jan 11 2022, 1:27 AM

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
2	I'd name this file `sort-symbols.yaml` to match the option name. Also, is it really relevant that "stabs" are mentioned in the comment? Are all symbols stabs? If not, do non-stab symbols get sorted too (note: not a Mach-O developer, so I may be talking nonsense!).
5	As you've now only got one CHECK pattern in this test, you can delete the `--check-prefix` option and just use `CHECK:` and `CHECK-NEXT` below. However, as you're also playing around with whitespace, I'd recommend adding `--strict-whitespace` and `--match-full-lines` to the FileCheck command - this will ensure that all whitespace on all lines after the `CHECK:` must exactly match that in the output. Take a look at some other examples in other tests.
55	As you've now got a separate YAML, I'd change your symbol names to emphasise the differences, rather than being basically unrelated cruft copied over from the old test.
llvm/tools/llvm-readobj/MachODumper.cpp
638	I wouldn't bother with the `assert` here: none of Mach-O output expects JSON format, so adding an assert in one place makes it look like it needs sorting here, but not everywhere else.
670	I'm not clear on whether this unindent needs undoing or not after this. Probably you should manually inspect the output, both for multiple symbols, and for multiple operations, where symbol dumping happens before other output.
llvm/tools/llvm-readobj/Opts.td
47	Pinging the points in this comment (specifically the inline edit, and points 3 and 4).

Addressed reviews comment:

expect strict white-spaces in test
add docs
keep the option in alphabetical order (also fix the text s/outputting/displaying)
removed unneeded assert and restore indent

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
2	No, technically, not all symbols are STABS symbols . ( but non-stabs symbols wouldn't be printed here, so ... ) I was just trying to keep the names consistent with the other file (since both of these tested that STABS symbols can be dumped correctly).
55	Actually the names are quite realistic and are representative enough for what I wanted to test. I'm not really seeing why they need to change.

Harbormaster completed remote builds in B142988: Diff 399423.Jan 12 2022, 1:12 PM

jhenderson added inline comments.Jan 13 2022, 12:29 AM

llvm/docs/CommandGuide/llvm-readobj.rst
111
llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
55	It's more about clarity of test. By using "realistic" symbol names, you're actually making it a little harder to see what is important in the testing, as people may just assume they are cruft leftover from how the test input was generated. On the other hand, if you used names like "a", "b", "c" etc, it would be very obvious if they are/are not sorted.
llvm/tools/llvm-readobj/Opts.td
40

addressed review comments

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
55	done - renamed the symbols and added a few more

Harbormaster completed remote builds in B143434: Diff 400047.Jan 14 2022, 10:58 AM

@jhenderson : Do you have any other thought on this? Thanks! :)

In D116787#3251362, @oontvoo wrote:

@jhenderson : Do you have any other thought on this? Thanks! :)

Apologies for the delay, I was on paternity leave for 2 weeks, and am only now getting back to reviews.

Sorry I didn't spot this earlier: it's not obvious to me that --sort-symbols sorts by type. I'd expect it to sort by name, probably. I'm okay with a sort option, but I think we need to reconsider the UI, especially if you are planning on rolling out this option to other formats. There may come a point in the future that people want to sort by other fields. I don't think we need to support this right away, but we could name the switch something flexible enough. A couple of ideas:

Simplest: --sort-symbols-by-type. Just a rename of the switch you've implemented here.
More complicated, but perhaps better UI? --sort-symbols=type, allowing future extensibility i.e. the ability to add e.g. --sort-symbols=name or --sort-symbols=size at some point.

Thoughts?

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
55	I missed the bit about the sorting being done by n_type (I assumed it was based on name). Sorry for the noise, but I suggest you change the names again, to name them after the n_type field they are for, e.g. "_section1", "_section2", "_symDebugTable1" etc.

In D116787#3286763, @jhenderson wrote:

In D116787#3251362, @oontvoo wrote:

@jhenderson : Do you have any other thought on this? Thanks! :)

Apologies for the delay, I was on paternity leave for 2 weeks, and am only now getting back to reviews.

no worries!

Sorry I didn't spot this earlier: it's not obvious to me that --sort-symbols sorts by type. I'd expect it to sort by name, probably. I'm okay with a sort option, but I think we need to reconsider the UI, especially if you are planning on rolling out this option to other formats.

Ah, right. This sorts by both name AND type. (type first, then name amongst the group of symbols of the same type).

There may come a point in the future that people want to sort by other fields. I don't think we need to support this right away, but we could name the switch something flexible enough. A couple of ideas:

Simplest: --sort-symbols-by-type. Just a rename of the switch you've implemented here.

More complicated, but perhaps better UI? --sort-symbols=type, allowing future extensibility i.e. the ability to add e.g. --sort-symbols=name or --sort-symbols=size at some point.

Thoughts?

Regarding (1): it's not entirely true that this only sorts by type.(as mentioned, it sorts by both type and name). The end goal here (for me) is to have a way to deterministically sort all the symbols. The reason I didn't go with sorting them simply by name was because maskray@ raised concerns earlier that it didn't make sense semantically (with which I agreed).

Regarding (2), how would multiple sorting types interact? eg., people specifying both --sort-symbols=name --sort-symbols=type. Does the order of the flag determine which one is the first sorting priority?

In D116787#3287486, @oontvoo wrote:

Regarding (1): it's not entirely true that this only sorts by type.(as mentioned, it sorts by both type and name). The end goal here (for me) is to have a way to deterministically sort all the symbols. The reason I didn't go with sorting them simply by name was because maskray@ raised concerns earlier that it didn't make sense semantically (with which I agreed).

Regarding (2), how would multiple sorting types interact? eg., people specifying both --sort-symbols=name --sort-symbols=type. Does the order of the flag determine which one is the first sorting priority?

Ah I missed that it sorted by name and type.

Regarding 2, I think flag order determining sort priority would be wonderful, although I'd say --sort-symbols=name,type does it like that. Maybe --sort-symbols=name --sort-symbols=type does too, or maybe it sorts by just one. I'm not sure honestly.

Thoughts @MaskRay?

I agree that extending --sort-symbols to --sort-symbols=<value> is useful, since people may want to support different ways.
For example GNU nm has --numeric-sort, --no-sort, --size-sort. This cannot be changed but retrospectively maybe --sort={numeric,size} is a better UI.

--sort-symbols=name --sort-symbols=type specifying multi sort keys may not be obvious. The most common UI is that the last option overrides previous ones.
--sort-symbols=name,type looks good to me to specify multi sort keys.

(I will add a note that stabs-sorted.yaml looks quite long. It'd be wonderful if creating a test has less boilerplate.
Hope someone in #lld-macho may consider this as yet another motivation to improve Mach-O yaml2obj...)

Note: it may be worth adding --sort-symbols= to llvm-objdump as well. It's good to spend some time on the design.

In D116787#3288864, @MaskRay wrote:

I agree that extending --sort-symbols to --sort-symbols=<value> is useful, since people may want to support different ways.
For example GNU nm has --numeric-sort, --no-sort, --size-sort. This cannot be changed but retrospectively maybe --sort={numeric,size} is a better UI.

--sort-symbols=name --sort-symbols=type specifying multi sort keys may not be obvious. The most common UI is that the last option overrides previous ones.
--sort-symbols=name,type looks good to me to specify multi sort keys.

Ok, SG!

Note: it may be worth adding --sort-symbols= to llvm-objdump as well. It's good to spend some time on the design.

Yep, planning on doing that that too. :)

ThankS!

oontvoo planned changes to this revision.Feb 3 2022, 6:56 AM

rework the patch so that --sort-symbols take one or more keys to sort + updated tests and docs accordingly

PTAL! Thanks!

Herald added a subscriber: fedor.sergeev. · View Herald TranscriptFeb 24 2022, 8:55 AM

minor cleanup

Harbormaster completed remote builds in B151295: Diff 411152.Feb 24 2022, 9:42 AM

oontvoo added inline comments.Feb 24 2022, 9:58 AM

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
55	Just to be clear, when we sort by sections, we actually sort by their encoded values (numeric values, eg, 0x2E for debug ) and not the section names. So naming "_section1", "_section2", etc, doesn't help clarifying that point.

I feel like there's an awful lot more code than is required for this patch. I stopped commenting on inidividual things partway through. I think the following outline should be sufficient. It probably isn't 100% correct, but you should be able to get the general idea from it.

// This code could all live in generic area, since this is generic behaviour.
bool compareSymName(SymbolRef LHS, SymbolRef RHS) {
  // Implementation left as an exercise for the reader. In essence:
  // return LHS.Name < RHS.Name
}

bool compareSymType(SymbolRef LHS, SymbolRef RHS) {
  // Implementation left as an exercise for the reader. In essence:
  // return LHS.Type < RHS.Type
}

class SymbolComparer {
public:
  using ComparePred = function_ref<bool(SymbolRef, SymbolRef)>;
  void add(ComparePred Pred) { Predicates.push_back(Pred); }

  bool operator()(SymbolRef LHS, SymbolRef RHS) {
    for(ComparePred Pred : Predicates) {
      if (Pred(LHS, RHS))
        return true;
      if (Pred(RHS, LHS))
        return false;
    }
    // All considered parameters are equal. This means that a SymbolComparer
    // taking an empty vector in the constructor will treat all symbols as equal.
    return false;
  }

private:
  SmallVector<ComparePred, 2> Predicates;
};

// Code in MachODumper.cpp, possibly even other files too.
void MachODumper::printSymbols(const SymbolComparer &SymCmp) {
  ListScope Group(W, "Symbols");
  auto SymbolRange = Obj->symbol();
  std::vector<SymbolRef> SortedSymbols(SymbolRange.begin(), SymbolRange.end());
  stable_sort(SortedSymbols, SymCmp);
  for (SymbolRef Symbol : SortedSymbols)
    printSymbol(Symbol);
}

// In main
SymbolComparer SymCmp;
if (Arg *A = Args.getLastArg(OPT_sort_symbols_EQ)) {
  const StringMap <SymbolComparer::ComparePred> KeyToPredMap =
    {{"name", compareSymName}, {"type", compareSymType}};
  for (StringRef KeyStr : llvm::split(A->getValue(), ",")) {
    auto Found = KeyToPredMap.find(KeyStr);
    if(Found == KeyToPredMap.end())
      error("--sort-symbols value should be 'name' or 'type', but was '" +
            Twine(KeyStr) + "'");
    else
      SymCmp.add(Found->getValue());
  }
}

llvm/docs/CommandGuide/llvm-readobj.rst
112
llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
10	Might be worth another case where just one of these values is specified, showing what happens in this case (e.g. just type means identical type symbols are left in their original order, or something like that). As this test case is about whether or not the symbols are sorted, and not the formatting of the output, I'd remove the `--strict-whitespace` and --match-full-lines` options.
19–22	I don't think you need this block here.
24–32	Related to my above comment re. formatting, we don't need to show that the entire symbol printing is correct - that's the responsibility of a test that is testing the `--symbols` option rather than the `--sort-symbols` option. Instead, I'd just check the name and type fields. Something like this: TYPE-NAME: Name: _a TYPE-NAME-NEXT: Type: Section TYPE-NAME: Name: _d TYPE-NAME-NEXT: Type: Section
llvm/tools/llvm-readobj/MachODumper.cpp
95	Please only reformat the parts of the file that you've changed.
622–625	Just use a `struct` rather than a tuple. It'll be easier to reason with and you won't need these constants (which should be `size_t` anyway, since they're indexes).
634	Are you sure about this? They don't look all that temporary (as long as the object file is still in memory)...

In D116787#3345140, @jhenderson wrote:

// This code could all live in generic area, since this is generic behaviour.
bool compareSymName(SymbolRef LHS, SymbolRef RHS) {
  // Implementation left as an exercise for the reader. In essence:
  // return LHS.Name < RHS.Name
}

bool compareSymType(SymbolRef LHS, SymbolRef RHS) {
  // Implementation left as an exercise for the reader. In essence:
  // return LHS.Type < RHS.Type
}

class SymbolComparer {
public:
  using ComparePred = function_ref<bool(SymbolRef, SymbolRef)>;
  void add(ComparePred Pred) { Predicates.push_back(Pred); }

  bool operator()(SymbolRef LHS, SymbolRef RHS) {
    for(ComparePred Pred : Predicates) {
      if (Pred(LHS, RHS))
        return true;
      if (Pred(RHS, LHS))
        return false;
    }
    // All considered parameters are equal. This means that a SymbolComparer
    // taking an empty vector in the constructor will treat all symbols as equal.
    return false;
  }

private:
  SmallVector<ComparePred, 2> Predicates;
};

// Code in MachODumper.cpp, possibly even other files too.
void MachODumper::printSymbols(const SymbolComparer &SymCmp) {
  ListScope Group(W, "Symbols");
  auto SymbolRange = Obj->symbol();
  std::vector<SymbolRef> SortedSymbols(SymbolRange.begin(), SymbolRange.end());
  stable_sort(SortedSymbols, SymCmp);
  for (SymbolRef Symbol : SortedSymbols)
    printSymbol(Symbol);
}

// In main
SymbolComparer SymCmp;
if (Arg *A = Args.getLastArg(OPT_sort_symbols_EQ)) {
  const StringMap <SymbolComparer::ComparePred> KeyToPredMap =
    {{"name", compareSymName}, {"type", compareSymType}};
  for (StringRef KeyStr : llvm::split(A->getValue(), ",")) {
    auto Found = KeyToPredMap.find(KeyStr);
    if(Found == KeyToPredMap.end())
      error("--sort-symbols value should be 'name' or 'type', but was '" +
            Twine(KeyStr) + "'");
    else
      SymCmp.add(Found->getValue());
  }
}

I'm not sure that's much less code - the only substantial chunk of code right now is in MachOODumper.cpp, line ~602 to 722.
The rest is just one or two line of plumping the args through.

(Also there were some unrelated formatting changes when I ran clang-format ... will revert those)

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
24–32	i can drop this in the other test (the NAME-TYPE one) but this was needed here to ensure the whitespace padding was done correctly.
llvm/tools/llvm-readobj/MachODumper.cpp
634	Are you sure about this? They don't look all that temporary (as long as the object file is still in memory)... Yes - I've run the code with asan(previous version which simply stored the returned values from `->symbols()` to a vector then sorted it) and got errors

In D116787#3345634, @oontvoo wrote:

I'm not sure that's much less code - the only substantial chunk of code right now is in MachOODumper.cpp, line ~602 to 722.
The rest is just one or two line of plumping the args through.

My suggested code has about 55 LOC in total, compared to over 70 in this patch just for the comparators as things stand, without even considering the other code. It's also conceptually simpler: simply store a list of functions to sort by upfront, which are akin to std::less, and therefore don't need "equals" comparison functions, and then use them directly in the sorter. This is before you consider other complications, such as the use of std::tuple and constants to look up in said tuple, or the need for enums in the argument parsing.

llvm/tools/llvm-readobj/MachODumper.cpp
634	Are you sure it was that bit of code causing the problem? I've inspected the implementation of `symbols()` and `SymbolRef` for Mach-O, and the `SymbolRef` in this case is just a pointer into the buffer stored by MachOObjectFile (see `MachOObjectFile::getSymbolByIndex`). As such, there should be no lifetime issues, as long as the object file is in existence. Please try again and analyse where the issue actually is, as it could be a bug elsewhere that this just exposed. I'm going to need more convincing than just "asan said so" that there's an issue storing a SymbolRef in a vector that has a lifetime less than that of the object file.

please hold of reviewing ... i'll do some more digging and post the finding sooon

Updated diff:

Rework patch per review request
Got rid of unrelated formating changes

Note: the impl of the predicates (eg., compareSymType/compareSymName) cannot be "shared" because different formats seem to have different ways to get these.

Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2022, 3:57 PM

Herald added a subscriber: StephenFan. · View Herald Transcript

@jhenderson sorry this has taken awhile - was busy with other stuff - I've doubled checked the crash I was talking about - turned out it was because of getSymbolName() (completely unrelated) So yeah, you're right - storing the SymbolRef is safe. Reworked the patch per your suggestion (with some modification).
PTAL. thanks!

Harbormaster completed remote builds in B155717: Diff 417420.Mar 22 2022, 4:36 PM

What happened to the CommandGuide update?

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
10	Looks like you haven't addressed my test comments?
llvm/tools/llvm-readobj/MachODumper.cpp
627	Nit: don't start functions with blank lines. I don't this is the place to be adding the predicates, because much of this code will end up being duplicated when other format support is added. Instead, I recommend moving it to just after the code that creates the ObjDumper class in llvm-readobj.cpp, and use virtual functions for the predicates, using std::bind tto handle the fact that they're member functions. If you do that, there's no need for the sort key enum. Instead, you could delay processing the command-line option until you need it to identify which predicates to add.
llvm/tools/llvm-readobj/ObjDumper.h
41
44	Don't use `final`. It doesn't add any value and just makes later updates more complex.
64	We're inside the `llvm` namespace already.
llvm/tools/llvm-readobj/llvm-readobj.cpp
196–202	If sure about the warning, I recommend refactoring this to use the new `warn` function.
269
282–283	I think it would be simpler if this were an error. What's the motivation for a warning?

addressed review comments + added additional tests

oontvoo added inline comments.Mar 25 2022, 12:04 PM

llvm/tools/llvm-readobj/MachODumper.cpp
627	If you do that, there's no need for the sort key enum. Instead, you could delay processing the command-line option until you need it to identify which predicates to add. Implemented the virtual func parts as suggested but still keepng the enum (albeit local/static now) because it's weird to move the opts processing part outside of the `parseOptions` function.
llvm/tools/llvm-readobj/llvm-readobj.cpp
196–202	(removed)
282–283	(made it an error too)

Harbormaster completed remote builds in B156331: Diff 418289.Mar 25 2022, 1:30 PM

rebase

Harbormaster completed remote builds in B156353: Diff 418318.Mar 25 2022, 7:51 PM

jhenderson added inline comments.Mar 28 2022, 12:17 AM

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
24–32	Whitespace padding isn't relevant to this test: the formatting should be tested in a basic symbol printing test, not in a test that is about the `--sort-symbols` option. Besides: without the `--strict-whitespace` and `--match-full-lines` FileCheck options, this doesn't actually verify the whitespace properly.
llvm/tools/llvm-readobj/MachODumper.cpp
638	We usually explicitly use `llvm::` prefix for `std::` algorithm variations like this, to show it's an LLVM extension we're making use of. It also facilitates find and replace, should a `std::` version ever be added in the future.

jhenderson added inline comments.Mar 28 2022, 12:17 AM

llvm/tools/llvm-readobj/MachODumper.cpp
627	Yeah, I understand the concern. The downside with keeping it separate is that you have to effectively repeat the switch/case involved in option parsing in two places, which leads to ugly warts like the need for the `assert(false)` in the second one. Up to you though.
648	I might be mistaken, but I believe LLVM usually doesn't bother commenting out unused parameter names.
llvm/tools/llvm-readobj/ObjDumper.h
100
llvm/tools/llvm-readobj/llvm-readobj.cpp
91	`UNSUPPORTED` or `UNKNOWN` instead of `UNSPEC`.
360	`llvm::Optional` would be the more expressive form here. You could then pass it directly, rather than via the pointer, and have a `None` check instead of a `nullptr` check.
372	Here and below, I don't think you need the trailing return types.
381	Use `case UNSPEC` (renamed according to my earlier comment) here, instead of `default`, to take advantage of compiler warnings about not all cases being filled. See also https://llvm.org/docs/CodingStandards.html#don-t-use-default-labels-in-fully-covered-switches-over-enumerations.
382	Use `llvm_unreachable` rather than `assert(false)`.
387	Error and warning messages shouldn't end with a "." I have concerns here: this will stop dumping as soon as an unsupported file type is encountered (even in an archive), yet that is unlikely what the user wants to happen. They more likely want to continue dumping and just not sort in this case (imagine if they had mutliple different file format objects in their input). This should be at most a warning (including the input file name), although I could even see an argument for not having it at all. It also needs testing.

oontvoo marked 8 inline comments as done.Mar 28 2022, 8:09 AM

oontvoo added inline comments.

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
24–32	the comment about white was out of date (no longer needed)
llvm/tools/llvm-readobj/MachODumper.cpp
627	(ack'ed - keeping this as is for now until other format started implementing the option)
llvm/tools/llvm-readobj/llvm-readobj.cpp
387	Fixed If the format doesn't support --sort-symbols then the users shouldn't specify --sort-symbols. (Keeping it as error for consistency - similarly to how it handles unknown sort-key above, which is that if it doesn't understand it, then it's an error, rather than trying to guess. This file also doesn't have precedence for "warning")

addressed review comment

jhenderson added inline comments.Mar 28 2022, 8:21 AM

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
24–32	You can still simplify the checks as described in my original comment.
llvm/tools/llvm-readobj/MachODumper.cpp
648	Unaddressed but marked as done?
llvm/tools/llvm-readobj/llvm-readobj.cpp
360	I was mistaken to put the `llvm::` qualifier before `Optional`. It looks like many (most?) instances don't use the qualifier for it or `None`, so please remove it.
372	Not addressed?
387	You're making the potentially incorrect assumption that all input are of the same format. If a user has two objects of different formats, they might want the symbols sorted for the ones that can be: $ llvm-readobj elf.o macho.o --sort-symbols --symbols Would result in an error, and nothing printed (not even macho.o's symbols).

updated diff

oontvoo marked an inline comment as done.Mar 28 2022, 8:49 AM

oontvoo marked 2 inline comments as done.

Looks good aside from the test comment.

llvm/tools/llvm-readobj/llvm-readobj.cpp
387–390	This warning is untested. Unless you have patches lined up for all other formats, I'd add a test that shows what happens for a mixture of sortable and unsortable formats, e.g. `llvm-readobj wasm.o macho.o elf.o`

added test for warning case

oontvoo marked an inline comment as done.Mar 29 2022, 8:07 AM

jhenderson added inline comments.Mar 29 2022, 8:19 AM

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
13–15	As this logic is testing generic logic in the tool, rather than format-specific logic, I think you'd be better off pulling it into a separate test directly in the llvm-readobj directory. Additionally, I wouldn't use ELF, as ELF is a good candidate for the next format to support. I'd use one of the other input formats supported by llvm-readobj.

oontvoo added inline comments.Mar 29 2022, 8:28 AM

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
13–15	As you've said, this test is temporary until the other formats are implemented. As such, I dont see why it needs to go out in a separate test file

jhenderson added inline comments.Mar 29 2022, 8:30 AM

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
13–15	As an ELF developer, I wouldn't necessarily expect me adding support to the ELF layer to break a Mach-O test, which is currently what would happen.

oontvoo marked an inline comment as done.Mar 29 2022, 8:45 AM

oontvoo added inline comments.

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
13–15	k

added more tests

Herald added a subscriber: aheejin. · View Herald TranscriptMar 29 2022, 8:45 AM

rebase

jhenderson added inline comments.Mar 29 2022, 11:45 PM

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
231	Nit: missing newline at EOF.
llvm/test/tools/llvm-readobj/sort-symbols.test
1	Double # for comments in these tests, and remove the double space.
6	Any particular reason you've not included xcoff?
16	I'd consider pruning this back to just `warning '{{.+}}_macho':` to reduce the risk of false negatives due to a slight change in the warning message.

oontvoo marked 4 inline comments as done.Mar 30 2022, 6:24 AM

oontvoo added inline comments.

llvm/test/tools/llvm-readobj/sort-symbols.test
16	Wouldn't that make the test more prone to false positives? that is, if some new warning pops up somewhere else, this would trip. So i'm going to keep this.

updated diff

jhenderson added inline comments.Mar 30 2022, 6:42 AM

llvm/test/tools/llvm-readobj/sort-symbols.test
16	Yes, it would, but we probably shouldn't be invoking behaviour that causes those warnings anyway, so they're harmless (it would be easy to fix the test if it did trigger one in the future). The test as it was before the last edit demonstrates why overly strict CHECK-NOTs are not much use, because typos can cause them to pass spuriously.
69	Too many blank lines at EOF (should be exactly one \n at the end).

removed blank line

oontvoo marked an inline comment as done.Mar 30 2022, 6:44 AM

oontvoo added inline comments.

llvm/test/tools/llvm-readobj/sort-symbols.test
16	But that's not this test's job to guard against OTHER kinds of warnings.

oontvoo marked 2 inline comments as done.Mar 30 2022, 6:44 AM

oontvoo added inline comments.

llvm/test/tools/llvm-readobj/sort-symbols.test
16	Furthermore, false negatives/brittle tests are just as frustrating.

oontvoo added inline comments.Mar 30 2022, 6:51 AM

llvm/test/tools/llvm-readobj/sort-symbols.test
16	Yes, it would, but we probably shouldn't be invoking behaviour that causes those warnings anyway, so they're harmless (it would be easy to fix the test if it did trigger one in the future). I disagree with the assertion that it's easy to fix these. Imagine there were a dozen tests similar to this one which were not expecting some warnings, then someone added a new warning and they would have to go update all these tests, even though it's not their fault. (it is the test's fault that it casts too wide a nest on the warning).

oontvoo added inline comments.Mar 30 2022, 6:52 AM

llvm/test/tools/llvm-readobj/sort-symbols.test
16	s/nest/net Do you have any other comment on this patch because it seems we've been back on forth for a very long time and it doesn't seem to get any more progress ...

jhenderson added inline comments.Mar 30 2022, 7:10 AM

llvm/test/tools/llvm-readobj/sort-symbols.test
16	@MaskRay, any thoughts on this or other aspectse of this patch? As someone who has been stung way too many times by rotten tests caused by negative matches like this not actually catching the thing I'm expecting to be caught, I'm incredibly wary of strict -NOT patterns like this. This isn't some arbitrary concern: I've seen bugs in released products because of this exact kind of overly precise check pattern. That being said, there is an alternative approach I think you could consider: stick the message after the colon in a FileCheck define and then use it in both the positive and negative matches. That way, if the message is changed, the positive matches will start failing, prompting the developer to update that check too, which in turn will ensure the CHECK-NOT doesn't rot (since it's testing guaranteed to be testing the exact same string). # RUN: ... \| FileCheck %s -DMSG="--sort-symbols is not supported yet for this format" # CHECK: warning: '{{.+}}_coff': [[MSG]] ... # CHECK-NOT: warning: '{{.+}}': [[MSG]] (NB: I've left the file path loose in the negative match so that if somebody changes the input file name, the check pattern is still valid).

oontvoo added inline comments.Mar 30 2022, 7:26 AM

llvm/test/tools/llvm-readobj/sort-symbols.test
16	As someone who has been stung way too many times by rotten tests caused by negative matches like this not actually catching the thing I'm expecting to be caught, I'm incredibly wary of strict -NOT patterns like this. This isn't some arbitrary concern: I've seen bugs in released products because of this exact kind of overly precise check pattern. Generally, maybe there's a point - but in this context specifically, what would be the bug if the warning were emitted for macho and not caught? The macho specific tests should have caught it (ie., changes in code logic) And stated before, you were arguing for making this CHECK-NOT catch all the warnings, which I disagree with (reasons stated a few comments back). So while it's true that the current set up isn't too ideal, given a choice of false neg vs false positive, I think most would agree that a test should learn toward false positive because false negatives tend to waste a lot of other people's time in debugging test failures. That being said, there is an alternative approach I think you could consider: stick the message after the colon in a FileCheck define and then use it in both the positive and negative matches. That way, if the message is changed, the positive matches will start failing, prompting the developer to update that check too, which in turn will ensure the CHECK-NOT doesn't rot (since it's testing guaranteed to be testing the exact same string). # RUN: ... \| FileCheck %s -DMSG="--sort-symbols is not supported yet for this format" # CHECK: warning: '{{.+}}_coff': [[MSG]] ... # CHECK-NOT: warning: '{{.+}}': [[MSG]] (NB: I've left the file path loose in the negative match so that if somebody changes the input file name, the check pattern is still valid).

oontvoo added inline comments.Mar 30 2022, 7:34 AM

llvm/test/tools/llvm-readobj/sort-symbols.test
16	(NB: I've left the file path loose in the negative match so that if somebody changes the input file name, the check pattern is still valid). What would be the legitimate reason to change the macho filename and not update the test? Those names were chosen specifically to differentiate the different formats.

pass MSG param to FileCheck

updated diff

MaskRay added inline comments.Mar 30 2022, 10:59 AM

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml
2
llvm/test/tools/llvm-readobj/sort-symbols.test
16	The current `# CHECK-NOT: warning '{{.+}}_macho': [[MSG]]` usage looks quite nice, though you may omit `:` after `warning`. By saving `MSG` to a -D variable, you don't have to repeat the message. I also wonder whether we can just use `# CHECK-NOT: warning:` to assert no extra warnings. To make the test more specific, we want to make the file as valid as possible and rule out possibilities for other warnings, then `# CHECK-NOT: warning:` suffices.

updated diff

oontvoo added inline comments.Mar 30 2022, 11:11 AM

llvm/test/tools/llvm-readobj/sort-symbols.test
16	though you may omit : after warning. By saving MSG to a -D variable, you don't have to repeat the message. I'm not sure factoring the `:` to `MSG` helps with readability - the CHECK statements would look like: # CHECK: warning '{{.+}}_coff'[[MSG]] which is not as good as the current form

oontvoo added inline comments.Mar 30 2022, 11:25 AM

llvm/test/tools/llvm-readobj/sort-symbols.test
16	I also wonder whether we can just use `# CHECK-NOT: warning:` to assert no extra warnings. To make the test more specific, we want to make the file as valid as possible and rule out possibilities for other warnings, then `# CHECK-NOT: warning:` suffices. Why does only this test need to assert that there is no other warnings/errors? I've checked all other tests in this project, and none of them has a check that no additional warnings were emitted.

oontvoo added inline comments.Mar 30 2022, 11:38 AM

llvm/test/tools/llvm-readobj/sort-symbols.test
16	IOWs, I don't see why the burden has to be on this test to check that there is no other warnings. It should be as precise as possible in terms of what warning it does not expect and in this case, it does not expect this specific warning for the macho format. Any other warnings is another test's problem. (This test is named "sort-symbols.test" and not "all-warnings.test" for a reason) If we want to have warning check tests, that's another discussion.

Harbormaster completed remote builds in B157022: Diff 419220.Mar 30 2022, 10:33 PM

LGTM now, thanks!

This revision is now accepted and ready to land.Mar 30 2022, 11:58 PM

rebase

This revision was landed with ongoing or failed builds.Mar 31 2022, 6:16 AM

Closed by commit rGea9cf2dc96c7: [llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO… (authored by oontvoo). · Explain Why

This revision was automatically updated to reflect the committed changes.

oontvoo added a commit: rGea9cf2dc96c7: [llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO….

oontvoo added a reverting change: rG33b3c86afab0: Revert "[llvm-readobj][MachO] Add option to sort the symbol table before….Mar 31 2022, 6:33 AM

Harbormaster completed remote builds in B157158: Diff 419418.Mar 31 2022, 12:05 PM

oontvoo mentioned this in rGe6e5e3e025ec: [llvm-readobj] Fix forward build breakages caused by https://reviews.llvm..Mar 31 2022, 12:23 PM

oontvoo added a commit: rGe6e5e3e025ec: [llvm-readobj] Fix forward build breakages caused by https://reviews.llvm..Mar 31 2022, 12:24 PM

oontvoo mentioned this in rG33e197112a21: [llvm-readobj] Support non 64bit platforms too.Mar 31 2022, 12:40 PM

Revision Contents

Path

Size

llvm/

docs/

CommandGuide/

llvm-readobj.rst

4 lines

test/

tools/

llvm-readobj/

MachO/

stabs-sorted.yaml

230 lines

sort-symbols.test

58 lines

tools/

llvm-readobj/

49 lines

58 lines

1 line

5 lines

53 lines

Diff 418931

llvm/docs/CommandGuide/llvm-readobj.rst

Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines

When used with :option:`--sections`, display relocations for each section

shown. This option has no effect for GNU style output.

.. option:: --section-symbols, --st

When used with :option:`--sections`, display symbols for each section shown.

This option has no effect for GNU style output.

.. option:: --sort-symbols=<sort_key[,sort_key]>

Specify the keys to sort symbols before displaying symtab.

jhendersonUnsubmitted

Done

.. option:: --sort-symbols

- Sort symbols before displaying symtab

+ Sort symbols before displaying symtab.

.. option:: --stackmap

jhenderson:

Valid values for sort_key are ``name`` and ``type``.

jhendersonUnsubmitted

Done

Specify the keys to sort symbols before displaying symtab.

- Valid values for sort_key are ``name`` and ``type``

+ Valid values for sort_key are ``name`` and ``type``.

.. option:: --stackmap

jhenderson:

.. option:: --stackmap

Display contents of the stackmap section.

.. option:: --string-dump=<section[,section,...]>, -p

Display the specified section(s) as a list of strings. ``section`` may be a

section index or section name.

▲ Show 20 Lines • Show All 222 Lines • Show Last 20 Lines

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml

This file was added.

## Verify that llvm-readobj can dump files with stabs symbols in a sorted order,

jhendersonUnsubmitted

Done

I'd name this file sort-symbols.yaml to match the option name. Also, is it really relevant that "stabs" are mentioned in the comment? Are all symbols stabs? If not, do non-stab symbols get sorted too (note: not a Mach-O developer, so I may be talking nonsense!).

jhenderson: I'd name this file `sort-symbols.yaml` to match the option name. Also, is it really relevant…

oontvooAuthorUnsubmitted

Done

No, technically, not all symbols are STABS symbols .
( but non-stabs symbols wouldn't be printed here, so ... ) I was just trying to keep the names consistent with the other file (since both of these tested that STABS symbols can be dumped correctly).

oontvoo: No, technically, not all symbols are STABS symbols . ( but non-stabs symbols wouldn't be…

MaskRayUnsubmitted

Done

- ## Verify that llvm-readobj can dump files with stabs symbols in a sorted order,

+ ## Verify that llvm-readobj can dump files with stabs symbols in a sorted order.

# RUN: yaml2obj --docnum=1 %s -o %t

MaskRay:

# RUN: yaml2obj --docnum=1 %s -o %t

# RUN: not llvm-readobj --syms --sort-symbols=foo %t 2>&1 | FileCheck %s --check-prefix ERR-KEY

jhendersonUnsubmitted

Done

As you've now only got one CHECK pattern in this test, you can delete the --check-prefix option and just use CHECK: and CHECK-NEXT below.

However, as you're also playing around with whitespace, I'd recommend adding --strict-whitespace and --match-full-lines to the FileCheck command - this will ensure that all whitespace on all lines after the CHECK: must exactly match that in the output. Take a look at some other examples in other tests.

jhenderson: As you've now only got one CHECK pattern in this test, you can delete the `--check-prefix`…

# RUN: not llvm-readobj --syms --sort-symbols=,, %t 2>&1 | FileCheck %s --check-prefix ERR-KEY-EMPT

# RUN: llvm-readobj --syms --sort-symbols=type,name %t | FileCheck %s --check-prefix TYPE-NAME

# RUN: llvm-readobj --syms --sort-symbols=name,type %t | FileCheck %s --check-prefix NAME-TYPE

# RUN: llvm-readobj --syms --sort-symbols=type %t | FileCheck %s --check-prefix TYPE-ONLY

jhendersonUnsubmitted

Done

Might be worth another case where just one of these values is specified, showing what happens in this case (e.g. just type means identical type symbols are left in their original order, or something like that).

As this test case is about whether or not the symbols are sorted, and not the formatting of the output, I'd remove the --strict-whitespace and --match-full-lines` options.

jhenderson: Might be worth another case where just one of these values is specified, showing what happens…

jhendersonUnsubmitted

Done

Looks like you haven't addressed my test comments?

jhenderson: Looks like you haven't addressed my test comments?

# TYPE-NAME: Name: _a (19)

# TYPE-NAME-NEXT: Type: Section (0xE)

# TYPE-NAME: Name: _d (10)

# TYPE-NAME-NEXT: Type: Section (0xE)

jhendersonUnsubmitted

Not Done

As this logic is testing generic logic in the tool, rather than format-specific logic, I think you'd be better off pulling it into a separate test directly in the llvm-readobj directory. Additionally, I wouldn't use ELF, as ELF is a good candidate for the next format to support. I'd use one of the other input formats supported by llvm-readobj.

jhenderson: As this logic is testing generic logic in the tool, rather than format-specific logic, I think…

oontvooAuthorUnsubmitted

Done

As you've said, this test is temporary until the other formats are implemented. As such, I dont see why it needs to go out in a separate test file

oontvoo: As you've said, this test is temporary until the other formats are implemented. As such, I dont…

jhendersonUnsubmitted

Done

As an ELF developer, I wouldn't necessarily expect me adding support to the ELF layer to break a Mach-O test, which is currently what would happen.

jhenderson: As an ELF developer, I wouldn't necessarily expect me adding support to the ELF layer to break…

oontvooAuthorUnsubmitted

Done

oontvoo: k

# TYPE-NAME: Name: _f (7)

# TYPE-NAME-NEXT: Type: SymDebugTable (0x2E)

# TYPE-NAME: Name: _z (1)

# TYPE-NAME-NEXT: Type: SymDebugTable (0x2E)

# TYPE-NAME: Name: _c (13)

# TYPE-NAME-NEXT: Type: SymDebugTable (0x64)

# TYPE-NAME: Name: _g (4)

jhendersonUnsubmitted

Done

I don't think you need this block here.

jhenderson: I don't think you need this block here.

# TYPE-NAME-NEXT: Type: SymDebugTable (0x64)

# TYPE-NAME: Name: _b (16)

# TYPE-NAME-NEXT: Type: SymDebugTable (0x66)

# TYPE-NAME: Name: _d2 (22)

# TYPE-NAME-NEXT: Type: SymDebugTable (0x66)

# NAME-TYPE: Name: _a (19)

# NAME-TYPE-NEXT: Type: Section (0xE)

# NAME-TYPE: Name: _b (16)

# NAME-TYPE-NEXT: Type: SymDebugTable (0x66)

jhendersonUnsubmitted

Done

Related to my above comment re. formatting, we don't need to show that the entire symbol printing is correct - that's the responsibility of a test that is testing the --symbols option rather than the --sort-symbols option. Instead, I'd just check the name and type fields. Something like this:

TYPE-NAME:      Name: _a
TYPE-NAME-NEXT: Type: Section
TYPE-NAME:      Name: _d
TYPE-NAME-NEXT: Type: Section

jhenderson: Related to my above comment re. formatting, we don't need to show that the entire symbol…

oontvooAuthorUnsubmitted

Done

i can drop this in the other test (the NAME-TYPE one) but this was needed here to ensure the whitespace padding was done correctly.

oontvoo: i can drop this in the other test (the NAME-TYPE one) but this was needed here to ensure the…

jhendersonUnsubmitted

Done

Whitespace padding isn't relevant to this test: the formatting should be tested in a basic symbol printing test, not in a test that is about the --sort-symbols option. Besides: without the --strict-whitespace and --match-full-lines FileCheck options, this doesn't actually verify the whitespace properly.

jhenderson: Whitespace padding isn't relevant to this test: the formatting should be tested in a basic…

oontvooAuthorUnsubmitted

Done

the comment about white was out of date (no longer needed)

oontvoo: the comment about white was out of date (no longer needed)

jhendersonUnsubmitted

Done

You can still simplify the checks as described in my original comment.

jhenderson: You can still simplify the checks as described in my original comment.

# NAME-TYPE: Name: _c (13)

# NAME-TYPE-NEXT: Type: SymDebugTable (0x64)

# NAME-TYPE: Name: _d (10)

# NAME-TYPE-NEXT: Type: Section (0xE)

# NAME-TYPE: Name: _d2 (22)

# NAME-TYPE-NEXT: Type: SymDebugTable (0x66)

# NAME-TYPE: Name: _f (7)

# NAME-TYPE-NEXT: Type: SymDebugTable (0x2E)

# NAME-TYPE: Name: _g (4)

# NAME-TYPE-NEXT: Type: SymDebugTable (0x64)

# NAME-TYPE: Name: _z (1)

# NAME-TYPE-NEXT: Type: SymDebugTable (0x2E)

# TYPE-ONLY: Name: _d (10)

# TYPE-ONLY-NEXT: Type: Section (0xE)

# TYPE-ONLY: Name: _a (19)

# TYPE-ONLY-NEXT: Type: Section (0xE)

# TYPE-ONLY: Name: _f (7)

# TYPE-ONLY-NEXT: Type: SymDebugTable (0x2E)

# TYPE-ONLY: Name: _z (1)

# TYPE-ONLY-NEXT: Type: SymDebugTable (0x2E)

# TYPE-ONLY: Name: _g (4)

# TYPE-ONLY-NEXT: Type: SymDebugTable (0x64)

jhendersonUnsubmitted

Not Done

As you've now got a separate YAML, I'd change your symbol names to emphasise the differences, rather than being basically unrelated cruft copied over from the old test.

jhenderson: As you've now got a separate YAML, I'd change your symbol names to emphasise the differences…

oontvooAuthorUnsubmitted

Done

Actually the names are quite realistic and are representative enough for what I wanted to test. I'm not really seeing why they need to change.

oontvoo: Actually the names are quite realistic and are representative enough for what I wanted to test.

jhendersonUnsubmitted

Done

It's more about clarity of test. By using "realistic" symbol names, you're actually making it a little harder to see what is important in the testing, as people may just assume they are cruft leftover from how the test input was generated. On the other hand, if you used names like "a", "b", "c" etc, it would be very obvious if they are/are not sorted.

jhenderson: It's more about clarity of test. By using "realistic" symbol names, you're actually making it a…

oontvooAuthorUnsubmitted

Done

done - renamed the symbols and added a few more

oontvoo: done - renamed the symbols and added a few more

jhendersonUnsubmitted

Done

I missed the bit about the sorting being done by n_type (I assumed it was based on name). Sorry for the noise, but I suggest you change the names again, to name them after the n_type field they are for, e.g. "_section1", "_section2", "_symDebugTable1" etc.

jhenderson: I missed the bit about the sorting being done by n_type (I assumed it was based on name). Sorry…

oontvooAuthorUnsubmitted

Done

Just to be clear, when we sort by sections, we actually sort by their encoded values (numeric values, eg, 0x2E for debug ) and not the section names. So naming "_section1", "_section2", etc, doesn't help clarifying that point.

oontvoo: Just to be clear, when we sort by sections, we actually sort by their encoded values (numeric…

# TYPE-ONLY: Name: _c (13)

# TYPE-ONLY-NEXT: Type: SymDebugTable (0x64)

# TYPE-ONLY: Name: _d2 (22)

# TYPE-ONLY-NEXT: Type: SymDebugTable (0x66)

# TYPE-ONLY: Name: _b (16)

# TYPE-ONLY-NEXT: Type: SymDebugTable (0x66)

--- !mach-o

FileHeader:

magic: 0xFEEDFACF

cputype: 0x1000007

cpusubtype: 0x3

filetype: 0x1

ncmds: 3

sizeofcmds: 496

flags: 0x2000

reserved: 0x0

LoadCommands:

- cmd: LC_SEGMENT_64

cmdsize: 392

segname: ''

vmaddr: 0

vmsize: 32

fileoff: 528

filesize: 28

maxprot: 7

initprot: 7

nsects: 4

flags: 0

Sections:

- sectname: __text

segname: __TEXT

addr: 0x0

size: 9

offset: 0x210

align: 0

reloff: 0x230

nreloc: 1

flags: 0x80000000

reserved1: 0x0

reserved2: 0x0

reserved3: 0x0

content: '000000000000000000'

relocations:

- address: 0x0

symbolnum: 7

pcrel: false

length: 3

extern: true

type: 0

scattered: false

value: 0

- sectname: more_data

segname: __DATA

addr: 0x9

size: 8

offset: 0x219

align: 0

reloff: 0x0

nreloc: 0

flags: 0x0

reserved1: 0x0

reserved2: 0x0

reserved3: 0x0

content: 7B00000000000000

- sectname: __data

segname: __DATA

addr: 0x11

size: 11

offset: 0x221

align: 0

reloff: 0x0

nreloc: 0

flags: 0x0

reserved1: 0x0

reserved2: 0x0

reserved3: 0x0

content: 7B00000000000000000000

- sectname: __common

segname: __DATA

addr: 0x1C

size: 4

offset: 0x0

align: 2

reloff: 0x0

nreloc: 0

flags: 0x1

reserved1: 0x0

reserved2: 0x0

reserved3: 0x0

- cmd: LC_SYMTAB

cmdsize: 24

symoff: 568

nsyms: 8

stroff: 696

strsize: 32

- cmd: LC_DYSYMTAB

cmdsize: 80

ilocalsym: 0

nlocalsym: 7

iextdefsym: 7

nextdefsym: 0

iundefsym: 7

nundefsym: 1

tocoff: 0

ntoc: 0

modtaboff: 0

nmodtab: 0

extrefsymoff: 0

nextrefsyms: 0

indirectsymoff: 0

nindirectsyms: 0

extreloff: 0

nextrel: 0

locreloff: 0

nlocrel: 0

LinkEditData:

NameList:

- n_strx: 4

n_type: 0x64

n_sect: 1

n_desc: 0

n_value: 0

- n_strx: 10

n_type: 0xE

n_sect: 1

n_desc: 0

n_value: 8

- n_strx: 22

n_type: 0x66

n_sect: 1

n_desc: 0

n_value: 8

- n_strx: 16

n_type: 0x66

n_sect: 2

n_desc: 0

n_value: 9

- n_strx: 19

n_type: 0xE

n_sect: 3

n_desc: 0

n_value: 17

- n_strx: 13

n_type: 0x64

n_sect: 4

n_desc: 0

n_value: 28

- n_strx: 7

n_type: 0x2E

n_sect: 3

n_desc: 0

n_value: 25

- n_strx: 1

n_type: 0x2E

n_sect: 0

n_desc: 0

n_value: 0

StringTable:

- ''

- _z

- _g

- _f

- _d

- _c

- _b

- _a

- _d2

- ''

...

No newline at end of file

jhendersonUnsubmitted

Done

Nit: missing newline at EOF.

jhenderson: Nit: missing newline at EOF.

llvm/test/tools/llvm-readobj/sort-symbols.test

This file was added.

# Test that we print a warning for ELF, WASM, and COFF but still dump the contents for all.

jhendersonUnsubmitted

Done

- # Test that we print a warning for ELF, WASM, and COFF but still dump the contents for all.

+ ## Test that we print a warning for ELF, WASM, and COFF but still dump the contents for all.

# RUN: yaml2obj --docnum=1 %s -o %t_macho

Double # for comments in these tests, and remove the double space.

jhenderson: Double # for comments in these tests, and remove the double space.

# RUN: yaml2obj --docnum=1 %s -o %t_macho

# RUN: yaml2obj --docnum=2 %s -o %t_elf

# RUN: yaml2obj --docnum=3 %s -o %t_wasm

# RUN: yaml2obj --docnum=4 %s -o %t_coff

jhendersonUnsubmitted

Done

Any particular reason you've not included xcoff?

jhenderson: Any particular reason you've not included xcoff?

# RUN: llvm-readobj --syms --sort-symbols=type,name %t_elf %t_wasm %t_coff %t_macho 2>&1 | FileCheck %s

# CHECK: warning: '{{.+}}_elf': --sort-symbols is not supported yet for this format

# CHECK: Format: elf64-unknown

# CHECK: warning: '{{.+}}_wasm': --sort-symbols is not supported yet for this format

# CHECK: Format: WASM

# CHECK: warning: '{{.+}}_coff': --sort-symbols is not supported yet for this format

# CHECK: Format: aixcoff-rs6000

# CHECK-NOT: warning '{{.+}}_macho': --sort-symbols is not supported yet for this format.

jhendersonUnsubmitted

Done

# CHECK: Format: aixcoff-rs6000

- # CHECK-NOT: warning '{{.+}}_macho': --sort-symbols is not supported yet for this format.

+ # CHECK-NOT: warning '{{.+}}_macho': --sort-symbols is not supported yet for this format

# CHECK: Format: Mach-O 64-bit x86-64

I'd consider pruning this back to just warning '{{.+}}_macho': to reduce the risk of false negatives due to a slight change in the warning message.

jhenderson: I'd consider pruning this back to just `warning '{{.+}}_macho':` to reduce the risk of false…

oontvooAuthorUnsubmitted

Done

Wouldn't that make the test more prone to false positives? that is, if some new warning pops up somewhere else, this would trip. So i'm going to keep this.

oontvoo: Wouldn't that make the test more prone to false positives? that is, if some new warning pops up…

jhendersonUnsubmitted

Done

Yes, it would, but we probably shouldn't be invoking behaviour that causes those warnings anyway, so they're harmless (it would be easy to fix the test if it did trigger one in the future).

The test as it was before the last edit demonstrates why overly strict CHECK-NOTs are not much use, because typos can cause them to pass spuriously.

jhenderson: Yes, it would, but we probably shouldn't be invoking behaviour that causes those warnings…

oontvooAuthorUnsubmitted

Done

But that's not this test's job to guard against *OTHER* kinds of warnings.

oontvoo: But that's not this test's job to guard against *OTHER* kinds of warnings.

oontvooAuthorUnsubmitted

Done

Furthermore, false negatives/brittle tests are just as frustrating.

oontvoo: Furthermore, false negatives/brittle tests are just as frustrating.

oontvooAuthorUnsubmitted

Done

Yes, it would, but we probably shouldn't be invoking behaviour that causes those warnings anyway, so they're harmless (it would be easy to fix the test if it did trigger one in the future).

I disagree with the assertion that it's easy to fix these. Imagine there were a dozen tests similar to this one which were not expecting some warnings, then someone added a new warning and they would have to go update all these tests, even though it's not their fault. (it is the test's fault that it casts too wide a nest on the warning).

oontvoo: > Yes, it would, but we probably shouldn't be invoking behaviour that causes those warnings…

oontvooAuthorUnsubmitted

Done

s/nest/net

Do you have any other comment on this patch because it seems we've been back on forth for a very long time and it doesn't seem to get any more progress ...

oontvoo: s/nest/net Do you have any other comment on this patch because it seems we've been back on…

jhendersonUnsubmitted

Not Done

@MaskRay, any thoughts on this or other aspectse of this patch?

As someone who has been stung way too many times by rotten tests caused by negative matches like this not actually catching the thing I'm expecting to be caught, I'm incredibly wary of strict -NOT patterns like this. This isn't some arbitrary concern: I've seen bugs in released products because of this exact kind of overly precise check pattern.

That being said, there is an alternative approach I think you could consider: stick the message after the colon in a FileCheck define and then use it in both the positive and negative matches. That way, if the message is changed, the positive matches will start failing, prompting the developer to update that check too, which in turn will ensure the CHECK-NOT doesn't rot (since it's testing guaranteed to be testing the exact same string).

# RUN: ... | FileCheck %s -DMSG="--sort-symbols is not supported yet for this format"

# CHECK: warning: '{{.+}}_coff': [[MSG]]
...
# CHECK-NOT: warning: '{{.+}}': [[MSG]]

(NB: I've left the file path loose in the negative match so that if somebody changes the input file name, the check pattern is still valid).

jhenderson: @MaskRay, any thoughts on this or other aspectse of this patch? As someone who has been stung…

oontvooAuthorUnsubmitted

Done

As someone who has been stung way too many times by rotten tests caused by negative matches like this not actually catching the thing I'm expecting to be caught, I'm incredibly wary of strict -NOT patterns like this. This isn't some arbitrary concern: I've seen bugs in released products because of this exact kind of overly precise check pattern.

Generally, maybe there's a point - but in this context specifically, what would be the bug if the warning were emitted for macho and not caught? The macho specific tests should have caught it (ie., changes in code logic)

And stated before, you were arguing for making this CHECK-NOT catch all the warnings, which I disagree with (reasons stated a few comments back).
So while it's true that the current set up isn't too ideal, given a choice of false neg vs false positive, I think most would agree that a test should learn toward false positive because false negatives tend to waste a lot of other people's time in debugging test failures.

That being said, there is an alternative approach I think you could consider: stick the message after the colon in a FileCheck define and then use it in both the positive and negative matches. That way, if the message is changed, the positive matches will start failing, prompting the developer to update that check too, which in turn will ensure the CHECK-NOT doesn't rot (since it's testing guaranteed to be testing the exact same string).
# RUN: ... | FileCheck %s -DMSG="--sort-symbols is not supported yet for this format"

# CHECK: warning: '{{.+}}_coff': [[MSG]]
...
# CHECK-NOT: warning: '{{.+}}': [[MSG]]
(NB: I've left the file path loose in the negative match so that if somebody changes the input file name, the check pattern is still valid).

oontvoo: > As someone who has been stung way too many times by rotten tests caused by negative matches…

oontvooAuthorUnsubmitted

Done

(NB: I've left the file path loose in the negative match so that if somebody changes the input file name, the check pattern is still valid).

What would be the legitimate reason to change the macho filename and not update the test? Those names were chosen specifically to differentiate the different formats.

oontvoo: > (NB: I've left the file path loose in the negative match so that if somebody changes the…

MaskRayUnsubmitted

Done

The current # CHECK-NOT: warning '{{.+}}_macho': [[MSG]] usage looks quite nice, though you may omit : after warning. By saving MSG to a -D variable, you don't have to repeat the message.

I also wonder whether we can just use # CHECK-NOT: warning: to assert no extra warnings. To make the test more specific, we want to make the file as valid as possible and rule out possibilities for other warnings, then # CHECK-NOT: warning: suffices.

MaskRay: The current `# CHECK-NOT: warning '{{.+}}_macho': [[MSG]]` usage looks quite nice, though you…

oontvooAuthorUnsubmitted

Done

though you may omit : after warning. By saving MSG to a -D variable, you don't have to repeat the message.

I'm not sure factoring the : to MSG helps with readability - the CHECK statements would look like:

# CHECK: warning '{{.+}}_coff'[[MSG]]

which is not as good as the current form

oontvoo: > though you may omit : after warning. By saving MSG to a -D variable, you don't have to…

oontvooAuthorUnsubmitted

Done

I also wonder whether we can just use # CHECK-NOT: warning: to assert no extra warnings. To make the test more specific, we want to make the file as valid as possible and rule out possibilities for other warnings, then # CHECK-NOT: warning: suffices.

Why does only this test need to assert that there is no other warnings/errors? I've checked all other tests in this project, and none of them has a check that no additional warnings were emitted.

oontvoo: > I also wonder whether we can just use `# CHECK-NOT: warning:` to assert no extra warnings.

oontvooAuthorUnsubmitted

Done

IOWs, I don't see why the burden has to be on this test to check that there is no other warnings. It should be as precise as possible in terms of what warning it does not expect and in this case, it does not expect this specific warning for the macho format. Any other warnings is another test's problem. (This test is named "sort-symbols.test" and not "all-warnings.test" for a reason)

If we want to have warning check tests, that's another discussion.

oontvoo: IOWs, I don't see why the burden has to be on this test to check that there is no other…

# CHECK: Format: Mach-O 64-bit x86-64

--- !mach-o

FileHeader:

magic: 0xFEEDFACF

cputype: 0x1000007

cpusubtype: 0x3

filetype: 0x1

ncmds: 0

sizeofcmds: 0

flags: 0x2000

reserved: 0x0

...

--- !ELF

FileHeader:

Class: ELFCLASS64

Data: ELFDATA2LSB

Type: ET_EXEC

Sections:

- Name: .gnu.version

Type: SHT_GNU_versym

...

--- !WASM

FileHeader:

Version: 0x00000001

Sections:

- Type: DATA

Segments:

- SectionOffset: 6

InitFlags: 0

Offset:

Opcode: GLOBAL_GET

Index: 1

Content: '64'

...

--- !XCOFF

FileHeader:

MagicNumber: 0x01DF

CreationTime: 1

EntriesInSymbolTable: 1

...

jhendersonUnsubmitted

Done

Too many blank lines at EOF (should be exactly one \n at the end).

jhenderson: Too many blank lines at EOF (should be exactly one \n at the end).

llvm/tools/llvm-readobj/MachODumper.cpp

//===- MachODumper.cpp - Object file dumping utility for llvm -------------===// //===- MachODumper.cpp - Object file dumping utility for llvm -------------===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// //

// This file implements the MachO-specific dumper for llvm-readobj. // This file implements the MachO-specific dumper for llvm-readobj.

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#include "ObjDumper.h" #include "ObjDumper.h"

#include "StackMapPrinter.h" #include "StackMapPrinter.h"

#include "llvm-readobj.h" #include "llvm-readobj.h"

#include "llvm/ADT/Optional.h"

#include "llvm/ADT/SmallString.h" #include "llvm/ADT/SmallString.h"

#include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringExtras.h"

#include "llvm/Object/MachO.h" #include "llvm/Object/MachO.h"

#include "llvm/Support/BinaryStreamReader.h" #include "llvm/Support/BinaryStreamReader.h"

#include "llvm/Support/Casting.h" #include "llvm/Support/Casting.h"

#include "llvm/Support/ScopedPrinter.h" #include "llvm/Support/ScopedPrinter.h"

using namespace llvm; using namespace llvm;

Show All 10 Lines public:

void printSectionHeaders() override; void printSectionHeaders() override;

void printRelocations() override; void printRelocations() override;

void printUnwindInfo() override; void printUnwindInfo() override;

void printStackMap() const override; void printStackMap() const override;

void printCGProfile() override; void printCGProfile() override;

void printNeededLibraries() override; void printNeededLibraries() override;

bool canCompareSymbols() const override { return true; }

bool compareSymbolsByName(object::SymbolRef LHS,

object::SymbolRef RHS) const override;

bool compareSymbolsByType(object::SymbolRef LHS,

object::SymbolRef RHS) const override;

// MachO-specific. // MachO-specific.

void printMachODataInCode() override; void printMachODataInCode() override;

void printMachOVersionMin() override; void printMachOVersionMin() override;

void printMachODysymtab() override; void printMachODysymtab() override;

void printMachOSegment() override; void printMachOSegment() override;

void printMachOIndirectSymbols() override; void printMachOIndirectSymbols() override;

void printMachOLinkerOptions () override; void printMachOLinkerOptions () override;

private: private:

template<class MachHeader> template<class MachHeader>

void printFileHeaders(const MachHeader &Header); void printFileHeaders(const MachHeader &Header);

StringRef getSymbolName(const SymbolRef &Symbol); StringRef getSymbolName(const SymbolRef &Symbol) const;

uint8_t getSymbolType(const SymbolRef &Symbol) const;

void printSymbols() override; void printSymbols(Optional<SymbolComparator> SymComp) override;

void printDynamicSymbols() override; void printDynamicSymbols(Optional<SymbolComparator> SymComp) override;

void printSymbol(const SymbolRef &Symbol, ScopedPrinter &W);

void printSymbol(const SymbolRef &Symbol); void printSymbol(const SymbolRef &Symbol);

void printRelocation(const RelocationRef &Reloc); void printRelocation(const RelocationRef &Reloc);

void printRelocation(const MachOObjectFile *Obj, const RelocationRef &Reloc); void printRelocation(const MachOObjectFile *Obj, const RelocationRef &Reloc);

void printSectionHeaders(const MachOObjectFile *Obj); void printSectionHeaders(const MachOObjectFile *Obj);

Show All 13 Lines

} // namespace llvm } // namespace llvm

const EnumEntry<uint32_t> MachOMagics[] = { const EnumEntry<uint32_t> MachOMagics[] = {

{ "Magic", MachO::MH_MAGIC }, { "Magic", MachO::MH_MAGIC },

{ "Cigam", MachO::MH_CIGAM }, { "Cigam", MachO::MH_CIGAM },

{ "Magic64", MachO::MH_MAGIC_64 }, { "Magic64", MachO::MH_MAGIC_64 },

{ "Cigam64", MachO::MH_CIGAM_64 }, { "Cigam64", MachO::MH_CIGAM_64 },

{ "FatMagic", MachO::FAT_MAGIC }, { "FatMagic", MachO::FAT_MAGIC },

{ "FatCigam", MachO::FAT_CIGAM }, { "FatCigam", MachO::FAT_CIGAM },

jhendersonUnsubmitted

Not Done

Please only reformat the parts of the file that you've changed.

jhenderson: Please only reformat the parts of the file that you've changed.

}; };

const EnumEntry<uint32_t> MachOHeaderFileTypes[] = { const EnumEntry<uint32_t> MachOHeaderFileTypes[] = {

{ "Relocatable", MachO::MH_OBJECT }, { "Relocatable", MachO::MH_OBJECT },

{ "Executable", MachO::MH_EXECUTE }, { "Executable", MachO::MH_EXECUTE },

{ "FixedVMLibrary", MachO::MH_FVMLIB }, { "FixedVMLibrary", MachO::MH_FVMLIB },

{ "Core", MachO::MH_CORE }, { "Core", MachO::MH_CORE },

{ "PreloadedExecutable", MachO::MH_PRELOAD }, { "PreloadedExecutable", MachO::MH_PRELOAD },

▲ Show 20 Lines • Show All 501 Lines • ▼ Show 20 Lines else

OS << " " << Obj->getPlainRelocationExternal(RE); OS << " " << Obj->getPlainRelocationExternal(RE);

OS << " " << RelocName OS << " " << RelocName

<< " " << IsScattered << " " << IsScattered

<< " " << SymbolNameOrOffset << " " << SymbolNameOrOffset

<< "\n"; << "\n";

} }

StringRef MachODumper::getSymbolName(const SymbolRef &Symbol) { StringRef MachODumper::getSymbolName(const SymbolRef &Symbol) const {

Expected<StringRef> SymbolNameOrErr = Symbol.getName(); Expected<StringRef> SymbolNameOrErr = Symbol.getName();

if (!SymbolNameOrErr) { if (!SymbolNameOrErr) {

reportError(SymbolNameOrErr.takeError(), Obj->getFileName()); reportError(SymbolNameOrErr.takeError(), Obj->getFileName());

} }

return *SymbolNameOrErr; return *SymbolNameOrErr;

} }

void MachODumper::printSymbols() { uint8_t MachODumper::getSymbolType(const SymbolRef &Symbol) const {

ListScope Group(W, "Symbols"); return Obj->getSymbol64TableEntry(Symbol.getRawDataRefImpl()).n_type;

jhendersonUnsubmitted

Done

Don't add a blank line at the start of a function.

jhenderson: Don't add a blank line at the start of a function.

}

bool MachODumper::compareSymbolsByName(SymbolRef LHS, SymbolRef RHS) const {

jhendersonUnsubmitted

Not Done

Just use a struct rather than a tuple. It'll be easier to reason with and you won't need these constants (which should be size_t anyway, since they're indexes).

jhenderson: Just use a `struct` rather than a tuple. It'll be easier to reason with and you won't need…

return getSymbolName(LHS).str().compare(getSymbolName(RHS).str()) < 0;

}

jhendersonUnsubmitted

Done

if (SortSymbols) {

- // The references returned by calling Obj->symbols() is temporary

- // and we can't hold on to them after the loop.

- // So in order to sort the symbols, we have to print their representation

- // to string and sort them later.

+ // The references returned by calling Obj->symbols() are temporary

+ // and we can't hold onto them after the loop. In order to sort the symbols,

+ // we have to print their representation to string and sort them later.

// Tuple of <type, name, representation>

jhenderson:

jhendersonUnsubmitted

Done

Nit: don't start functions with blank lines.

I don't this is the place to be adding the predicates, because much of this code will end up being duplicated when other format support is added. Instead, I recommend moving it to just after the code that creates the ObjDumper class in llvm-readobj.cpp, and use virtual functions for the predicates, using std::bind tto handle the fact that they're member functions.

If you do that, there's no need for the sort key enum. Instead, you could delay processing the command-line option until you need it to identify which predicates to add.

jhenderson: Nit: don't start functions with blank lines. I don't this is the place to be adding the…

oontvooAuthorUnsubmitted

Done

If you do that, there's no need for the sort key enum. Instead, you could delay processing the command-line option until you need it to identify which predicates to add.

Implemented the virtual func parts as suggested but still keepng the enum (albeit local/static now) because it's weird to move the opts processing part outside of the parseOptions function.

oontvoo: > If you do that, there's no need for the sort key enum. Instead, you could delay processing…

jhendersonUnsubmitted

Not Done

Yeah, I understand the concern. The downside with keeping it separate is that you have to effectively repeat the switch/case involved in option parsing in two places, which leads to ugly warts like the need for the assert(false) in the second one. Up to you though.

jhenderson: Yeah, I understand the concern. The downside with keeping it separate is that you have to…

oontvooAuthorUnsubmitted

Done

(ack'ed - keeping this as is for now until other format started implementing the option)

oontvoo: (ack'ed - keeping this as is for now until other format started implementing the option)

bool MachODumper::compareSymbolsByType(SymbolRef LHS, SymbolRef RHS) const {

return getSymbolType(LHS) < getSymbolType(RHS);

}

jhendersonUnsubmitted

Done

Entirely up to you, as it's existing code, but you could probably just drop the const &, since SymbolRef is designed to be easily copyable.

jhenderson: Entirely up to you, as it's existing code, but you could probably just drop the `const &`…

void MachODumper::printSymbols(Optional<SymbolComparator> SymComp) {

if (SymComp) {

jhendersonUnsubmitted

Done

Are you sure about this? They don't look all that temporary (as long as the object file is still in memory)...

jhenderson: Are you sure about this? They don't look all that temporary (as long as the object file is…

oontvooAuthorUnsubmitted

Done

Are you sure about this? They don't look all that temporary (as long as the object file is still in memory)...

Yes - I've run the code with asan(previous version which simply stored the returned values from ->symbols() to a vector then sorted it) and got errors

oontvoo: > Are you sure about this? They don't look all that temporary (as long as the object file is…

jhendersonUnsubmitted

Done

Are you sure it was that bit of code causing the problem? I've inspected the implementation of symbols() and SymbolRef for Mach-O, and the SymbolRef in this case is just a pointer into the buffer stored by MachOObjectFile (see MachOObjectFile::getSymbolByIndex). As such, there should be no lifetime issues, as long as the object file is in existence. Please try again and analyse where the issue actually is, as it could be a bug elsewhere that this just exposed. I'm going to need more convincing than just "asan said so" that there's an issue storing a SymbolRef in a vector that has a lifetime less than that of the object file.

jhenderson: Are you sure it was that bit of code causing the problem? I've inspected the implementation of…

auto SymbolRange = Obj->symbols();

jhendersonUnsubmitted

Done

raw_string_ostream StringOs(SymRep);

- llvm::formatted_raw_ostream FormattedOs(StringOs);

+ formatted_raw_ostream FormattedOs(StringOs);

std::unique_ptr<ScopedPrinter> Printer;

jhenderson:

std::vector<SymbolRef> SortedSymbols(SymbolRange.begin(),

SymbolRange.end());

llvm::stable_sort(SortedSymbols, *SymComp);

jhendersonUnsubmitted

Done

I wouldn't bother with the assert here: none of Mach-O output expects JSON format, so adding an assert in one place makes it look like it needs sorting here, but not everywhere else.

jhenderson: I wouldn't bother with the `assert` here: none of Mach-O output expects JSON format, so adding…

jhendersonUnsubmitted

Done

We usually explicitly use llvm:: prefix for std:: algorithm variations like this, to show it's an LLVM extension we're making use of. It also facilitates find and replace, should a std:: version ever be added in the future.

jhenderson: We usually explicitly use `llvm::` prefix for `std::` algorithm variations like this, to show…

for (SymbolRef Symbol : SortedSymbols)

printSymbol(Symbol);

} else {

jhendersonUnsubmitted

Done

I'm surprised you need this here, as JSON output format isn't supported yet for non-ELF formats. Probably remove it for now (especially as this code path is currently untested).

jhenderson: I'm surprised you need this here, as JSON output format isn't supported yet for non-ELF formats.

for (const SymbolRef &Symbol : Obj->symbols()) { for (const SymbolRef &Symbol : Obj->symbols()) {

printSymbol(Symbol); printSymbol(Symbol);

} }

jhendersonUnsubmitted

Done

ListScope Group(W, "Symbols");

- for (const SymbolRef &Symbol : Obj->symbols())

+ for (SymbolRef Symbol : Obj->symbols())

printSymbol(Symbol);

jhenderson:

} }

}

void MachODumper::printDynamicSymbols() { void MachODumper::printDynamicSymbols(Optional<SymbolComparator> SymComp) {

jhendersonUnsubmitted

Done

I might be mistaken, but I believe LLVM usually doesn't bother commenting out unused parameter names.

jhenderson: I might be mistaken, but I believe LLVM usually doesn't bother commenting out unused parameter…

jhendersonUnsubmitted

Not Done

Unaddressed but marked as done?

jhenderson: Unaddressed but marked as done?

ListScope Group(W, "DynamicSymbols"); ListScope Group(W, "DynamicSymbols");

jhendersonUnsubmitted

Done

I think there might be a cleaner way of doing this: ScopedPrinter has a getIndentLevel method, which you could use to set the indentation of the new printer you have here, based on that of the existing one. This will make this code more reusable too, since in different contexts, the indentation will be different.

It would look something like this: Printer.indent(W.getIndentLevel());, although you might need a "+ 1" or similar, I'm not sure.

jhenderson: I think there might be a cleaner way of doing this: `ScopedPrinter` has a `getIndentLevel`…

oontvooAuthorUnsubmitted

Done

Thanks! That does look better!

oontvoo: Thanks! That does look better!

} }

void MachODumper::printSymbol(const SymbolRef &Symbol) { void MachODumper::printSymbol(const SymbolRef &Symbol) {

printSymbol(Symbol, W);

}

void MachODumper::printSymbol(const SymbolRef &Symbol, ScopedPrinter &W) {

StringRef SymbolName = getSymbolName(Symbol); StringRef SymbolName = getSymbolName(Symbol);

MachOSymbol MOSymbol; MachOSymbol MOSymbol;

getSymbol(Obj, Symbol.getRawDataRefImpl(), MOSymbol); getSymbol(Obj, Symbol.getRawDataRefImpl(), MOSymbol);

StringRef SectionName = ""; StringRef SectionName = "";

jhendersonUnsubmitted

Done

These should be const &, right? Otherwise, you end up copying the tuple contents.

jhenderson: These should be `const &`, right? Otherwise, you end up copying the tuple contents.

// Don't ask a Mach-O STABS symbol for its section unless we know that // Don't ask a Mach-O STABS symbol for its section unless we know that

// STAB symbol's section field refers to a valid section index. Otherwise // STAB symbol's section field refers to a valid section index. Otherwise

// the symbol may error trying to load a section that does not exist. // the symbol may error trying to load a section that does not exist.

// TODO: Add a whitelist of STABS symbol types that contain valid section // TODO: Add a whitelist of STABS symbol types that contain valid section

// indices. // indices.

if (!(MOSymbol.Type & MachO::N_STAB)) { if (!(MOSymbol.Type & MachO::N_STAB)) {

Expected<section_iterator> SecIOrErr = Symbol.getSection(); Expected<section_iterator> SecIOrErr = Symbol.getSection();

jhendersonUnsubmitted

Done

return LType < RType;

- // FIXME: Maybe use Symbol's addresses as tie-breaker if

+ // TODO: Maybe use Symbol's addresses as tie-breaker if

// names are the same.

This isn't a bug, merely a possible improvement.

jhenderson: This isn't a bug, merely a possible improvement.

if (!SecIOrErr) if (!SecIOrErr)

jhendersonUnsubmitted

Done

I'm not clear on whether this unindent needs undoing or not after this. Probably you should manually inspect the output, both for multiple symbols, and for multiple operations, where symbol dumping happens before other output.

jhenderson: I'm not clear on whether this unindent needs undoing or not after this. Probably you should…

reportError(SecIOrErr.takeError(), Obj->getFileName()); reportError(SecIOrErr.takeError(), Obj->getFileName());

section_iterator SecI = *SecIOrErr; section_iterator SecI = *SecIOrErr;

jhendersonUnsubmitted

Done

Note that if you move this ListScope earlier, outside the if, you a) avoid the duplication with the else, and b), shouldn't need the + 1 I mentioned in my earlier comment about indentation.

jhenderson: Note that if you move this `ListScope` earlier, outside the `if`, you a) avoid the duplication…

oontvooAuthorUnsubmitted

Done

I initially moved it here (closer to where it is used) for clarity. Anyways, i can move it back.

oontvoo: I initially moved it here (closer to where it is used) for clarity. Anyways, i can move it back.

if (SecI != Obj->section_end()) if (SecI != Obj->section_end())

SectionName = unwrapOrError(Obj->getFileName(), SecI->getName()); SectionName = unwrapOrError(Obj->getFileName(), SecI->getName());

} }

DictScope D(W, "Symbol"); DictScope D(W, "Symbol");

W.printNumber("Name", SymbolName, MOSymbol.StringIndex); W.printNumber("Name", SymbolName, MOSymbol.StringIndex);

if (MOSymbol.Type & MachO::N_STAB) { if (MOSymbol.Type & MachO::N_STAB) {

W.printHex("Type", "SymDebugTable", MOSymbol.Type); W.printHex("Type", "SymDebugTable", MOSymbol.Type);

▲ Show 20 Lines • Show All 288 Lines • Show Last 20 Lines

llvm/tools/llvm-readobj/ObjDumper.h

//===-- ObjDumper.h ---------------------------------------------*- C++ -*-===// //===-- ObjDumper.h ---------------------------------------------*- C++ -*-===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#ifndef LLVM_TOOLS_LLVM_READOBJ_OBJDUMPER_H #ifndef LLVM_TOOLS_LLVM_READOBJ_OBJDUMPER_H

#define LLVM_TOOLS_LLVM_READOBJ_OBJDUMPER_H #define LLVM_TOOLS_LLVM_READOBJ_OBJDUMPER_H

#include <memory> #include <memory>

#include <system_error> #include <system_error>

#include "llvm/ADT/Optional.h"

#include "llvm/ADT/STLFunctionalExtras.h"

#include "llvm/ADT/SmallVector.h"

#include "llvm/ADT/StringMap.h"

#include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringRef.h"

#include "llvm/Object/ObjectFile.h" #include "llvm/Object/ObjectFile.h"

#include "llvm/Support/CommandLine.h" #include "llvm/Support/CommandLine.h"

#include <unordered_set> #include <unordered_set>

namespace llvm { namespace llvm {

namespace object { namespace object {

class Archive; class Archive;

class COFFImportFile; class COFFImportFile;

class ObjectFile; class ObjectFile;

class XCOFFObjectFile; class XCOFFObjectFile;

class ELFObjectFileBase; class ELFObjectFileBase;

} } // namespace object

namespace codeview { namespace codeview {

class GlobalTypeTableBuilder; class GlobalTypeTableBuilder;

class MergingTypeTableBuilder; class MergingTypeTableBuilder;

} // namespace codeview } // namespace codeview

class ScopedPrinter; class ScopedPrinter;

// Comparator to compare symbols.

// Usage: the caller registers predicates (i.e., how to compare the symbols) by

jhendersonUnsubmitted

Not Done

// Comparator to compare symbols.

- // Usage: caller register predicates (ie., how to compare the symbols) by

+ // Usage: the caller registers predicates (i.e., how to compare the symbols) by

// calling addPredicate(). The order in which predicates are registered is also

jhenderson:

// calling addPredicate(). The order in which predicates are registered is also

// their priority.

class SymbolComparator {

jhendersonUnsubmitted

Done

// their priority.

- final class SymbolComparator {

+ class SymbolComparator {

public:

Don't use final. It doesn't add any value and just makes later updates more complex.

jhenderson: Don't use `final`. It doesn't add any value and just makes later updates more complex.

public:

using CompPredicate =

function_ref<bool(object::SymbolRef, object::SymbolRef)>;

// Each Obj format has a slightly different way of retrieving a symbol's info

// So we defer the predicate's impl to each format.

void addPredicate(CompPredicate Pred) { Predicates.push_back(Pred); }

bool operator()(object::SymbolRef LHS, object::SymbolRef RHS) {

for (CompPredicate Pred : Predicates) {

if (Pred(LHS, RHS))

return true;

if (Pred(RHS, LHS))

return false;

}

return false;

}

private:

SmallVector<CompPredicate, 2> Predicates;

jhendersonUnsubmitted

Not Done

private:

- llvm::SmallVector<CompPredicate, 2> Predicates;

+ SmallVector<CompPredicate, 2> Predicates;

};

class ObjDumper {

We're inside the llvm namespace already.

jhenderson: We're inside the `llvm` namespace already.

};

class ObjDumper { class ObjDumper {

public: public:

ObjDumper(ScopedPrinter &Writer, StringRef ObjName); ObjDumper(ScopedPrinter &Writer, StringRef ObjName);

virtual ~ObjDumper(); virtual ~ObjDumper();

virtual bool canDumpContent() { return true; } virtual bool canDumpContent() { return true; }

virtual void printFileSummary(StringRef FileStr, object::ObjectFile &Obj, virtual void printFileSummary(StringRef FileStr, object::ObjectFile &Obj,

ArrayRef<std::string> InputFilenames, ArrayRef<std::string> InputFilenames,

const object::Archive *A); const object::Archive *A);

virtual void printFileHeaders() = 0; virtual void printFileHeaders() = 0;

virtual void printSectionHeaders() = 0; virtual void printSectionHeaders() = 0;

virtual void printRelocations() = 0; virtual void printRelocations() = 0;

virtual void printSymbols(bool PrintSymbols, bool PrintDynamicSymbols) { virtual void printSymbols(bool PrintSymbols, bool PrintDynamicSymbols) {

printSymbols(PrintSymbols, PrintDynamicSymbols, llvm::None);

}

virtual void printSymbols(bool PrintSymbols, bool PrintDynamicSymbols,

llvm::Optional<SymbolComparator> SymComp) {

if (PrintSymbols) if (PrintSymbols)

printSymbols(); printSymbols(SymComp);

if (PrintDynamicSymbols) if (PrintDynamicSymbols)

printDynamicSymbols(); printDynamicSymbols(SymComp);

} }

virtual void printProgramHeaders(bool PrintProgramHeaders, virtual void printProgramHeaders(bool PrintProgramHeaders,

cl::boolOrDefault PrintSectionMapping) { cl::boolOrDefault PrintSectionMapping) {

if (PrintProgramHeaders) if (PrintProgramHeaders)

printProgramHeaders(); printProgramHeaders();

if (PrintSectionMapping == cl::BOU_TRUE) if (PrintSectionMapping == cl::BOU_TRUE)

printSectionMapping(); printSectionMapping();

} }

virtual void printUnwindInfo() = 0; virtual void printUnwindInfo() = 0;

// Symbol comparison functions.

jhendersonUnsubmitted

Done

virtual void printUnwindInfo() = 0;

- // Symbols comparisons.

+ // Symbol comparison functions.

virtual bool canCompareSymbols() const { return false; }

jhenderson:

virtual bool canCompareSymbols() const { return false; }

virtual bool compareSymbolsByName(object::SymbolRef LHS,

object::SymbolRef RHS) const {

return true;

}

virtual bool compareSymbolsByType(object::SymbolRef LHS,

object::SymbolRef RHS) const {

return true;

}

// Only implemented for ELF at this time. // Only implemented for ELF at this time.

virtual void printDependentLibs() {} virtual void printDependentLibs() {}

virtual void printDynamicRelocations() { } virtual void printDynamicRelocations() { }

virtual void printDynamicTable() { } virtual void printDynamicTable() { }

virtual void printNeededLibraries() { } virtual void printNeededLibraries() { }

virtual void printSectionAsHex(StringRef SectionName) {} virtual void printSectionAsHex(StringRef SectionName) {}

virtual void printHashTable() { } virtual void printHashTable() { }

virtual void printGnuHashTable() {} virtual void printGnuHashTable() {}

▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines public:

std::function<Error(const Twine &Msg)> WarningHandler; std::function<Error(const Twine &Msg)> WarningHandler;

void reportUniqueWarning(Error Err) const; void reportUniqueWarning(Error Err) const;

void reportUniqueWarning(const Twine &Msg) const; void reportUniqueWarning(const Twine &Msg) const;

protected: protected:

ScopedPrinter &W; ScopedPrinter &W;

private: private:

virtual void printSymbols() {} virtual void printSymbols() { printSymbols(llvm::None); }

virtual void printDynamicSymbols() {} virtual void printSymbols(llvm::Optional<SymbolComparator> Comp) {}

virtual void printDynamicSymbols() { printDynamicSymbols(llvm::None); }

virtual void printDynamicSymbols(llvm::Optional<SymbolComparator> Comp) {}

virtual void printProgramHeaders() {} virtual void printProgramHeaders() {}

virtual void printSectionMapping() {} virtual void printSectionMapping() {}

std::unordered_set<std::string> Warnings; std::unordered_set<std::string> Warnings;

}; };

std::unique_ptr<ObjDumper> createCOFFDumper(const object::COFFObjectFile &Obj, std::unique_ptr<ObjDumper> createCOFFDumper(const object::COFFObjectFile &Obj,

ScopedPrinter &Writer); ScopedPrinter &Writer);

Show All 23 Lines

llvm/tools/llvm-readobj/Opts.td

Show All 31 Lines

def relocs : FF<"relocs", "Display the relocation entries in the file">; def relocs : FF<"relocs", "Display the relocation entries in the file">;

def section_data : FF<"section-data", "Display section data for each section shown. This option has no effect for GNU style output">; def section_data : FF<"section-data", "Display section data for each section shown. This option has no effect for GNU style output">;

def section_details : FF<"section-details", "Display the section details">; def section_details : FF<"section-details", "Display the section details">;

def section_headers : FF<"section-headers", "Display section headers">; def section_headers : FF<"section-headers", "Display section headers">;

def section_mapping : FF<"section-mapping", "Display the section to segment mapping">; def section_mapping : FF<"section-mapping", "Display the section to segment mapping">;

def section_mapping_EQ_false : FF<"section-mapping=false", "Don't display the section to segment mapping">, Flags<[HelpHidden]>; def section_mapping_EQ_false : FF<"section-mapping=false", "Don't display the section to segment mapping">, Flags<[HelpHidden]>;

def section_relocations : FF<"section-relocations", "Display relocations for each section shown. This option has no effect for GNU style output">; def section_relocations : FF<"section-relocations", "Display relocations for each section shown. This option has no effect for GNU style output">;

def section_symbols : FF<"section-symbols", "Display symbols for each section shown. This option has no effect for GNU style output">; def section_symbols : FF<"section-symbols", "Display symbols for each section shown. This option has no effect for GNU style output">;

defm sort_symbols : Eq<"sort-symbols", "Specify the keys to sort the symbols before displaying symtab">;

jhendersonUnsubmitted

Done

def section_symbols : FF<"section-symbols", "Display symbols for each section shown. This option has no effect for GNU style output">;

- def sort_symbols : FF<"sort-symbols", "Sort symbol before displaying symtab">;

+ def sort_symbols : FF<"sort-symbols", "Sort symbols before displaying symtab">;

def stack_sizes : FF<"stack-sizes", "Display contents of all stack sizes sections. This option has no effect for GNU style output">;

jhenderson:

def stack_sizes : FF<"stack-sizes", "Display contents of all stack sizes sections. This option has no effect for GNU style output">; def stack_sizes : FF<"stack-sizes", "Display contents of all stack sizes sections. This option has no effect for GNU style output">;

def stackmap : FF<"stackmap", "Display contents of stackmap section">; def stackmap : FF<"stackmap", "Display contents of stackmap section">;

defm string_dump : Eq<"string-dump", "Display the specified section(s) as a list of strings">, MetaVarName<"<name or index>">; defm string_dump : Eq<"string-dump", "Display the specified section(s) as a list of strings">, MetaVarName<"<name or index>">;

def string_table : FF<"string-table", "Display the string table (only for XCOFF now)">; def string_table : FF<"string-table", "Display the string table (only for XCOFF now)">;

def symbols : FF<"symbols", "Display the symbol table. Also display the dynamic symbol table when using GNU output style for ELF">; def symbols : FF<"symbols", "Display the symbol table. Also display the dynamic symbol table when using GNU output style for ELF">;

def unwind : FF<"unwind", "Display unwind information">; def unwind : FF<"unwind", "Display unwind information">;

jhendersonUnsubmitted

Done

def unwind : FF<"unwind", "Display unwind information">;

- def sort_symbols : FF<"sort-symbols", "Sort symbol before outputing symtab">;

+ def sort_symbols : FF<"sort-symbols", "Sort symbols before displaying symtab">;

// ELF specific options.

"displaying" is the term used for other options.
This option is (currently) Mach-O specific. Unless you plan on implementing it for other formats too, please move it to the Mach-O specific options block.
These options are listed alphabetically within each block. Please maintain that order.
Please remember to update the documentation for llvm-readobj (and llvm-readelf, if you are planning on this being a generic option), located at llvm/docs/CommandGuide.
If this is going to be Mach-O specific (for now), I'd name the variable name accordingly (i.e. something like macho_sort_symbols. Also, it will need to have the grp_macho Group, like the other Mach-O specific options.

jhenderson: 1) "displaying" is the term used for other options. 2) This option is (currently) Mach-O…

jhendersonUnsubmitted

Done

Pinging the points in this comment (specifically the inline edit, and points 3 and 4).

jhenderson: Pinging the points in this comment (specifically the inline edit, and points 3 and 4).

// ELF specific options. // ELF specific options.

def grp_elf : OptionGroup<"kind">, HelpText<"OPTIONS (ELF specific)">; def grp_elf : OptionGroup<"kind">, HelpText<"OPTIONS (ELF specific)">;

def dynamic_table : FF<"dynamic-table", "Display the dynamic section table">, Group<grp_elf>; def dynamic_table : FF<"dynamic-table", "Display the dynamic section table">, Group<grp_elf>;

def elf_linker_options : FF<"elf-linker-options", "Display the .linker-options section">, Group<grp_elf>; def elf_linker_options : FF<"elf-linker-options", "Display the .linker-options section">, Group<grp_elf>;

defm elf_output_style : Eq<"elf-output-style", "Specify ELF dump style: LLVM, GNU, JSON">, Group<grp_elf>; defm elf_output_style : Eq<"elf-output-style", "Specify ELF dump style: LLVM, GNU, JSON">, Group<grp_elf>;

def histogram : FF<"histogram", "Display bucket list histogram for hash sections">, Group<grp_elf>; def histogram : FF<"histogram", "Display bucket list histogram for hash sections">, Group<grp_elf>;

def section_groups : FF<"section-groups", "Display section groups">, Group<grp_elf>; def section_groups : FF<"section-groups", "Display section groups">, Group<grp_elf>;

def gnu_hash_table : FF<"gnu-hash-table", "Display the GNU hash table for dynamic symbols">, Group<grp_elf>; def gnu_hash_table : FF<"gnu-hash-table", "Display the GNU hash table for dynamic symbols">, Group<grp_elf>;

▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

llvm/tools/llvm-readobj/llvm-readobj.h

	//===-- llvm-readobj.h ----------------------------------------------------===//			//===-- llvm-readobj.h ----------------------------------------------------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_TOOLS_LLVM_READOBJ_LLVM_READOBJ_H			#ifndef LLVM_TOOLS_LLVM_READOBJ_LLVM_READOBJ_H
	#define LLVM_TOOLS_LLVM_READOBJ_LLVM_READOBJ_H			#define LLVM_TOOLS_LLVM_READOBJ_LLVM_READOBJ_H

				#include "ObjDumper.h"

				#include "llvm/ADT/SmallVector.h"
	#include "llvm/Support/CommandLine.h"			#include "llvm/Support/CommandLine.h"
	#include "llvm/Support/Compiler.h"			#include "llvm/Support/Compiler.h"
	#include "llvm/Support/ErrorOr.h"
	#include "llvm/Support/Error.h"			#include "llvm/Support/Error.h"
				#include "llvm/Support/ErrorOr.h"
	#include <string>			#include <string>

	namespace llvm {			namespace llvm {
	namespace object {			namespace object {
	class RelocationRef;			class RelocationRef;
	}			}

	// Various helper functions.			// Various helper functions.
	Show All 10 Lines
	namespace opts {			namespace opts {
	extern bool SectionRelocations;			extern bool SectionRelocations;
	extern bool SectionSymbols;			extern bool SectionSymbols;
	extern bool SectionData;			extern bool SectionData;
	extern bool ExpandRelocs;			extern bool ExpandRelocs;
	extern bool RawRelr;			extern bool RawRelr;
	extern bool CodeViewSubsectionBytes;			extern bool CodeViewSubsectionBytes;
	extern bool Demangle;			extern bool Demangle;
	enum OutputStyleTy { LLVM, GNU, JSON, UNKNOWN };			enum OutputStyleTy { LLVM, GNU, JSON, UNKNOWN };
				jhendersonUnsubmitted Not Done Reply Inline Actions Noting that this change isn't needed if you drop the reference to JSON output style from elsewhere in this patch. jhenderson: Noting that this change isn't needed if you drop the reference to JSON output style from…
	extern OutputStyleTy Output;			extern OutputStyleTy Output;
	} // namespace opts			} // namespace opts

	#define LLVM_READOBJ_ENUM_ENT(ns, enum) \			#define LLVM_READOBJ_ENUM_ENT(ns, enum) \
	{ #enum, ns::enum }			{ #enum, ns::enum }

	#define LLVM_READOBJ_ENUM_CLASS_ENT(enum_class, enum) \			#define LLVM_READOBJ_ENUM_CLASS_ENT(enum_class, enum) \
	{ #enum, std::underlying_type<enum_class>::type(enum_class::enum) }			{ #enum, std::underlying_type<enum_class>::type(enum_class::enum) }

	#endif			#endif

llvm/tools/llvm-readobj/llvm-readobj.cpp

Show All 15 Lines

// //

// Output should be specialized for each format where appropriate. // Output should be specialized for each format where appropriate.

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#include "llvm-readobj.h" #include "llvm-readobj.h"

#include "ObjDumper.h" #include "ObjDumper.h"

#include "WindowsResourceDumper.h" #include "WindowsResourceDumper.h"

#include "llvm/ADT/Optional.h"

#include "llvm/DebugInfo/CodeView/GlobalTypeTableBuilder.h" #include "llvm/DebugInfo/CodeView/GlobalTypeTableBuilder.h"

#include "llvm/DebugInfo/CodeView/MergingTypeTableBuilder.h" #include "llvm/DebugInfo/CodeView/MergingTypeTableBuilder.h"

#include "llvm/MC/TargetRegistry.h" #include "llvm/MC/TargetRegistry.h"

#include "llvm/Object/Archive.h" #include "llvm/Object/Archive.h"

#include "llvm/Object/COFFImportFile.h" #include "llvm/Object/COFFImportFile.h"

#include "llvm/Object/ELFObjectFile.h" #include "llvm/Object/ELFObjectFile.h"

#include "llvm/Object/MachOUniversal.h" #include "llvm/Object/MachOUniversal.h"

#include "llvm/Object/ObjectFile.h" #include "llvm/Object/ObjectFile.h"

▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines

}; };

class ReadobjOptTable : public opt::OptTable { class ReadobjOptTable : public opt::OptTable {

public: public:

ReadobjOptTable() : OptTable(InfoTable) { setGroupedShortOptions(true); } ReadobjOptTable() : OptTable(InfoTable) { setGroupedShortOptions(true); }

}; };

enum OutputFormatTy { bsd, sysv, posix, darwin, just_symbols }; enum OutputFormatTy { bsd, sysv, posix, darwin, just_symbols };

enum SortSymbolKeyTy {

NAME = 0,

TYPE = 1,

UNKNOWN = 100,

jhendersonUnsubmitted

Done

UNSUPPORTED or UNKNOWN instead of UNSPEC.

jhenderson: `UNSUPPORTED` or `UNKNOWN` instead of `UNSPEC`.

// TODO: add ADDRESS, SIZE as needed.

};

} // namespace } // namespace

namespace opts { namespace opts {

static bool Addrsig; static bool Addrsig;

static bool All; static bool All;

static bool ArchSpecificInfo; static bool ArchSpecificInfo;

static bool BBAddrMap; static bool BBAddrMap;

bool ExpandRelocs; bool ExpandRelocs;

static bool CGProfile; static bool CGProfile;

bool Demangle; bool Demangle;

static bool DependentLibraries; static bool DependentLibraries;

static bool DynRelocs; static bool DynRelocs;

static bool DynamicSymbols; static bool DynamicSymbols;

static bool FileHeaders; static bool FileHeaders;

static bool Headers; static bool Headers;

static std::vector<std::string> HexDump; static std::vector<std::string> HexDump;

static bool PrettyPrint; static bool PrettyPrint;

jhendersonUnsubmitted

Done

Noting that this change isn't needed if you drop the reference to JSON output style from elsewhere in this patch.

jhenderson: Noting that this change isn't needed if you drop the reference to JSON output style from…

static bool PrintStackMap; static bool PrintStackMap;

static bool PrintStackSizes; static bool PrintStackSizes;

static bool Relocations; static bool Relocations;

bool SectionData; bool SectionData;

static bool SectionDetails; static bool SectionDetails;

static bool SectionHeaders; static bool SectionHeaders;

bool SectionRelocations; bool SectionRelocations;

bool SectionSymbols; bool SectionSymbols;

static std::vector<std::string> StringDump; static std::vector<std::string> StringDump;

static bool StringTable; static bool StringTable;

static bool Symbols; static bool Symbols;

static bool UnwindInfo; static bool UnwindInfo;

static cl::boolOrDefault SectionMapping; static cl::boolOrDefault SectionMapping;

static SmallVector<SortSymbolKeyTy> SortKeys;

// ELF specific options. // ELF specific options.

static bool DynamicTable; static bool DynamicTable;

static bool ELFLinkerOptions; static bool ELFLinkerOptions;

static bool GnuHashTable; static bool GnuHashTable;

static bool HashSymbols; static bool HashSymbols;

static bool HashTable; static bool HashTable;

static bool HashHistogram; static bool HashHistogram;

▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines [[noreturn]] void reportError(Error Err, StringRef Input) {

llvm_unreachable("error() call should never return"); llvm_unreachable("error() call should never return");

} }

void reportWarning(Error Err, StringRef Input) { void reportWarning(Error Err, StringRef Input) {

assert(Err); assert(Err);

if (Input == "-") if (Input == "-")

Input = "<stdin>"; Input = "<stdin>";

// Flush the standard output to print the warning at a // Flush the standard output to print the warning at a

// proper place. // proper place.

fouts().flush(); fouts().flush();

handleAllErrors( handleAllErrors(

createFileError(Input, std::move(Err)), [&](const ErrorInfoBase &EI) { createFileError(Input, std::move(Err)), [&](const ErrorInfoBase &EI) {

WithColor::warning(errs(), ToolName) << EI.message() << "\n"; WithColor::warning(errs(), ToolName) << EI.message() << "\n";

}); });

jhendersonUnsubmitted

Done

If sure about the warning, I recommend refactoring this to use the new warn function.

jhenderson: If sure about the warning, I recommend refactoring this to use the new `warn` function.

oontvooAuthorUnsubmitted

Done

(removed)

oontvoo: (removed)

} }

} // namespace llvm } // namespace llvm

static void parseOptions(const opt::InputArgList &Args) { static void parseOptions(const opt::InputArgList &Args) {

opts::Addrsig = Args.hasArg(OPT_addrsig); opts::Addrsig = Args.hasArg(OPT_addrsig);

opts::All = Args.hasArg(OPT_all); opts::All = Args.hasArg(OPT_all);

opts::ArchSpecificInfo = Args.hasArg(OPT_arch_specific); opts::ArchSpecificInfo = Args.hasArg(OPT_arch_specific);

▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines static void parseOptions(const opt::InputArgList &Args) {

opts::HashTable = Args.hasArg(OPT_hash_table); opts::HashTable = Args.hasArg(OPT_hash_table);

opts::HashHistogram = Args.hasArg(OPT_histogram); opts::HashHistogram = Args.hasArg(OPT_histogram);

opts::NeededLibraries = Args.hasArg(OPT_needed_libs); opts::NeededLibraries = Args.hasArg(OPT_needed_libs);

opts::Notes = Args.hasArg(OPT_notes); opts::Notes = Args.hasArg(OPT_notes);

opts::PrettyPrint = Args.hasArg(OPT_pretty_print); opts::PrettyPrint = Args.hasArg(OPT_pretty_print);

opts::ProgramHeaders = Args.hasArg(OPT_program_headers); opts::ProgramHeaders = Args.hasArg(OPT_program_headers);

opts::RawRelr = Args.hasArg(OPT_raw_relr); opts::RawRelr = Args.hasArg(OPT_raw_relr);

opts::SectionGroups = Args.hasArg(OPT_section_groups); opts::SectionGroups = Args.hasArg(OPT_section_groups);

if (Arg *A = Args.getLastArg(OPT_sort_symbols_EQ)) {

std::string SortKeysString = A->getValue();

for (StringRef KeyStr : llvm::split(A->getValue(), ",")) {

SortSymbolKeyTy KeyType = StringSwitch<SortSymbolKeyTy>(KeyStr)

jhendersonUnsubmitted

Not Done

int pos = 0;

- for (llvm::StringRef KeyStr : llvm::split(A->getValue(), ",")) {

+ for (StringRef KeyStr : llvm::split(A->getValue(), ",")) {

if (!KeyStr.empty()) {

jhenderson:

.Case("name", SortSymbolKeyTy::NAME)

.Case("type", SortSymbolKeyTy::TYPE)

.Default(SortSymbolKeyTy::UNKNOWN);

if (KeyType == SortSymbolKeyTy::UNKNOWN)

error("--sort-symbols value should be 'name' or 'type', but was '" +

Twine(KeyStr) + "'");

opts::SortKeys.push_back(KeyType);

}

opts::VersionInfo = Args.hasArg(OPT_version_info); opts::VersionInfo = Args.hasArg(OPT_version_info);

jhendersonUnsubmitted

Not Done

This option is currently Mach-O specific, so move it accordingly, unless you plan on implementing other formats. Also, these lists are in alphabetical order. Please maintain that order.

jhenderson: This option is currently Mach-O specific, so move it accordingly, unless you plan on…

oontvooAuthorUnsubmitted

Done

Yes, I plan to add that to at least, ELF - but dont want to do it in one patch. (esp. since the ELF-dumper is a bit more complex)
Fixed the ordering, though.

oontvoo: Yes, I plan to add that to at least, ELF - but dont want to do it in one patch. (esp. since the…

// Mach-O specific options. // Mach-O specific options.

opts::MachODataInCode = Args.hasArg(OPT_macho_data_in_code); opts::MachODataInCode = Args.hasArg(OPT_macho_data_in_code);

opts::MachODysymtab = Args.hasArg(OPT_macho_dysymtab); opts::MachODysymtab = Args.hasArg(OPT_macho_dysymtab);

jhendersonUnsubmitted

Done

I think it would be simpler if this were an error. What's the motivation for a warning?

jhenderson: I think it would be simpler if this were an error. What's the motivation for a warning?

oontvooAuthorUnsubmitted

Done

(made it an error too)

oontvoo: (made it an error too)

opts::MachOIndirectSymbols = Args.hasArg(OPT_macho_indirect_symbols); opts::MachOIndirectSymbols = Args.hasArg(OPT_macho_indirect_symbols);

opts::MachOLinkerOptions = Args.hasArg(OPT_macho_linker_options); opts::MachOLinkerOptions = Args.hasArg(OPT_macho_linker_options);

opts::MachOSegment = Args.hasArg(OPT_macho_segment); opts::MachOSegment = Args.hasArg(OPT_macho_segment);

opts::MachOVersionMin = Args.hasArg(OPT_macho_version_min); opts::MachOVersionMin = Args.hasArg(OPT_macho_version_min);

// PE/COFF specific options. // PE/COFF specific options.

opts::CodeView = Args.hasArg(OPT_codeview); opts::CodeView = Args.hasArg(OPT_codeview);

opts::CodeViewEnableGHash = Args.hasArg(OPT_codeview_ghash); opts::CodeViewEnableGHash = Args.hasArg(OPT_codeview_ghash);

▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines std::string FileStr =

: Obj.getFileName().str(); : Obj.getFileName().str();

std::string ContentErrString; std::string ContentErrString;

if (Error ContentErr = Obj.initContent()) if (Error ContentErr = Obj.initContent())

ContentErrString = "unable to continue dumping, the file is corrupt: " + ContentErrString = "unable to continue dumping, the file is corrupt: " +

toString(std::move(ContentErr)); toString(std::move(ContentErr));

ObjDumper *Dumper; ObjDumper *Dumper;

Optional<SymbolComparator> SymComp;

jhendersonUnsubmitted

Done

ObjDumper *Dumper;

- std::unique_ptr<SymbolComparator> SymComp;

+ llvm::Optional<SymbolComparator> SymComp;

Expected<std::unique_ptr<ObjDumper>> DumperOrErr = createDumper(Obj, Writer);

llvm::Optional would be the more expressive form here. You could then pass it directly, rather than via the pointer, and have a None check instead of a nullptr check.

jhenderson: `llvm::Optional` would be the more expressive form here. You could then pass it directly…

jhendersonUnsubmitted

Done

I was mistaken to put the llvm:: qualifier before Optional. It looks like many (most?) instances don't use the qualifier for it or None, so please remove it.

jhenderson: I was mistaken to put the `llvm::` qualifier before `Optional`. It looks like many (most?)…

Expected<std::unique_ptr<ObjDumper>> DumperOrErr = createDumper(Obj, Writer); Expected<std::unique_ptr<ObjDumper>> DumperOrErr = createDumper(Obj, Writer);

if (!DumperOrErr) if (!DumperOrErr)

reportError(DumperOrErr.takeError(), FileStr); reportError(DumperOrErr.takeError(), FileStr);

Dumper = (*DumperOrErr).get(); Dumper = (*DumperOrErr).get();

if (!opts::SortKeys.empty()) {

if (Dumper->canCompareSymbols()) {

SymComp = SymbolComparator();

for (SortSymbolKeyTy Key : opts::SortKeys) {

switch (Key) {

case NAME:

SymComp->addPredicate([Dumper](SymbolRef LHS, SymbolRef RHS) {

jhendersonUnsubmitted

Done

Here and below, I don't think you need the trailing return types.

jhenderson: Here and below, I don't think you need the trailing return types.

jhendersonUnsubmitted

Done

Not addressed?

jhenderson: Not addressed?

return Dumper->compareSymbolsByName(LHS, RHS);

});

break;

case TYPE:

SymComp->addPredicate([Dumper](SymbolRef LHS, SymbolRef RHS) {

return Dumper->compareSymbolsByType(LHS, RHS);

});

break;

case UNKNOWN:

jhendersonUnsubmitted

Done

Use case UNSPEC (renamed according to my earlier comment) here, instead of default, to take advantage of compiler warnings about not all cases being filled. See also https://llvm.org/docs/CodingStandards.html#don-t-use-default-labels-in-fully-covered-switches-over-enumerations.

jhenderson: Use `case UNSPEC` (renamed according to my earlier comment) here, instead of `default`, to take…

llvm_unreachable("Unsupported sort key");

jhendersonUnsubmitted

Done

Use llvm_unreachable rather than assert(false).

jhenderson: Use `llvm_unreachable` rather than `assert(false)`.

}

} else {

reportWarning(createStringError(

jhendersonUnsubmitted

Done

Error and warning messages shouldn't end with a "."
I have concerns here: this will stop dumping as soon as an unsupported file type is encountered (even in an archive), yet that is unlikely what the user wants to happen. They more likely want to continue dumping and just not sort in this case (imagine if they had mutliple different file format objects in their input). This should be at most a warning (including the input file name), although I could even see an argument for not having it at all. It also needs testing.

jhenderson: 1) Error and warning messages shouldn't end with a "." 2) I have concerns here: this will stop…

oontvooAuthorUnsubmitted

Done

Fixed
If the format doesn't support --sort-symbols then the users shouldn't specify --sort-symbols. (Keeping it as error for consistency - similarly to how it handles unknown sort-key above, which is that if it doesn't understand it, then it's an error, rather than trying to guess. This file also doesn't have precedence for "warning")

oontvoo: 1. Fixed 2. If the format doesn't support --sort-symbols then the users shouldn't specify…

jhendersonUnsubmitted

Done

You're making the potentially incorrect assumption that all input are of the same format. If a user has two objects of different formats, they might want the symbols sorted for the ones that can be:

$ llvm-readobj elf.o macho.o --sort-symbols --symbols

Would result in an error, and nothing printed (not even macho.o's symbols).

jhenderson: You're making the potentially incorrect assumption that all input are of the same format. If a…

errc::invalid_argument,

"--sort-symbols is not supported yet for this format"),

FileStr);

jhendersonUnsubmitted

Done

This warning is untested. Unless you have patches lined up for all other formats, I'd add a test that shows what happens for a mixture of sortable and unsortable formats, e.g. llvm-readobj wasm.o macho.o elf.o

jhenderson: This warning is untested. Unless you have patches lined up for all other formats, I'd add a…

}

Dumper->printFileSummary(FileStr, Obj, opts::InputFilenames, A); Dumper->printFileSummary(FileStr, Obj, opts::InputFilenames, A);

if (opts::FileHeaders) if (opts::FileHeaders)

Dumper->printFileHeaders(); Dumper->printFileHeaders();

if (Obj.isXCOFF() && opts::XCOFFAuxiliaryHeader) if (Obj.isXCOFF() && opts::XCOFFAuxiliaryHeader)

Dumper->printAuxiliaryHeader(); Dumper->printAuxiliaryHeader();

Show All 19 Lines if (opts::NeededLibraries)

Dumper->printNeededLibraries(); Dumper->printNeededLibraries();

if (opts::Relocations) if (opts::Relocations)

Dumper->printRelocations(); Dumper->printRelocations();

if (opts::DynRelocs) if (opts::DynRelocs)

Dumper->printDynamicRelocations(); Dumper->printDynamicRelocations();

if (opts::UnwindInfo) if (opts::UnwindInfo)

Dumper->printUnwindInfo(); Dumper->printUnwindInfo();

if (opts::Symbols || opts::DynamicSymbols) if (opts::Symbols || opts::DynamicSymbols)

Dumper->printSymbols(opts::Symbols, opts::DynamicSymbols); Dumper->printSymbols(opts::Symbols, opts::DynamicSymbols, SymComp);

if (!opts::StringDump.empty()) if (!opts::StringDump.empty())

Dumper->printSectionsAsString(Obj, opts::StringDump); Dumper->printSectionsAsString(Obj, opts::StringDump);

if (!opts::HexDump.empty()) if (!opts::HexDump.empty())

Dumper->printSectionsAsHex(Obj, opts::HexDump); Dumper->printSectionsAsHex(Obj, opts::HexDump);

if (opts::HashTable) if (opts::HashTable)

Dumper->printHashTable(); Dumper->printHashTable();

if (opts::GnuHashTable) if (opts::GnuHashTable)

Dumper->printGnuHashTable(); Dumper->printGnuHashTable();

▲ Show 20 Lines • Show All 247 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO only, for now).ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 418931

llvm/docs/CommandGuide/llvm-readobj.rst

llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml

llvm/test/tools/llvm-readobj/sort-symbols.test

llvm/tools/llvm-readobj/MachODumper.cpp

llvm/tools/llvm-readobj/ObjDumper.h

llvm/tools/llvm-readobj/Opts.td

llvm/tools/llvm-readobj/llvm-readobj.h

llvm/tools/llvm-readobj/llvm-readobj.cpp

[llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO only, for now).
ClosedPublic