This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/tools/llvm-objdump/
-
tools/
-
llvm-objdump/
15/15
multiple-symbols.test
-
tools/llvm-objdump/
-
llvm-objdump/
-
ObjdumpOpts.td
33/33
llvm-objdump.cpp

Differential D131589

[llvm-objdump] Handle multiple syms at same addr in disassembly.
ClosedPublic

Authored by simon_tatham on Aug 10 2022, 9:28 AM.

Download Raw Diff

Details

Reviewers

scott.linder
jhenderson
aardappel
rochauha
sbc100
dschuff
rafauler
MaskRay

Commits

rG8e29f3f1c35a: [llvm-objdump] Handle multiple syms at same addr in disassembly.

Summary

The main disassembly loop in llvm-objdump works by iterating through
the symbols in a code section, and for each one, dumping the range of
the section from that symbol to the next. If there's another symbol
defined at the same location, then that range will have length 0, and
llvm-objdump will skip over the symbol entirely.

As a result, llvm-objdump will only show the last of the symbols
defined at that address. Not only that, but the other symbols won't
even be checked against the --disassemble-symbol list. So if you
have two symbols foo and bar defined in the same place, then one
of --disassemble-symbol=foo and --disassemble-symbol=bar will
generate an error message and no disassembly.

I think a better approach in that situation is to prioritise display
of the symbol the user actually asked for. Also, if the user
specifically asks for disassembly of both of two symbols defined
at the same address, the best response I can think of is to
disassemble the code once, preceded by both symbol names.

This involves teaching llvm-objdump to be able to display more than
one symbol name at the head of a disassembled section, which also
makes it possible to implement a --show-all-symbols option to
display every symbol defined in the code, not just the most
preferred one at each address.

This change also turns out to fix a bug in which --disassemble-all
on a mixed Arm/Thumb ELF file would fail to switch disassembly states
between Arm and Thumb functions, because the mapping symbols were
accidentally ignored.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	610 ms	x64 debian > BOLT.AArch64::r_aarch64_prelxx.s

Event Timeline

simon_tatham created this revision.Aug 10 2022, 9:28 AM

Herald added a reviewer: MaskRay. · View Herald TranscriptAug 10 2022, 9:28 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added a subscriber: rupprecht. · View Herald Transcript

simon_tatham requested review of this revision.Aug 10 2022, 9:28 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2022, 9:28 AM

Herald added subscribers: llvm-commits, StephenFan, aheejin. · View Herald Transcript

Harbormaster completed remote builds in B180436: Diff 451520.Aug 10 2022, 11:45 AM

Thanks Simon, I think this patch makes a nice improvement to the behavior of objdump. I'm not the code owner of these disassembly tools, but the patch looks good to me. I would rather wait for the input of whoever is more familiar with objdump, though.

llvm/tools/llvm-objdump/llvm-objdump.cpp
1494–1497	Curious about the style here where we define a scope to isolate a piece of computation. I don't think I've seen this in other LLVM files or in the coding style, so I'm curious what other people think about it. I do think it makes the code more clear to understand here, because it's only 3 lines. But on the other hand, it increases nesting on line 1509, which can actually make code harder to read (arguably).

When adding new tool options, please make sure to update the CommandGuide at llvm/docs/CommandGuide for the tool.

llvm/tools/llvm-objdump/llvm-objdump.cpp
1494–1497	I'm not sure this style is used in LLVM, or at least not in the areas I'm familiar with, so I'd drop it.
1496	Nit: I think we prefer preincrement to post.
1501	Can this be a vector of `StringRef` to save the copying?
1511	"(that we can find at all)" - I'm not sure what this is saying. Do you even need it?
1533
1534
1537–1540	Not sure you need the parentheses, or the "//" emphasis marks. Also, "Arm" should be "ARM".
1542	`size_t` would be the more natural type for loops over `SymbolsHere`, since that's the value returned by `size()` and used by the index operator. Ditto below.
1563–1565	It's not immediately obvious to me why this line has changed. Could you explain please, as it likely means I've missed something.
1585	`StringRef`?
1683–1684	Is there a subtle behaviour change here if you have multiple symbols at the same address but different types (i.e. one is STT_OBJECT and one isn't, e.g. STT_FUNC)?

jhenderson added inline comments.Aug 11 2022, 1:30 AM

llvm/tools/llvm-objdump/llvm-objdump.cpp
1687–1688	Same comment as above.

In D131589#3714216, @rafauler wrote:

I'm not the code owner of these disassembly tools

Several of the reviewers I picked for this patch were authors of the specific parts of the code that I was doing something complicated to. According to git, you were involved in setting up the --disassemble-symbols system in the first place, so I particularly value your input on whether there's any intentional feature of its semantics that I've broken without noticing :-)

(The onSymbolStart mechanism is the other one that I'm concerned about, because neither of its use cases is familiar to me. So I tried to pick reviewers who know something about that, as well.)

llvm/tools/llvm-objdump/llvm-objdump.cpp
1494–1497	Oops, good catch. Those braces were originally there to isolate some variables so that they weren't in scope for the rest of the enormous for-loop body. But by the time I finished developing the patch I'd removed all the variables local to the block, and somehow didn't notice that even in three last-minute pre-upload reviews of my own :-) Now I look closely, the same thing happened to the other scope that decides which symbols to print. I'll fix that one too.
1501	Not trivially, because in the case where we have to demangle names, `demangle()` returns a newly made string and we have to have somewhere to store it. I've changed it so that we have the vector of `StringRef` you wanted, and also a vector of `std::string` which is left empty when not in demangling mode, and the StringRefs point at the local vector or the original name elsewhere, as appropriate.
1511	I just meant that if a symbol specified by the user doesn't appear in the object file at all, then we're exempt from the need to display it. But I agree that's not 100% clear, or particularly important to highlight in this context. I'll remove the parenthesis.
1542	I agree! But I guessed that the widespread existing use of `unsigned` in similar cases in the LLVM code base (for example, this very loop where `SI` ranges up to `Symbols.size()`) was a local idiom that I'd be criticised for going against. I'm glad to see that the opposite is true ;-)
1563–1565	In the existing version of the loop, `SI` is incremented after the loop body runs, by the `++SI` in the `for` statement itself. So throughout the loop body, `SI` points at the symbol we're currently disassembling, and `SI+1` here indicates the next symbol, whose address marks the point where we're planning to finish this iteration of the disassembly loop. In the new version, I've removed the `++SI` in the `for` statement, and replaced it with code at the beginning of the loop body that advances `SI` past all the symbols defined at the same address. So after that code runs, the rest of the loop body sees `SI` already pointing at the first symbol defined at a later address.
1683–1684	Potentially, yes. Previously, `llvm-objdump` would pick just one of the symbols defined at the address, and base its decision on that symbol alone. With this patch, it will go through all of them, and spots any STT_OBJECT even if it's not the symbol last in the sorted list. This is just the sort of thing I hoped to have a useful discussion about in order to decide what the behaviour should be, to avoid the risk of writing oodles of code to implement a complicated policy that we had no consensus on :-) so thanks for flagging it up. What do you think we should do if an STT_FUNC and an STT_OBJECT occur at the same address? llvm-objdump's existing policy doesn't look particularly deliberate to me – it's an artefact of the code's previous lack of attention to collocated symbols. Perhaps it's nonetheless best to stick with the existing policy just for stability's sake, but if so, I'd prefer that we'd discussed other options before deciding that. Other possibilities that spring to mind are to deliberately make STT_OBJECT highest priority (which is what's happening in this version of the code), or to make it lowest priority, or to choose based on some criterion like symbol index in the ELF file (go with whichever symbol was first/last in the actual object file's symtab). And maybe, whichever of those we do, emit a warning that flags up that we had to make an arbitrary decision that could have gone the other way. What do you think? (PS I hope you're not going to like the symtab index idea, because that information isn't preserved at all in `SymbolInfoTy` so it would take a load more plumbing :-)

Addressed all review comments (I think) other than the question of STT_OBJECT priority versus other kinds of symbol.

Harbormaster completed remote builds in B180658: Diff 451830.Aug 11 2022, 8:07 AM

jhenderson added inline comments.Aug 12 2022, 12:40 AM

llvm/tools/llvm-objdump/llvm-objdump.cpp
1501	I'd forgotten about the `demangle` aspect of this. As such, I'm not fussed whichever way you prefer to do it.
1505–1508	You can do this, right?
1683–1684	Looking at the comment block, my instinct says we should treat STT_FUNC as higher priority (possibly assuming it hasn't got size 0) and do regular disassembly. Having an STT_OBJECT/STT_COMMON symbol at the same address as an STT_FUNC symbol sounds like it's unlikely to ever occur in practice ("this code represents both a function and some data??"). If it does, I think it's reasonable to pick one style somewhat arbitrarily. The user can use `--disassemble-symbol` of the STT_OBJECT symbol if they want to disassemble it as data in this case, I think. I'm less clear on the second block below about ARM mapping symbols, because I'm not familiar with how ARM mapping symbols are used, and therefore don't think I can make an informed decision on the right approach there.

simon_tatham added inline comments.Aug 12 2022, 3:00 AM

llvm/tools/llvm-objdump/llvm-objdump.cpp
1505–1508	I nearly did, but I wasn't confident that a `StringRef` to a `std::string` stored in a `std::vector` is guaranteed to stay valid if the `std::vector` has to resize itself. What if the `std::string` implementation stores short enough strings without a separate allocation, and the vector resize involves a realloc? So instead I did something I'm sure is safe, which is to set up the entire vector of strings, commit to never modifying it again, and then start making `StringRef`s pointing into it. (Another option I'd have been completely confident of would be to make a vector of `unique_ptr<string>`, so that even if the vector resizes, each pointed-to string stays put. That forces even more allocations, though.)

Note to self: I still need to review the tests, but will do that once the approach re. multiple symbols has been settled on and the tests updated accordingly.

llvm/tools/llvm-objdump/llvm-objdump.cpp
1505–1508	Good point - keep the loop as-is. It turns out that the "Small String Optimization" that std::string can use under-the-hood, means that perhaps unintuitively, the string's data can be moved when the string itself is moved, rather than just the pointer to the data (see https://stackoverflow.com/questions/57723963/is-it-safe-to-store-the-pointer-to-the-data-of-a-stdstring).

simon_tatham marked 6 inline comments as done.Aug 15 2022, 3:03 AM

simon_tatham added inline comments.

llvm/tools/llvm-objdump/llvm-objdump.cpp
1683–1684	OK, I'll change it to treat code symbols as higher priority than data. As for mapping symbols, that's a use case I do know something about, and honestly, I think the simplest approach there is to stop checking for `STT_OBJECT` symbols at all, and just say that if there are mapping symbols in this section, we should use them, and not try to second-guess whether we think they're useful. So I think the best thing is to remove that loop completely, and replace it with a test of `MappingSymbols.empty()`.

Updated handling of STT_OBJECT symbols as discussed. Also added a comment about the confusing double loop setting up the demangled symbol names, since the next person might also wonder why it's not being done in one step.

Harbormaster completed remote builds in B181241: Diff 452618.Aug 15 2022, 3:39 AM

Code changes basically look good, but I'm out of time for today to review the testing. Please make sure the testing covers all the new code paths, and that the changed behaviour we were discussing is also covered.

llvm/tools/llvm-objdump/llvm-objdump.cpp
1510–1511	I feel like this should be simplfiable to a single line, possibly using something like `SymNamesHere.insert(SymNamesHere.begin(), DemangledSymNamesHere.begin(), DemangledSymNamesHere.end());` although maybe it's unnecessarily complex. (Same goes for the loop immediately below in the `else`).

Added another test to check that the new data vs code priorities work.

I'd misunderstood the previous sorting criterion, it turned out. I
thought symbols at the same address were sorted by type. In fact
they're sorted by name, then by type. So out of a data and code
symbol, the alphabetically later one was previously winning!

Herald added a subscriber: emaste. · View Herald TranscriptAug 16 2022, 7:54 AM

Harbormaster completed remote builds in B181532: Diff 453011.Aug 16 2022, 8:42 AM

In addition to me inline comments, I think one bit of testing you still need is showing whether the demangled or raw name is used for the priority ordering of symbol names.

llvm/test/tools/llvm-objdump/ELF/data-vs-code-priority.test
1 ↗	(On Diff #453011)	This is going to need some kind of `REQUIRES` directive, since it's testing disassembly.
4 ↗	(On Diff #453011)	"Prior to D131589" is going to be pretty meaningless in the future. I'd just get rid of this whole sentence (and delete the "now" from the next sentence), or at least replace this bit with the more generic "Previously".
17–20 ↗	(On Diff #453011)	Do you think it would be worth also showing which symbols are printed?
29–33 ↗	(On Diff #453011)	I prefer to line up the values in a block so that they all start at the same column, like in the suggestion. It marginally helps with readability, I find.
llvm/test/tools/llvm-objdump/multiple-symbols.test
2	I'd move your comment about what the test does to the start of the file here. Also, this test will need a `REQUIRES` directive too, as it won't work if the build doesn't have the relevant target configured.
5	I'd put blank lines between the closely-related groups here. For example, you could have the first two runs in one group, the next 6 in another and the remainder in a third. I'd then label each group with a comment explaining what's special about that set of test cases.
22	Nit: here and elsewhere, I think the canonical spelling is ARM.
30	I'd suggest this fomatting for the groups, because it looks initially like you've just omitted a space after the comma!
34	Rather than half a dozen different CHECK patterns, which are mostly duplicates, I'd consider using multiple check prefixes in each test case to enable/disable the relevant parts. For example, you'd have one prefix for each of the symbols, and then another prefix for each of the disassembly blocks. You could use an `--implicit-check-not` to the FileCheck commands to ensure other stuff isn't printed incorrectly, instead of the `-NOT` style too, but up to you. Rough example: # AMAP: 00000000 <$a.0>: # AAAA: 00000000 <aaaa>: # BBBB: 00000000 <bbbb>: # CODE1: <first functions code> # CODE1-NEXT: ... ... # TMAP: ....
102–104	Bearing in mind that --disassemble-symbol should already have testing elsewhere, what does this second block of code + function symbols give us that the first block alone doesn't?
137	Delete this comment - it's obvious that it's the input file by virtue of it being YAML.
llvm/tools/llvm-objdump/llvm-objdump.cpp
1583	The change of this logic should correspond to some sort of test case, I think, but I don't think I see anything?

I've left most of your comments un-replied-to so far, because I need to think harder about the choice of symbols to display, as mentioned in one of my inline comments below.

llvm/test/tools/llvm-objdump/ELF/data-vs-code-priority.test
17–20 ↗	(On Diff #453011)	Hmmm. Now I look more closely, there's still something a bit odd here. The symbols that are printed are the ones beginning with `B`, in every case. (Exactly as the unmodified llvm-objdump would have done.) The presence of a non-data symbol at each location has caused the section contents to be disassembled as code, but the non-data symbol isn't winning the contest in every case to be the one printed. I wonder if that inconsistency might be confusing? If we think the code symbol is more important from a disassembly perspective, perhaps we should make it the one displayed, as well, for consistency? Otherwise you end up printing a data symbol name followed by code, which looks confusing. I feel as if we ought to print a data symbol with data, or a code symbol with code, but not a confusing mixture. I'll have a rethink.
llvm/test/tools/llvm-objdump/multiple-symbols.test
22	I do have to keep remembering that LLVM's canonical spelling isn't the same as Arm's canonical spelling. My fingers have a very strong habit of typing it the way we spell it, for obvious reasons!
llvm/tools/llvm-objdump/llvm-objdump.cpp
1583	It turned out that I had trouble thinking of something that would have changed as a result of removing this section! The intention of the old code here is to avoid checking mapping symbols if we're starting disassembly at an `STT_OBJECT` symbol. But `STT_OBJECT` symbols are handled by the previous if statement by going to `dumpELFData` and then terminating this loop iteration, so it's difficult for one to get as far as here in the first place. If there is any case that could have got here at all without being eaten by the previous test, it must be a confusing edge case of some kind and I haven't put my finger on it yet.

jhenderson added inline comments.Aug 18 2022, 12:27 AM

llvm/test/tools/llvm-objdump/ELF/data-vs-code-priority.test
17–20 ↗	(On Diff #453011)	I agree - if we are disassembling as code, we should be printing code symbols. If there are multiple symbols to print (due to --disassemble-symbol etc), then if any of them are code, I think we should still print as code. However, if none of the code symbols are "selected" for printing, we should print as data, in my opinion. In case there's any ambiguity, I do think we should pick the code symbols above the data symbols, both in choosing which symbol to pick and therefore how to disassemble a block of bytes. It is probably worth taking a look at what GNU objdump does, and see if you can identify any behaviour that makes sense and we can conform to. Disassembly is one area where we diverge somewhat, but I think it might still be a useful reference point.
llvm/test/tools/llvm-objdump/multiple-symbols.test
22	I mean, I'm going off what wikipedia (and several other websites) tells me is the spelling :-) Strange that Arm's official spelling according to the company is different! I'd be happy to go back to what you had before then!

In D131589#3728128, @jhenderson wrote:

I think one bit of testing you still need is showing whether the demangled or raw name is used for the priority ordering of symbol names.

I'm not sure what you mean by that. Priority order of symbol names? The sorting order in Symbols is set up before anything gets demangled, so it will be based on the raw name, but that isn't changed by this patch.

llvm/test/tools/llvm-objdump/multiple-symbols.test
22	There was a change of preference at some point in the past, and it's entirely possible that not everyone has caught up. But in recent years Arm's preferred spelling of its own name is "Arm". (Obviously identifiers in source code have to match the existing spelling and all be consistent, but comments can be up to date!)
34	I had a try at this, but I'm afraid I couldn't see how to make it test the things I want tested. The problem is that the `-NEXT` suffix doesn't apply between different FileCheck prefixes. If I write, for example, COMMON: some header FOO-NEXT: line involving foo BAR-NEXT: line involving bar then I'd like `--check-prefixes=COMMON,BAR` to enforce that the bar line shows up immediately after the header line, and there isn't an intervening line of any kind. But in fact the `BAR-NEXT` check provokes an error message from FileCheck that there should have been a previous `BAR` check for it to be next to. It apparently means "must be on the next line from the previous check with the same prefix", not "... with any currently enabled prefix". So I think if I converted this test into your suggested style, I'd lose the ability to have `NEXT` checks at all, so I'd have to have a pair of prefixes for each piece of output, denoting "this line is / is not expected to appear in the file" ... # AMAP: 00000000 <$a.0>: # AMAP-NOT: 00000000 <$a.0>: # AAAA: 00000000 <aaaa>: # AAAA-NOT: 00000000 <aaaa>: and then each RUN line would have to have an absolutely enormous collection of check-prefixes specifying every single line it both did and didn't want. I can give that a try if you really want me to, but are you sure it's clearer? The effect from my point of view is that all the details of what makes one test run different from another are now way off to the right and smushed into a long undistinguished list of keywords, and you have to cross-refer to the checks anyway to make sense of them, instead of laid out in a table of "here is what this test expects to see".
102–104	It took me a day to remember what I'd been thinking here myself, so I agree it's unclear! The intention of having both an Arm and a Thumb function was to ensure that each one is disassembled in the right one of those states, because the mapping symbols that indicate the changeover are still reliably recognised, regardless of which subset of symbols is being displayed.
llvm/tools/llvm-objdump/llvm-objdump.cpp
1583	Aha! There is an edge case affected by this change. If you set `--disassemble-all` to force disassembly of data sections, then the previous code would have had the side effect of ignoring mapping symbols in code sections, so you'd get Thumb code mistakenly disassembled as Arm. The new criterion of "use mapping symbols if they're there" stops that failure from happening. I'll add a regression test for it.

Moved the check for data symbols to before we choose symbols to display, so that the same check can control which symbol is printed and how the data after it is disassembled.

Added a test for the changed behaviour of --disassemble-all, and tweaked comments and layout in existing tests for review comments.

Herald added a subscriber: kristof.beyls. · View Herald TranscriptAug 19 2022, 2:35 AM

Harbormaster completed remote builds in B182183: Diff 453935.Aug 19 2022, 3:17 AM

In D131589#3734616, @simon_tatham wrote:

In D131589#3728128, @jhenderson wrote:

I think one bit of testing you still need is showing whether the demangled or raw name is used for the priority ordering of symbol names.

I'm not sure what you mean by that. Priority order of symbol names? The sorting order in Symbols is set up before anything gets demangled, so it will be based on the raw name, but that isn't changed by this patch.

Your new test is about multiple symbols at the same location, and you specifically call out the alphatical sorting in a comment. That then immediately raises the question about whether demangled or mangled names are used. It's not the end of the world, since you rightly point out that this aspect hasn't changed, but I think it would still be useful to check (assuming of course it isn't already tested, anyway).

llvm/test/tools/llvm-objdump/ELF/ARM/disassemble-all.s
1 ↗	(On Diff #453935)	Perhaps worth adding "mapping-symbols" to the test name, e.g. `disassemble-all-mapping-symbols.s`, since it's specifically the interaction of the two that's interesting.
llvm/test/tools/llvm-objdump/ELF/data-vs-code-priority.test
18 ↗	(On Diff #453935)	"is displayed before" sounds incomplete. Before what? Do you mean "is displayed first"? Also, missing full stop at end of sentence.
25–32 ↗	(On Diff #453935)	Nit: the whitespace for indentation of these lines is inconsistent with the first block above. Please fix.
llvm/test/tools/llvm-objdump/multiple-symbols.test
34	The problem is that the -NEXT suffix doesn't apply between different FileCheck prefixes. I'm 95% certain that this is incorrect, as I just tested it out locally with the following test passing fine for me: # RUN: echo foo > %t.txt # RUN: echo baz >> %t.txt # RUN: FileCheck %s --input-file=%t.txt --check-prefixes=FOO,BAZ # FOO: foo # BAR-NEXT: bar # BAZ-NEXT: baz Did you perhaps accidentally omit the `COMMON` from one of your FileCheck prefix sets? The only rule for -NEXT/-EMPTY commands is that there has to be one regular check (across all prefix sets) before the first -NEXT/-EMPTY.
102–104	Perhaps worth additional comments then to explain this.
llvm/tools/llvm-objdump/llvm-objdump.cpp
1518

simon_tatham marked 10 inline comments as done.Aug 19 2022, 7:07 AM

simon_tatham added inline comments.

llvm/test/tools/llvm-objdump/multiple-symbols.test
34	You're right, it does work the way you say. I had indeed missed having an initial regular check, because I started off with a `-NOT` check, which doesn't count. But I was misled by FileCheck's error message: if I adjust your demo so that its first check is `FOO-NOT`, then I see z.test:3:3: error: found 'BAZ-NEXT' without previous 'BAZ: line which is what led me to think it worked the way I said!
llvm/tools/llvm-objdump/llvm-objdump.cpp
1510–1511	I'm afraid I don't know enough about that kind of STL iterator idiom to see how you'd do it in the `else` loop, where you not only have to iterate over `SymbolsHere` but also extract the `Name` field of each one. You'd need some kind of lambda, or templated field extraction gadget, or something, surely?

I think I've now addressed all your review comments, including adding a demonstration of alphabetical order vs demangling.

(Ah, that's where the one last unticked Done box was hiding.)

Harbormaster completed remote builds in B182212: Diff 453985.Aug 19 2022, 7:52 AM

jhenderson added inline comments.Aug 22 2022, 12:16 AM

llvm/test/tools/llvm-objdump/multiple-symbols-mangling.test
7 ↗	(On Diff #453985)	I'm wondering if this is a case where a generated-from-assembly-using-llvm-mc input might be more appropriate. Given we already need the Arm target for the disassembly, and we don't need to control any fine details of the object really, I don't think you lose any coverage, and the input file would be simpler. It might be worth looking to do the same at some of the other tests, though I haven't tried to figure out whether the assembly equivalent would be simpler.
14 ↗	(On Diff #453985)	FWIW, I find `_` characters in check prefixes weird. I'm also slightly hesitant, because it is easy enough to mistype an `_` as `-` (but less likely the other way around). I'm not sure you lose much by switching to `-`.
llvm/tools/llvm-objdump/llvm-objdump.cpp
1510–1511	You're looking for `std::transform` I believe in that case, though it's debatable whether it's easier to read, so what you've got is fine.

Adjusted check prefixes, and translated all yaml2obj inputs into llvm-mc inputs (which in all cases fails with the old llvm-objdump, i.e. still produces an object file that successfully tests the changed behaviour).

llvm/test/tools/llvm-objdump/multiple-symbols-mangling.test
14 ↗	(On Diff #453985)	I wasn't sure whether the use of `-` in the middle of the check prefix might conflict with the use of `-` to separate the semantic `-NOT:` and so forth at the end. But apparently that works fine.
llvm/tools/llvm-objdump/llvm-objdump.cpp
1510–1511	Yes, I see, with a lambda to extract the field of each object. I agree it's nicer to leave it as it is :-)

Harbormaster completed remote builds in B182536: Diff 454428.Aug 22 2022, 3:16 AM

LGTM, but before pushing, probably worth giving others (@rafauler, @MaskRay in particular) a day or two to have another look.

This revision is now accepted and ready to land.Aug 22 2022, 3:43 AM

This revision was landed with ongoing or failed builds.Aug 24 2022, 7:08 AM

Closed by commit rG8e29f3f1c35a: [llvm-objdump] Handle multiple syms at same addr in disassembly. (authored by simon_tatham). · Explain Why

This revision was automatically updated to reflect the committed changes.

simon_tatham added a commit: rG8e29f3f1c35a: [llvm-objdump] Handle multiple syms at same addr in disassembly..

simon_tatham mentioned this in rG79f99bf6220e: [bolt] Fix a test affected by D131589..Aug 24 2022, 7:52 AM

Thanks! This is a useful option. FWIW I created a feature request for GNU objdump https://sourceware.org/bugzilla/show_bug.cgi?id=29847

MaskRay mentioned this in rGd3b7c84a0bf6: [llvm-objdump][docs] Mention --show-all-symbols.Dec 5 2022, 12:01 PM

Revision Contents

Path

Size

llvm/

test/

tools/

llvm-objdump/

multiple-symbols.test

177 lines

tools/

llvm-objdump/

ObjdumpOpts.td

4 lines

llvm-objdump.cpp

203 lines

Diff 452618

llvm/test/tools/llvm-objdump/multiple-symbols.test

This file was added.

# RUN: yaml2obj %s -o %t.o

# RUN: llvm-objdump --triple armv8a -d %t.o | FileCheck --check-prefix=DEFAULT %s

jhendersonUnsubmitted

Done

I'd move your comment about what the test does to the start of the file here.

Also, this test will need a REQUIRES directive too, as it won't work if the build doesn't have the relevant target configured.

jhenderson: I'd move your comment about what the test does to the start of the file here. Also, this test…

# RUN: llvm-objdump --triple armv8a --show-all-symbols -d %t.o | FileCheck --check-prefix=ALLSYMS %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=aaaa -d %t.o | FileCheck --check-prefix=AAAA %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=bbbb -d %t.o | FileCheck --check-prefix=BBBB %s

jhendersonUnsubmitted

Done

I'd put blank lines between the closely-related groups here. For example, you could have the first two runs in one group, the next 6 in another and the remainder in a third.

I'd then label each group with a comment explaining what's special about that set of test cases.

jhenderson: I'd put blank lines between the closely-related groups here. For example, you could have the…

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=aaaa,bbbb -d %t.o | FileCheck --check-prefix=AABB %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=aaaa --show-all-symbols -d %t.o | FileCheck --check-prefix=AABB-ALL %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=bbbb --show-all-symbols -d %t.o | FileCheck --check-prefix=AABB-ALL %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=aaaa,bbbb --show-all-symbols -d %t.o | FileCheck --check-prefix=AABB-ALL %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=cccc -d %t.o | FileCheck --check-prefix=CCCC %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=dddd -d %t.o | FileCheck --check-prefix=DDDD %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=cccc,dddd -d %t.o | FileCheck --check-prefix=CCDD %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=cccc --show-all-symbols -d %t.o | FileCheck --check-prefix=CCDD-ALL %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=dddd --show-all-symbols -d %t.o | FileCheck --check-prefix=CCDD-ALL %s

# RUN: llvm-objdump --triple armv8a --disassemble-symbols=cccc,dddd --show-all-symbols -d %t.o | FileCheck --check-prefix=CCDD-ALL %s

## This test checks the behavior of llvm-objdump's --disassemble-symbols and

## --show-all-symbols options, in the presence of multiple symbols defined at

## the same address in an object file.

## The test input file contains an Arm and a Thumb function, each with two

## function-type symbols defined at its entry point. Also, because it's Arm,

jhendersonUnsubmitted

Done

Nit: here and elsewhere, I think the canonical spelling is ARM.

jhenderson: Nit: here and elsewhere, I think the canonical spelling is ARM.

simon_tathamAuthorUnsubmitted

Done

I do have to keep remembering that LLVM's canonical spelling isn't the same as Arm's canonical spelling. My fingers have a very strong habit of typing it the way we spell it, for obvious reasons!

simon_tatham: I do have to keep remembering that LLVM's canonical spelling isn't the same as Arm's canonical…

jhendersonUnsubmitted

Done

I mean, I'm going off what wikipedia (and several other websites) tells me is the spelling :-) Strange that Arm's official spelling according to the company is different! I'd be happy to go back to what you had before then!

jhenderson: I mean, I'm going off what wikipedia (and several other websites) tells me is the spelling :-)…

simon_tathamAuthorUnsubmitted

Done

There was a change of preference at some point in the past, and it's entirely possible that not everyone has caught up. But in recent years Arm's preferred spelling of its own name is "Arm".

(Obviously identifiers in source code have to match the existing spelling and all be consistent, but comments can be up to date!)

simon_tatham: There was a change of preference at some point in the past, and it's entirely possible that not…

## there's a $a mapping symbol defined at the start of the section, and a $t

## mapping symbol at the point where Arm code stops and Thumb code begins.

## By default, llvm-objdump will pick one of the symbols to disassemble at each

## point where any are defined at all. The tie-break sorting criterion is

## alphabetic, so it will be the alphabetically later symbol in each case: of

## the names aaaa,bbbb for the Arm function it picks bbbb, and of cccc,dddd for

## the Thumb function it picks dddd.

jhendersonUnsubmitted

Done

## alphabetic, so it will be the alphabetically later symbol in each case: of

- ## the names aaaa,bbbb for the Arm function it picks bbbb, and of cccc,dddd for

+ ## the names aaaa and bbbb for the Arm function it picks bbbb, and of cccc and dddd for

## the Thumb function it picks dddd.

I'd suggest this fomatting for the groups, because it looks initially like you've just omitted a space after the comma!

jhenderson: I'd suggest this fomatting for the groups, because it looks initially like you've just omitted…

# DEFAULT-NOT: >:

# DEFAULT: 00000000 <bbbb>:

# DEFAULT-NEXT: 0: e0800080 add r0, r0, r0, lsl #1

jhendersonUnsubmitted

Done

Rather than half a dozen different CHECK patterns, which are mostly duplicates, I'd consider using multiple check prefixes in each test case to enable/disable the relevant parts. For example, you'd have one prefix for each of the symbols, and then another prefix for each of the disassembly blocks. You could use an --implicit-check-not to the FileCheck commands to ensure other stuff isn't printed incorrectly, instead of the -NOT style too, but up to you.

Rough example:

# AMAP: 00000000 <$a.0>:
# AAAA: 00000000 <aaaa>:
# BBBB: 00000000 <bbbb>:
# CODE1: <first functions code>
# CODE1-NEXT: ...
...
# TMAP: ....

jhenderson: Rather than half a dozen different CHECK patterns, which are mostly duplicates, I'd consider…

simon_tathamAuthorUnsubmitted

Done

I had a try at this, but I'm afraid I couldn't see how to make it test the things I want tested.

The problem is that the -NEXT suffix doesn't apply between different FileCheck prefixes. If I write, for example,

COMMON: some header
FOO-NEXT: line involving foo
BAR-NEXT: line involving bar

then I'd like --check-prefixes=COMMON,BAR to enforce that the bar line shows up immediately after the header line, and there isn't an intervening line of any kind. But in fact the BAR-NEXT check provokes an error message from FileCheck that there should have been a previous BAR check for it to be next to. It apparently means "must be on the next line from the previous check with the same prefix", not "... with any currently enabled prefix".

So I think if I converted this test into your suggested style, I'd lose the ability to have NEXT checks at all, so I'd have to have a pair of prefixes for each piece of output, denoting "this line is / is not expected to appear in the file" ...

# AMAP:     00000000 <$a.0>:
# AMAP-NOT: 00000000 <$a.0>:
# AAAA:     00000000 <aaaa>:
# AAAA-NOT: 00000000 <aaaa>:

and then each RUN line would have to have an absolutely enormous collection of check-prefixes specifying every single line it both did and didn't want.

I can give that a try if you really want me to, but are you sure it's clearer? The effect from my point of view is that all the details of what makes one test run different from another are now way off to the right and smushed into a long undistinguished list of keywords, and you have to cross-refer to the checks anyway to make sense of them, instead of laid out in a table of "here is what this test expects to see".

simon_tatham: I had a try at this, but I'm afraid I couldn't see how to make it test the things I want tested.

jhendersonUnsubmitted

Done

The problem is that the -NEXT suffix doesn't apply between different FileCheck prefixes.

I'm 95% certain that this is incorrect, as I just tested it out locally with the following test passing fine for me:

# RUN: echo foo > %t.txt
# RUN: echo baz >> %t.txt
# RUN: FileCheck %s --input-file=%t.txt --check-prefixes=FOO,BAZ

# FOO: foo
# BAR-NEXT: bar
# BAZ-NEXT: baz

Did you perhaps accidentally omit the COMMON from one of your FileCheck prefix sets? The only rule for -NEXT/-EMPTY commands is that there has to be one regular check (across all prefix sets) before the first -NEXT/-EMPTY.

jhenderson: > The problem is that the -NEXT suffix doesn't apply between different FileCheck prefixes. I'm…

simon_tathamAuthorUnsubmitted

Done

You're right, it does work the way you say. I had indeed missed having an initial regular check, because I started off with a -NOT check, which doesn't count. But I was misled by FileCheck's error message: if I adjust your demo so that its first check is FOO-NOT, then I see

z.test:3:3: error: found 'BAZ-NEXT' without previous 'BAZ: line

which is what led me to think it worked the way I said!

simon_tatham: You're right, it does work the way you say. I had indeed missed having an initial regular check…

# DEFAULT-NEXT: 4: e12fff1e bx lr

# DEFAULT-EMPTY:

# DEFAULT-NEXT: 00000008 <dddd>:

# DEFAULT-NEXT: 8: eb00 0080 add.w r0, r0, r0, lsl #2

# DEFAULT-NEXT: c: 4770 bx lr

## With the --show-all-symbols option, all the symbols are shown, including the

## administrative mapping symbols.

# ALLSYMS-NOT: >:

# ALLSYMS: 00000000 <$a.0>:

# ALLSYMS-NEXT: 00000000 <aaaa>:

# ALLSYMS-NEXT: 00000000 <bbbb>:

# ALLSYMS-NEXT: 0: e0800080 add r0, r0, r0, lsl #1

# ALLSYMS-NEXT: 4: e12fff1e bx lr

# ALLSYMS-EMPTY:

# ALLSYMS-NEXT: 00000008 <$t.1>:

# ALLSYMS-NEXT: 00000008 <cccc>:

# ALLSYMS-NEXT: 00000008 <dddd>:

# ALLSYMS-NEXT: 8: eb00 0080 add.w r0, r0, r0, lsl #2

# ALLSYMS-NEXT: c: 4770 bx lr

## If you ask for '--disassemble-symbols=aaaa', then the symbol aaaa is singled

## out for display even though it wouldn't be shown by default, because it's

## the one you actually asked for. And display stops after that: we don't move

## on to disassemble the second block of code at all.

# AAAA-NOT: >:

# AAAA: 00000000 <aaaa>:

# AAAA-NEXT: 0: e0800080 add r0, r0, r0, lsl #1

# AAAA-NEXT: 4: e12fff1e bx lr

# AAAA-NOT: 8

## Similarly, if you ask for '--disassemble-symbols=bbbb', then you see just

## bbbb. (This is the symbol that _would_ have been shown before, of course.)

# BBBB-NOT: >:

# BBBB: 00000000 <bbbb>:

# BBBB-NEXT: 0: e0800080 add r0, r0, r0, lsl #1

# BBBB-NEXT: 4: e12fff1e bx lr

# BBBB-NOT: 8

## If you ask for both, via '--disassemble-symbols=aaaa,bbbb', then the code is

## only dumped once, but both symbols are shown at its entry point, because

## they're both symbols you expressed an interest in.

# AABB-NOT: >:

# AABB: 00000000 <aaaa>:

# AABB-NEXT: 00000000 <bbbb>:

# AABB-NEXT: 0: e0800080 add r0, r0, r0, lsl #1

# AABB-NEXT: 4: e12fff1e bx lr

# AABB-NOT: 8

## With _any_ of those three options and also --show-all-symbols, the

## disassembled code is still limited to just the symbol(s) you asked about,

## but all symbols defined at the same address are mentioned, whether you asked

## about them or not.

# AABB-ALL-NOT: >:

# AABB-ALL: 00000000 <$a.0>:

# AABB-ALL-NEXT: 00000000 <aaaa>:

# AABB-ALL-NEXT: 00000000 <bbbb>:

# AABB-ALL-NEXT: 0: e0800080 add r0, r0, r0, lsl #1

# AABB-ALL-NEXT: 4: e12fff1e bx lr

# AABB-ALL-NOT: 8

## Similarly for the other two functions. This time we must check that the

## aaaa/bbbb block of code was not disassembled _before_ the output we're

## expecting.

jhendersonUnsubmitted

Done

Bearing in mind that --disassemble-symbol should already have testing elsewhere, what does this second block of code + function symbols give us that the first block alone doesn't?

jhenderson: Bearing in mind that --disassemble-symbol should already have testing elsewhere, what does this…

simon_tathamAuthorUnsubmitted

Done

It took me a day to remember what I'd been thinking here myself, so I agree it's unclear! The intention of having both an Arm and a Thumb function was to ensure that each one is disassembled in the right one of those states, because the mapping symbols that indicate the changeover are still reliably recognised, regardless of which subset of symbols is being displayed.

simon_tatham: It took me a day to remember what I'd been thinking here myself, so I agree it's unclear! The…

jhendersonUnsubmitted

Done

Perhaps worth additional comments then to explain this.

jhenderson: Perhaps worth additional comments then to explain this.

## Asking for just cccc:

# CCCC-NOT: 0:

# CCCC: 00000008 <cccc>:

# CCCC-NEXT: 8: eb00 0080 add.w r0, r0, r0, lsl #2

# CCCC-NEXT: c: 4770 bx lr

## Asking for just dddd:

# DDDD-NOT: 0:

# DDDD: 00000008 <dddd>:

# DDDD-NEXT: 8: eb00 0080 add.w r0, r0, r0, lsl #2

# DDDD-NEXT: c: 4770 bx lr

## Asking for both:

# CCDD-NOT: 0:

# CCDD: 00000008 <cccc>:

# CCDD-NEXT: 00000008 <dddd>:

# CCDD-NEXT: 8: eb00 0080 add.w r0, r0, r0, lsl #2

# CCDD-NEXT: c: 4770 bx lr

## Any of those together with --show-all-symbols:

# CCDD-ALL-NOT: 0:

# CCDD-ALL: 00000008 <$t.1>:

# CCDD-ALL-NEXT: 00000008 <cccc>:

# CCDD-ALL-NEXT: 00000008 <dddd>:

# CCDD-ALL-NEXT: 8: eb00 0080 add.w r0, r0, r0, lsl #2

# CCDD-ALL-NEXT: c: 4770 bx lr

## And here's the input object file.

jhendersonUnsubmitted

Done

Delete this comment - it's obvious that it's the input file by virtue of it being YAML.

jhenderson: Delete this comment - it's obvious that it's the input file by virtue of it being YAML.

--- !ELF

FileHeader:

Class: ELFCLASS32

Data: ELFDATA2LSB

Type: ET_REL

Machine: EM_ARM

Flags: [ EF_ARM_EABI_VER5 ]

Sections:

- Name: .text

Type: SHT_PROGBITS

Flags: [ SHF_ALLOC, SHF_EXECINSTR ]

AddressAlign: 0x4

Content: 800080E01EFF2FE100EB80007047

Symbols:

- Name: '$a.0'

Section: .text

Value: 0x0

- Name: aaaa

Section: .text

Type: STT_FUNC

Binding: STB_GLOBAL

Value: 0x0

- Name: bbbb

Section: .text

Type: STT_FUNC

Binding: STB_GLOBAL

Value: 0x0

- Name: '$t.1'

Section: .text

Value: 0x8

- Name: cccc

Section: .text

Type: STT_FUNC

Binding: STB_GLOBAL

Value: 0x8

- Name: dddd

Section: .text

Type: STT_FUNC

Binding: STB_GLOBAL

Value: 0x8

llvm/tools/llvm-objdump/ObjdumpOpts.td

	Show First 20 Lines • Show All 147 Lines • ▼ Show 20 Lines

	def section_headers : Flag<["--"], "section-headers">,			def section_headers : Flag<["--"], "section-headers">,
	HelpText<"Display summaries of the headers for each section.">;			HelpText<"Display summaries of the headers for each section.">;
	def : Flag<["--"], "headers">, Alias<section_headers>,			def : Flag<["--"], "headers">, Alias<section_headers>,
	HelpText<"Alias for --section-headers">;			HelpText<"Alias for --section-headers">;
	def : Flag<["-"], "h">, Alias<section_headers>,			def : Flag<["-"], "h">, Alias<section_headers>,
	HelpText<"Alias for --section-headers">;			HelpText<"Alias for --section-headers">;

				def show_all_symbols : Flag<["--"], "show-all-symbols">,
				HelpText<"Show all symbols during disassembly, even if multiple "
				"symbols are defined at the same location">;

	def show_lma : Flag<["--"], "show-lma">,			def show_lma : Flag<["--"], "show-lma">,
	HelpText<"Display LMA column when dumping ELF section headers">;			HelpText<"Display LMA column when dumping ELF section headers">;

	def source : Flag<["--"], "source">,			def source : Flag<["--"], "source">,
	HelpText<"When disassembling, display source interleaved with the "			HelpText<"When disassembling, display source interleaved with the "
	"disassembly. Implies --disassemble">;			"disassembly. Implies --disassemble">;
	def : Flag<["-"], "S">, Alias<source>, HelpText<"Alias for --source">;			def : Flag<["-"], "S">, Alias<source>, HelpText<"Alias for --source">;

	▲ Show 20 Lines • Show All 183 Lines • Show Last 20 Lines

llvm/tools/llvm-objdump/llvm-objdump.cpp

Show First 20 Lines • Show All 201 Lines • ▼ Show 20 Lines

bool objdump::LeadingAddr; bool objdump::LeadingAddr;

static bool Offloading; static bool Offloading;

static bool RawClangAST; static bool RawClangAST;

bool objdump::Relocations; bool objdump::Relocations;

bool objdump::PrintImmHex; bool objdump::PrintImmHex;

bool objdump::PrivateHeaders; bool objdump::PrivateHeaders;

std::vector<std::string> objdump::FilterSections; std::vector<std::string> objdump::FilterSections;

bool objdump::SectionHeaders; bool objdump::SectionHeaders;

static bool ShowAllSymbols;

static bool ShowLMA; static bool ShowLMA;

bool objdump::PrintSource; bool objdump::PrintSource;

static uint64_t StartAddress; static uint64_t StartAddress;

static bool HasStartAddressFlag; static bool HasStartAddressFlag;

static uint64_t StopAddress = UINT64_MAX; static uint64_t StopAddress = UINT64_MAX;

static bool HasStopAddressFlag; static bool HasStopAddressFlag;

▲ Show 20 Lines • Show All 1,258 Lines • ▼ Show 20 Lines for (const SectionRef &Section : ToolSectionFilter(Obj)) {

// the section offset. // the section offset.

uint64_t RelAdjustment = Obj.isRelocatableObject() ? 0 : SectionAddr; uint64_t RelAdjustment = Obj.isRelocatableObject() ? 0 : SectionAddr;

uint64_t Size; uint64_t Size;

uint64_t Index; uint64_t Index;

bool PrintedSection = false; bool PrintedSection = false;

std::vector<RelocationRef> Rels = RelocMap[Section]; std::vector<RelocationRef> Rels = RelocMap[Section];

std::vector<RelocationRef>::const_iterator RelCur = Rels.begin(); std::vector<RelocationRef>::const_iterator RelCur = Rels.begin();

std::vector<RelocationRef>::const_iterator RelEnd = Rels.end(); std::vector<RelocationRef>::const_iterator RelEnd = Rels.end();

// Disassemble symbol by symbol.

for (unsigned SI = 0, SE = Symbols.size(); SI != SE; ++SI) {

std::string SymbolName = Symbols[SI].Name.str();

if (Demangle)

SymbolName = demangle(SymbolName);

// Skip if --disassemble-symbols is not empty and the symbol is not in // Loop over each chunk of code between two points where at least

// the list. // one symbol is defined.

if (!DisasmSymbolSet.empty() && !DisasmSymbolSet.count(SymbolName)) for (size_t SI = 0, SE = Symbols.size(); SI != SE;) {

// Advance SI past all the symbols starting at the same address,

// and make an ArrayRef of them.

unsigned FirstSI = SI;

uint64_t Start = Symbols[SI].Addr;

ArrayRef<SymbolInfoTy> SymbolsHere;

while (SI != SE && Symbols[SI].Addr == Start)

++SI;

SymbolsHere = ArrayRef<SymbolInfoTy>(&Symbols[FirstSI], SI - FirstSI);

jhendersonUnsubmitted

Done

Nit: I think we prefer preincrement to post.

jhenderson: Nit: I think we prefer preincrement to post.

rafaulerUnsubmitted

Done

Curious about the style here where we define a scope to isolate a piece of computation. I don't think I've seen this in other LLVM files or in the coding style, so I'm curious what other people think about it.

I do think it makes the code more clear to understand here, because it's only 3 lines. But on the other hand, it increases nesting on line 1509, which can actually make code harder to read (arguably).

rafauler: Curious about the style here where we define a scope to isolate a piece of computation. I don't…

jhendersonUnsubmitted

Done

I'm not sure this style is used in LLVM, or at least not in the areas I'm familiar with, so I'd drop it.

jhenderson: I'm not sure this style is used in LLVM, or at least not in the areas I'm familiar with, so I'd…

simon_tathamAuthorUnsubmitted

Done

Oops, good catch. Those braces were originally there to isolate some variables so that they weren't in scope for the rest of the enormous for-loop body. But by the time I finished developing the patch I'd removed all the variables local to the block, and somehow didn't notice that even in three last-minute pre-upload reviews of my own :-)

Now I look closely, the same thing happened to the other scope that decides which symbols to print. I'll fix that one too.

simon_tatham: Oops, good catch. Those braces were originally there to isolate some //variables// so that they…

// Get the demangled names of all those symbols. We end up with a vector

// of StringRef that holds the names we're going to use, and a vector of

// std::string that stores the new strings returned by demangle(), if

// any. If we don't call demangle() then that vector can stay empty.

jhendersonUnsubmitted

Done

Can this be a vector of StringRef to save the copying?

jhenderson: Can this be a vector of `StringRef` to save the copying?

simon_tathamAuthorUnsubmitted

Done

Not trivially, because in the case where we have to demangle names, demangle() returns a newly made string and we have to have somewhere to store it.

I've changed it so that we have the vector of StringRef you wanted, and also a vector of std::string which is left empty when not in demangling mode, and the StringRefs point at the local vector or the original name elsewhere, as appropriate.

simon_tatham: Not trivially, because in the case where we have to demangle names, `demangle()` returns a…

jhendersonUnsubmitted

Done

I'd forgotten about the demangle aspect of this. As such, I'm not fussed whichever way you prefer to do it.

jhenderson: I'd forgotten about the `demangle` aspect of this. As such, I'm not fussed whichever way you…

std::vector<StringRef> SymNamesHere;

std::vector<std::string> DemangledSymNamesHere;

if (Demangle) {

// Fetch the demangled names and store them locally.

for (const SymbolInfoTy &Symbol : SymbolsHere)

DemangledSymNamesHere.push_back(demangle(Symbol.Name.str()));

// Now we've finished modifying that vector, it's safe to make

jhendersonUnsubmitted

Done

if (Demangle) {

- for (const SymbolInfoTy &Symbol : SymbolsHere)

+ for (const SymbolInfoTy &Symbol : SymbolsHere) {

DemangledSymNamesHere.push_back(demangle(Symbol.Name.str()));

- for (const std::string &DemangledName : DemangledSymNamesHere)

- SymNamesHere.push_back(DemangledName);

+ SymNamesHere.push_back(DemangledSymNamesHere.back());

+ }

} else {

You can do this, right?

jhenderson: You can do this, right?

simon_tathamAuthorUnsubmitted

Done

I nearly did, but I wasn't confident that a StringRef to a std::string stored in a std::vector is guaranteed to stay valid if the std::vector has to resize itself. What if the std::string implementation stores short enough strings without a separate allocation, and the vector resize involves a realloc?

So instead I did something I'm sure is safe, which is to set up the entire vector of strings, commit to never modifying it again, and then start making StringRefs pointing into it.

(Another option I'd have been completely confident of would be to make a vector of unique_ptr<string>, so that even if the vector resizes, each pointed-to string stays put. That forces even more allocations, though.)

simon_tatham: I //nearly// did, but I wasn't confident that a `StringRef` to a `std::string` stored in a `std…

jhendersonUnsubmitted

Done

Good point - keep the loop as-is. It turns out that the "Small String Optimization" that std::string can use under-the-hood, means that perhaps unintuitively, the string's data can be moved when the string itself is moved, rather than just the pointer to the data (see https://stackoverflow.com/questions/57723963/is-it-safe-to-store-the-pointer-to-the-data-of-a-stdstring).

jhenderson: Good point - keep the loop as-is. It turns out that the "Small String Optimization" that std…

// StringRefs pointing into it.

for (const std::string &DemangledName : DemangledSymNamesHere)

SymNamesHere.push_back(DemangledName);

jhendersonUnsubmitted

Done

"(that we can find at all)" - I'm not sure what this is saying. Do you even need it?

jhenderson: "(that we can find at all)" - I'm not sure what this is saying. Do you even need it?

simon_tathamAuthorUnsubmitted

Done

I just meant that if a symbol specified by the user doesn't appear in the object file at all, then we're exempt from the need to display it. But I agree that's not 100% clear, or particularly important to highlight in this context. I'll remove the parenthesis.

simon_tatham: I just meant that if a symbol specified by the user doesn't appear in the object file at all…

jhendersonUnsubmitted

Done

I feel like this should be simplfiable to a single line, possibly using something like SymNamesHere.insert(SymNamesHere.begin(), DemangledSymNamesHere.begin(), DemangledSymNamesHere.end()); although maybe it's unnecessarily complex. (Same goes for the loop immediately below in the else).

jhenderson: I feel like this should be simplfiable to a single line, possibly using something like…

simon_tathamAuthorUnsubmitted

Done

I'm afraid I don't know enough about that kind of STL iterator idiom to see how you'd do it in the else loop, where you not only have to iterate over SymbolsHere but also extract the Name field of each one. You'd need some kind of lambda, or templated field extraction gadget, or something, surely?

simon_tatham: I'm afraid I don't know enough about that kind of STL iterator idiom to see how you'd do it in…

jhendersonUnsubmitted

Done

You're looking for std::transform I believe in that case, though it's debatable whether it's easier to read, so what you've got is fine.

jhenderson: You're looking for `std::transform` I believe in that case, though it's debatable whether it's…

simon_tathamAuthorUnsubmitted

Done

Yes, I see, with a lambda to extract the field of each object. I agree it's nicer to leave it as it is :-)

simon_tatham: Yes, I see, with a lambda to extract the field of each object. I agree it's nicer to leave it…

} else {

for (const SymbolInfoTy &Symbol : SymbolsHere)

SymNamesHere.push_back(Symbol.Name);

}

// Decide which symbol(s) from this collection we're going to print.

std::vector<bool> SymsToPrint(SymbolsHere.size(), false);

jhendersonUnsubmitted

Done

// Distinguish ELF data from code symbols, which will be used later on to

- // decide whether to 'disassemble' this chunk as at data declaration via

+ // decide whether to 'disassemble' this chunk as a data declaration via

// dumpELFData(), or whether to treat it as code.

jhenderson:

// If the user has given the --disassemble-symbols option, then we must

// display every symbol in that set, and no others.

if (!DisasmSymbolSet.empty()) {

bool FoundAny = false;

for (size_t i = 0; i < SymbolsHere.size(); ++i) {

if (DisasmSymbolSet.count(SymNamesHere[i])) {

SymsToPrint[i] = true;

FoundAny = true;

}

// And if none of the symbols here is one that the user asked for, skip

// disassembling this entire chunk of code.

if (!FoundAny)

continue; continue;

jhendersonUnsubmitted

Done

SymsToPrint[SymbolsHere.size() - 1] = true;

}

- // Now that we know we're disassembling this section at all, override

+ // Now that we know we're disassembling this section, override

// the choice of which symbols to display by printing _all_ of them a

jhenderson:

} else {

jhendersonUnsubmitted

Done

// Now that we know we're disassembling this section at all, override

- // the choice of which symbols to display by printing _all_ of them a

+ // the choice of which symbols to display by printing _all_ of them at

// this address if the user asked for all symbols.

jhenderson:

// Otherwise, print whichever symbol at this location is last in the

// Symbols array, because that array is pre-sorted in a way intended to

// correlate with priority of which symbol to display.

SymsToPrint[SymbolsHere.size() - 1] = true;

}

jhendersonUnsubmitted

Done

Not sure you need the parentheses, or the "//" emphasis marks. Also, "Arm" should be "ARM".

jhenderson: Not sure you need the parentheses, or the "//" emphasis marks. Also, "Arm" should be "ARM".

// Now that we know we're disassembling this section, override the choice

// of which symbols to display by printing _all_ of them at this address

jhendersonUnsubmitted

Done

size_t would be the more natural type for loops over SymbolsHere, since that's the value returned by size() and used by the index operator.

Ditto below.

jhenderson: `size_t` would be the more natural type for loops over `SymbolsHere`, since that's the value…

simon_tathamAuthorUnsubmitted

Done

I agree! But I guessed that the widespread existing use of unsigned in similar cases in the LLVM code base (for example, this very loop where SI ranges up to Symbols.size()) was a local idiom that I'd be criticised for going against. I'm glad to see that the opposite is true ;-)

simon_tatham: I agree! But I guessed that the widespread existing use of `unsigned` in similar cases in the…

// if the user asked for all symbols.

// That way, '--show-all-symbols --disassemble-symbol=foo' will print

// only the chunk of code headed by 'foo', but also show any other

// symbols defined at that address, such as aliases for 'foo', or the ARM

// mapping symbol preceding its code.

if (ShowAllSymbols) {

for (size_t i = 0; i < SymbolsHere.size(); ++i)

SymsToPrint[i] = true;

}

uint64_t Start = Symbols[SI].Addr;

if (Start < SectionAddr || StopAddress <= Start) if (Start < SectionAddr || StopAddress <= Start)

continue; continue;

else

FoundDisasmSymbolSet.insert(SymbolName); for (size_t i = 0; i < SymbolsHere.size(); ++i)

FoundDisasmSymbolSet.insert(SymNamesHere[i]);

// The end is the section end, the beginning of the next symbol, or // The end is the section end, the beginning of the next symbol, or

// --stop-address. // --stop-address.

uint64_t End = std::min<uint64_t>(SectionAddr + SectSize, StopAddress); uint64_t End = std::min<uint64_t>(SectionAddr + SectSize, StopAddress);

if (SI + 1 < SE) if (SI < SE)

End = std::min(End, Symbols[SI + 1].Addr); End = std::min(End, Symbols[SI].Addr);

if (Start >= End || End <= StartAddress) if (Start >= End || End <= StartAddress)

jhendersonUnsubmitted

Done

It's not immediately obvious to me why this line has changed. Could you explain please, as it likely means I've missed something.

jhenderson: It's not immediately obvious to me why this line has changed. Could you explain please, as it…

simon_tathamAuthorUnsubmitted

Done

In the existing version of the loop, SI is incremented after the loop body runs, by the ++SI in the for statement itself. So throughout the loop body, SI points at the symbol we're currently disassembling, and SI+1 here indicates the next symbol, whose address marks the point where we're planning to finish this iteration of the disassembly loop.

In the new version, I've removed the ++SI in the for statement, and replaced it with code at the beginning of the loop body that advances SI past all the symbols defined at the same address. So after that code runs, the rest of the loop body sees SI already pointing at the first symbol defined at a later address.

simon_tatham: In the existing version of the loop, `SI` is incremented //after// the loop body runs, by the…

continue; continue;

Start -= SectionAddr; Start -= SectionAddr;

End -= SectionAddr; End -= SectionAddr;

if (!PrintedSection) { if (!PrintedSection) {

PrintedSection = true; PrintedSection = true;

outs() << "\nDisassembly of section "; outs() << "\nDisassembly of section ";

if (!SegmentName.empty()) if (!SegmentName.empty())

outs() << SegmentName << ","; outs() << SegmentName << ",";

outs() << SectionName << ":\n"; outs() << SectionName << ":\n";

} }

outs() << '\n'; outs() << '\n';

for (size_t i = 0; i < SymbolsHere.size(); ++i) {

if (!SymsToPrint[i])

continue;

const SymbolInfoTy &Symbol = SymbolsHere[i];

const StringRef SymbolName = SymNamesHere[i];

jhendersonUnsubmitted

Done

StringRef?

jhenderson: `StringRef`?

if (LeadingAddr) if (LeadingAddr)

outs() << format(Is64Bits ? "%016" PRIx64 " " : "%08" PRIx64 " ", outs() << format(Is64Bits ? "%016" PRIx64 " " : "%08" PRIx64 " ",

SectionAddr + Start + VMAAdjustment); SectionAddr + Start + VMAAdjustment);

if (Obj.isXCOFF() && SymbolDescription) { if (Obj.isXCOFF() && SymbolDescription) {

outs() << getXCOFFSymbolDescription(Symbols[SI], SymbolName) << ":\n"; outs() << getXCOFFSymbolDescription(Symbol, SymbolName) << ":\n";

} else } else

outs() << '<' << SymbolName << ">:\n"; outs() << '<' << SymbolName << ">:\n";

}

// Don't print raw contents of a virtual section. A virtual section // Don't print raw contents of a virtual section. A virtual section

// doesn't have any contents in the file. // doesn't have any contents in the file.

if (Section.isVirtual()) { if (Section.isVirtual()) {

outs() << "...\n"; outs() << "...\n";

continue; continue;

} }

auto Status = DisAsm->onSymbolStart(Symbols[SI], Size, // See if any of the symbols defined at this location triggers target-

Bytes.slice(Start, End - Start), // specific disassembly behavior, e.g. of special descriptors or function

SectionAddr + Start, CommentStream); // prelude information.

// To have round trippable disassembly, we fall back to decoding the

// remaining bytes as instructions.

// //

// If there is a failure, we disassemble the failed region as bytes before // We stop this loop at the first symbol that triggers some kind of

// falling back. The target is expected to print nothing in this case. // interesting behavior (if any), on the assumption that if two symbols

// // defined at the same address trigger two conflicting symbol handlers,

// If there is Success or SoftFail i.e no 'real' failure, we go ahead by // the object file is probably confused anyway, and it would make even

// Size bytes before falling back. // less sense to present the output of _both_ handlers, because that

// So if the entire symbol is 'eaten' by the target: // would describe the same data twice.

// Start += Size // Now Start = End and we will never decode as for (size_t SHI = 0; SHI < SymbolsHere.size(); ++SHI) {

// // instructions SymbolInfoTy Symbol = SymbolsHere[SHI];

auto Status =

DisAsm->onSymbolStart(Symbol, Size, Bytes.slice(Start, End - Start),

SectionAddr + Start, CommentStream);

if (!Status) {

// If onSymbolStart returns None, that means it didn't trigger any

// interesting handling for this symbol. Try the other symbols

// defined at this address.

continue;

}

if (Status.value() == MCDisassembler::Fail) {

// If onSymbolStart returns Fail, that means it identified some kind

// of special data at this address, but wasn't able to disassemble it

// meaningfully. So we fall back to disassembling the failed region

// as bytes, assuming that the target detected the failure before

// printing anything.

// //

// Right now, most targets return None i.e ignore to treat a symbol // Return values Success or SoftFail (i.e no 'real' failure) are

// separately. But WebAssembly decodes preludes for some symbols. // expected to mean that the target has emitted its own output.

// //

if (Status) { // Either way, 'Size' will have been set to the amount of data

if (Status.value() == MCDisassembler::Fail) { // covered by whatever prologue the target identified. So we advance

outs() << "// Error in decoding " << SymbolName // our own position to beyond that. Sometimes that will be the entire

// distance to the next symbol, and sometimes it will be just a

// prologue and we should start disassembling instructions from where

// it left off.

outs() << "// Error in decoding " << SymNamesHere[SHI]

<< " : Decoding failed region as bytes.\n"; << " : Decoding failed region as bytes.\n";

for (uint64_t I = 0; I < Size; ++I) { for (uint64_t I = 0; I < Size; ++I) {

outs() << "\t.byte\t " << format_hex(Bytes[I], 1, /*Upper=*/true) outs() << "\t.byte\t " << format_hex(Bytes[I], 1, /*Upper=*/true)

<< "\n"; << "\n";

} }

} else {

Size = 0;

}

Start += Size; Start += Size;

break;

}

Index = Start; Index = Start;

if (SectionAddr < StartAddress) if (SectionAddr < StartAddress)

Index = std::max<uint64_t>(Index, StartAddress - SectionAddr); Index = std::max<uint64_t>(Index, StartAddress - SectionAddr);

// If there is a data/common symbol inside an ELF text section and we are // If there is a data/common symbol inside an ELF text section and we are

// only disassembling text (applicable all architectures), we are in a // only disassembling text (applicable all architectures), we are in a

// situation where we must print the data and not disassemble it. // situation where we must print the data and not disassemble it.

// If data _and_ code symbols are defined at the same address, the code

// takes priority, on the grounds that disassembling code is our main

// purpose here, and it would be a worse failure to _not_ interpret

// something that _was_ meaningful as code than vice versa.

// Any ELF symbol type that is not clearly data will be regarded as code.

// In particular, one of the uses of STT_NOTYPE is for branch targets

// inside functions, for which STT_FUNC would be inaccurate.

if (Obj.isELF() && !DisassembleAll && Section.isText()) { if (Obj.isELF() && !DisassembleAll && Section.isText()) {

uint8_t SymTy = Symbols[SI].Type; bool FoundNonDataSym = false;

if (SymTy == ELF::STT_OBJECT || SymTy == ELF::STT_COMMON) {

for (const SymbolInfoTy &Symbol : SymbolsHere) {

uint8_t SymTy = Symbol.Type;

if (SymTy != ELF::STT_OBJECT && SymTy != ELF::STT_COMMON) {

FoundNonDataSym = true;

break;

}

if (!FoundNonDataSym) {

dumpELFData(SectionAddr, Index, End, Bytes); dumpELFData(SectionAddr, Index, End, Bytes);

Index = End; Index = End;

continue;

jhendersonUnsubmitted

Done

Is there a subtle behaviour change here if you have multiple symbols at the same address but different types (i.e. one is STT_OBJECT and one isn't, e.g. STT_FUNC)?

jhenderson: Is there a subtle behaviour change here if you have multiple symbols at the same address but…

simon_tathamAuthorUnsubmitted

Done

Potentially, yes. Previously, llvm-objdump would pick just one of the symbols defined at the address, and base its decision on that symbol alone. With this patch, it will go through all of them, and spots any STT_OBJECT even if it's not the symbol last in the sorted list.

This is just the sort of thing I hoped to have a useful discussion about in order to decide what the behaviour should be, to avoid the risk of writing oodles of code to implement a complicated policy that we had no consensus on :-) so thanks for flagging it up.

What do you think we should do if an STT_FUNC and an STT_OBJECT occur at the same address? llvm-objdump's existing policy doesn't look particularly deliberate to me – it's an artefact of the code's previous lack of attention to collocated symbols. Perhaps it's nonetheless best to stick with the existing policy just for stability's sake, but if so, I'd prefer that we'd discussed other options before deciding that.

Other possibilities that spring to mind are to deliberately make STT_OBJECT highest priority (which is what's happening in this version of the code), or to make it lowest priority, or to choose based on some criterion like symbol index in the ELF file (go with whichever symbol was first/last in the actual object file's symtab). And maybe, whichever of those we do, emit a warning that flags up that we had to make an arbitrary decision that could have gone the other way.

What do you think?

(PS I hope you're not going to like the symtab index idea, because that information isn't preserved at all in SymbolInfoTy so it would take a load more plumbing :-)

simon_tatham: Potentially, yes. Previously, `llvm-objdump` would pick just one of the symbols defined at the…

jhendersonUnsubmitted

Done

Looking at the comment block, my instinct says we should treat STT_FUNC as higher priority (possibly assuming it hasn't got size 0) and do regular disassembly. Having an STT_OBJECT/STT_COMMON symbol at the same address as an STT_FUNC symbol sounds like it's unlikely to ever occur in practice ("this code represents both a function and some data??"). If it does, I think it's reasonable to pick one style somewhat arbitrarily. The user can use --disassemble-symbol of the STT_OBJECT symbol if they want to disassemble it as data in this case, I think.

I'm less clear on the second block below about ARM mapping symbols, because I'm not familiar with how ARM mapping symbols are used, and therefore don't think I can make an informed decision on the right approach there.

jhenderson: Looking at the comment block, my instinct says we should treat STT_FUNC as higher priority…

simon_tathamAuthorUnsubmitted

Done

OK, I'll change it to treat code symbols as higher priority than data.

As for mapping symbols, that's a use case I do know something about, and honestly, I think the simplest approach there is to stop checking for STT_OBJECT symbols at all, and just say that if there are mapping symbols in this section, we should use them, and not try to second-guess whether we think they're useful. So I think the best thing is to remove that loop completely, and replace it with a test of MappingSymbols.empty().

simon_tatham: OK, I'll change it to treat code symbols as higher priority than data. As for mapping symbols…

} }

bool CheckARMELFData = hasMappingSymbols(Obj) &&

jhendersonUnsubmitted

Done

The change of this logic should correspond to some sort of test case, I think, but I don't think I see anything?

jhenderson: The change of this logic should correspond to some sort of test case, I think, but I don't…

simon_tathamAuthorUnsubmitted

Done

It turned out that I had trouble thinking of something that would have changed as a result of removing this section!

The intention of the old code here is to avoid checking mapping symbols if we're starting disassembly at an STT_OBJECT symbol. But STT_OBJECT symbols are handled by the previous if statement by going to dumpELFData and then terminating this loop iteration, so it's difficult for one to get as far as here in the first place.

If there is any case that could have got here at all without being eaten by the previous test, it must be a confusing edge case of some kind and I haven't put my finger on it yet.

simon_tatham: It turned out that I had trouble thinking of something that would have //changed// as a result…

simon_tathamAuthorUnsubmitted

Done

Aha! There is an edge case affected by this change. If you set --disassemble-all to force disassembly of data sections, then the previous code would have had the side effect of ignoring mapping symbols in code sections, so you'd get Thumb code mistakenly disassembled as Arm.

The new criterion of "use mapping symbols if they're there" stops that failure from happening. I'll add a regression test for it.

simon_tatham: Aha! There //is// an edge case affected by this change. If you set `--disassemble-all` to force…

Symbols[SI].Type != ELF::STT_OBJECT &&

!DisassembleAll;

bool DumpARMELFData = false; bool DumpARMELFData = false;

jhendersonUnsubmitted

Done

Same comment as above.

jhenderson: Same comment as above.

formatted_raw_ostream FOS(outs()); formatted_raw_ostream FOS(outs());

std::unordered_map<uint64_t, std::string> AllLabels; std::unordered_map<uint64_t, std::string> AllLabels;

std::unordered_map<uint64_t, std::vector<std::string>> BBAddrMapLabels; std::unordered_map<uint64_t, std::vector<std::string>> BBAddrMapLabels;

if (SymbolizeOperands) { if (SymbolizeOperands) {

collectLocalBranchTargets(Bytes, MIA, DisAsm, IP, PrimarySTI, collectLocalBranchTargets(Bytes, MIA, DisAsm, IP, PrimarySTI,

SectionAddr, Index, End, AllLabels); SectionAddr, Index, End, AllLabels);

collectBBAddrMapLabels(AddrToBBAddrMap, SectionAddr, Index, End, collectBBAddrMapLabels(AddrToBBAddrMap, SectionAddr, Index, End,

BBAddrMapLabels); BBAddrMapLabels);

} }

while (Index < End) { while (Index < End) {

// ARM and AArch64 ELF binaries can interleave data and text in the // ARM and AArch64 ELF binaries can interleave data and text in the

// same section. We rely on the markers introduced to understand what // same section. We rely on the markers introduced to understand what

// we need to dump. If the data marker is within a function, it is // we need to dump. If the data marker is within a function, it is

// denoted as a word/short etc. // denoted as a word/short etc.

if (CheckARMELFData) { if (!MappingSymbols.empty()) {

char Kind = getMappingSymbolKind(MappingSymbols, Index); char Kind = getMappingSymbolKind(MappingSymbols, Index);

DumpARMELFData = Kind == 'd'; DumpARMELFData = Kind == 'd';

if (SecondarySTI) { if (SecondarySTI) {

if (Kind == 'a') { if (Kind == 'a') {

STI = PrimaryIsThumb ? SecondarySTI : PrimarySTI; STI = PrimaryIsThumb ? SecondarySTI : PrimarySTI;

DisAsm = PrimaryIsThumb ? SecondaryDisAsm : PrimaryDisAsm; DisAsm = PrimaryIsThumb ? SecondaryDisAsm : PrimaryDisAsm;

} else if (Kind == 't') { } else if (Kind == 't') {

STI = PrimaryIsThumb ? PrimarySTI : SecondarySTI; STI = PrimaryIsThumb ? PrimarySTI : SecondarySTI;

▲ Show 20 Lines • Show All 1,221 Lines • ▼ Show 20 Lines static void parseObjdumpOptions(const llvm::opt::InputArgList &InputArgs) {

LeadingAddr = !InputArgs.hasArg(OBJDUMP_no_leading_addr); LeadingAddr = !InputArgs.hasArg(OBJDUMP_no_leading_addr);

RawClangAST = InputArgs.hasArg(OBJDUMP_raw_clang_ast); RawClangAST = InputArgs.hasArg(OBJDUMP_raw_clang_ast);

Relocations = InputArgs.hasArg(OBJDUMP_reloc); Relocations = InputArgs.hasArg(OBJDUMP_reloc);

PrintImmHex = PrintImmHex =

InputArgs.hasFlag(OBJDUMP_print_imm_hex, OBJDUMP_no_print_imm_hex, false); InputArgs.hasFlag(OBJDUMP_print_imm_hex, OBJDUMP_no_print_imm_hex, false);

PrivateHeaders = InputArgs.hasArg(OBJDUMP_private_headers); PrivateHeaders = InputArgs.hasArg(OBJDUMP_private_headers);

FilterSections = InputArgs.getAllArgValues(OBJDUMP_section_EQ); FilterSections = InputArgs.getAllArgValues(OBJDUMP_section_EQ);

SectionHeaders = InputArgs.hasArg(OBJDUMP_section_headers); SectionHeaders = InputArgs.hasArg(OBJDUMP_section_headers);

ShowAllSymbols = InputArgs.hasArg(OBJDUMP_show_all_symbols);

ShowLMA = InputArgs.hasArg(OBJDUMP_show_lma); ShowLMA = InputArgs.hasArg(OBJDUMP_show_lma);

PrintSource = InputArgs.hasArg(OBJDUMP_source); PrintSource = InputArgs.hasArg(OBJDUMP_source);

parseIntArg(InputArgs, OBJDUMP_start_address_EQ, StartAddress); parseIntArg(InputArgs, OBJDUMP_start_address_EQ, StartAddress);

HasStartAddressFlag = InputArgs.hasArg(OBJDUMP_start_address_EQ); HasStartAddressFlag = InputArgs.hasArg(OBJDUMP_start_address_EQ);

parseIntArg(InputArgs, OBJDUMP_stop_address_EQ, StopAddress); parseIntArg(InputArgs, OBJDUMP_stop_address_EQ, StopAddress);

HasStopAddressFlag = InputArgs.hasArg(OBJDUMP_stop_address_EQ); HasStopAddressFlag = InputArgs.hasArg(OBJDUMP_stop_address_EQ);

SymbolTable = InputArgs.hasArg(OBJDUMP_syms); SymbolTable = InputArgs.hasArg(OBJDUMP_syms);

SymbolizeOperands = InputArgs.hasArg(OBJDUMP_symbolize_operands); SymbolizeOperands = InputArgs.hasArg(OBJDUMP_symbolize_operands);

▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Handle multiple syms at same addr in disassembly.ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 452618

llvm/test/tools/llvm-objdump/multiple-symbols.test

llvm/tools/llvm-objdump/ObjdumpOpts.td

llvm/tools/llvm-objdump/llvm-objdump.cpp

[llvm-objdump] Handle multiple syms at same addr in disassembly.
ClosedPublic