This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Add option --print-raw-value to llvm-nm to dump raw symbol values in case of ARM
Needs ReviewPublic

Authored by quic-subhkedi on Dec 1 2022, 4:26 AM.

Details

Summary

The address value printed for the symbol "foo" in the added test case using llvm-nm is 1002 which is one byte short of the expected value of 1003. In instances where the user wants to see the raw values via llvm-nm the option --print-raw-values can be used.

The aim is to get those raw values by adding a new option to llvm-nm i.e. --print-raw-values.

Diff Detail

Event Timeline

quic-subhkedi created this revision.Dec 1 2022, 4:26 AM
Herald added a project: Restricted Project. · View Herald Transcript
quic-subhkedi requested review of this revision.Dec 1 2022, 4:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 1 2022, 4:26 AM
quic-subhkedi edited the summary of this revision. (Show Details)Dec 1 2022, 4:34 AM

It is my understanding that the final bit in ARM code is to indicate whether it is from the THUMB or ARM instruction set for an address and that actual address calculations ignore the final bit of the address value. Under what circumstances would you therefore ever have a function at an odd address? Could it be that GNU nm is actually incorrect here to not remove the odd bit?

(Please note, I am not an ARM developer and therefore don't know much about this area).

llvm/include/llvm/Object/ELFObjectFile.h
554–555

With this change, the comment becomes incorrect.

llvm/test/tools/llvm-nm/ARM/address-check.test
2–4

Various nits. Also, please reflow the comment to 80 character width.

27

Nit: add new line at EOF.

shankare added inline comments.Dec 1 2022, 5:25 AM
llvm/include/llvm/Object/ELFObjectFile.h
555–558

Should this be removed even for MIPS ?

Its better that the caller handle it appropriately.

quic-subhkedi marked 3 inline comments as done.Dec 1 2022, 5:25 AM
quic-subhkedi marked an inline comment as done.
shankare added inline comments.Dec 1 2022, 6:51 AM
llvm/include/llvm/Object/ELFObjectFile.h
556–559

You removed this code, and pls look at the failing builder.

Which version of nm are you using? The arm-none-eabi-nm from the Arm GNU toolchain (at least in my GCC toolchain) strips off the bottom bit of Thumb STT_FUNC like llvm-nm. The x86_64 version does not and prints the value with the bottom bit set. As such I think that llvm-nm for Arm targets should match arm-non-eabi-nm. I guess if there is a way to pass a triple to llvm-nm that is non-Arm then it shouldn't strip the bottom bit.

The ABI has a section on symbol values https://github.com/ARM-software/abi-aa/blob/main/aaelf32/aaelf32.rst#553symbol-values

In addition to the normal rules for symbol values the following rules shall also apply to symbols of type STT_FUNC:

* If the symbol addresses an Arm instruction, its value is the address of the instruction (in a relocatable object, the offset of the instruction from the start of the section containing it).
* If the symbol addresses a Thumb instruction, its value is the address of the instruction with bit zero set (in a relocatable object, the section offset with bit zero set). For the purposes of relocation the value used shall be the address of the instruction (st_value & ~1).

Note

This allows a linker to distinguish Arm and Thumb code symbols without having to refer to the map. An Arm symbol will always have an even value, while a Thumb symbol will always have an odd value. However, a linker should strip the discriminating bit from the value before using it for relocation.

The raw value of a Thumb address of type STT_FUNC will always have the bottom bit set, although its address will be (st_value & ~1). I don't think the ABI gives a definitive answer to what nm should print. I guess it depends if llvm-nm should display the address or the raw value. In any case I think we should follow arm-none-eabi-nm.

Shouldn't nm just print the raw value stored in the symbol table like what readelf does ?

$ llvm-readelf -s a.o

Symbol table '.symtab' contains 2 entries:
   Num:    Value  Size Type    Bind   Vis       Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT   UND
     1: 00001003     0 FUNC    GLOBAL DEFAULT     1 foo
$ llvm-nm a.o
00001002 T foo
quic-subhkedi retitled this revision from [ARM] Fix Address value difference for symbol emitted by llvm-nm to [ARM] Add option --print-raw-value to llvm-nm to dump raw symbol values in case of ARM.
quic-subhkedi edited the summary of this revision. (Show Details)

Shouldn't nm just print the raw value stored in the symbol table like what readelf does ?

$ llvm-readelf -s a.o

Symbol table '.symtab' contains 2 entries:
   Num:    Value  Size Type    Bind   Vis       Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT   UND
     1: 00001003     0 FUNC    GLOBAL DEFAULT     1 foo
$ llvm-nm a.o
00001002 T foo

I guess it depends on what nm is supposed to output, is it the address of the symbol, or its raw value? Both interpretations seem reasonable. It all depends on what the authors of aeabi-none-elf-nm or aeabi-linux-gnueabihf-nm intended. It may be that this was not intended behaviour and it drops out of GNU nm using BFD (could be worth a post to the binutils mailing list). However if the intent of this patch is to make llvm-nm more like GNU nm, then I think we shouldn't remove the Arm and Mips specific behaviour as this moves it away from arm-none-elf-nm (and other arm-* targets).

According to https://pubs.opengroup.org/onlinepubs/009696899/utilities/nm.html, nm should print the symbol value, which I'd actually interpret as being the value encoded directly in the symbol, which may or may not be the address the symbol is referring to:

If symbolic information is present in the input files, then for each file or for each member of an archive, the nm utility shall write the following information to standard output. ...

  • ...
  • Value of the symbol

I am not really arguing for or against changing the behaviour, but feel like it's important to know the POSIX standard at least.

Can we use print raw values option for now ?

Perhaps we should just print the raw st_value unconditionally? Since this is something strange with arm*nm, it may be worth reporting this difference on binutils@sourceware.org

it looks like with the change to the API to just return st_value, few bots broke, Subham ?

May be just printing the raw value is an option since callers to this API expect the value returned as per ARM/Mips ABI ?

I don't think adding state to the shared llvm/Object/ELFObjectFile.h is the right thing to do there. I think that bit of code is used in too many places that are relying on the behaviour being what it is without having it mutated by a stateful parameter. I think you could add an additional member function that retrieved the symbol without any additional processing, I think this would show the intent more clearly, as well as not adding state. llvm-nm could call that function when wanting raw symbol addresses.

Alternatively you could use the mapping symbols $t and $a along with the symbol type to identify the Thumb code symbols and add the Thumb bit back in llvm-nm. That wouldn't work for MicroMips, but I think you care about Arm.

I expect that llvm-nm and arm-none-eabi-nm use the address and not the raw value so that --numeric-sort gives the right answer. Otherwise mapping symbols, and any other STT_NOTYPE symbol aliases will be at different addresses to the Thumb function symbols themselves.

I personally don't have any objections to adding functionality to llvm-nm. I do think it needs to avoid altering code-paths that depend on the function the way that it is without introducing state though. I can't comment on whether this change is worth the additional complexity, will need to run this by a maintainer to get the final decision on that.

llvm/include/llvm/Object/ELFObjectFile.h
63

I don't think that this is a good idea to introduce state in such a general place. This header is used for all sorts of tools, they will be expecting that the Thumb bit and the Micromips indicator bit are stripped. If this is ever set by another user then all sorts of things could break.

555

I'm not a fan of altering the behaviour of this function with state. I would prefer a new function that gets the raw symbol value. I think that shows the intent of the change rather than mutating an existing function that is in widespread use.

llvm/tools/llvm-nm/Opts.td
85 ↗(On Diff #480138)

Also for Mips, see comment below.

llvm/tools/llvm-nm/llvm-nm.cpp
133 ↗(On Diff #480138)

Also for Mips. The microMIPS ISA uses the bottom bit for the same purpose as Thumb. See ISA Mode Switch in the MIPS32 Architecture Reference Manual.

it looks like with the change to the API to just return st_value, few bots broke, Subham ?

May be just printing the raw value is an option since callers to this API expect the value returned as per ARM/Mips ABI ?

Yes Shankar, an early revision of this patch tried to simply just emit st_value and it broke a bunch of tests across the board including clang and lld tests, adding an option is the safer way in my opinion

quic-subhkedi marked 4 inline comments as done.

Updated the patch to use newly added APIs and make the implementation stateless

I recommend a test with --numeric-sort when --print-raw-values is used. As I understand it the numeric sort will use the values with thumb/mips bit cleared, which will mean that when printed some symbols will look out of order. This will look a bit weird but is probably better than having different sorting orders.

I don't have any more comments. I think you'll need a LLVM binutils maintainer to approve.

llvm/include/llvm/Object/ELFObjectFile.h
80

Worth a comment to mention what Raw means. Something like
// As getSymbolAddress, but do not mask the Thumb bit or microMips indicator.

612

The code base has many // TODO: Test this error. comments, is it worth replicating them here as this is the same code pattern?

Not a strong opinion.

A few nits, but unfortunately, as today is my last day working for 6 weeks, I won't be able to review further at this time. If @MaskRay or another developer hasn't had a chance to look at this, I'll look when I get back end of January (please ping me at that point, as I may forget otherwise).

llvm/include/llvm/Object/ELFObjectFile.h
589

Nit: delete this blank line.

612

Some context: these were added by @grimar some time ago. Ideally, all our code paths should have at least one test that exercises that code path, so in this case, we'd want a test case that shows what happens when there's an error fetching the address. The TODOs were added because at the time, it either was tricky or completely impossible to craft a test case, given the tools we have. That might not be the case here (but if it is, a TODO may be appropriate).

620

Nit: delete this blank line.

llvm/tools/llvm-nm/Opts.td
86 ↗(On Diff #481267)

What's the motivation for hiding this option?

quic-subhkedi marked 4 inline comments as done.
quic-subhkedi added inline comments.
llvm/include/llvm/Object/ELFObjectFile.h
612

yes it had to replicated

llvm/tools/llvm-nm/Opts.td
86 ↗(On Diff #481267)

doesn't need to be hidden so removed

Added a test with --numeric-sort to record the expected behavior

shankare accepted this revision.Dec 12 2022, 9:25 AM
This revision is now accepted and ready to land.Dec 12 2022, 9:25 AM

@MaskRay Can you have a look at this patch?
All the comments from @peter.smith and @jhenderson are addressed

(I've been out for a while and just came back.)

Perhaps we should just print the raw st_value unconditionally? Since this is something strange with arm*nm, it may be worth reporting this difference on binutils@sourceware.org

This comment is still relevant. It looks like some folks don't like arm-*-nm's st_value &~1 behavior. In this patch, an llvm-nm specific option --print-raw-value is being introduced.
So, can some Arm folks bring up this on binutils@sourceware.org? Is it really useful for arm-*-nm's to keep the st_value &~1 behavior?

If not, llvm-nm can behave as always --print-raw-value, saving us the trouble to introduce a new option.

@peter.smith Can you address @MaskRay comments to start a discussion on binutils@sourceware.org to discuss the expected nm behavior?

@peter.smith Can you address @MaskRay comments to start a discussion on binutils@sourceware.org to discuss the expected nm behavior?

I'm on vacation for the next 3 weeks, can raise it with our own GNU team internally when I get back to see if they have any memory of why GNU nm works the way it does. Feel free to raise it one the binutils list yourself if you would like an answer beforehand, my colleagues from the GNU team will be reading that mailing list.

apazos added a subscriber: apazos.Jan 10 2023, 10:56 AM
apazos added inline comments.
llvm/include/llvm/Object/ELFObjectFile.h
601

Result setting here overwrites the value set above under the flags checks. This code does not look correct.

quic-subhkedi marked an inline comment as done.
quic-subhkedi added inline comments.
llvm/include/llvm/Object/ELFObjectFile.h
601

added the else blocks to preserve the result values

quic-subhkedi marked an inline comment as done.Jan 18 2023, 12:10 AM

@peter.smith Were you able to discuss the issue with the GNU team?
Can you start a discussion on binutils@sourceware.org regarding this?

@peter.smith Were you able to discuss the issue with the GNU team? Can you start a discussion on binutils@sourceware.org regarding this?

My apologies, my day job is very busy at the moment and I've not had chance. I should be able to ask our own internal team later this week, most likely Friday.

@peter.smith Were you able to discuss the issue with the GNU team? Can you start a discussion on binutils@sourceware.org regarding this?

My apologies, my day job is very busy at the moment and I've not had chance. I should be able to ask our own internal team later this week, most likely Friday.

I've asked internally. I'll report back as soon as I get an answer, or next week if I don't.

@peter.smith Were you able to discuss the issue with the GNU team? Can you start a discussion on binutils@sourceware.org regarding this?

My apologies, my day job is very busy at the moment and I've not had chance. I should be able to ask our own internal team later this week, most likely Friday.

I've asked internally. I'll report back as soon as I get an answer, or next week if I don't.

I have an answer, but it is unfortunately I don't think it is going to help very much. To summarise:

  • The intent of the ABI is that bit 0 of an STT_FUNC symbol is not part of the address. So printing a symbols value includes bit 0, printing a symbols address does not.
  • Arm GNU tools have a "soft" preference for "higher-level, user facing" tools like objdump --syms and nm to print the address and not the value and "lower-level, developer facing" tools like readelf --symbols to print the value and not the address. You can indeed see the discrepancy arm-none-eabi-objdump --syms will force bit 0 to 0 in the output but arm-none-eabi-readelf will not. This behaviour is replicated in llvm-objdump and llvm-readelf.
  • It is difficult to argue from first principles on what nm should print, GNU considers it a user-facing tool so it prints addresses.
  • One shouldn't read too much into what POSIX says as it uses address and value interchangeably, essentially an assumption that the two are the same, and was likely written before Arm and Mips stashed away non address information in bit 0.
  • There is no enthusiasm for changing the GNU behaviour.

So I think this still leaves us with the choice of whether llvm-nm should differ from arm-none-eabi-nm. I personally would err on the side of caution and consistency between GNU and LLVM and keep the default behaviour the same. However I don't think it will make a huge difference.

So I think this still leaves us with the choice of whether llvm-nm should differ from arm-none-eabi-nm. I personally would err on the side of caution and consistency between GNU and LLVM and keep the default behaviour the same. However I don't think it will make a huge difference.

I'm inclined to agree with @peter.smith re. compatibility. I don't think there's a strong enough motivation to diverge here, and I've seen llvm-nm used in scripts before to gather symbol data. This means we can't safely change the behaviour like we can with some other tools that are likely only for direct human consumption.

I think this means the option has a place here. @MaskRay, any further thoughts?

On the note of this new option, it should be in the llvm-nm CommandGuide documentation. Also, given you specifically call out MIPS in the comments etc, it probably makes sense to have testing for MIPS too?

llvm/include/llvm/Object/ELFObjectFile.h
580–581

As this is a new code path, rather than a pre-existing one, it should have a test that covers it.

598–599

This code flow looks a bit clunky and easy to misread. Could this become an else if with the above, and then the bit following this if can become the else?

600

As above: the new code path should be tested.

612

Sorry for the confusion: new code should be tested. The existing TODOs are for code paths that weren't properly tested previously and which @grimar didn't have time to identify a way to test.

You should be able to write a test using yaml2obj has many ways to craft an object with malformed input. If there's literally no way to hit this code path (even if you were to hand-edit a binary), then a return of the error is inappropriate and it should be cantFail.

llvm/test/tools/llvm-nm/ARM/address-check.test
2–3

I think this comment should be reworded. Also, as this test is in the ARM subdirectory, there's no need to specifically say that this is for ARM here.

4

Nit: We don't usually have these extra lines between comments and test logic for simple tests with a single test case, like this test.

6–7

Rather than two prefixes, I think it would be better to use a single test prefix, but using FileCheck's -D to specify the expected value. That way, there's no risk of another output change causing one test case to pass and the other to no longer be covered for some reason. I'd also move the thing being checked for close to the RUN line, as it's short.

Something like the following:

# RUN: llvm-nm %t | FileCheck %s -DVALUE=00001002
# RUN: llvm-nm %t --print-raw-values | FileCheck %s -DVALUE=00001003

# CHECK: [[VALUE]] T foo
llvm/test/tools/llvm-nm/ARM/address-sort.test
1–2 ↗(On Diff #488230)

It's not clear from this comment why --numeric-sort + --print-raw-values is an interesting combination to test. Could you rephrase the comment (also including my suggestions from the other test case) to explain why this combination is important. Looking at the code, I think it's because the address used in the sort the printed address (i.e. the "raw" one when --print-raw-values is used), so you should say something about that.

32–35 ↗(On Diff #488230)

CHECK-DAG means that the output can be in either order between the two check lines. Is that what you actually want?

llvm/tools/llvm-nm/Opts.td
59 ↗(On Diff #488230)

The help text probably should mention the ARM/MIPS specificness of this option.

jhenderson requested changes to this revision.Jan 31 2023, 12:57 AM
This revision now requires changes to proceed.Jan 31 2023, 12:57 AM

So I think this still leaves us with the choice of whether llvm-nm should differ from arm-none-eabi-nm. I personally would err on the side of caution and consistency between GNU and LLVM and keep the default behaviour the same. However I don't think it will make a huge difference.

I'm inclined to agree with @peter.smith re. compatibility. I don't think there's a strong enough motivation to diverge here, and I've seen llvm-nm used in scripts before to gather symbol data. This means we can't safely change the behaviour like we can with some other tools that are likely only for direct human consumption.

I think this means the option has a place here. @MaskRay, any further thoughts?

No further thought. Adding an option seems the way to go.

quic-subhkedi marked 9 inline comments as done.
quic-subhkedi added inline comments.
llvm/include/llvm/Object/ELFObjectFile.h
600

this function is only used via the llvm-nm.cpp and is unreachable from there, so is it ok if we remove this?

quic-subhkedi marked 3 inline comments as done.Feb 15 2023, 12:05 AM
quic-subhkedi added inline comments.
llvm/tools/llvm-nm/Opts.td
86 ↗(On Diff #481267)

not hidden any more

Please add the new option to the llvm-nm CommandGuide documentation.

I was under the impression this patch was to print the raw symbol value, which would be the raw st_value field of the Elf_Sym object. However, this function isn't doing that. Could you explain why, please?

llvm/include/llvm/Object/ELFObjectFile.h
578

Nit: I'm not sure auto is appropriate here. LLVM doesn't have an always-auto policy, and I can't tell from here or the immediate context what the underlying type is.

580–581

If this is now tested, please remove the TODO.

588

Some of the code in this function is duplicated with getSymbolAddress. Indeed, I'd expect the two functions (based on their names) to do almost the same thing, with getSymbolAddress simply doing the additional masking. That doesn't seem to be the case though.

llvm/test/tools/llvm-nm/ARM/address-check.test
2
2

This test name makes no sense. What does "address-check" have to do with "print-raw-values".

I recommend changing the name to something like "print-raw-values.test".

15

Is the flag actually needed for this test? If not, please delete this line to minimise noise.

20

This is unnecessary for the test case, I believe.

26

You can omit the Binding field. It'll cause the letter code llvm-nm prints to change, but this is irrelevant for the test.

llvm/test/tools/llvm-nm/ARM/address-sort.test
1–2 ↗(On Diff #497574)

Some wording tweak suggestions.

18 ↗(On Diff #497574)

Same comments about minmising the YAML as in the other test case.

31 ↗(On Diff #497574)

Nit: delete extra blank line.

32–35 ↗(On Diff #497574)

As with the first test, I find it easier to follow tests if the check lines are before the YAML.

llvm/tools/llvm-nm/llvm-nm.cpp
1840–1841 ↗(On Diff #497574)

This asymmetry makes me think that getRawSymbolAddress should be getRawAddress and be a metho of SymbolRef.

Also, is this behaviour ELF-specific or does e.g. COFF have the same behaviour, and therefore the option should apply there too?