This is an archive of the discontinued LLVM Phabricator instance.

[docs][llvm-nm] Improve symbol code documentation
ClosedPublic

Authored by jhenderson on Jun 14 2019, 4:46 AM.

Details

Summary

The existing symbol code documentation was very incomplete. This patch adds the missing codes, and defines them based on the current code behaviour.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42231.

Diff Detail

Event Timeline

jhenderson created this revision.Jun 14 2019, 4:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2019, 4:46 AM

All this looks reasonable, but my grasp of Mach-O is not as complete as I would like, and I have some questions for you.

docs/CommandGuide/llvm-nm.rst
65 ↗(On Diff #204742)

I am not sure about the "Absolute Symbol" note. Can you point me at why you think that's what it means?

The historical documentation for 's' is "A symbol in a section that is not in TEXT, DATA (non-BSS), or in DATA (BSS). I'd hate to call that "unknown" because "undefined" is such a strong concept and they start with the same letter and are easy to confuse. Pragmatically, you can use directives to tell clang to write symbols into any arbitrary segment/section; and symbols defined in such a way would be listed as "s". So it's not that they're "unknown" as they are "uncommon" or perhaps "uncanonical." For the sake of documentation, it might be best to say "Mach-O: symbol not in TEXT or __DATA.

69 ↗(On Diff #204742)

Hmm ... I don't *think* Mach-O will print this value. But, please let me know if you know otherwise, and thanks for bringing this to my attention.

97 ↗(On Diff #204742)

I believe we do not print w/W for Mach-O files. If you know otherwise, please let me know.

101 ↗(On Diff #204742)

Yes, that's what the documentation says ... I've been looking for an example ...

MaskRay added inline comments.Jun 15 2019, 2:21 AM
docs/CommandGuide/llvm-nm.rst
51 ↗(On Diff #204742)

I think the semantics (GNU nm behavior) of n are similar to what the reverted rL359312 did, but unfortunately I cannot find a reasonable use case to write a test, nor can I find useful tests in the binutils-gdb repository.

I don't know if relocations into non-SHF_ALLOC sections other than .debug* are reasonable.

rupprecht added inline comments.Jun 17 2019, 10:43 AM
docs/CommandGuide/llvm-nm.rst
31 ↗(On Diff #204742)

Do we print lowercase a? man nm doesn't include it.

jhenderson marked 6 inline comments as done.Jun 19 2019, 8:30 AM
jhenderson added inline comments.
docs/CommandGuide/llvm-nm.rst
31 ↗(On Diff #204742)

Yes, and indeed GNU nm prints it too:

// bar.s
foo = 1234

PS C:\Work\TempWork> clang -c bar.s
PS C:\Work\TempWork> nm bar.o
00000000000004d2 a foo
PS C:\Work\TempWork> \llvm\build\Debug\bin\llvm-nm bar.o
00000000000004d2 a foo
51 ↗(On Diff #204742)

I don't know if relocations into non-SHF_ALLOC sections other than .debug* are reasonable.

We use them in Sony-specific sections at least. I think they are reasonable for additional metadata that needs to refer to the actual code in some way.

I'm okay with the semantics of 'n' changing in the future (this is just one example where the code doesn't necessarily make sense). I'd just like to document what we currently do now - we can document what the behaviour is when we change the code.

65 ↗(On Diff #204742)

I am not sure about the "Absolute Symbol" note. Can you point me at why you think that's what it means?

See: https://github.com/llvm/llvm-project/blob/master/llvm/tools/llvm-nm/llvm-nm.cpp#L1005. If the N_Type is N_ABS, the code returns 's'.

I'll update the text slightly to explicitly refer to the section names.

69 ↗(On Diff #204742)

It doesn't. As far as I can see, it's COFF and ELF-specific. I'm not sure whether we want to indicate where something isn't supported by Mach-O only. Happy for feedback on this.

97 ↗(On Diff #204742)

llvm-nm doesn't use 'w' for Mach-O. See above for my comments about whether we should explicitly state this.

101 ↗(On Diff #204742)

It's also what the code says. I have no idea what an N_STAB symbol actually is though. Happy for something better here (or even to delete it entirely).

jhenderson marked 2 inline comments as done.Jun 19 2019, 8:49 AM
jhenderson added inline comments.
docs/CommandGuide/llvm-nm.rst
51 ↗(On Diff #204742)

We use them in Sony-specific sections at least. I think they are reasonable for additional metadata that needs to refer to the actual code in some way.

I misread your comment. I don't know about relocations into non-alloc sections, but it's certainly possible to produce local symbols in them at least.

Update text of 's' to specify which sections it applies to more precisely.

MaskRay added inline comments.Jun 20 2019, 12:16 AM
docs/CommandGuide/llvm-nm.rst
51 ↗(On Diff #204742)

Sent out D63588. The new behavior should match what I observed from GNU nm.

ELF: .note or .comment section symbol, or local symbol from non-alloc section.

This should just be: local symbol from non-alloc section if D63588 is accepted. (You can add non-writable if that matters.. though I can't find a use case of a writable non-alloc section)

Note sections are usually SHF_ALLOC. .comment sections do not have relocations. I don't know why the original code was written that way...

81 ↗(On Diff #205613)

It doesn't have to be "referenced".

as /dev/null -o a.o
ld.bfd a.o -u foo -o a  # or gold; but ld.lld doesn't have this behavior
nm a => U foo

though foo is not referenced...

89 ↗(On Diff #205613)

Multiple definitions link together into zero or one definitions.

This sentence makes me puzzled. You may delete it if no better description can be found.

jhenderson marked 2 inline comments as done.Jun 20 2019, 3:40 AM
jhenderson added inline comments.
docs/CommandGuide/llvm-nm.rst
81 ↗(On Diff #205613)

You're right. This one is just moved from earlier, but we can update it now to be better.

89 ↗(On Diff #205613)

How about "This definition will only be used if no regular definitions exist in a link. If multiple weak definitions and no regular definitons exist, one of the weak definitions will be used."?

Reword undefined and weak symbol code text. Also update 'n' text following behaviour change.

MaskRay accepted this revision.Jun 20 2019, 4:27 AM

Thank you for improving the documentation! The ELF part looks good to me, but it'd good to have someone else to take a look.

This revision is now accepted and ready to land.Jun 20 2019, 4:27 AM
jhenderson added a subscriber: majnemer.

Thanks @MaskRay

@majnemer, I've added you as it looks like you've worked on the COFF llvm-nm side a fair bit. How do the COFF aspects look?

@mtrent, are you happy with the Mach-O parts?

mtrent accepted this revision.Jun 20 2019, 10:25 AM
rupprecht accepted this revision.Jun 21 2019, 5:50 PM

Thanks all. I'm going to land this shortly. I'm happy to make further changes if there's any remaining issues.

This revision was automatically updated to reflect the committed changes.