Page MenuHomePhabricator

Fix Bug 41353 - unique symbols printed as D instead of u
ClosedPublic

Authored by mmpozulp on Apr 25 2019, 12:10 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

mmpozulp created this revision.Apr 25 2019, 12:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2019, 12:10 AM
mmpozulp updated this revision to Diff 196713.Apr 25 2019, 1:10 PM

Protect dyn_cast with isELF() check

Adding zbrid since she has context already. I'll take a look in a bit too. Thanks for the patch!

(Also sorry for the delay; I was out last week and am doing a very slow job of un-burying myself from what's piled up)

rupprecht added a subscriber: MaskRay.

Looks like support was recently added in rL359380, though without any tests, and was accidentally reverted.

The approach there (checking SymI->getOther() >> 4 == ELF::STB_GNU_UNIQUE) is a bit opaque, but I think with a comment might look better than the dyn_cast chain. Can you see if that works?

Also +grimar who has been adding GNU_UNIQUE support elsewhere

llvm/test/tools/llvm-nm/X86/Inputs/unique.s
1 ↗(On Diff #196713)

For simple cases like this, you can embed the assembly directly into the test file. See llvm-nm/X86/radix.s for an example test that does this.

llvm/test/tools/llvm-nm/X86/unique.test
1 ↗(On Diff #196713)

Instead of clang, use llvm-mc -filetype=obj -triple=x86_64-pc-linux to directly handle the assembly

Actually, yaml2obj might be even better, now that it can handle unique symbols.

grimar added a comment.May 6 2019, 4:35 AM

Looks like support was recently added in rL359380, though without any tests, and was accidentally reverted.

I think it just had a bug. Binding is st_info >> 4, not st_other >> 4:
https://github.com/llvm-mirror/llvm/blob/master/include/llvm/BinaryFormat/ELF.h#L995

mmpozulp updated this revision to Diff 198350.May 6 2019, 3:11 PM

Incorporate feedback from @rupprecht

mmpozulp updated this revision to Diff 198353.May 6 2019, 3:19 PM

Delete unnecessary modification to ELFTypes.h

grimar added inline comments.May 7 2019, 2:35 AM
llvm/test/tools/llvm-nm/X86/unique.test
1 ↗(On Diff #198353)

I would write in plain English what this test checks (instead of adding a link).
This is s more common way to describe what is going on in test and much more convenient for readers.

1 ↗(On Diff #196713)

+1 to use yaml2obj.

llvm/tools/llvm-nm/llvm-nm.cpp
932 ↗(On Diff #198353)

Since getSymbolNMTypeChar starts from comment "// OK, this is ELF",
and has ELFObjectFileBase argument (i.e. it is called only for ELF)
I suggest to simplify and inline this method instead.
Something like:

case (ELF::SHF_TLS | ELF::SHF_ALLOC | ELF::SHF_WRITE):
case (ELF::SHF_ALLOC | ELF::SHF_WRITE):
  return ELFSymbolRef(Sym).getBinding() == ELF::STB_GNU_UNIQUE ? 'u' : 'd';
...
if ((Symflags & object::SymbolRef::SF_Global) && 
  ELFSymbolRef(Sym).getBinding() != ELF::STB_GNU_UNIQUE)
    Ret = toupper(Ret);
mmpozulp updated this revision to Diff 198526.May 7 2019, 2:04 PM

Inline the check for gnu unique symbol as per suggestion by @grimar. Use yaml2obj instead of llvm-mc in unique.test.

grimar added inline comments.May 13 2019, 2:14 AM
llvm/test/tools/llvm-nm/X86/unique.test
13 ↗(On Diff #198526)

The header you wrote looks nice for me personally, but the problem that it is inconsistent
with other tests (I do not think we usually write how the llvm-mc was called for the very trivial cases like this).

Also, the problem is that your YAML has more sections than needed. (And removing the sections would need to
update your comment about how the YAML was produced, what is probably does not make value here since it
is a very simple case which just does not need llvm-mc I think).

So I think you can just use:

## Check that we print 'u' for unique symbols
# RUN: yaml2obj < %s | llvm-nm - | FileCheck %s

--- !ELF
FileHeader:      
  Class:           ELFCLASS64
  Data:            ELFDATA2LSB
  Type:            ET_REL
  Machine:      EM_X86_64
Sections:        
  - Name:            .data
    Type:            SHT_PROGBITS
    Flags:           [ SHF_WRITE, SHF_ALLOC ]
    AddressAlign:    0x0000000000000001
    Content:         '00'
Symbols:         
  - Name:            foo
    Type:            STT_OBJECT
    Section:         .data
    Binding:         STB_GNU_UNIQUE
...

# CHECK: 0000000000000000 u foo

Or even may be:

## Check that we print 'u' for unique symbols
# RUN: yaml2obj < %s | llvm-nm - | FileCheck %s

--- !ELF
FileHeader:      
  Class:           ELFCLASS64
  Data:            ELFDATA2LSB
  Type:            ET_REL
  Machine:      EM_X86_64
Symbols:         
  - Name:          foo
    Binding:        STB_GNU_UNIQUE
...

# CHECK: 0000000000000000 u foo
grimar added inline comments.May 13 2019, 2:19 AM
llvm/tools/llvm-nm/llvm-nm.cpp
1177 ↗(On Diff #198526)

A bit hard to read these conditions.

What do you think about the following?

if (!(Symflags & object::SymbolRef::SF_Global))
    return Ret;

if (Obj.isELF() && ELFSymbolRef(*I).getBinding() == ELF::STB_GNU_UNIQUE)
    return Ret;

return toupper(Ret);
mmpozulp updated this revision to Diff 199317.May 13 2019, 1:11 PM

Incorporate feedback by @grimar

mmpozulp added inline comments.May 13 2019, 1:28 PM
llvm/test/tools/llvm-nm/X86/unique.test
13 ↗(On Diff #198526)

Good suggestion! The smallest yaml file that I could achieve is

--- !ELF
FileHeader:      
  Class:           ELFCLASS64
  Data:            ELFDATA2LSB
  Type:            ET_REL
  Machine:         EM_X86_64
Sections:        
  - Name:            .data
    Type:            SHT_PROGBITS
    Flags:           [ SHF_WRITE, SHF_ALLOC ]
Symbols:         
  - Name:            foo
    Section:         .data
    Binding:         STB_GNU_UNIQUE

If I delete the .data section I get 'U' since the symbol is now undefined. I also noticed that if I delete

Flags:           [ SHF_WRITE, SHF_ALLOC ]

nm reports 'u' but llvm-nm reports 'n'. If I also delete

Binding:         STB_GNU_UNIQUE

nm reports 'd' but llvm-nm reports 'n'. I assume that these situations don't arise in reality so we don't care.

MaskRay added inline comments.May 13 2019, 10:28 PM
llvm/tools/llvm-nm/llvm-nm.cpp
946 ↗(On Diff #199317)

You have to rebase. This switch has been deleted

jhenderson added inline comments.May 14 2019, 1:39 AM
llvm/test/tools/llvm-nm/X86/unique.test
6 ↗(On Diff #199317)

Nit: there's no need to add large numbers of spaces between the tag and value in the YAML. Just add enough so that all values in a block line up:

Class:   ELFCLASS64
Data:    ELFDATA2LSB
Type:    ET_REL
Machine: EM_X86_64
mmpozulp updated this revision to Diff 199527.May 14 2019, 3:53 PM

Rebase and fix yaml spaces. Thanks @MaskRay and @jhenderson :)

jhenderson added inline comments.May 15 2019, 1:45 AM
llvm/test/tools/llvm-nm/X86/unique.test
1 ↗(On Diff #199527)

Nit: missing trailing full stop.

13 ↗(On Diff #199527)

If the flags aren't relevant to the resolution of the symbol, I suggest getting rid of them entirely.

llvm/tools/llvm-nm/llvm-nm.cpp
911–912 ↗(On Diff #199527)

I don't know how realistic this is, but is this the right order of precedence for unique symbols, compared with GNU nm? In other words, do executable and NOBITS sections containing these symbols result in 't'/'b'?

mmpozulp updated this revision to Diff 199974.May 16 2019, 9:37 PM

Incorporate feedback by @jhenderson.

mmpozulp marked an inline comment as done.May 16 2019, 9:43 PM
mmpozulp added inline comments.
llvm/tools/llvm-nm/llvm-nm.cpp
911–912 ↗(On Diff #199527)

STB_GNU_UNIQUE precedence is high. I tested 5 different cases with GNU nm version 2.27-34.base.el7 to make sure llvm-nm agrees (see below), and added those tests to this patch. What do you think, @jhenderson?

--- !ELF
FileHeader:    
  Class:   ELFCLASS64
  Data:    ELFDATA2LSB
  Type:    ET_REL
  Machine: EM_X86_64
Sections:    
  - Name:  .data
    Type:  XXX
Symbols:    
  - Name:    foo 
    Section: .data
    Binding: STB_GNU_UNIQUE
XXX             nm  llvm-nm
SHT_PROGBITS    u   u
SHT_NOBITS      u   u
--- !ELF
FileHeader:    
  Class:   ELFCLASS64
  Data:    ELFDATA2LSB
  Type:    ET_REL
  Machine: EM_X86_64
Sections:    
  - Name:  .data
    Type:  SHT_PROGBITS
    Flags: [YYY]
Symbols:    
  - Name:    foo 
    Section: .data
    Binding: STB_GNU_UNIQUE
YYY                     nm  llvm-nm
SHF_EXECINSTR           u   u
SHF_ALLOC               u   u
SHF_ALLOC, SHF_WRITE    u   u

I think you can afford to have all the test cases for unique symbols in one test file. I'd also name the sections and symbols to illustrate the cases, rather than calling them all .data/foo.

mmpozulp updated this revision to Diff 200161.May 18 2019, 5:17 PM

Squish tests together into one file and use more descriptive section/symbol names. Thanks @jhenderson :)

MaskRay added inline comments.May 18 2019, 5:26 PM
llvm/tools/llvm-nm/llvm-nm.cpp
911–912 ↗(On Diff #199527)

You may move the 'u' check before if (SecI != Obj.section_end()) {

See
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;hb=9d6d4be89d12747a92629ed1bde1d423e2831de1;f=bfd/syms.c#l699

The if (SecI != Obj.section_end()) { blocks roughly corresponds to decode_section_type (symbol->section);.

BTW, I think STB_GNU_UNIQUE is a misfeature (no clear specification, only (complex) implementation is in glibc, interaction with STB_WEAK/STB_GLOBAL is unclear..).. this feature will highlight the problem :)

BTW, I think STB_GNU_UNIQUE is a misfeature (no clear specification, only (complex) implementation is in glibc, interaction with STB_WEAK/STB_GLOBAL is unclear..).. this feature will highlight the problem :)

@MaskRay, it sounds like you know a lot more than I do about this feature :). Do you know what purpose is served by "unique" global symbols? man nm shows

"u" The symbol is a unique global symbol.  This is a GNU extension to the
    standard set of ELF symbol bindings.  For such a symbol the dynamic
    linker will make sure that in the entire process there is just one
    symbol with this name and type in use.

but without an example I don't understand why this feature is useful. If you, @MaskRay, don't know of an example, then @rupprecht, who asked for this feature, or other knowledgeable people like @grimar and @jhenderson may know of one. Thanks :)

I'd like @MaskRay's comment to be addressed, but otherwise, this looks good to me.

mmpozulp updated this revision to Diff 200402.May 20 2019, 10:17 PM

Move the 'u' check before if (SecI != Obj.section_end()), thanks @MaskRay.

jhenderson added inline comments.May 21 2019, 2:20 AM
llvm/test/tools/llvm-nm/X86/unique.test
39 ↗(On Diff #200402)

Could you add a unique symbol without a section to show it gets 'u', please?

mmpozulp marked an inline comment as done.May 21 2019, 12:26 PM
mmpozulp added inline comments.
llvm/test/tools/llvm-nm/X86/unique.test
39 ↗(On Diff #200402)

Is that the behavior that we want for llvm-nm? If I run GNU nm (version 2.27-34.base.el7) it reports 'U' instead of 'u' for a unique symbol without a section

Symbols:
  - Name:    nosection
    Binding: STB_GNU_UNIQUE
U nosection
grimar added inline comments.May 22 2019, 12:38 AM
llvm/test/tools/llvm-nm/X86/unique.test
39 ↗(On Diff #200402)

I can confirm. Symbol is undefined, but unique and GNU nm shows 'U' in this case:

Symbol table '.symtab' contains 2 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 NOTYPE  UNIQUE DEFAULT  UND foo
jhenderson added inline comments.May 22 2019, 3:38 AM
llvm/test/tools/llvm-nm/X86/unique.test
39 ↗(On Diff #200402)

Right, sorry. I based my comment on reading the change below (and I might have misread it).

I think we should have a test case for whatever the behaviour is.

LGTM, pending the additional test case James requested

mmpozulp updated this revision to Diff 200823.May 22 2019, 2:47 PM

Update test to make sure that we print 'U' for a unique symbol without a section. Thanks, @jhenderson, for suggesting this.

rupprecht accepted this revision.May 22 2019, 4:03 PM
rupprecht added inline comments.
llvm/test/tools/llvm-nm/X86/unique.test
3 ↗(On Diff #200823)

nit: < is not necessary for yaml2obj. Just yaml2obj %s is fine (and more common).

This revision is now accepted and ready to land.May 22 2019, 4:03 PM
mmpozulp updated this revision to Diff 200835.May 22 2019, 4:42 PM

Use yaml2obj %s instead of yaml2obj < %s in test, thanks @rupprecht.

jhenderson accepted this revision.May 23 2019, 1:54 AM

LGTM. Do you need help committing this?

LGTM. Do you need help committing this?

Yes, please. Thanks!

This revision was automatically updated to reflect the committed changes.

Committed as r361595. Thanks!