This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Use <first-symbol>-<offset> as the section start symbol
AbandonedPublic

Authored by ychen on Jun 13 2019, 9:43 AM.

Details

Summary

provided that later symbol exists. If not, use section name.

https://bugs.llvm.org/show_bug.cgi?id=41946

Event Timeline

ychen created this revision.Jun 13 2019, 9:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2019, 9:43 AM
ychen updated this revision to Diff 204573.Jun 13 2019, 10:16 AM
  • update
MaskRay added a comment.EditedJun 13 2019, 7:32 PM

In the context of PR41946:

Disassembly of section .text:

0000000000001000 .text:
    1000: 90                            nop

I agree that 0000000000001000 .text: is not useful as it repeats what Disassembly of section .text: says. However, in the bar@plt-0x10 case (when the PLT resolver stub doesn't get a symbol name), I'm not sure this improves readability.

ychen added a comment.Jun 13 2019, 7:37 PM

In the context of PR41946:

Disassembly of section .text:

0000000000001000 .text:
    1000: 90                            nop

I agree that 0000000000001000 .text: is not useful as it repeats what Disassembly of section .text: says. However, in the bar@plt-0x10 case (when the PLT resolver stub doesn't get a symbol name), I'm not sure this improves readability.

I agree it is not very readable but it matches GNU output.

MaskRay added a comment.EditedJun 13 2019, 8:36 PM

In the context of PR41946:

Disassembly of section .text:

0000000000001000 .text:
    1000: 90                            nop

I agree that 0000000000001000 .text: is not useful as it repeats what Disassembly of section .text: says. However, in the bar@plt-0x10 case (when the PLT resolver stub doesn't get a symbol name), I'm not sure this improves readability.

I agree it is not very readable but it matches GNU output.

I know this what GNU objdump does:) While working on lld's ppc32/riscv .plt support recently, I constantly see the annoying __libc_start_main@plt-0x20 as a branch target. Compatibility is important, for people who use llvm-objdump as a drop-in replacement of GNU objectdump. However, in some places, where there are several ways to do a thing (e.g. how to render .plt), I believe we do not necessarily copy its behavior if we can find better ways.

Having said this, .plt looking better than blah@plt-0x20 is my opinion. Other reviewers may have different ideas. I don't mind changing it to blah@plt-0x20 if the majority of reviewers agree so.

ychen added a comment.Jun 13 2019, 8:56 PM

In the context of PR41946:

Disassembly of section .text:

0000000000001000 .text:
    1000: 90                            nop

I agree that 0000000000001000 .text: is not useful as it repeats what Disassembly of section .text: says. However, in the bar@plt-0x10 case (when the PLT resolver stub doesn't get a symbol name), I'm not sure this improves readability.

I agree it is not very readable but it matches GNU output.

I know this what GNU objdump does:) While working on lld's ppc32/riscv .plt support recently, I constantly see the annoying __libc_start_main@plt-0x20 as a branch target. Compatibility is important, for people who use llvm-objdump as a drop-in replacement of GNU objectdump. However, in some places, where there are several ways to do a thing (e.g. how to render .plt), I believe we do not necessarily copy its behavior if we can find better ways.

Having said this, .plt looking better than blah@plt-0x20 is my opinion. Other reviewers may have different ideas. I don't mind changing it to blah@plt-0x20 if the majority of reviewers agree so.

.plt definitely looks better for the case. I don't have a strong opinion for each. Looks like it is (consistency+compatibility-with-GNU vs readability). I'm slightly in favor of consistency as a light user of binary tools.

grimar added inline comments.Jun 14 2019, 1:17 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
1196–1197

Does the following look simpler?

if (Symbols.empty()) {
  Symbols.push_back({SectionAddr, SectionName,
                     Section.isText() ? ELF::STT_FUNC : ELF::STT_OBJECT});
} else {
  uint64_t Addr = std::get<0>(Symbols[0]);
  if (Addr != SectionAddr) {
    std::string Sym = Demangle ? demangle(std::get<1>(Symbols[0]))
                               : std::get<1>(Symbols[0]);
    Symbols.insert(Symbols.begin(),
                   {SectionAddr,
                    Sym + "-0x" + utohexstr(Addr - SectionAddr),
                    Section.isText() ? ELF::STT_FUNC : ELF::STT_OBJECT});
  }
}
1196–1197

The comment need an update.

ychen updated this revision to Diff 204791.EditedJun 14 2019, 9:27 AM
  • address comments
  • add a few missing tests that need fixup
ychen marked 2 inline comments as done.Jun 14 2019, 9:30 AM
MaskRay added inline comments.Jun 16 2019, 7:17 PM
lld/test/ELF/ppc-relocs.s
45 ↗(On Diff #204791)

This is another example that demonstrates <section-offset> may not be aesthetically appealing.

I absolutely agree with @MaskRay that the GNU format is less than ideal, but the overwhelming feedback at last year's BoF on the topic was that we should match the output of GNU unless the output was wrong. If it were up to me, I'd just omit the reference entirely in these instances, since they don't really add any value!

Maybe this is another indication that we need separate GNU and LLVM styles for llvm-objdump, similar to llvm-readobj/llvm-readelf?

llvm/test/MC/AMDGPU/branch-comment.s
13

This one is particularly concerning. Does this case match GNU? I'd expect it to fold the two sums together (i.e. keep_symbol-0x4)

llvm/test/tools/llvm-objdump/X86/elf-disassemble-no-start-symbol.test
3

Nit: missing trailing full stop. It might be wise to quote the bit after "with the name".

llvm/tools/llvm-objdump/llvm-objdump.cpp
1197

if later -> if a later

ychen updated this revision to Diff 205189.Jun 17 2019, 2:57 PM
  • Add the code and test to show that the dummy section start symbol should not be used to resolve branch target in disassembly.
  • Update test and some comments.
ychen marked 4 inline comments as done.Jun 17 2019, 3:05 PM
ychen added inline comments.
llvm/test/MC/AMDGPU/branch-comment.s
13

Thanks for pointing this out. GNU objdump does not use the dummy start symbol for branch target resolving (folding). Code updated. Added a test case for this llvm/test/MC/X86/branch-comment.s.

ychen updated this revision to Diff 205190.Jun 17 2019, 3:05 PM
ychen marked an inline comment as done.
  • add comment
jhenderson added inline comments.Jun 18 2019, 3:46 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
1404

This check is rather horrid, if I'm honest, for all sorts of reasons (apart from anything else, ELF places no restrictions on symbol names so a valid symbol name could actually contain "-0x").

I think you need a different way of identifying these inserted symbols (assuming that it's not possible to just change when they are inserted). You might need a list of sections that have had symbols inserted, for example.

ychen updated this revision to Diff 205447.Jun 18 2019, 3:15 PM
ychen marked an inline comment as done.
  • - perform start symbols insertion before disassembling
ychen added inline comments.Jun 18 2019, 3:16 PM
llvm/tools/llvm-objdump/llvm-objdump.cpp
1404

This check is rather horrid, if I'm honest, for all sorts of reasons (apart from anything else, ELF places no restrictions on symbol names so a valid symbol name could actually contain "-0x").

It is horrible. I was assuming the low chance of an ELF name having "-0x" before this start symbol insertion, but it is still legitimate.

I think you need a different way of identifying these inserted symbols (assuming that it's not possible to just change when they are inserted).

Yes, the insertion is just before disassembling so it could happen earlier but not later.

You might need a list of sections that have had symbols inserted, for example.

Did exactly this. I think it does the right thing.

ychen updated this revision to Diff 205472.Jun 18 2019, 4:52 PM
  • update

riscv-objdump appears to use STT_SECTION symbols to symbolize addresses, e.g.:

Disassembly of section .plt:

000027d0 <.plt>:
    27d0:       00011397                auipc   t2,0x11

x86-64 and some other may not. I need to dig into the details. I also filed a binutils-gdb bug about this https://sourceware.org/bugzilla/show_bug.cgi?id=24702

ychen added a comment.Jun 18 2019, 9:18 PM

riscv-objdump appears to use STT_SECTION symbols to symbolize addresses, e.g.:

Disassembly of section .plt:

000027d0 <.plt>:
    27d0:       00011397                auipc   t2,0x11

x86-64 and some other may not. I need to dig into the details. I also filed a binutils-gdb bug about this https://sourceware.org/bugzilla/show_bug.cgi?id=24702

Thanks. That's great to know! Once it is confirmed from GNU side, I will update the patch to reflect that.

jhenderson added inline comments.Jun 19 2019, 4:32 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
349

It's not clear from the function name what this function is supposed to be doing. Please at least add a comment, and also consider renaming the function. Perhaps shouldKeepForDisassembly?

1127

Perhaps worth calling this StartProxySymbols (or just ProxySymbols), since they aren't real symbols.

ychen updated this revision to Diff 205664.Jun 19 2019, 1:45 PM
  • update
ychen marked 3 inline comments as done.Jun 19 2019, 1:46 PM
jhenderson accepted this revision.EditedJun 20 2019, 2:09 AM

Okay, LGTM, but @MaskRay probably should confirm he's happy enough for now (we can improve the STT_SECTION bit later, if we want).

llvm/tools/llvm-objdump/llvm-objdump.cpp
349

disassemble -> disassembly

This revision is now accepted and ready to land.Jun 20 2019, 2:09 AM
MaskRay requested changes to this revision.Jun 20 2019, 8:52 AM

Sorry for doing this. It is very late in my timezone, but I think I have to mark it as "Request Changes" as I saw it accepted and I really don't want it to be committed...

If you all agree bar2@plt-0x20 is less ideal and our .plt is better, I'm not sure why you want to copy the behavior of GNU objdump. I believe this belongs to the aesthetical area where 100% compatibility is not necessary. Programs/shell scripts parsing <bar@plt-0x20> should be extremely brittle and they should be avoided. If the programs need addresses, they should just parse the address part, not the stuff in angle quotes.

I have found someone who agrees with me that firstsym-x is awful and the behavior should not be replicated. Let me check if I can get more opinions tomorrow.

This revision now requires changes to proceed.Jun 20 2019, 8:52 AM
ychen added a comment.Jun 20 2019, 9:04 AM

Sorry for doing this. It is very late in my timezone, but I think I have to mark it as "Request Changes" as I saw it accepted and I really don't want it to be committed...

If you all agree bar2@plt-0x20 is less ideal and our .plt is better, I'm not sure why you want to copy the behavior of GNU objdump. I believe this belongs to the aesthetical area where 100% compatibility is not necessary. Programs/shell scripts parsing <bar@plt-0x20> should be extremely brittle and they should be avoided. If the programs need addresses, they should just parse the address part, not the stuff in angle quotes.

I have found someone who agrees with me that firstsym-x is awful and the behavior should not be replicated. Let me check if I can get more opinions tomorrow.

I think the basis of this patch to have a consistent output as GNU objdump so it is easier for existing GNU users migrate to llvm binutils. Your proposal is definitely an improvement in the functionality of the tool but probably less ideal for the migration?

Like @jhenderson suggested, I assume it does not sacrifice anything if we land this first, then revisit the issue later when GNU make the change also? This makes sure we are always consistent with GNU.

Like @jhenderson suggested, I assume it does not sacrifice anything if we land this first, then revisit the issue later when GNU make the change also? This makes sure we are always consistent with GNU.

As I said, I don't know in what scenarios existing GNU objdump users cannot migrate to llvm-objdump just because of this incompatibility.

I am against the very idea of symbolizing an address with <firstsym - offset>. We all agree it is less ideal than <.section>, then I am not sure how this issue can be improved in the future. This patch would just create unnecessary churn if we later decide to change them back to .plt

- // DISASM-NEXT: 0000000000201020 .plt:
+ // DISASM-NEXT: 0000000000201020 bar2@plt-0x10:

May I ask you to start a thread on https://lists.llvm.org/pipermail/llvm-dev/2019-June to discuss this?

May I ask you to start a thread on https://lists.llvm.org/pipermail/llvm-dev/2019-June to discuss this?

I agree that this needs a wider audience. @ychen, could you go ahead and start that email, please. Feel free to run a draft by me first if you want.

May I ask you to start a thread on https://lists.llvm.org/pipermail/llvm-dev/2019-June to discuss this?

I agree that this needs a wider audience. @ychen, could you go ahead and start that email, please. Feel free to run a draft by me first if you want.

I'll make sure I comment on the list when the email thread starts. Some more data points for you from some gcc 8 toolchains. Arm and AArch64 use <.plt> in the same way as RiscV does. I would prefer this change not to be made for Arm and AArch64. There often can be a lot of target specific decisions made in binutils, we may find that there is little consistency in disassembly across targets.

aarch64-linux-gnu-objdump -d libgomp.so

Disassembly of section .plt:

0000000000007700 <.plt>:
    7700:       a9bf7bf0        stp     x16, x30, [sp, #-16]!
    7704:       f0000190        adrp    x16, 3a000 <__FRAME_END__+0xf3a4>
    7708:       f947fe11        ldr     x17, [x16, #4088]
    770c:       913fe210        add     x16, x16, #0xff8
    7710:       d61f0220        br      x17
    7714:       d503201f        nop
    7718:       d503201f        nop
    771c:       d503201f        nop

0000000000007720 <memcpy@plt>:
    7720:       900001b0        adrp    x16, 3b000 <memcpy@GLIBC_2.17>
    7724:       f9400211        ldr     x17, [x16]
    7728:       91000210        add     x16, x16, #0x0
    772c:       d61f0220        br      x17

arm-linux-gnueabihf-objdump -d libgomp.so

Disassembly of section .plt:

0000602c <.plt>:
    602c:       e52de004        push    {lr}            ; (str lr, [sp, #-4]!)
    6030:       e59fe004        ldr     lr, [pc, #4]    ; 603c <.plt+0x10>
    6034:       e08fe00e        add     lr, pc, lr
    6038:       e5bef008        ldr     pc, [lr, #8]!
    603c:       00030fc4        .word   0x00030fc4

00006040 <calloc@plt>:
    6040:       e28fc600        add     ip, pc, #0, 12
    6044:       e28cca30        add     ip, ip, #48, 20 ; 0x30000
    6048:       e5bcffc4        ldr     pc, [ip, #4036]!        ; 0xfc4

@MaskRay, @peter.smith, is it just the PLT that is specially handled? Perhaps we could just special-case that in llvm-objdump? I think there's clear precedence for this for other versions of objdump, so I'm okay with that difference in behaviour.

It seems to me like other cases of this syntax appearing are going to be very rare overall.

@MaskRay, @peter.smith, is it just the PLT that is specially handled? Perhaps we could just special-case that in llvm-objdump? I think there's clear precedence for this for other versions of objdump, so I'm okay with that difference in behaviour.

It seems to me like other cases of this syntax appearing are going to be very rare overall.

From a cursory scan https://github.com/bminor/binutils-gdb/blob/master/binutils/objdump.c#L749 it looks like the .plt and .got are treated specially by objdump and can remain candidates for disassembly. I agree that it is likely to be linker generated content such as .plt sections that are likely to not have a code symbol at address 0.

ychen added a comment.Jun 21 2019, 9:34 AM

@MaskRay @jhenderson @peter.smith Thank you for the comments. I'll start a thread to collect more opinions.

ychen added a comment.Jun 21 2019, 4:01 PM

To help writing up the discussion thread, I'm trying to gather some data on how each target handle the case with GNU objdump. Weird that I could not reproduce the results @MaskRay @peter.smith was able to obtain. This is not to show favor for either choice but to understand the current situation on GNU side so we make a sensible decision.

➜  Bld lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.2 LTS
Release:        18.04
Codename:       bionic

risc-v

➜  Bld /usr/riscv64-linux-gnu/bin/objdump --version
GNU objdump (GNU Binutils for Ubuntu) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
➜  Bld file /usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0
/usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0: ELF 64-bit LSB shared object, UCB RISC-V, version 1 (SYSV), dynamically linked, BuildID[sha1]=5af4a2eff63e94f36b40ad145b1c281908cfe8f9, stripped
➜  Bld /usr/riscv64-linux-gnu/bin/objdump -d /usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0 | head -n 20

/usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0:     file format elf64-littleriscv


Disassembly of section .plt:

0000000000006dc0 <pthread_attr_setdetachstate@plt-0x20>:
    6dc0:       00016397                auipc   t2,0x16
    6dc4:       41c30333                sub     t1,t1,t3
    6dc8:       4203be03                ld      t3,1056(t2) # 1d1e0 <acc_is_present_array_h_@@OACC_2.0+0x3046>
    6dcc:       fd430313                addi    t1,t1,-44
    6dd0:       42038293                addi    t0,t2,1056
    6dd4:       00135313                srli    t1,t1,0x1
    6dd8:       0082b283                ld      t0,8(t0)
    6ddc:       000e0067                jr      t3

0000000000006de0 <pthread_attr_setdetachstate@plt>:
    6de0:       00016e17                auipc   t3,0x16
    6de4:       410e3e03                ld      t3,1040(t3) # 1d1f0 <pthread_attr_setdetachstate@GLIBC_2.27>
    6de8:       000e0367                jalr    t1,t3

aarch64

➜  Bld /usr/aarch64-linux-gnu/bin/objdump --version
GNU objdump (GNU Binutils for Ubuntu) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
➜  Bld file /usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0
/usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, BuildID[sha1]=1106de22dfcbfbef7445bbcbe38479bcec8b8780, stripped
➜  Bld /usr/aarch64-linux-gnu/bin/objdump -d /usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0 | head -n 30

/usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0:     file format elf64-littleaarch64


Disassembly of section .init:

0000000000006db8 <.init>:
    6db8:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
    6dbc:       910003fd        mov     x29, sp
    6dc0:       94000a77        bl      979c <acc_async_test_all@plt+0x246c>
    6dc4:       a8c17bfd        ldp     x29, x30, [sp], #16
    6dc8:       d65f03c0        ret

Disassembly of section .plt:

0000000000006dd0 <memcpy@plt-0x20>:
    6dd0:       a9bf7bf0        stp     x16, x30, [sp, #-16]!
    6dd4:       b00001b0        adrp    x16, 3b000 <acc_is_present_array_h_@@OACC_2.0+0x17a3c>
    6dd8:       f947fe11        ldr     x17, [x16, #4088]
    6ddc:       913fe210        add     x16, x16, #0xff8
    6de0:       d61f0220        br      x17
    6de4:       d503201f        nop
    6de8:       d503201f        nop
    6dec:       d503201f        nop

0000000000006df0 <memcpy@plt>:
    6df0:       d00001b0        adrp    x16, 3c000 <memcpy@GLIBC_2.17>
    6df4:       f9400211        ldr     x17, [x16]
    6df8:       91000210        add     x16, x16, #0x0
    6dfc:       d61f0220        br      x17

Test with GNU bintutils master branch just in case:

➜  Bld ~/Src/binutils-bld/binutils/objdump -d /usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0 | head -n 20

/usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0:     file format elf64-littleriscv


Disassembly of section .plt:

0000000000006dc0 <pthread_attr_setdetachstate@plt-0x20>:
    6dc0:       00016397                auipc   t2,0x16
    6dc4:       41c30333                sub     t1,t1,t3
    6dc8:       4203be03                ld      t3,1056(t2) # 1d1e0 <acc_is_present_array_h_@@OACC_2.0+0x3046>
    6dcc:       fd430313                addi    t1,t1,-44
    6dd0:       42038293                addi    t0,t2,1056
    6dd4:       00135313                srli    t1,t1,0x1
    6dd8:       0082b283                ld      t0,8(t0)
    6ddc:       000e0067                jr      t3

0000000000006de0 <pthread_attr_setdetachstate@plt>:
    6de0:       00016e17                auipc   t3,0x16
    6de4:       410e3e03                ld      t3,1040(t3) # 1d1f0 <pthread_attr_setdetachstate@GLIBC_2.27>
    6de8:       000e0367                jalr    t1,t3


➜  Bld ~/Src/binutils-bld/binutils/objdump -d /usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0 | head -n 30

/usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0:     file format elf64-littleaarch64


Disassembly of section .init:

0000000000006db8 <.init>:
    6db8:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
    6dbc:       910003fd        mov     x29, sp
    6dc0:       94000a77        bl      979c <acc_async_test_all@plt+0x246c>
    6dc4:       a8c17bfd        ldp     x29, x30, [sp], #16
    6dc8:       d65f03c0        ret

Disassembly of section .plt:

0000000000006dd0 <memcpy@plt-0x20>:
    6dd0:       a9bf7bf0        stp     x16, x30, [sp, #-16]!
    6dd4:       b00001b0        adrp    x16, 3b000 <acc_is_present_array_h_@@OACC_2.0+0x17a3c>
    6dd8:       f947fe11        ldr     x17, [x16, #4088]
    6ddc:       913fe210        add     x16, x16, #0xff8
    6de0:       d61f0220        br      x17
    6de4:       d503201f        nop
    6de8:       d503201f        nop
    6dec:       d503201f        nop

0000000000006df0 <memcpy@plt>:
    6df0:       d00001b0        adrp    x16, 3c000 <memcpy@GLIBC_2.17>
    6df4:       f9400211        ldr     x17, [x16]
    6df8:       91000210        add     x16, x16, #0x0
    6dfc:       d61f0220        br      x17

To help writing up the discussion thread, I'm trying to gather some data on how each target handle the case with GNU objdump. Weird that I could not reproduce the results @MaskRay @peter.smith was able to obtain. This is not to show favor for either choice but to understand the current situation on GNU side so we make a sensible decision.

It looks like there was a fairly recent change in binutils, my version was 2.32. An older version I happen to have lying around 2.25, seems to match the behaviour you are seeing. If I had to guess I'd say it was this change in binutils: https://sourceware.org/bugzilla/show_bug.cgi?id=22911

➜  Bld lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.2 LTS
Release:        18.04
Codename:       bionic

risc-v

➜  Bld /usr/riscv64-linux-gnu/bin/objdump --version
GNU objdump (GNU Binutils for Ubuntu) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
➜  Bld file /usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0
/usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0: ELF 64-bit LSB shared object, UCB RISC-V, version 1 (SYSV), dynamically linked, BuildID[sha1]=5af4a2eff63e94f36b40ad145b1c281908cfe8f9, stripped
➜  Bld /usr/riscv64-linux-gnu/bin/objdump -d /usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0 | head -n 20

/usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0:     file format elf64-littleriscv


Disassembly of section .plt:

0000000000006dc0 <pthread_attr_setdetachstate@plt-0x20>:
    6dc0:       00016397                auipc   t2,0x16
    6dc4:       41c30333                sub     t1,t1,t3
    6dc8:       4203be03                ld      t3,1056(t2) # 1d1e0 <acc_is_present_array_h_@@OACC_2.0+0x3046>
    6dcc:       fd430313                addi    t1,t1,-44
    6dd0:       42038293                addi    t0,t2,1056
    6dd4:       00135313                srli    t1,t1,0x1
    6dd8:       0082b283                ld      t0,8(t0)
    6ddc:       000e0067                jr      t3

0000000000006de0 <pthread_attr_setdetachstate@plt>:
    6de0:       00016e17                auipc   t3,0x16
    6de4:       410e3e03                ld      t3,1040(t3) # 1d1f0 <pthread_attr_setdetachstate@GLIBC_2.27>
    6de8:       000e0367                jalr    t1,t3

aarch64

➜  Bld /usr/aarch64-linux-gnu/bin/objdump --version
GNU objdump (GNU Binutils for Ubuntu) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
➜  Bld file /usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0
/usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, BuildID[sha1]=1106de22dfcbfbef7445bbcbe38479bcec8b8780, stripped
➜  Bld /usr/aarch64-linux-gnu/bin/objdump -d /usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0 | head -n 30

/usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0:     file format elf64-littleaarch64


Disassembly of section .init:

0000000000006db8 <.init>:
    6db8:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
    6dbc:       910003fd        mov     x29, sp
    6dc0:       94000a77        bl      979c <acc_async_test_all@plt+0x246c>
    6dc4:       a8c17bfd        ldp     x29, x30, [sp], #16
    6dc8:       d65f03c0        ret

Disassembly of section .plt:

0000000000006dd0 <memcpy@plt-0x20>:
    6dd0:       a9bf7bf0        stp     x16, x30, [sp, #-16]!
    6dd4:       b00001b0        adrp    x16, 3b000 <acc_is_present_array_h_@@OACC_2.0+0x17a3c>
    6dd8:       f947fe11        ldr     x17, [x16, #4088]
    6ddc:       913fe210        add     x16, x16, #0xff8
    6de0:       d61f0220        br      x17
    6de4:       d503201f        nop
    6de8:       d503201f        nop
    6dec:       d503201f        nop

0000000000006df0 <memcpy@plt>:
    6df0:       d00001b0        adrp    x16, 3c000 <memcpy@GLIBC_2.17>
    6df4:       f9400211        ldr     x17, [x16]
    6df8:       91000210        add     x16, x16, #0x0
    6dfc:       d61f0220        br      x17

Test with GNU bintutils master branch just in case:

➜  Bld ~/Src/binutils-bld/binutils/objdump -d /usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0 | head -n 20

/usr/riscv64-linux-gnu/lib/libgomp.so.1.0.0:     file format elf64-littleriscv


Disassembly of section .plt:

0000000000006dc0 <pthread_attr_setdetachstate@plt-0x20>:
    6dc0:       00016397                auipc   t2,0x16
    6dc4:       41c30333                sub     t1,t1,t3
    6dc8:       4203be03                ld      t3,1056(t2) # 1d1e0 <acc_is_present_array_h_@@OACC_2.0+0x3046>
    6dcc:       fd430313                addi    t1,t1,-44
    6dd0:       42038293                addi    t0,t2,1056
    6dd4:       00135313                srli    t1,t1,0x1
    6dd8:       0082b283                ld      t0,8(t0)
    6ddc:       000e0067                jr      t3

0000000000006de0 <pthread_attr_setdetachstate@plt>:
    6de0:       00016e17                auipc   t3,0x16
    6de4:       410e3e03                ld      t3,1040(t3) # 1d1f0 <pthread_attr_setdetachstate@GLIBC_2.27>
    6de8:       000e0367                jalr    t1,t3


➜  Bld ~/Src/binutils-bld/binutils/objdump -d /usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0 | head -n 30

/usr/aarch64-linux-gnu/lib/libgomp.so.1.0.0:     file format elf64-littleaarch64


Disassembly of section .init:

0000000000006db8 <.init>:
    6db8:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
    6dbc:       910003fd        mov     x29, sp
    6dc0:       94000a77        bl      979c <acc_async_test_all@plt+0x246c>
    6dc4:       a8c17bfd        ldp     x29, x30, [sp], #16
    6dc8:       d65f03c0        ret

Disassembly of section .plt:

0000000000006dd0 <memcpy@plt-0x20>:
    6dd0:       a9bf7bf0        stp     x16, x30, [sp, #-16]!
    6dd4:       b00001b0        adrp    x16, 3b000 <acc_is_present_array_h_@@OACC_2.0+0x17a3c>
    6dd8:       f947fe11        ldr     x17, [x16, #4088]
    6ddc:       913fe210        add     x16, x16, #0xff8
    6de0:       d61f0220        br      x17
    6de4:       d503201f        nop
    6de8:       d503201f        nop
    6dec:       d503201f        nop

0000000000006df0 <memcpy@plt>:
    6df0:       d00001b0        adrp    x16, 3c000 <memcpy@GLIBC_2.17>
    6df4:       f9400211        ldr     x17, [x16]
    6df8:       91000210        add     x16, x16, #0x0
    6dfc:       d61f0220        br      x17
jimw added a subscriber: jimw.Jun 27 2019, 5:39 PM

This one had me confused for a while, but after a bit of experimenting I figured out that the problem is strip.

hifiveu017:1075$ cat tmp.c
extern int sub2 (void);
int sub (void) { return sub2 (); }
hifiveu017:1076$ gcc --shared -fpic -O -o tmp.so tmp.c
hifiveu017:1077$ objdump -d tmp.so | head

tmp.so: file format elf64-littleriscv

Disassembly of section .plt:

00000000000003d0 <.plt>:
3d0: 00002397 auipc t2,0x2
3d4: 41c30333 sub t1,t1,t3
3d8: c303be03 ld t3,-976(t2) # 2000 <TMC_END>
hifiveu017:1078$ strip tmp.so
hifiveu017:1079$ objdump -d tmp.so | head

tmp.so: file format elf64-littleriscv

Disassembly of section .plt:

00000000000003d0 <sub2@plt-0x20>:
3d0: 00002397 auipc t2,0x2
3d4: 41c30333 sub t1,t1,t3
3d8: c303be03 ld t3,-976(t2) # 2000 <sub@@Base+0x1b6e>
hifiveu017:1080$

MaskRay is looking at system libraries which have all been stripped. I've been looking at libraries built as part of the toolchain build, which are not stripped. The difference between the stripped and unstripped file is that the unstripped file has a section symbol for each section. Readelf --syms shows it as

9: 00000000000003d0     0 SECTION LOCAL  DEFAULT    9

and nm shows it as
00000000000003d0 l d .plt 0000000000000000 .plt
The stripped file doesn't have the section symbols.

Objdump always uses the nearest symbol, so if you have the section symbols, then the output starts with that symbol name. This was mentioned near the top, when discussing
Disassembly of section .text:
0000000000001000 .text:
The first .text is the section, the second one is the section symbol. And that section symbol disappears when you strip the file, if it isn't needed by something else like a dynamic reloc.

So the issue here is that the objdump output is a little confusing when run on stripped files. I don't think that is a bug. If you want good objdump output, don't strip the files.

There is a secondary issue that the first plt in the plt section does not have a symbol of its own, and certainly not one that will survive strip, but I don't think that is a bug either.

Perhaps objdump could be extended to create synthetic section symbols if they don't exist, to get better output for stripped files, but that would be an enhancement not a bug fix.

ychen abandoned this revision.Jun 28 2022, 2:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2022, 2:48 PM