Page MenuHomePhabricator

Handle absolute symbols as branch targets in disassembly.

Authored by saugustine on Jun 25 2018, 11:10 AM.



Certain tests and embedded systems use sectionless symbols
as the target of branches or calls. Handle them in the disassembly.

Diff Detail

Event Timeline

saugustine created this revision.Jun 25 2018, 11:10 AM

Add missing test file.

Add missing newline.

Restore missing files.

Small update to fix segfault in llvm test.

MaskRay added inline comments.Jun 25 2018, 3:41 PM

Should this test be synthesized from a .s file using llvm-mc?

MaskRay added inline comments.Jun 25 2018, 10:22 PM

I find that on AArch64 PowerPC x86_64 ... all these call/branch instructions print the immediate value without taking account of PC (what they print are relative addresses, instead of absolute addresses as printed by objdump)

The PowerPC one at least appends .+ before the immediate to tell you it is a relative address.
void PPCInstPrinter::printBranchOperand(const MCInst *MI, unsigned OpNo,

                                      raw_ostream &O) {
if (!MI->getOperand(OpNo).isImm())
  return printOperand(MI, OpNo, O);

// Branches can take an immediate operand.  This is used by the branch
// selection pass to print .+8, an eight byte displacement from the PC.
O << ".+";
printAbsBranchOperand(MI, OpNo, O);



Reasonable behavior. I have verified that objdump does something similar (foo + 4). It has one extra heuristic: if the target address is less than the absolute symbol of the smallest address, it can print something like foo - 4. But I think we don't need such heuristic.

saugustine marked an inline comment as done.Jun 26 2018, 10:09 AM
saugustine added inline comments.

This is a good question, and I debated. But only about five of the ~100 files in llbm-objdump/Inputs are normal ASCII source of one kind or another. The rest are binaries. So I went that way. Totally willing to change it if you think it is important.


This patch doesn't change the printed value, it just adds the symbol it targets. This is sufficient for my needs, and I'm not sure what the dependencies of the current format are, or if it is a principled decision.

Before this patch:

201000: e8 fb f0 df ff callq -2100997


201000: e8 fb f0 df ff callq -2100997 <foo>

MaskRay added inline comments.Jun 26 2018, 10:22 AM

Adding @enderby @ruiu who have touched test/Object/Inputs/* binaries.

There are also utilities obj2yaml and yaml2obj (introduced in 2012) to construct tests involving binary files. I wonder why the tests here are organized in such a way.


I should be clear that my comment was asking why pc-relative addresses are printed in the current way... This patch definitely improves the view so you can just ignore my comment..

echristo added inline comments.Jun 26 2018, 2:19 PM

I can elaborate here:

  • Some of the tests predate obj2yaml/yaml2obj - especially since it didn't see widespread use until recently.
  • For some tests you'll definitely want an existing object file because you've either got malformed output that you were trying to look at, or something that might change in the future.
  • Otherwise a .s file is fine too.

My preference here, I think, is going to be:

  1. yaml2obj if it's simple
  2. .s file if possible
  3. checked in object file

Almost assuredly no particular planning on the format of objdump output here.

Switch from binary file as input to assembly in line in the test.

saugustine marked an inline comment as done.Jun 27 2018, 2:54 PM
saugustine added inline comments.

It turns out that yaml2obj doesn't handle absolute symbols cleanly either. So I've just converted the very simple input to inline text in the test itself.

echristo accepted this revision.Jun 27 2018, 2:54 PM

I'm ok here. Ray?

This revision is now accepted and ready to land.Jun 27 2018, 2:54 PM
MaskRay accepted this revision.Jun 28 2018, 9:51 AM
MaskRay added inline comments.

Nit: trailing \n should be unnecessary.

saugustine closed this revision.Jun 28 2018, 1:04 PM
saugustine marked an inline comment as done.

This is committed as r335903, with a small update to fix targets that don't have ld.lld available in r335908.

grimar added a subscriber: grimar.Jun 28 2018, 11:20 PM
grimar added inline comments.

btw, I think \n can be avoided here with the use of ;.
Here is how we do such things in LLD tests:

# RUN: echo '.section .init_array, "aw"; .quad 0' | llvm-mc -filetype=obj -triple=x86_64-unknown-linux - -o %t.o

It perhaps looks a bit cleaner.

I see that you switched to use the binary file in rL335908. I think it is a better way because does not seem
that adding a dependency to ld.lld was good for llvm-objdump tests.
That is also consistent with another test we have which is test\tools\llvm-objdump\X86\phdrs.test.

A good practice is to add the description of how to rebuild the binary object used.
Because in some rare cases it might be needed, but without such descriptions, it can be a painful task.
phdrs.test, for example, contains such description. May I request you to add one too, please?