This is an archive of the discontinued LLVM Phabricator instance.

[objdump][ARM] Use correct offset when printing ARM/Thumb branch targets
ClosedPublic

Authored by ostannard on Mar 2 2021, 5:58 AM.

Details

Summary

llvm-objdump only uses one MCInstrAnalysis object, so if ARM and Thumb
code is mixed in one object, or if an object is disassembled without
explicitly setting the triple to match the ISA used, then branch and
call targets will be printed incorrectly.

This could be fixed by creating two MCInstrAnalysis objects in
llvm-objdump, like we currently do for SubtargetInfo. However, I don't
think there's any reason we need two separate sub-classes of
MCInstrAnalysis, so instead these can be merged into one, and the ISA
determined by checking the opcode of the instruction.

Diff Detail

Event Timeline

ostannard created this revision.Mar 2 2021, 5:58 AM
ostannard requested review of this revision.Mar 2 2021, 5:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2021, 5:58 AM
ostannard updated this revision to Diff 327463.Mar 2 2021, 8:17 AM
  • Test with both ARM and Thumb triples
  • Fix failing lld test, where this bug was causing the wrong symbols to be displayed
simon_tatham added inline comments.Mar 2 2021, 9:36 AM
llvm/lib/Target/ARM/MCTargetDesc/ARMMCTargetDesc.cpp
423

Is it possible to find out from the MCInstrDesc whether something's an Arm or a Thumb instruction? If so, you could set Offset based on that, and then you wouldn't have to get it right in each individual switch case, including future updates.

It looks as if (Desc.TSFlags & FormMask) == ThumbFrm ought to do that, but that is based on quite a quick look, so I might be wrong about what it means.

MaskRay added inline comments.Mar 2 2021, 6:10 PM
llvm/test/tools/llvm-objdump/ELF/ARM/branch-symbols.s
2

Tip: --no-show-raw-insn to omit hex bytes. They are separately tested by test/MC/ARM encoding tests, so no need to duplicate in llvm-objdump tests.

11

Another pain is that arm prints relative offsets, instead of absolute addresses.
Is anyone going to claim the work?

ostannard marked 2 inline comments as done.Mar 4 2021, 1:53 AM
ostannard added inline comments.
llvm/test/tools/llvm-objdump/ELF/ARM/branch-symbols.s
11

I'm not planning on changing that, and I'm not sure that would be the right thing to do given that the assembler interprets immediates as relative offsets.

ostannard updated this revision to Diff 328076.Mar 4 2021, 1:54 AM
  • Use TSFlags to check for thumb instructions, and walk operands to find the immediate, to avoid needing to keep the switch statement up to date
  • Don't include instruction encodings in test
simon_tatham accepted this revision.Mar 4 2021, 2:22 AM

Very nice – it hadn't even occurred to me that you might be able to find the branch-target operand automatically as well as identifying Arm/Thumb.

This revision is now accepted and ready to land.Mar 4 2021, 2:22 AM
This revision was landed with ongoing or failed builds.Mar 4 2021, 3:17 AM
This revision was automatically updated to reflect the committed changes.