This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Avoid using mapping symbols as branch target labels
ClosedPublic

Authored by krisb on Dec 1 2022, 11:17 AM.

Details

Summary

The main motivation for this change is to avoid ambiguity because
mapping symbol names may not be unique across a binary and do not allow uniquely
identifying target address. So that mapping symbols used as branch target
labels make llvm-objdump output less readable.

Another point is that mapping symbols sometimes appear in
non-allocatable sections, like debug info sections which make objdump
output even more confusing.

For example, a small AArch64 executable would contain plenty of $d[.*]
symbols and none of them would be useful as a label (base) for resolving
a branch or a memory operand target address:

0000000000000254 l       .note.ABI-tag	0000000000000000 $d
00000000000008d4 l       .eh_frame            0000000000000000 $d
0000000000000868 l       .rodata              0000000000000000 $d
0000000000011028 l       .data                0000000000000000 $d
0000000000010db8 l       .fini_array          0000000000000000 $d
0000000000010db0 l       .init_array          0000000000000000 $d
00000000000008e8 l       .eh_frame            0000000000000000 $d
0000000000011034 l       .bss                 0000000000000000 $d
0000000000000165 l       .debug_abbrev	0000000000000000 $d.1
0000000000000553 l       .debug_info    	0000000000000000 $d.2
0000000000000000 l       .debug_str_offsets	0000000000000000 $d.3
000000000000039c l       .debug_str           0000000000000000 $d.4
0000000000000000 l       .debug_addr          0000000000000000 $d.5
00000000000000c1 l       .comment             0000000000000000 $d.6
0000000000000948 l       .eh_frame            0000000000000000 $d.7
00000000000001d4 l       .debug_line          0000000000000000 $d.8
0000000000000000 l       .debug_line_str	0000000000000000 $d.9
0000000000011030 l       .data        	0000000000000000 $d.1
00000000000001ac l       .debug_abbrev	0000000000000000 $d.2
00000000000005a1 l       .debug_info    	0000000000000000 $d.3
0000000000000024 l       .debug_str_offsets	0000000000000000 $d.4
000000000000039c l       .debug_str           0000000000000000 $d.5
0000000000000010 l       .debug_addr          0000000000000000 $d.6
00000000000000c1 l       .comment             0000000000000000 $d.7
0000000000000948 l       .eh_frame            0000000000000000 $d.8
0000000000000258 l       .debug_line          0000000000000000 $d.9
00000000000009a0 l       .eh_frame            0000000000000000 $d

Note that GNU objdump doesn't use mapping symbols as branch target
labels for all targets that support such symbols (ARM, AArch64, CSKY).

Diff Detail

Event Timeline

krisb created this revision.Dec 1 2022, 11:17 AM
krisb requested review of this revision.Dec 1 2022, 11:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 1 2022, 11:17 AM
krisb updated this revision to Diff 479580.Dec 2 2022, 3:17 AM

Fix lld test

ikudrin accepted this revision.Dec 2 2022, 3:58 AM

LGTM. Please give some time for other reviewers to react.

This revision is now accepted and ready to land.Dec 2 2022, 3:58 AM
MaskRay accepted this revision.Dec 2 2022, 9:36 AM
simon_tatham accepted this revision.Dec 5 2022, 1:20 AM

Yes, I agree too – definitely a good change.

I especially like the part where a mapping symbol isn't used even if it's the only symbol at that address at all, as seen in one of the modified tests. functionname+0x34 really is an improvement on $x.12!