This is an archive of the discontinued LLVM Phabricator instance.

[SYCL]Fix bug: objdump find symbol error on adrp instruction when imm < 0 in arm64
Needs ReviewPublic

Authored by JuunChen on Sep 20 2020, 2:50 AM.

Details

Summary

Problem

I used the same llvm-objdump -S -m --section=<section name> <macho-file> command for two different macho files to see the disassembly code.
But the annotation for the assembly instruction can't show for the second file, as shown on the right.

Here is the two files which you can test:
the two macho files.zip

Reason

I found this to be a bug if the adrp's imm < 0, at the followed lines:

adrp_imm = ((info->adrp_inst & 0x00ffffe0) >> 3) | ((info->adrp_inst >> 29) & 0x3);
if (info->adrp_inst & 0x0200000)
     adrp_imm |= 0xfffffffffc000000LL;

The line adrp_imm = ((info->adrp_inst & 0x00ffffe0) >> 3) | ((info->adrp_inst >> 29) & 0x3) find the adrp_imm in adrp_inst, it's right.
And the next two lines are intended to:

  1. Determine if adrp_imm is a negative number
  2. if the adrp_imm is negative , adrp_imms's 64-bit complement is calculated

However, as shown in the picture:


adrp_imm is encoded as [23:5][31:29].

If you want to determine if adrp_imm is a negative number, you should determine the 23rd bit of adrp_inst, or the 20th bit of adrp_imm.
It will not be info->adrp_inst & 0x0200000. The 0x0200000 is 0b00000000001000000000000000000000,the code is to determine the 21st of adrp_inst,so it doesn't make any sense.
This code adrp_imm |= 0xfffffffffc000000LL is also wrong, it caculated the wrong bits.

Fix

Use the follow code to fix:

if (adrp_imm & (1 << (21 - 1)))
       adrp_imm |= ~((1LL << 21) - 1);

validation

After the correction, I got the result I wanted.

Diff Detail

Event Timeline

JuunChen created this revision.Sep 20 2020, 2:50 AM
Herald added a reviewer: MaskRay. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
JuunChen requested review of this revision.Sep 20 2020, 2:50 AM
JuunChen added a comment.EditedSep 20 2020, 3:56 AM

I created a new differential, how to close this one?

"Add Action..." -> "Close Revision". However, if you just want to update the existing Differential, you can click "Upload Diff" or use arc diff 'HEAD^' (if the last commit has the Differential Revision: string connecting to this differential.

MaskRay added subscribers: enderby, pete, lhames.EditedSep 20 2020, 9:32 AM

Thanks for identifying the bug. SymbolierSymbolLookup is used as a parameter of llvm::Target::createMCSymbolizer. The information is very low level and duplicates the normal operand decoding logic by lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp DecodeAdrInstruction. The latter is used ubiquitously while the former (SymbolierSymbolLookup) is used in very few places and less tested. There are only 3 tests failing if I change the argument of createMCSymbolizer to nullptr:

Failed Tests (3):                                                              
  LLVM :: CodeGen/ARM/local-call.ll                                            
  LLVM :: MC/ARM/aligned-blx.s                                                                                                            
  LLVM :: tools/llvm-objdump/MachO/ARM/symbolized-disassembly.test

The purpose of SymbolierSymbolLookup is to symbolize a branch address. However, for that purpose, we have another (more recent) symbolization in llvm-objdump.cpp: D77853.

I suggest we delete SymbolierSymbolLookup and add ARM symbolization like what I have done to x86/PPC/AArch64 (I don't know enough about ARM and don't have time to investigate it)
@enderby was the original author of MachODump.cpp but he has been inactive for a while. CC @ab (author of rL182625), @lhames and @pete who may know who can pick up the work for ARM and fix the symbolization issue.

MaskRay added a subscriber: ab.Sep 20 2020, 9:35 AM
This comment was removed by JuunChen.
JuunChen edited the summary of this revision. (Show Details)Sep 20 2020, 6:50 PM
JuunChen updated this revision to Diff 293048.Sep 20 2020, 6:56 PM

Reformat Code

JuunChen retitled this revision from fix bug when adrp_imm < 0 to [SYCL]Fix bug: objdump find symbol error on adrp instruction when imm < 0 in arm64.Sep 20 2020, 6:58 PM