This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Allow delayed decision for ADD/SUB relocations
ClosedPublic

Authored by MaskRay on Jul 14 2023, 11:45 PM.

Details

Summary

For a label difference A-B in assembly, if A and B are separated by a
linker-relaxable instruction, we should emit a pair of ADD/SUB
relocations (e.g. R_RISCV_ADD32/R_RISCV_SUB32,
R_RISCV_ADD64/R_RISCV_SUB64).

However, the decision is made upfront at parsing time with inadequate
heuristics (requiresFixup). As a result, LLVM integrated assembler
incorrectly suppresses R_RISCV_ADD32/R_RISCV_SUB32 for the following
code:

// Simplified from a workaround https://android-review.googlesource.com/c/platform/art/+/2619609
// Both end and begin are not defined yet. We decide ADD/SUB relocations upfront and don't know they will be needed.
.4byte end-begin

begin:
  call foo
end:

To fix the bug, make two primary changes:

  • Delete requiresFixups and the overridden emitValueImpl (from D103539). This deletion requires accurate evaluateAsAbolute (D153097).
  • In MCAssembler::evaluateFixup, call handleAddSubRelocations to emit ADD/SUB relocations.

However, there is a remaining issue in
MCExpr.cpp:AttemptToFoldSymbolOffsetDifference. With MCAsmLayout, we may
incorrectly fold A-B even when A and B are separated by a
linker-relaxable instruction. This deficiency is acknowledged (see
D153097), but was previously bypassed by eagerly emitting ADD/SUB using
requiresFixups. To address this, we partially reintroduce canFold (from
D61584, removed by D103539).

Some expressions (e.g. .size and .fill) need to take the MCAsmLayout
code path in AttemptToFoldSymbolOffsetDifference, avoiding relocations
(weird, but matching GNU assembler and needed to match user
expectation). Switch to evaluateKnownAbsolute to leverage the InSet
condition.

As a bonus, this change allows for the removal of some relocations for
the FDE address_range field in the .eh_frame section.

riscv64-64b-pcrel.s contains the main test.
Add a linker relaxable instruction to dwarf-riscv-relocs.ll to test what
it intends to test.
Merge fixups-relax-diff.ll into fixups-diff.ll.

Diff Detail

Event Timeline

MaskRay created this revision.Jul 14 2023, 11:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2023, 11:45 PM
MaskRay requested review of this revision.Jul 14 2023, 11:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2023, 11:45 PM
MaskRay updated this revision to Diff 542341.Jul 20 2023, 12:43 AM

rebase after llvm-readobj patch landed

kito-cheng accepted this revision.Jul 20 2023, 1:11 AM

LGTM, requiresFixups is the one of the hard to maintain and hard to understand function around there, and this patch removed that and handled it in better way :)

This revision is now accepted and ready to land.Jul 20 2023, 1:11 AM
asb added a comment.Jul 20 2023, 7:20 AM

One piece of this I'm struggling to follow a bit is evaluateKnownAbsolute and InSet. This InSet parameter doesn't have much in the way of documentation. I'm sure it's obvious to someone working in this code regularly, but could you please clarify the intended meaning of this parameter?

One piece of this I'm struggling to follow a bit is evaluateKnownAbsolute and InSet. This InSet parameter doesn't have much in the way of documentation. I'm sure it's obvious to someone working in this code regularly, but could you please clarify the intended meaning of this parameter?

Even if I have stared at the code for quite some time, InSet is not so obvious to me...

My current understanding is:

Some expressions cannot use relocations (e.g., .size and .fill directives). They set InSet as a parameter in various methods in llvm/lib/MC/.
This makes expression evaluation different based on the context and should be generally avoided if possible.
This issue is aggravated by the fact that LLVMMC seems to misuse InSet and make InSet affect the behavior in situations where it doesn't need to.
Unfortunately, the uses are a bit messy. Additionally, InSet appears to be used as a workaround for Mach-O in some places, specifically related to its .subsections_via_symbols mechanism.


We can probably add this explanation above MCExpr::evaluateAsAbsolute. In some situations InSet can be removed. I'll try to clean this up further when I get more time.

I'll land this tomorrow:)

This revision was landed with ongoing or failed builds.Jul 21 2023, 8:38 AM
This revision was automatically updated to reflect the committed changes.
llvm/test/CodeGen/RISCV/fixups-diff.ll