Page MenuHomePhabricator

RISCV: adjust handling of relocation emission for RISCV

Authored by compnerd on Jun 2 2021, 10:49 AM.



This re-architects the RISCV relocation handling to bring the
implementation closer in line with the implementation in binutils. We
would previously aggressively resolve the relocation. With this
restructuring, we always will emit a paired relocation for any symbolic
difference of the type of S±T[±C] where S and T are labels and C is a

GAS has a special target hook controlled by RELOC_EXPANSION_POSSIBLE
which indicates that a fixup may be expanded into multiple relocations.
This is used by the RISCV backend to always emit a paired relocation -
either ADD[WIDTH] + SUB[WIDTH] for text relocations or SET[WIDTH] +
SUB[WIDTH] for a debug info relocation. Irrespective of whether linker
relaxation support is enabled, symbolic difference is always emitted as
a paired relocation.

This change also sinks the target specific behaviour down into the
target specific area rather than exposing it to the shared relocation
handling. In the process, we also sink the "special" handling for debug
information down into the RISCV target. Although this improves the path
for the other targets, this is not necessarily entirely ideal either.
The changes in the debug info emission could be done through another
type of hook as this functionality would be required by any other target
which wishes to do linker relaxation. However, as there are no other
targets in LLVM which currently do this, this is a reasonable thing to
do until such time as the code needs to be shared.

Improve the handling of the relocation (and add a reduced test case from
the Linux kernel) to ensure that we handle complex expressions for
symbolic difference. This ensures that we correct relocate symbols with
the adddends normalized and associated with the addition portion of the
paired relocation.

This change also addresses some review comments from Alex Bradbury about
the relocations meant for use in the DWARF CFA being named incorrectly
(using ADD6 instead of SET6) in the original change which introduced the
relocation type.

This resolves the issues with the symbolic difference emission
sufficiently to enable building the Linux kernel with clang+IAS+lld
(without linker relaxation).

Resolves PR50153, PR50156!
Fixes: ClangBuiltLinux/linux#1023, ClangBuiltLinux/linux#1143

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jrtc27 added inline comments.Jun 3 2021, 8:46 AM

Use std::pair or a local struct type


I know what the code is trying to do, what I'm failing to grasp is why it actually works, it looks buggy to me, as in that it doesn't do what I think it does and what you say it's meant to, so I must be missing something given the tests still pass (I assume they do given you haven't touched any?).


Still applies below.


Why has this suddenly appeared in version 2?

compnerd added inline comments.Jun 3 2021, 3:38 PM

This isn't meant for reviewing yet :) - its marked as a WIP.


This change isn't really ready for review yet. I wanted to share it with @MaskRay so I put it here as it is where it is going to end up. I'm still working through the changes, so expect a lot of major changes. Thats also why there is no one tagged on the diff.

compnerd updated this revision to Diff 349921.Jun 4 2021, 10:59 AM

Closer to passing tests (9 more failures to understand). It matches gas' behaviour better, is more uniformly behaved.

jrtc27 added inline comments.Jun 4 2021, 11:04 AM

There's no longer a RUN line for this... also this test makes no sense in the first place, it's trying to use 64-bit relocations on RV32. That should just be an error.

12 ↗(On Diff #349921)

Again, no longer used by a RUN line. And this looks like a bug, we're passing --mc-relax-all to the assembler, so this branch should have been relaxed in the branch relaxation sense (which is not the same as the linker relaxation sense), no? GNU as doesn't have --mc-relax-all, comparing against it is meaningless.

compnerd added inline comments.Jun 4 2021, 3:16 PM

I didn't even notice that the relocation types are just flat out wrong. That seems like a separate issue.

I wanted to keep the now dead check to do a last round of verification of the changes, at which point it should be removed.

12 ↗(On Diff #349921)

Similar to above, I wanted to keep it until I do the final sweep through the change.

Agreed, that matching the behaviour with GAS for this does not make sense. I will have to adjust the implementation to accommodate this once I figure out the remaining cases.

It does raise the question of does it make sense for us to have a -mrelax and a -mc-relax-all? Perhaps we should just treat -mrelax as -mc-relax-all. Is there a heavy penalty for this? Yes, linking may take a little bit more time and memory and the object file size will be slightly larger, but there shouldn't be any runtime cost. The question boils down to, is the slight link time savings worth the two separate paths.

simoncook added inline comments.Jun 6 2021, 1:54 PM

The relocations are 64-bit because of the size of .dword, I don't think this is an error?

jrtc27 added inline comments.Jun 6 2021, 1:55 PM

The relocations match the directive, but the directive itself makes no sense on RV32. Can you do .8byte foo on i386?...


And if you did want to support it, I'd argue it should just be a zero-padded 32-bit relocation (i.e. offset by 4 bytes on big-endian).

compnerd added inline comments.Jun 7 2021, 8:37 AM

Well, I don't think that we should change that behaviour in this change. Furthermore, it will require working with GCC and binutils as it currently _does_ generate a pair of R_RISCV_ADD64 and R_RISCV_SUB64 on RV32.

compnerd updated this revision to Diff 350403.Jun 7 2021, 1:26 PM

7 failures to look into; need to still replace the use of RelaxAll which would also improve the test pass rate.

compnerd updated this revision to Diff 350651.Jun 8 2021, 10:27 AM

Down to 6 failures to explore.

Various clean ups to the change. This almost is inline with what GAS does to handle the symbolic differences. This is missing a specific tweak to the condition where the difference is constrained immediately instead of being relaxed (that is, we are over-relaxing currently).

compnerd updated this revision to Diff 351202.Jun 10 2021, 9:37 AM

3 failures left now. KI:

  • 6B relocations are not implemented
  • debug_line references aren't correctly emitted

One thing to note is that this seems to improve the generation of the eh_frame, which is important for enabling relaxation without negatively impacting unwinding.

compnerd updated this revision to Diff 351477.Jun 11 2021, 9:35 AM

2 remaining failures KI:

  • debug_line encoding is not relaxable
compnerd updated this revision to Diff 351900.Jun 14 2021, 9:48 AM
compnerd edited the summary of this revision. (Show Details)
compnerd added reviewers: asb, MaskRay, jrtc27.
compnerd retitled this revision from WIP: try to repair RISCV handling of paired relocations to RISCV: adjust handling of relocation emission for RISCV.
compnerd added subscribers: echristo, dblaikie.

With Diff 351900 of this patch applied, a RISCV build of the mainline Linux kernel with LLVM=1 LLVM_IAS=1 builds and boots. This patch resolves for me.



compnerd updated this revision to Diff 352163.Jun 15 2021, 9:40 AM
compnerd edited the summary of this revision. (Show Details)
compnerd added inline comments.Jun 15 2021, 10:10 AM

Ugh, I missed this :-(; Yes, it is now unused, at one point I was using it for std::pair. Thanks for pointing that out.

With Diff 352163 of this patch applied, I can build+boot a mainline riscv Linux kernel. is resolved, and further now so is This is gets us back to being able to build entirely with LLVM (clang, clang's integrated assembler, LLD). That's huge! Thanks @compnerd !

I also built the kernel with debug info; llvm-dwarfdump didn't have any complaints about the input.


does this need to be static?


Should this call getSymB() rather than getSymA()?


Should this be:

MCStreamer::emitValueImpl(Value, Size, Loc);
if (!useLinkerRelaxation(getContext(), Value, A, B))

Some of the comments have an extra R in the relocation type identifier.


is it ok to drop the llvm-objdump -d test?

jrtc27 added inline comments.Jun 15 2021, 12:14 PM

It was there to ensure that the assembled output was correct, but if it no longer assembles then you can't exactly test it.

MaskRay added inline comments.Jun 15 2021, 2:14 PM

Add -NEXT:

Otherwise CHECK: } cannot ensure there is no additional ADD/SUB.




nit: [[@ is deprecated. use :[[#@LINE+1]]:[[#]]: error: ...




riscv64-unknown-none-elf can be simplified as riscv64


To ensure no additional add/sub, add -NEXT and check } wherever appropriate

MaskRay added inline comments.Jun 15 2021, 2:47 PM

Since you are switching the check-prefix styles...

The most common form is --check-prefix=


I think it is useful to track the column information ([[#]] -> an integer). This ensures that the reported diagnostic makes sense.

compnerd marked 13 inline comments as done.Jun 15 2021, 3:02 PM
compnerd added inline comments.

No, it doesn't need to be static. It is more at a conceptual level not something tied into the state of the streamer - it simply looks at an expression and tells you if you need a paired relocation for handling this. BTW, I welcome bike-shedding on this function's name - I dislike the current name of this function.


Move along, nothing to see here. That is rather embarrassing. Thank you for flagging this.


No, they are going down different functions - MCELFStreamer vs MCStreamer.

compnerd updated this revision to Diff 352264.Jun 15 2021, 3:03 PM
compnerd marked 2 inline comments as done.
compnerd edited the summary of this revision. (Show Details)
nickdesaulniers accepted this revision.Jun 15 2021, 3:09 PM

LGTM; big thanks for sorting all of this out!

This revision is now accepted and ready to land.Jun 15 2021, 3:09 PM
MaskRay added inline comments.Jun 15 2021, 3:09 PM

The old comment is unintentionally dropped: // According to DWARF spec., the DW_LNS_fixed_advance_pc opcode takes a ..

s/50000/60000/ to restore the previous threshold.


blank line seems redundant


s/WasRelaxed/Relaxed/ to be consistent with the impl


"Don’t duplicate function or class name at the beginning of the comment. "

Just ignore some other comments which violate this rule.

MaskRay accepted this revision.Jun 15 2021, 3:10 PM
compnerd marked 4 inline comments as done.Jun 16 2021, 8:08 AM
compnerd added inline comments.

Changed the implementation instead.


*shrug* sure, though I think that consistency is better than following the rules in this case (and is in the developer guidelines for LLVM IIRC). However, I don't feel strongly enough to bother arguing ... so changed. Also the comment leader doesn't seem like doxygen will heed it.

This revision was landed with ongoing or failed builds.Jun 17 2021, 8:53 AM
This revision was automatically updated to reflect the committed changes.

_sigh_ forgot to submit these review comments


Please line things up, and ideally also indent within { (matching the relative indentation of the output) to make it clearer


No need for a custom prefix



compnerd marked an inline comment as done.Jun 17 2021, 9:33 AM

I'll address the clean ups in a follow up.

compnerd marked 3 inline comments as done.Jun 17 2021, 9:37 AM
compnerd added inline comments.

Excellent point!


Good idea.

compnerd marked 2 inline comments as done.Jun 17 2021, 11:33 AM

e70d4994ea9eb44ba5a86c5073ac83cf500aabdd should have the suggestions incorporated.