This is an archive of the discontinued LLVM Phabricator instance.

RISCV: adjust handling of relocation emission for RISCV
ClosedPublic

Authored by compnerd on Jun 2 2021, 10:49 AM.

Details

Summary

This re-architects the RISCV relocation handling to bring the
implementation closer in line with the implementation in binutils. We
would previously aggressively resolve the relocation. With this
restructuring, we always will emit a paired relocation for any symbolic
difference of the type of S±T[±C] where S and T are labels and C is a
constant.

GAS has a special target hook controlled by RELOC_EXPANSION_POSSIBLE
which indicates that a fixup may be expanded into multiple relocations.
This is used by the RISCV backend to always emit a paired relocation -
either ADD[WIDTH] + SUB[WIDTH] for text relocations or SET[WIDTH] +
SUB[WIDTH] for a debug info relocation. Irrespective of whether linker
relaxation support is enabled, symbolic difference is always emitted as
a paired relocation.

This change also sinks the target specific behaviour down into the
target specific area rather than exposing it to the shared relocation
handling. In the process, we also sink the "special" handling for debug
information down into the RISCV target. Although this improves the path
for the other targets, this is not necessarily entirely ideal either.
The changes in the debug info emission could be done through another
type of hook as this functionality would be required by any other target
which wishes to do linker relaxation. However, as there are no other
targets in LLVM which currently do this, this is a reasonable thing to
do until such time as the code needs to be shared.

Improve the handling of the relocation (and add a reduced test case from
the Linux kernel) to ensure that we handle complex expressions for
symbolic difference. This ensures that we correct relocate symbols with
the adddends normalized and associated with the addition portion of the
paired relocation.

This change also addresses some review comments from Alex Bradbury about
the relocations meant for use in the DWARF CFA being named incorrectly
(using ADD6 instead of SET6) in the original change which introduced the
relocation type.

This resolves the issues with the symbolic difference emission
sufficiently to enable building the Linux kernel with clang+IAS+lld
(without linker relaxation).

Resolves PR50153, PR50156!
Fixes: ClangBuiltLinux/linux#1023, ClangBuiltLinux/linux#1143

Diff Detail

Event Timeline

compnerd created this revision.Jun 2 2021, 10:49 AM
compnerd requested review of this revision.Jun 2 2021, 10:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2021, 10:49 AM
MaskRay added a comment.EditedJun 2 2021, 10:56 AM

Thanks for working on this. requiresDiffExpressionRelocations FK_Data_Add_1 canFold ... are all RISC-V specific layering violation concepts which add significant complexity to the MC component.
It is good to see to see them simplified and moved to the proper place (e.g. lib/Target/RISCV/MCTargetDesc as your diff shows).

Could you please re-upload the diff with full context?

jrtc27 added inline comments.Jun 2 2021, 12:07 PM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
396

Please don't "fix" unrelated whitespace

llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp
56

Ditto

156–161

I'm confused why this doesn't overwrite RI again with an R_RISCV_SUB8?

160

This approach is also going to give quadratic complexity. Is there a reason we can't emplace_back? If so, can a stable sort post-processing step not fix that more efficiently (asymptotically)?

163

Please no comma operators outside of for

jrtc27 added inline comments.Jun 2 2021, 12:08 PM
llvm/lib/MC/ELFObjectWriter.cpp
1522–1526 ↗(On Diff #349326)

Why is this not a hook?

This undoes an ancient optimization in LLVM to support RISC-V's desire to over-emit relocations for link time relocation relaxation.

Do you mean that we currently emit more relocations than we need to, or is this snark directed towards RISC-V's linker relaxations? If the latter, please remove such comments from your patch description.

compnerd added inline comments.Jun 3 2021, 8:35 AM
llvm/lib/MC/ELFObjectWriter.cpp
1522–1526 ↗(On Diff #349326)

This might make sense to be a hook, however, Im not sure if we need a new hook, I think that there are hooks that give us the necessary information, just need to wire them up.

llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
396

Sure, from a final patch perhaps, but when working on something its very helpful.

llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp
156–161

Because there are two relocations which need to be emitted. The problem here is that we need a second relocation on a different symbol. The ideal thing would be that we generate two fixups in the first place.

160

Yeah, I suppose we can do an emplace_back and sort at the end.

compnerd updated this revision to Diff 349560.Jun 3 2021, 8:35 AM
jrtc27 added a comment.Jun 3 2021, 8:46 AM

Could you please re-upload the diff with full context?

Still applies.

This undoes an ancient optimization in LLVM to support RISC-V's desire to over-emit relocations for link time relocation relaxation.

Do you mean that we currently emit more relocations than we need to, or is this snark directed towards RISC-V's linker relaxations? If the latter, please remove such comments from your patch description.

Still applies.

llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
396

Only locally. For reviewing it's a nuisance, so please remove them from the review.

llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp
155

Use std::pair or a local struct type

156–161

I know what the code is trying to do, what I'm failing to grasp is why it actually works, it looks buggy to me, as in that it doesn't do what I think it does and what you say it's meant to, so I must be missing something given the tests still pass (I assume they do given you haven't touched any?).

163

Still applies below.

llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
61

Why has this suddenly appeared in version 2?

compnerd added inline comments.Jun 3 2021, 3:38 PM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
396

This isn't meant for reviewing yet :) - its marked as a WIP.

llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
61

This change isn't really ready for review yet. I wanted to share it with @MaskRay so I put it here as it is where it is going to end up. I'm still working through the changes, so expect a lot of major changes. Thats also why there is no one tagged on the diff.

compnerd updated this revision to Diff 349921.Jun 4 2021, 10:59 AM

Closer to passing tests (9 more failures to understand). It matches gas' behaviour better, is more uniformly behaved.

jrtc27 added inline comments.Jun 4 2021, 11:04 AM
llvm/test/CodeGen/RISCV/fixups-relax-diff.ll
19

There's no longer a RUN line for this... also this test makes no sense in the first place, it's trying to use 64-bit relocations on RV32. That should just be an error.

llvm/test/MC/RISCV/rv64-relax-all.s
12 ↗(On Diff #349921)

Again, no longer used by a RUN line. And this looks like a bug, we're passing --mc-relax-all to the assembler, so this branch should have been relaxed in the branch relaxation sense (which is not the same as the linker relaxation sense), no? GNU as doesn't have --mc-relax-all, comparing against it is meaningless.

compnerd added inline comments.Jun 4 2021, 3:16 PM
llvm/test/CodeGen/RISCV/fixups-relax-diff.ll
19

I didn't even notice that the relocation types are just flat out wrong. That seems like a separate issue.

I wanted to keep the now dead check to do a last round of verification of the changes, at which point it should be removed.

llvm/test/MC/RISCV/rv64-relax-all.s
12 ↗(On Diff #349921)

Similar to above, I wanted to keep it until I do the final sweep through the change.

Agreed, that matching the behaviour with GAS for this does not make sense. I will have to adjust the implementation to accommodate this once I figure out the remaining cases.

It does raise the question of does it make sense for us to have a -mrelax and a -mc-relax-all? Perhaps we should just treat -mrelax as -mc-relax-all. Is there a heavy penalty for this? Yes, linking may take a little bit more time and memory and the object file size will be slightly larger, but there shouldn't be any runtime cost. The question boils down to, is the slight link time savings worth the two separate paths.

simoncook added inline comments.Jun 6 2021, 1:54 PM
llvm/test/CodeGen/RISCV/fixups-relax-diff.ll
19

The relocations are 64-bit because of the size of .dword, I don't think this is an error?

jrtc27 added inline comments.Jun 6 2021, 1:55 PM
llvm/test/CodeGen/RISCV/fixups-relax-diff.ll
19

The relocations match the directive, but the directive itself makes no sense on RV32. Can you do .8byte foo on i386?...

19

And if you did want to support it, I'd argue it should just be a zero-padded 32-bit relocation (i.e. offset by 4 bytes on big-endian).

compnerd added inline comments.Jun 7 2021, 8:37 AM
llvm/test/CodeGen/RISCV/fixups-relax-diff.ll
19

Well, I don't think that we should change that behaviour in this change. Furthermore, it will require working with GCC and binutils as it currently _does_ generate a pair of R_RISCV_ADD64 and R_RISCV_SUB64 on RV32.

compnerd updated this revision to Diff 350403.Jun 7 2021, 1:26 PM

7 failures to look into; need to still replace the use of RelaxAll which would also improve the test pass rate.

compnerd updated this revision to Diff 350651.Jun 8 2021, 10:27 AM

Down to 6 failures to explore.

Various clean ups to the change. This almost is inline with what GAS does to handle the symbolic differences. This is missing a specific tweak to the condition where the difference is constrained immediately instead of being relaxed (that is, we are over-relaxing currently).

compnerd updated this revision to Diff 351202.Jun 10 2021, 9:37 AM

3 failures left now. KI:

  • 6B relocations are not implemented
  • debug_line references aren't correctly emitted

One thing to note is that this seems to improve the generation of the eh_frame, which is important for enabling relaxation without negatively impacting unwinding.

compnerd updated this revision to Diff 351477.Jun 11 2021, 9:35 AM

2 remaining failures KI:

  • debug_line encoding is not relaxable
compnerd updated this revision to Diff 351900.Jun 14 2021, 9:48 AM
compnerd edited the summary of this revision. (Show Details)
compnerd added reviewers: asb, MaskRay, jrtc27.
compnerd retitled this revision from WIP: try to repair RISCV handling of paired relocations to RISCV: adjust handling of relocation emission for RISCV.
compnerd added subscribers: echristo, dblaikie.

With Diff 351900 of this patch applied, a RISCV build of the mainline Linux kernel with LLVM=1 LLVM_IAS=1 builds and boots. This patch resolves https://github.com/ClangBuiltLinux/linux/issues/1023 for me.

llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp
18

Unused?

compnerd updated this revision to Diff 352163.Jun 15 2021, 9:40 AM
compnerd edited the summary of this revision. (Show Details)
compnerd added inline comments.Jun 15 2021, 10:10 AM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp
18

Ugh, I missed this :-(; Yes, it is now unused, at one point I was using it for std::pair. Thanks for pointing that out.

With Diff 352163 of this patch applied, I can build+boot a mainline riscv Linux kernel.
https://github.com/ClangBuiltLinux/linux/issues/1023 is resolved, and further now so is https://github.com/ClangBuiltLinux/linux/issues/1143. This is gets us back to being able to build entirely with LLVM (clang, clang's integrated assembler, LLD). That's huge! Thanks @compnerd !

I also built the kernel with debug info; llvm-dwarfdump didn't have any complaints about the input.

llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
195

does this need to be static?

208

Should this call getSymB() rather than getSymA()?

230–233

Should this be:

MCStreamer::emitValueImpl(Value, Size, Loc);
if (!useLinkerRelaxation(getContext(), Value, A, B))
  return;
llvm/lib/Target/RISCV/MCTargetDesc/RISCVFixupKinds.h
105–120

Some of the comments have an extra R in the relocation type identifier.

llvm/test/MC/RISCV/hilo-constaddr-expr.s
4

is it ok to drop the llvm-objdump -d test?

jrtc27 added inline comments.Jun 15 2021, 12:14 PM
llvm/test/MC/RISCV/hilo-constaddr-expr.s
4

It was there to ensure that the assembled output was correct, but if it no longer assembles then you can't exactly test it.

MaskRay added inline comments.Jun 15 2021, 2:14 PM
llvm/test/CodeGen/RISCV/fixups-diff.ll
27

Add -NEXT:

Otherwise CHECK: } cannot ensure there is no additional ADD/SUB.

33

ditto

llvm/test/MC/RISCV/hilo-constaddr-expr.s
12

nit: [[@ is deprecated. use :[[#@LINE+1]]:[[#]]: error: ...

llvm/test/MC/RISCV/reloc-addend.s
10 ↗(On Diff #352163)

-NEXT

llvm/test/MC/RISCV/scoped-relaxation.s
2

riscv64-unknown-none-elf can be simplified as riscv64

31

To ensure no additional add/sub, add -NEXT and check } wherever appropriate

MaskRay added inline comments.Jun 15 2021, 2:47 PM
llvm/test/MC/RISCV/fixups-expr.s
2

Since you are switching the check-prefix styles...

The most common form is --check-prefix=

llvm/test/MC/RISCV/hilo-constaddr-expr.s
12

I think it is useful to track the column information ([[#]] -> an integer). This ensures that the reported diagnostic makes sense.

compnerd marked 13 inline comments as done.Jun 15 2021, 3:02 PM
compnerd added inline comments.
llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
195

No, it doesn't need to be static. It is more at a conceptual level not something tied into the state of the streamer - it simply looks at an expression and tells you if you need a paired relocation for handling this. BTW, I welcome bike-shedding on this function's name - I dislike the current name of this function.

208

Move along, nothing to see here. That is rather embarrassing. Thank you for flagging this.

230–233

No, they are going down different functions - MCELFStreamer vs MCStreamer.

compnerd updated this revision to Diff 352264.Jun 15 2021, 3:03 PM
compnerd marked 2 inline comments as done.
compnerd edited the summary of this revision. (Show Details)
nickdesaulniers accepted this revision.Jun 15 2021, 3:09 PM

LGTM; big thanks for sorting all of this out!

This revision is now accepted and ready to land.Jun 15 2021, 3:09 PM
MaskRay added inline comments.Jun 15 2021, 3:09 PM
llvm/lib/MC/MCAssembler.cpp
1086
1111
llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
236

The old comment is unintentionally dropped: // According to DWARF spec., the DW_LNS_fixed_advance_pc opcode takes a ..

s/50000/60000/ to restore the previous threshold.

277

blank line seems redundant

llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
98

s/WasRelaxed/Relaxed/ to be consistent with the impl

llvm/lib/Target/RISCV/MCTargetDesc/RISCVFixupKinds.h
84

https://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments

"Don’t duplicate function or class name at the beginning of the comment. "

Just ignore some other comments which violate this rule.

MaskRay accepted this revision.Jun 15 2021, 3:10 PM
compnerd marked 4 inline comments as done.Jun 16 2021, 8:08 AM
compnerd added inline comments.
llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
98

Changed the implementation instead.

llvm/lib/Target/RISCV/MCTargetDesc/RISCVFixupKinds.h
84

*shrug* sure, though I think that consistency is better than following the rules in this case (and is in the developer guidelines for LLVM IIRC). However, I don't feel strongly enough to bother arguing ... so changed. Also the comment leader doesn't seem like doxygen will heed it.

This revision was landed with ongoing or failed builds.Jun 17 2021, 8:53 AM
This revision was automatically updated to reflect the committed changes.

_sigh_ forgot to submit these review comments

llvm/test/CodeGen/RISCV/fixups-diff.ll
25–27

Please line things up, and ideally also indent within { (matching the relative indentation of the output) to make it clearer

llvm/test/CodeGen/RISCV/fixups-relax-diff.ll
1–4

No need for a custom prefix

15

-NEXT?

compnerd marked an inline comment as done.Jun 17 2021, 9:33 AM

I'll address the clean ups in a follow up.

compnerd marked 3 inline comments as done.Jun 17 2021, 9:37 AM
compnerd added inline comments.
llvm/test/CodeGen/RISCV/fixups-relax-diff.ll
1–4

Excellent point!

15

Good idea.

compnerd marked 2 inline comments as done.Jun 17 2021, 11:33 AM

e70d4994ea9eb44ba5a86c5073ac83cf500aabdd should have the suggestions incorporated.