[RISCV] Add diff relocation support for RISC-V

Authored by simoncook on Apr 2 2018, 1:24 PM.



For RISC-V it is desirable to have relaxation happen in the linker once addresses are known, and as such the size between two instructions/byte sequences in a section could change.

For most assembler expressions, this is fine, as the absolute address results in the expression being converted to a fixup, and finally relocations. However, for expressions such as .quad .L2-.L1, the assembler folds this down to a constant once fragments are laid out, under the assumption that the difference can no longer change, although in the case of linker relaxation the differences can change at link time, so the constant is incorrect. One place where this commonly appears is in debug information, where the size of a function expression is in a form similar to the above.

This patch extends the assembler to allow an AsmBackend to declare that it does not want the assembler to fold down this expression, and instead generate a pair of relocations that allow the linker to carry out the calculation. In this case, the expression is not folded, but when it comes to emitting a fixup, the generic FK_Data_* fixups are converted into a pair, one for the addition half, one for the subtraction, and this is passed to the relocation generating methods as usual. I have named these FK_Data_Add_* and FK_Data_Sub_* to indicate which half these are for.

For RISC-V, which supports this via e.g. the R_RISCV_ADD64, R_RISCV_SUB64 pair of relocations, these are also set to always emit relocations relative to local symbols rather than section offsets. This is to deal with the fact that if relocations were calculated on e.g. .text+8 and .text+4, the result 12 would be stored rather than 4 as both addends are added in the linker.

Diff Detail

simoncook created this revision.Apr 2 2018, 1:24 PM
mgrang added a subscriber: mgrang.Apr 4 2018, 5:13 PM
apazos added inline comments.Apr 4 2018, 6:01 PM
48 ↗(On Diff #140678)

Comments need to be updated:
FK_Data_Add_8 -> A eight byte..
FK_Data_Sub_1 -> A one-byte..

686 ↗(On Diff #140678)

Comment update:
one for the add, and one for the addition, -> one for the add, and one for the sub,

39 ↗(On Diff #140678)

Should this setting be conditional on the relax flag?
The relax flag is being introduced in https://reviews.llvm.org/D44888

asb added a comment.Apr 5 2018, 4:05 AM

Thanks for this Simon.

Perhaps requiresDiffExpressionRelocations or similar might be better than supportsEmittingDiffExpressions? As Ana says, you might want to stick with the current behaviour when -mno-relax is set.

Just to make sure I understand the scope of this patch properly:

  • As far as you currently understand, does this cover all cases where constants are emitted where relocations are needed in the presence of linker relaxation. i.e. are there other known cases where you have planned follow-ups?
  • Does anyone know of other backends that could be affected by this issue? Linker relaxation seems to used somewhat sparingly as far as I can see, with targets like ARM preferring the less invasive 'range thunks'.
684 ↗(On Diff #140678)

different -> difference

asb added a comment.Apr 5 2018, 4:12 AM

From a quick look in binutils, it seems micromips could be tripped up by similar issues.

mgrang added inline comments.Apr 5 2018, 8:13 AM
40 ↗(On Diff #140678)

Remove empty line.

simoncook updated this revision to Diff 141454.Apr 6 2018, 4:56 PM
simoncook edited the summary of this revision. (Show Details)

I've updated the patch to respond to comments, and deal with the couple of test failures I was seeing.

The only one I think may need some thoughts on is the changes I made to hilo-constaddr.s: Using %lo on an expression won't work in this case, gas doesn't accept it as input, and in the relaxation case I don't believe there are RISC-V relocations that could represent such an expression. I've split out the test case into two files, one for the working case, and one for the input I would consider invalid.

Does this seem reasonable, I think we might want to reject this all the time, and not just when linker relaxation is enabled. Otherwise I think it could cause confusion that some expressions are accepted by the assembler only when relaxation is on. (For reference, this is the change in RISCVMCExpr.cpp) Any thoughts on this?

Alex, as far as I'm aware, this covers all cases where constants are emitted from what I've seen so far.

Ana, we can tie this to linker relaxation, looking at D44886 which I think adds the MC part of conditional relaxation, my change in the AsmBackend can be easily swapped from return true to querying the SubTargetInfo for return enableLinkerRelax() when that lands.

simoncook marked 4 inline comments as done.Apr 6 2018, 4:57 PM
simoncook updated this revision to Diff 141458.Apr 6 2018, 5:09 PM

The previous diff had one change missing from my original patch, that is now restored.

simoncook updated this revision to Diff 141512.Apr 7 2018, 2:16 PM
simoncook retitled this revision from [RISCV WIP] Add diff relocation support for RISC-V to [RISCV] Add diff relocation support for RISC-V.

Take part of test I removed as it now produces and error, and turn it into an explicit test for an error message.

asb added a comment.Apr 12 2018, 6:26 AM

I think this patch and its tests would be best if based on D44886, so it can be clear which behaviour is conditional on linker relaxation being enabled.

I'd be happy for hilo-constaddr-invalid.s to always be rejected. It never worked in gas, and we can always support it for builds without linker relaxation in the future. Perhaps add a note to the test file to document that we're unconditionally rejecting for now for consistency, but that it would be easy in the future to support when linker relaxation is disabled.

685 ↗(On Diff #141512)

indiciated -> indicated

simoncook updated this revision to Diff 142607.Apr 16 2018, 3:29 AM

I've rebased the change on top of D44886 to indicate what is conditional based on linker relaxation.

I haven't made hilo-constaddr-invalid.s rejected unconditionally. I took a look at how to do this, and it seems it's more difficult to do that (considering fixup types on parent expressions when evaluating subexpressions) than it may be worth it, than to have this accepted based on the linker flag. If the latter is the behavior we want in the end, we already have that and I propose leaving it this way around.

simoncook marked 2 inline comments as done.Apr 16 2018, 3:29 AM
asb added inline comments.Apr 16 2018, 8:10 AM
3 ↗(On Diff #142607)

We shouldn't lose the CHECK-REL check from this file, and probably would want to keep the check for the difference of two labels.

asb added a subscriber: sdardis.Apr 17 2018, 5:59 AM

CC @sdardis - micromips seems to use aggressive linker relaxation in binutils ld meaning this may have some current or future relevance to the Mips LLVM backend.

simoncook updated this revision to Diff 143061.Apr 19 2018, 2:05 AM

I've readded the reloc check line to hilo-constaddr.s and renamed hilo-constaddr-invalid.s to hilo-constaddr-expr.s since it is no longer checking something that is always invalid.

simoncook marked an inline comment as done.Apr 19 2018, 2:05 AM
simoncook added inline comments.May 3 2018, 8:51 AM
42 ↗(On Diff #143061)

I've been doing some additional thinking on this, and I don't know if having this depend on the relax feature bit might introduce a problem going into the future.

Looking at D45864, the way .option rvc and .option norvc are implemented is they add and remove feature bits. If in the future we support .option norelax through the same approach we can end up with a situation where we end up with correct DWARF depending on which of these two is used last.

If we want to do this unconditionally, then we need to track whether feature relax was ever enabled, possibly via a second feature bit that is set by .option relax but not cleared by .option norelax and have return true if either is set. Does this seem like a good approach?

asb added inline comments.May 3 2018, 9:20 AM
42 ↗(On Diff #143061)

I could see that argument. I could certainly see a counter-argument that says that if the user uses .option relax and .option norelax in the same section then they're doing something wrong and if things break they get to keep both pieces.

Maybe an alternative is to warn if relax/norelax is switched within the same section?

shiva0217 added inline comments.May 10 2018, 11:44 PM
42 ↗(On Diff #143061)

The array value in the following case will always generate a fixup instead of absolute value after the commit.

commit 4d344626681a31cd8993f97eceb68419bfcc4c54
Author: Nirav Dave <niravd@google.com>
AuthorDate: Mon Apr 30 19:22:40 2018 +0000
Commit: Nirav Dave <niravd@google.com>
CommitDate: Mon Apr 30 19:22:40 2018 +0000
[MC] Change AsmParser to leverage Assembler during evaluation

int foo (int a)
  static const short ar[] = { &&l1 - &&l1, &&l2 - &&l1 };
  void *p = &&l1 + ar[a];
  goto *p;
  return 1;
  return 2;

We probably have two options.

  1. requiresDiffExpressionRelocations() always return true to emit ADD16 and SUB16 relocations.
  2. Fix up the above commit and then create a new patch to force this case to emit a relocation type when requiresDiffExpressionRelocations return true.

Any thoughts?

simoncook added inline comments.May 11 2018, 4:36 AM
42 ↗(On Diff #143061)

I'm not quite sure what you mean. Compiling this test case I see ADD16 and SUB16 relocations in the table when relaxation is enabled, which I would expect. I did notice clang crashes assembling this case with relaxation disabled, I've submitted a fix to this as D46746, was it this crash you were referring to/the fixup being applied there is fine?

shiva0217 added inline comments.May 13 2018, 10:18 PM
42 ↗(On Diff #143061)

Hi Simon. Yes, clang crashes with relaxation disabled in this case. I thought we could let requiresDiffExpressionRelocations always return true to fix this case. But I think https://reviews.llvm.org/D46746 would be a better fixed. Thanks. Another observation is that before the "[MC] Change AsmParser to leverage Assembler" commit,
The case will insert absolute value instead of generating FK_DATA_2 fixup. So the case didn't fail. Anyway, we could apply https://reviews.llvm.org/D46746 on current trunk to pass this case.

asb added a comment.May 16 2018, 9:28 PM

Thanks Simon, I think this is in pretty good shape. I've added a few comments on code style and expanding the tests. The only part of the patch I'm unsure about is the reference to InSet, so if you can clarify I'd really appreciate it.

101 ↗(On Diff #143061)

autobrief is enabled in the LLVM Doxygen config so \brief isn't needed in any of these newly added functions.

142–165 ↗(On Diff #143061)

LLVM coding style is not to indent case labels (see getKindForSize).

699–701 ↗(On Diff #143061)

This comment applies to both the if and else branches it should be brought up to just below if (!IsResolved) { and possibly given a minor rephrasing to mention the requiresDiffExpressionRelocation case.

702 ↗(On Diff #143061)

Given that the if() has a brace, the else should probably have one too. [Note: personal preference - can't see guidance one way or another in the LLVM coding guidelines).

574 ↗(On Diff #143061)

Why is InSet needed in this if statement? That's a genuine question rather than a suggestion it's wrong - the meaning of InSet is unclear.

5 ↗(On Diff #143061)

Could you please make this test more exhaustive:

  1. Add riscv64 check lines
  2. Check R_RISCV_{ADD,SUB}{8,16,32} as well
1 ↗(On Diff #143061)

Could you add a RUN line for when relax is disabled, and adapt the CHECK lines that were previously there when this was in hilo-constaddr.s for this example?

asb added inline comments.May 16 2018, 11:04 PM
5 ↗(On Diff #143061)

I committed rL332574 which adds support for the same data directive names used by binutils, so you should be able to use .word etc

simoncook updated this revision to Diff 148007.May 22 2018, 8:05 AM

Rebased on top of tree, and applied changes as per Alex's review.

Regarding InSet, I've tried to put a comment that explains its usage, but I'm not sure if it explains why it's there too well (or whether this is really should have a FIXME: The use of InSet = SymbolSize is a hack, similiar to evaluateAsAbsolute (line 461) ).

As far as I can tell, InSet is used as a flag to determine whether to allow expressions to evaluate as an absolute across sections (as this is otherwise an error), and the ELF object emitter re-uses this when emitting the size of a symbol in the symbol table (this seems the most reliable way of identifying an explicit "IsSymbolSize" through the expression evaluator).

The way our linker relaxtion works is that the size in the symbol table is a constant which represents the current symbol size, and the linker adjusts this before emitting its final binary based on how much relaxation occurs. Without this change we would emit a symbol size of zero, and a pair of relocations in the .symtab section, which I don't think the linker understands, or is able to support easily (if at all). Using this approach generates the binary we would expect.

asb accepted this revision.May 23 2018, 5:30 AM

Thanks Simon. A few minor nitpicks, but this looks good to me.

You forgot to include full context in the updated patch version.

112 ↗(On Diff #148007)

This \brief still needs removing

739–740 ↗(On Diff #148007)

Should be on a single line } else {

583 ↗(On Diff #148007)

reloations -> relocations

Nitpick: end with full stop

This revision is now accepted and ready to land.May 23 2018, 5:30 AM
This revision was automatically updated to reflect the committed changes.