This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Allow label arithmetic with add/sub/cmp
ClosedPublic

Authored by rovka on Aug 24 2016, 5:47 AM.

Details

Summary

Allow instructions such as 'cmp w0, #(end - start)' by folding the
expression into a constant. For ELF, we fold only if the symbols are in
the same section. For MachO, we fold if the expression contains only
symbols that are not linker visible.

Fixes https://llvm.org/bugs/show_bug.cgi?id=18920

Diff Detail

Repository
rL LLVM

Event Timeline

rovka updated this revision to Diff 69099.Aug 24 2016, 5:47 AM
rovka retitled this revision from to [AArch64] Allow label arithmetic with add/sub/cmp.
rovka updated this object.
rovka added subscribers: llvm-commits, rengolin, psmith.
rovka added a comment.Aug 24 2016, 5:50 AM

This is relaxing the error-checking in the asm parser quite significantly, so please look for any edge cases that I might have missed in the tests.
Also, I'm not sure this is doing the right thing for Darwin, any guidance there is more than welcome.
Thanks.

@t.p.northover should have a look at the Darwin relocation, but the rest looks fine for me.

Nice set of tests, thanks!

lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
3606 ↗(On Diff #69099)

Maybe hacky, but can't you just count Loc[] and return 2 | 3 depending on the size?

rovka added inline comments.Aug 24 2016, 6:26 AM
lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
3606 ↗(On Diff #69099)

Right, I'll try to fix that.

rovka updated this revision to Diff 69108.Aug 24 2016, 6:33 AM

Fixed the FIXME :)

t.p.northover edited edge metadata.Aug 24 2016, 7:51 AM

I think the MachO logic is pretty problematic:

  • Because of .subsections_via_symbols a relocation would usually be needed for 2 assembler-local symbols if there's a linker-visible one between them.
  • I think ARM64_RELOC_UNSIGNED is supposed to set all bits of the data it covers, not parse and interpret the instruction it's in the middle of.

I don't think there's an appropriate MachO relocation for this either, so we probably shouldn't try.

How does the ELF logic deal with attempts to perform subtractions across sections (the equivalent of the first MachO problem)?

How does the ELF logic deal with attempts to perform subtractions across sections (the equivalent of the first MachO problem)?

Ah, I see there's logic in ELFObjectWriter.cpp and it really is checking the section, so MachO probably can't benefit.

We probably want to diagnose attempts in MachO where MCAssembler::getAtom is different (or null) for each symbol. With that restriction, we can even cope if one of them is the defining (visible) symbol of the atom.

I agree with Renato that the ELF side looks good.

Some suggestions the tests:

  • All of the instructions are either add or cmp and all use the w registers. It may be worth varying these a bit more to cover the other cases within the switch(Inst.getOpcode())
  • The labels used in the instructions look to be backwards references to labels already defined. It would be good to have some tests that use forward references to labels that are defined later in the file.
  • All the Darwin tests are of the form (label1 - label2 + constant), are there other forms that can be expressed?

I note that the label expression must evaluate to a positive value. For example #(end - start) is ok, but #(start - end) or #-(end-start) is not. I don't think that there is much that you can do about this in the general case as IIRC the value of the expression is evaluated after the fragments are fixed and it is too late to easily change an ADD into a SUB or vice-versa.

To the best of my knowledge the Mach-O side looks ok, but I agree with Renato that someone more familar with Mach-O should take a look.

rovka added a comment.Aug 25 2016, 1:46 AM

We probably want to diagnose attempts in MachO where MCAssembler::getAtom is different (or null) for each symbol. With that restriction, we can even cope if one of them is the defining (visible) symbol of the atom.

Ok, I'll try to investigate this some more. Thanks!

rovka added a comment.Aug 25 2016, 1:55 AM

I agree with Renato that the ELF side looks good.

Some suggestions the tests:

  • All of the instructions are either add or cmp and all use the w registers. It may be worth varying these a bit more to cover the other cases within the switch(Inst.getOpcode())
  • The labels used in the instructions look to be backwards references to labels already defined. It would be good to have some tests that use forward references to labels that are defined later in the file.

Those are very good suggestions, thanks.

  • All the Darwin tests are of the form (label1 - label2 + constant), are there other forms that can be expressed?

I'm actually not proud of the Darwin tests, but I wanted to get an initial round of feedback before diving too deep. From what I've played with so far, we can't express (label1 - label2) >> constant (at least not out of the box; I don't know how we'd add support for that). Also, since we're generating relocations, we can't diagnose a lot of cases where we error out for ELF - I'm not sure what to do about these, at the moment I'm just hoping the linker will error out in a helpful way, or maybe magically handle them by moving atoms around.

I note that the label expression must evaluate to a positive value. For example #(end - start) is ok, but #(start - end) or #-(end-start) is not. I don't think that there is much that you can do about this in the general case as IIRC the value of the expression is evaluated after the fragments are fixed and it is too late to easily change an ADD into a SUB or vice-versa.

I should at least add a test-case diagnosing this though.

Thanks for the feedback.

rovka updated this revision to Diff 69242.Aug 25 2016, 7:24 AM
rovka updated this object.
rovka edited edge metadata.

Fix Darwin to allow label arithmetic only with assembler-private labels and update the Darwin tests accordingly. Also update the ELF tests as per the review comments.

rovka updated this revision to Diff 69248.Aug 25 2016, 8:06 AM

Fix something embarrassing in a Darwin test (my bad).

Thanks, the Darwin side looks less scary now.

compnerd added inline comments.Aug 25 2016, 8:46 PM
lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
733 ↗(On Diff #69248)

I think this is better written as:

if (auto *CE = dyn_cast<MCConstantExpr>(Expr))
  return CE->getValue() >= 0 && CE->getValue() <= 0xfff;
return true;
rovka updated this revision to Diff 69348.Aug 26 2016, 5:26 AM
  • Address Saleem's comment (thanks!)
  • Get Darwin to error out on invalid values obtained from label arithmetic, i.e. negative values or values that are too large to fit. I did this by overriding the processFixupValue method in DarwinAArch64AsmBackend. It now calls adjustFixupValue with a legitimate MCContext, so it can dump errors (just like the ELF backend does).
rovka updated this revision to Diff 71809.Sep 19 2016, 5:34 AM

Move some of the tests from D23930 here (label arithmetic + lsl).

rovka added a comment.Oct 4 2016, 6:13 PM

Ping again.

compnerd accepted this revision.Oct 10 2016, 8:20 PM
compnerd edited edge metadata.
This revision is now accepted and ready to land.Oct 10 2016, 8:20 PM
This revision was automatically updated to reflect the committed changes.