This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Allocatable Global Register Variables for ARM
AcceptedPublic

Authored by anwel on Oct 11 2019, 2:32 AM.

Details

Summary

This patch combines two earlier patches aiming at providing the same support (https://reviews.llvm.org/D56003 for clang, https://reviews.llvm.org/D56005 for LLVM). It enables reservation of allocatable registers via command line options, which in turn allows them to be used as global named register variables. They will then not be used by the register allocator nor spilled to the stack.

More information is available in the original RFC: http://lists.llvm.org/pipermail/llvm-dev/2018-December/128706.html

Changes from the previous patches include:

  • adding a constraint to specify -ffixed-rN if rN is used as named register variable.
  • upgrading the frame-pointer warning to an error and throwing an error in LLVM, as well as clang.*

Additionally this patch now only supports r6-r11. r4 and r5 are excluded from this patch as r4 is used as hard-coded scratch register in various parts of the ARM backend. r4 also appears to be used as an input register for a Windows asm routine (__chkstk). Similarly, the ABI of the segmented stack prologue for Android and Linux seems to use r4 and r5 as input registers. A separate patch could follow to add the support for r4 and/or r5, such that the whole range of allocatable registers (r4-r11) is available.

As before it should be noted that this also changes the behaviour of the old -ffixed-r9 option. This option will now prevent the register from being spilled to the stack.

*This was originally a warning, but we don't seem to have the necessary information to determine frame-pointer usage in the given context. Any insight here would be welcome.

Diff Detail

Event Timeline

anwel created this revision.Oct 11 2019, 2:32 AM

Bit of a drive-by comment, but I can't say I am big fan of all the string matching on the register names. Not sure if this is a fair comment, because I haven't looked closely at it yet, but could we use more the ARM::R[0-9] values more? Perhaps that's difficult from the Clang parts?

clang/lib/Basic/Targets/ARM.cpp
902

I was pointed at something similar myself recently, but if I am not mistaken then I think this is a use-after-free:

"+reserve-" + RegName.str()

this will allocate a temp std::string that SearchFeature points to, which then gets released, and SearchFeature is still pointing at it.

llvm/test/CodeGen/ARM/reg-alloc-with-fixed-reg-r6-modified.ll
16

nit: +m -> + m

llvm/test/CodeGen/ARM/reg-alloc-with-fixed-reg-r6.ll
14

same nit

llvm/test/CodeGen/ARM/reg-alloc-wout-fixed-regs.ll
4

As all these tests (this file and the ones above) are the same, the "equivalent C source code" is the same, perhaps move all these cases into 1 file.

14

same nit here

chill added a subscriber: chill.Oct 11 2019, 5:57 AM

TBH, I quite dislike the creeping abuse of SubtargetFeatures as code generation options.

cf. Target.td:1477

//===----------------------------------------------------------------------===//
// SubtargetFeature - A characteristic of the chip set.
//

IMHO, since reserved registes are per-function, this strongly suggests implementation as function attribute(s), rather than subtarget features (also for the pre-existing r9).
It also opens the path towards possible future __attribute__.

clang/include/clang/Basic/TargetInfo.h
944

Parameter name can be omitted if unused; that would remove a potential warning.

clang/lib/Basic/Targets/ARM.cpp
884

Perhaps you can use here RegName == "r6" or string switch ?

890

HasSizeMismatch is not set along all possible paths.

895

s/enable/reserve/ ?

899

Likewise ?

901–902

These variables can be const.

903–907

This explicit loop can be written like:

return llvm::any_of(getTargetOpts().Features(),
                   [&](auto &P) { return P == SearchFeature; });
llvm/lib/Target/ARM/ARMSubtarget.h
236

The usual designation for these registers is "GPR". Suggestion either ReservedGPRegisters or just ReservedRegisters,
here and elsewhere.

chill added inline comments.Oct 11 2019, 6:03 AM
clang/lib/Basic/Targets/ARM.cpp
902

Any temporaries would be destructed at the end of the full expression. By that time, the SearchString would be constructed and stand on its own.

SjoerdMeijer added inline comments.Oct 11 2019, 6:22 AM
clang/lib/Basic/Targets/ARM.cpp
902

Ah yes, true. This is a std::string, not stringref as in my case.

carwil added a comment.EditedOct 14 2019, 8:06 AM

IMHO, since reserved registes are per-function, this strongly suggests implementation as function attribute(s), rather than subtarget features (also for the pre-existing r9).

What do you mean reserved registers are per-function? That sounds like you're describing local register variables, which I don't believe Clang has any support for (and there aren't a great deal of use-cases for anyway).
We're specifically talking about global usage here.

anwel updated this revision to Diff 224862.Oct 14 2019, 8:06 AM

Applied some minor changes suggested in the comments, including renaming the array of reserved registers.

chill added a comment.Oct 14 2019, 9:08 AM

IMHO, since reserved registes are per-function, this strongly suggests implementation as function attribute(s), rather than subtarget features (also for the pre-existing r9).

What do you mean reserved registers are per-function? That sounds like you're describing local register variables, which I don't believe Clang has any support for (and there aren't a great deal of use-cases for anyway).
We're specifically talking about global usage here.

I mean that the set is dynamically computed and depends on the specific function: cf. https://github.com/llvm/llvm-project/blob/master/llvm/lib/Target/ARM/ARMBaseRegisterInfo.cpp#L184
as opposed to, say, depend only on target/subtarget/abi.

IMHO, since reserved registes are per-function, this strongly suggests implementation as function attribute(s), rather than subtarget features (also for the pre-existing r9).

What do you mean reserved registers are per-function? That sounds like you're describing local register variables, which I don't believe Clang has any support for (and there aren't a great deal of use-cases for anyway).
We're specifically talking about global usage here.

I mean that the set is dynamically computed and depends on the specific function: cf. https://github.com/llvm/llvm-project/blob/master/llvm/lib/Target/ARM/ARMBaseRegisterInfo.cpp#L184
as opposed to, say, depend only on target/subtarget/abi.

Ah, I see. That's a fair comment. AArch64 also achieves it's -reserve-xN options via subtarget features (despite function context), I think that's where the inspiration/suggestion for doing it this way in the ARM backend came from.

anwel marked 8 inline comments as done.Oct 29 2019, 8:57 AM
anwel updated this revision to Diff 227094.Oct 30 2019, 7:42 AM
anwel marked 9 inline comments as done.

Rebase and make some variables const

clang/lib/Basic/Targets/ARM.cpp
903–907

I see your point, but using a for loop seems to better match the style of the code.

llvm/test/CodeGen/ARM/reg-alloc-wout-fixed-regs.ll
4

I see your point, but I'd still prefer to leave them as they are because in my opinion having the test cases separated into individual files it's much easier to grasp what they are doing.

Just want to double check with @efriedma, before we except this. I believe this patch now catches all the points you made on https://reviews.llvm.org/D56005. Anything we've missed?

anwel updated this revision to Diff 229269.Nov 14 2019, 4:03 AM

Rebase on current llvm-project master

anwel updated this revision to Diff 229296.Nov 14 2019, 6:03 AM

Change clang's error message when trying to use the target's frame pointer as GRV to sound more like an error then a warning.

anwel updated this revision to Diff 229531.Nov 15 2019, 6:35 AM
carwil accepted this revision.Nov 15 2019, 8:20 AM

I think there's been plenty of time for comments here. LGTM.

This revision is now accepted and ready to land.Nov 15 2019, 8:20 AM

Approval is not ever a time-based process; someone appropriate actually has to do the work of reviewing the patch. If a patch doesn't get reviewed, you "ping" it a couple times, to note that you're waiting for a review. If it still isn't reviewed at that point, and you're not sure what to do, send an email to cfe-dev.

In this case, it was on me to review; I apologize for not looking at this earlier.

llvm/lib/Target/ARM/ARMBaseRegisterInfo.cpp
435

I'm a little concerned about this... what happens if r6 is reserved, and the code uses a construct which LLVM implements using a base pointer? For example:

void f(void a(char*, char*), int n) {
  char r[n];
  char r2[16] __attribute((aligned(16)));
  a(r, r2);
}

(There are potentially other ways to implement this construct. But I'm pretty sure using a base pointer is the only method which is currently implemented.)

Hi @efriedma, thanks your comments. You're right, that was hasty of me. Apologies for that, it won't happen again.

RE: R6
Case in point: you're right. We're definitely not handling this correctly. I'll revert the patch, or ask @anwel too. Once we have a fix, we'll raise another review.