This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Allow shrink wrapping for RISC-V
ClosedPublic

Authored by lewis-revill on May 21 2019, 5:39 AM.

Details

Summary

Enabling shrink wrapping requires ensuring the insertion point of the epilogue is correct for MBBs without a terminator, in which case the instruction to adjust the stack pointer is the last instruction in the block.

Diff Detail

Repository
rL LLVM

Event Timeline

lewis-revill created this revision.May 21 2019, 5:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 21 2019, 5:39 AM

I've reworked this patch to safely work around the implementation of save/restore via libcalls. These place constraints on the save/restore points that can be chosen by the shrink wrapping algorithm.

lewis-revill retitled this revision from [WIP][RISCV] Allow shrink wrapping for RISC-V to [RISCV] Allow shrink wrapping for RISC-V.
lewis-revill edited the summary of this revision. (Show Details)
luismarques requested changes to this revision.Aug 16 2019, 3:46 PM
luismarques added inline comments.
lib/Target/RISCV/RISCVFrameLowering.cpp
146

Document why we advance exactly one instruction.

304

This patch is using this function but it seems like D62686 no longer introduces it.

This revision now requires changes to proceed.Aug 16 2019, 3:46 PM
shiva0217 added inline comments.Aug 19 2019, 11:23 PM
lib/Target/RISCV/RISCVFrameLowering.cpp
137

We might need to consider that there're two branches at the end of the basic block. Last non-debug instruction is PseudoBr which is a terminator. So the epilogue will be inserted between the branches.

BEQ killed renamable $x10, $x0, %bb.3
PseudoBR %bb.2

Replacing by

MachineBasicBlock::iterator MBBI = MBB.getFirstTerminator();

could handle the case. and

if (MBBI == MBB.end())
  MBBI = MBB.getLastNonDebugInstr();

to handle that there is no terminator in the basic block.

Rebased, and modified the logic to find the insertion point for the epilogue code.

Fix git error whereby logic from D62686 was mistakenly included in this patch

Rebased and updated to incorporate changes to parent revision

Rebased due to updated dependency.

asb added a comment.Oct 3 2019, 6:56 AM

I think this patch is orthogonal to save/restore via libcalls, and we can probably land it sooner. Could you please rebase so it's standalone?

Sure, that's sensible. I'll keep a separate patch for ensuring compatibility with save/restore for if/when it's required.

apazos added a comment.Oct 3 2019, 3:16 PM

There are some additional perennial test suite failures when applying this patch and enabling -mllvm -enable-shrink-wrap.
When Lewis updates the patch to be standalone, we can verify it again.

Make this patch independent of save/restore compatibility

apazos added inline comments.
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
230 ↗(On Diff #223849)

I have been verifying the pending patches at Oz, Os, and also O2. This is helping with uncovering issues.
You can run it with 'llc test.ll -enable-shrink-wrap

With the latest standalone patch, it seems we have a few crashes left. Below is a bugpoint reduced test:

define dso_local void @test() local_unnamed_addr {
entry:

br i1 undef, label %T.exit, label %for.body.i

for.body.i: ; preds = %for.body.i, %entry

store i32 0, i32* undef
%incdec.ptr.i.i = getelementptr inbounds i32, i32* null, i32 1
%cmp.i.i = icmp eq i32* undef, undef
br i1 %cmp.i.i, label %T.exit.loopexit, label %for.body.i

T.exit.loopexit: ; preds = %for.body.i

%0 = ptrtoint i32* %incdec.ptr.i.i to i32
br label %T.exit

T.exit: ; preds = %T.exit.loopexit, %entry

ret void

}

shiva0217 added inline comments.Oct 12 2019, 11:29 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
230 ↗(On Diff #223849)

It seems that shrink wrapping may choose an empty basic block to insert epilogue. We might need to add empty block detection for DL initialization and MBBI advance. Something like:

DebugLoc DL = !MBB.empty() ? MBBI->getDebugLoc() : DebugLoc();
if (!MBB.empty() && !MBBI->isTerminator())
   MBBI = std::next(MBBI);
lewis-revill added inline comments.Oct 13 2019, 8:44 AM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
230 ↗(On Diff #223849)

Thanks Ana, it looks like this is a case of the shrink wrapping pass choosing a basic block which is empty as the place to insert the prologue. I didn't realise it was possible, but I'll push an updated patch that takes this into account.

Use MBB.end() and a blank DebugLoc for epilogue insertion when a basic block is empty.

Lewis, this patch is not applying clean on top of https://reviews.llvm.org/D62686. Can you please rebase?

Lewis, this patch is not applying clean on top of https://reviews.llvm.org/D62686. Can you please rebase?

Hi Ana, I've made this patch independent of the save/restore patch as per Alex's request. Unfortunately that means that neither will rebase cleanly on top of the other until one is committed.
I could recreate the old patch with the save/restore compatibility included (currently split into D68644) implemented on top of save/restore, if that is worth doing?

Off topic, this makes me think Phabricator needs some way to add multiple alternative diffs for applying a given patch on top of other patches...

Thanks Lewis, I applied the patch and I am not detecting any new failure with this feature.

Lewis, I am not seeing any code change.

Btw to turn it on by default on a the target you need to add enableShrinkWrapping implementation in RISCVFrameLowering.h

+ /// Returns true if the target will correctly handle shrink wrapping.
+ bool enableShrinkWrapping(const MachineFunction &MF) const override {
+ return true;
+ }
+

So I've been looking into the benchmarks that I've been running to figure out why code doesn't change with this enabled.

I found that GCC's shrink wrapping does make a difference, and that comes from a loop which looks like the following:

int foo(int rpt) {
  int i;
  int r;

  for (i = 0; i < rpt; ++i) {
    srand(0);
    r = ...;
  }

  return r;
}

GCC appears to convert this into the equivalent of:

int foo(int rpt) {
  if (rpt == 0)
    return 0;

  int i;
  int r;

  for (i = 0; i < rpt; ++i) {
    srand(0);
    r = ...;
  }

  return r;
}

And is then able to shrink wrap around the if statement. Sure enough, if I feed the modified function above into LLVM, we are able to shrink wrap it as well.

(I'm not certain if this conversion is sane from GCC, but I haven't properly looked into it.)

So personally I'm not concerned about the fact that shrink wrapping doesn't make a difference in my benchmarks; I can show that it changes C code we expect it to change... I'm not so sure about the lack of differences with SPEC though.

asb added a comment.Nov 21 2019, 7:34 AM

I note that the TargetFrameLowering hooks canUseAsPrologue and canUseAsEpilogue are both called by the shrink wrapper. They default to true, but targets may need to override this for correctness. Looking at e.g. AArch64, I see it overrides canUseAsPrologue and returns false in the case that the function needs stack realignment and there are no scratch registers available. Are you certain that no such case is needed for RISC-V?

In D62190#1755159, @asb wrote:

I note that the TargetFrameLowering hooks canUseAsPrologue and canUseAsEpilogue are both called by the shrink wrapper. They default to true, but targets may need to override this for correctness. Looking at e.g. AArch64, I see it overrides canUseAsPrologue and returns false in the case that the function needs stack realignment and there are no scratch registers available. Are you certain that no such case is needed for RISC-V?

So I see that the stack realignment implementation for larger values does indeed require the use of a scratch register, so that implies we should attempt to prevent the situation where we don't have one available. However the scratch register is created as a virtual GPR register, rather than attempting to scavenge a real free register. I can imagine that if this was changed, we could directly perform the same check in canUseAsPrologue to guarantee a register being available in a much more transparent and obvious way. EG the way in which AArch64 uses the function findScratchNonCalleeSaveRegister.

In D62190#1755159, @asb wrote:

I note that the TargetFrameLowering hooks canUseAsPrologue and canUseAsEpilogue are both called by the shrink wrapper. They default to true, but targets may need to override this for correctness. Looking at e.g. AArch64, I see it overrides canUseAsPrologue and returns false in the case that the function needs stack realignment and there are no scratch registers available. Are you certain that no such case is needed for RISC-V?

So I see that the stack realignment implementation for larger values does indeed require the use of a scratch register, so that implies we should attempt to prevent the situation where we don't have one available. However the scratch register is created as a virtual GPR register, rather than attempting to scavenge a real free register. I can imagine that if this was changed, we could directly perform the same check in canUseAsPrologue to guarantee a register being available in a much more transparent and obvious way. EG the way in which AArch64 uses the function findScratchNonCalleeSaveRegister.

I'm working on a patch for the realignment implementation based off issues that arose with implementing the ilp32e ABI. I will update this thread when that patch is ready.

apazos added a comment.Dec 3 2019, 4:45 PM

Lewis, try rebase again, not applying cleaning, nor https://reviews.llvm.org/D62686

apazos added a comment.Jan 7 2020, 8:34 AM

Lewis, same question, is the patch final? It would be good to merge it before the 10.0 release branch creation on Jan 15th

This patch is yes, so long as it applies cleanly. So if it's agreed that it should be landed then I can do that as soon as I get a chance.

luismarques accepted this revision.Jan 13 2020, 5:44 AM

Thanks for the patch Lewis. Overall it LGTM, and it would be great to merge this in time for the LLVM 10.0 release.

The patch would benefit from having more extensive tests, although those could be provided by a follow-up patch (but please do follow up). For instance, the current tests don't exercise the if (!MBB.empty()) else condition. It would also be great to include tests that illustrate the impact on debug instructions (even if that has been an area where testing has been limited in the past). See for instance my inline comment for a possible issue regarding correctly updating the debug locations.

llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
261 ↗(On Diff #234072)

This declaration shadows the DL above; I assume you meant to update the previously declared DL.

This revision is now accepted and ready to land.Jan 13 2020, 5:44 AM

Quick fix of shadow variable assignment.

This revision was automatically updated to reflect the committed changes.