This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Enable the LocalStackSlotAllocation pass support
ClosedPublic

Authored by craig.topper on Mar 5 2021, 9:35 PM.

Details

Summary

For RISC-V, load/store(exclude vector load/store) instructions only has a 12 bit immediate operand. If the offset is out-of-range, it must make use of a temp register to make up this offset. If between these offsets, they have a small(IsInt<12>) relative offset, LocalStackSlotAllocation pass can find a value as frame base register's value, and replace the origin offset with this register's value plus the relative offset.

Diff Detail

Event Timeline

StephenFan created this revision.Mar 5 2021, 9:35 PM
StephenFan requested review of this revision.Mar 5 2021, 9:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 5 2021, 9:35 PM
StephenFan edited the summary of this revision. (Show Details)Mar 5 2021, 9:36 PM
craig.topper added inline comments.Mar 5 2021, 9:58 PM
llvm/test/CodeGen/RISCV/local-stack-allocation.ll
1 ↗(On Diff #328731)

Is it possible to pre-commit this test case and only show the changes here in this patch?

craig.topper added inline comments.Mar 5 2021, 10:03 PM
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
316

I think vector load/store can have a frameindex operand. There's no immediate field so it will get split out into an ADDI in eliminateFrameIndex, but I think that happens later.

StephenFan added inline comments.Mar 5 2021, 10:10 PM
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
316

In the RISCVIselDAGToDAG.cpp:

case ISD::FrameIndex: {
    SDValue Imm = CurDAG->getTargetConstant(0, DL, XLenVT);
    int FI = cast<FrameIndexSDNode>(Node)->getIndex();
    SDValue TFI = CurDAG->getTargetFrameIndex(FI, VT);
    ReplaceNode(Node, CurDAG->getMachineNode(RISCV::ADDI, DL, VT, TFI, Imm));
    return;
  }

So the llvm ir like this :

%addr = alloca i8
call void callee(i8* %addr)

the %addr will be selected to ADDI

craig.topper added inline comments.Mar 5 2021, 10:20 PM
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
316

Right. I was just asking why the comment says vector load/store are different?

StephenFan added inline comments.Mar 5 2021, 10:25 PM
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
316

Oh, sorry. You're right, the vector load/store can have a FrameIndex Operand. I made a mistake.

StephenFan marked 2 inline comments as done.Mar 5 2021, 10:25 PM

Pre-commit local-stack-allocation.ll test.

StephenFan marked an inline comment as done.Mar 5 2021, 10:32 PM
StephenFan added inline comments.Mar 5 2021, 10:39 PM
llvm/test/CodeGen/RISCV/local-stack-allocation.ll
17 ↗(On Diff #328736)

This instruction will be eliminated in D92479 which I will land it soon after.

Fix comment error.

asb requested changes to this revision.Apr 1 2021, 3:37 AM

Could you please rebase this to account for D98716?

I _think_ the right fix is to change the call for needsStackRealignment to shouldRealignStack. But I'm seeing multiple runtime failures for the GCC torture suite with the patch applied. e.g. 20031012-1.c at O0 for rv32imafdc ilp32.

This revision now requires changes to proceed.Apr 1 2021, 3:37 AM
StephenFan updated this revision to Diff 336311.Apr 8 2021, 9:51 PM

Fix error.

In D98101#2663432, @asb wrote:

Could you please rebase this to account for D98716?

I _think_ the right fix is to change the call for needsStackRealignment to shouldRealignStack. But I'm seeing multiple runtime failures for the GCC torture suite with the patch applied. e.g. 20031012-1.c at O0 for rv32imafdc ilp32.

Hi @asb . I wrote a script to compile and run the gcc c torture tests on qemu. According to the test result and as you said, multiple runtime failures appeared. However, 20031012-1.c at O0 for rv32imafdc ilp32 didn't fail on qemu. And I have fixed the failures and updated the patch. Can you help me testing the gcc c torture test again? Thanks!

asb added a comment.Apr 15 2021, 8:06 AM

The GCC torture suite now gets a 100% pass rate for me - thanks for fixing. I've left various minor notes about comment phrasing and formatting etc.

I don't feel I've stepped through the logic of the patch carefully enough yet, but hopefully the suggested edits are actionable in the meantime.

llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
298

No need to copy this doc comment TargetRegisterInfo.h.

305

returns => Returns

307

I think this sentence needs rewriting - I can't quite follow it.

312

clang-format prefers this with a space after the first ;

316

This thread is marked as done, but the comment still same to make the same claim that vector load/store don't have a frameindex?

335

I think: "the maximum possible offset relative to the frame pointer."?

341

bytes => byte and "maximum possible offset relative to the stack pointer." I think

357

Nit: end with full stop

376

Nit: re-wrap and end sentence in full stop.

386

Nit: end with full stop

394

Nit: end with full stop.

Address @asb 's comment

StephenFan marked an inline comment as done.

Delete "that" in comment

StephenFan marked 9 inline comments as done.Apr 16 2021, 11:50 PM
StephenFan added inline comments.
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
307

emm, This sentence is copied from TargetRegisterInfo.h. It means that, For RISCV, if a frame index operand has a Offset that out-of-range 12 bits, this function will return true to indicate this frame index operand needs a frame base register.

asb added inline comments.May 13 2021, 12:50 AM
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
307

Yes that was my bad - the sentence does indeed make sense as written.

I've benchmarked the impact of this patch with CoreMark and Embench and it looked good. In summary, there were no or minimal performance differences but there were various small improvements to code size, which are probably similar to your test case.
I was going to add other checks like running the llvm test suite but the patch no longer applies. Can you refresh it?

jrtc27 added inline comments.May 13 2021, 7:43 AM
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
317–321

Can we not just look and see if it both has an FI and is an I or S type instruction? This isn't really flexible, and I don't see why it's needed either (and I'd have to more than double the size of this switch statement downstream in CHERI-LLVM to add both capability base and capability value instructions).

332–334

What about floating point registers? Can we also generalise this to iterate over the list of saved registers rather than copying the ABI to yet another place (and thus yet another place we'd have to change downstream in CHERI-LLVM)?

339

Why 128? A hard-coded constant common across all ISAs seems fishy.

379

We use the less-confusing name FIOperandNum elsewhere

381

The copies of this in RegisterScavenging and other backends have assert(i < MI.getNumOperands() && "Instr doesn't have FrameIndex operand!");

383–385

I don't see why the first part matters. Only include the relevant information, that "FrameIndex operands are always represented as a register followed by an immediate".

386

This variable seems pointless to me, just inline the +1

llvm/test/CodeGen/RISCV/local-stack-allocation.ll
30 ↗(On Diff #338495)

Please fix this test, either before you pre-commit it or now as another pre-commit if you've already done so.

craig.topper added inline comments.Sep 29 2022, 1:09 PM
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
316

I think vector load/store no longer have frame indices.

317–321

This is missing LBU, LHU, and LWU.

I agree with @jrtc27 we should check the I or S type instead.

330

No need to say 'int' llvm almost always uses unsigned by itself.

332–334

Are X3 and X4 really callee saved to the stack? They're reserved so I don't think we ever spill them.

Address Comments.

StephenFan marked 9 inline comments as done.Sep 30 2022, 8:24 AM

Improve comments.

StephenFan added inline comments.Sep 30 2022, 8:48 AM
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
339

This value should be an experimental target-dependent value. But I don't know yet which value is appropriate for riscv.

craig.topper added inline comments.Sep 30 2022, 10:26 AM
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
332

Is ReservedByUser not already cover by getReservedRegs?

339

Can we copy the comments from ARM and AArch64 here including the FIXME for 128?

StephenFan updated this revision to Diff 464553.Oct 2 2022, 7:38 AM
  1. Remove check user reserved regsiter list.
  2. Improve comments.
craig.topper added inline comments.Oct 4 2022, 9:52 PM
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
399

Should we assert that it is I or S format?

StephenFan updated this revision to Diff 465382.Oct 5 2022, 6:58 AM

Add assertion.

I think we need this change from D134851 to match SVE from D83859

 bool isStackIdSafeForLocalArea(unsigned StackId) const override {
  // We don't support putting RVV objects into the pre-allocated local
  // frame block at the moment.
  return StackId != TargetStackID::ScalableVector;
}

Implement target hook.

craig.topper added inline comments.Oct 12 2022, 10:48 AM
llvm/test/CodeGen/RISCV/local-stack-slot-allocation.ll
7–8

Remove TODO

Remove TODO.

This revision was not accepted when it landed; it landed in state Needs Review.Oct 19 2022, 1:16 AM
This revision was automatically updated to reflect the committed changes.

I'm seeing failures in the llvm-testsuite on riscv64-unknown-linux-gnu with -O2 (no vectors) which git bisect attributes to this change.

Failed Tests (3):
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test
  test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test
  test-suite :: SingleSource/UnitTests/matrix-types-spec.test

Has anyone seen something similar?

I'm seeing failures in the llvm-testsuite on riscv64-unknown-linux-gnu with -O2 (no vectors) which git bisect attributes to this change.

Failed Tests (3):
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test
  test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test
  test-suite :: SingleSource/UnitTests/matrix-types-spec.test

Has anyone seen something similar?

I think we're seeing those failures too.

craig.topper reopened this revision.Nov 1 2022, 8:21 PM
This comment was removed by dnpetrov-sc.
LiDongjin removed a subscriber: LiDongjin.
LiDongjin commandeered this revision.Nov 27 2022, 7:12 PM
LiDongjin added a reviewer: StephenFan.

Fix the failures in the llvm-testsuite on riscv64-unknown-linux-gnu with -O2 (no vectors).

LiDongjin updated this revision to Diff 478127.Nov 27 2022, 7:24 PM

For the failures in the llvm-testsuite, it seems some problems for Add instruction which generates wrong frame index.
Also in Arm and AArch64, No need to support for Add instruction.
Therefore load/store(exclude vector load/store) instruction is enough for LocalStackSlotAllocation pass.

If it would be better to add a test case that tests we don't do local stack slot allocation on ADDI instruction?

LiDongjin updated this revision to Diff 478853.Nov 30 2022, 1:35 AM

Reverse ping. Can we commit this?

This revision was not accepted when it landed; it landed in state Needs Review.Dec 21 2022, 12:46 AM
This revision was automatically updated to reflect the committed changes.

I'm seeing failures on the llvm testsuite after this change (scalar only).

Failed Tests (3):
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test
  test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test
  test-suite :: SingleSource/UnitTests/matrix-types-spec.test

Does anyone else see them too? Thanks!

I'm seeing failures on the llvm testsuite after this change (scalar only).

Failed Tests (3):
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test
  test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test
  test-suite :: SingleSource/UnitTests/matrix-types-spec.test

Does anyone else see them too? Thanks!

Are those the exact same tests you reported last time?

rogfer01 added a comment.EditedDec 22 2022, 11:30 PM

I'm seeing failures on the llvm testsuite after this change (scalar only).

Failed Tests (3):
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test
  test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test
  test-suite :: SingleSource/UnitTests/matrix-types-spec.test

Does anyone else see them too? Thanks!

Are those the exact same tests you reported last time?

Yes, I see the same failures. I admit I'm a bit puzzled.

Might be a problem on our side, so if someone can reproduce it too, that'd be helpful.

craig.topper reopened this revision.Dec 25 2022, 12:58 PM

I've reverted the patch so I'm reopening.

craig.topper added a comment.EditedJan 5 2023, 12:49 PM

It looks like maybe we're somehow creating load/stores with negative immediates that are supposed to be large positive immediates.

For example,
I see s9 pointing to sp+2152. A store that is supposed to be accessing sp+4288 was created with s9-1960. If we consider that -1960 as a uimm12 it would be +2136. 2152+2136 is the 4288 we were after.

This is backed up by running -verify-machineinstrs which generates a ton of errors about immediates that are uimm12 instead of simm12.

I think we failed to consider the offset in the store itself in RISCVRegisterInfo::isFrameOffsetLegal

craig.topper commandeered this revision.Jan 5 2023, 3:17 PM
craig.topper edited reviewers, added: LiDongjin; removed: craig.topper.

Consider the instruction's local offset in isFrameOffsetLegal.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 6 2023, 9:54 AM
This revision was automatically updated to reflect the committed changes.
LiDongjin reopened this revision.Jan 6 2023, 5:58 PM

modify the isFrameOffsetLegal to (Offset <= maxIntN(12)) && (Offset >= minIntN(12)),
Simply constrain offset to be within the range of 12-bit unsigned integers.

@LiDongjin I made changes and re-commited this already. Are you proposing additional changes?

LiDongjin commandeered this revision.Jan 6 2023, 6:02 PM
LiDongjin edited reviewers, added: craig.topper; removed: LiDongjin.
LiDongjin updated this revision to Diff 487025.Jan 6 2023, 6:05 PM
This comment was removed by LiDongjin.
jrtc27 added a comment.Jan 6 2023, 6:06 PM

modify the isFrameOffsetLegal to (Offset <= maxIntN(12)) && (Offset >= minIntN(12)),

That’s what isIntN<12>(Offset) does

Simply constrain offset to be within the range of 12-bit unsigned integers.

Signed, not unsigned, which is already done

Sorry I miss the message, I have no additional changes @craig.topper .

jrtc27 added a comment.Jan 6 2023, 6:08 PM

Can we please have the diff reverted back to what was committed and closed? This is a mess now…

LiDongjin abandoned this revision.Jan 6 2023, 6:09 PM
LiDongjin updated this revision to Diff 487026.Jan 6 2023, 6:12 PM

Just change the isFrameOffsetLegal like:

return (Offset <= maxIntN(12)) && (Offset >= minIntN(12));
LiDongjin abandoned this revision.Jan 6 2023, 6:14 PM
craig.topper commandeered this revision.Jan 6 2023, 6:24 PM
craig.topper edited reviewers, added: LiDongjin; removed: craig.topper.

Commandeering so I can fix the history

craig.topper reclaimed this revision.Jan 6 2023, 6:24 PM
craig.topper accepted this revision.

Well I guess I can't reclose this properly as long as @asb has a blocking review.

Restoring to commited version

This revision is now accepted and ready to land.Jan 6 2023, 6:29 PM
craig.topper closed this revision.Jan 6 2023, 6:30 PM

Bumped @asb from reviewers to unblock.

This should be back to the version commited in 4554663bc0da71d61ab488641c95ef98430cb451