For RISC-V, load/store(exclude vector load/store) instructions only has a 12 bit immediate operand. If the offset is out-of-range, it must make use of a temp register to make up this offset. If between these offsets, they have a small(IsInt<12>) relative offset, LocalStackSlotAllocation pass can find a value as frame base register's value, and replace the origin offset with this register's value plus the relative offset.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/RISCV/local-stack-allocation.ll | ||
---|---|---|
1 ↗ | (On Diff #328731) | Is it possible to pre-commit this test case and only show the changes here in this patch? |
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
519 | I think vector load/store can have a frameindex operand. There's no immediate field so it will get split out into an ADDI in eliminateFrameIndex, but I think that happens later. |
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
519 | In the RISCVIselDAGToDAG.cpp: case ISD::FrameIndex: { SDValue Imm = CurDAG->getTargetConstant(0, DL, XLenVT); int FI = cast<FrameIndexSDNode>(Node)->getIndex(); SDValue TFI = CurDAG->getTargetFrameIndex(FI, VT); ReplaceNode(Node, CurDAG->getMachineNode(RISCV::ADDI, DL, VT, TFI, Imm)); return; } So the llvm ir like this : %addr = alloca i8 call void callee(i8* %addr) the %addr will be selected to ADDI |
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
519 | Right. I was just asking why the comment says vector load/store are different? |
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
519 | Oh, sorry. You're right, the vector load/store can have a FrameIndex Operand. I made a mistake. |
Could you please rebase this to account for D98716?
I _think_ the right fix is to change the call for needsStackRealignment to shouldRealignStack. But I'm seeing multiple runtime failures for the GCC torture suite with the patch applied. e.g. 20031012-1.c at O0 for rv32imafdc ilp32.
Hi @asb . I wrote a script to compile and run the gcc c torture tests on qemu. According to the test result and as you said, multiple runtime failures appeared. However, 20031012-1.c at O0 for rv32imafdc ilp32 didn't fail on qemu. And I have fixed the failures and updated the patch. Can you help me testing the gcc c torture test again? Thanks!
The GCC torture suite now gets a 100% pass rate for me - thanks for fixing. I've left various minor notes about comment phrasing and formatting etc.
I don't feel I've stepped through the logic of the patch carefully enough yet, but hopefully the suggested edits are actionable in the meantime.
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
501 | No need to copy this doc comment TargetRegisterInfo.h. | |
508 | returns => Returns | |
510 | I think this sentence needs rewriting - I can't quite follow it. | |
515 | clang-format prefers this with a space after the first ; | |
519 | This thread is marked as done, but the comment still same to make the same claim that vector load/store don't have a frameindex? | |
538 | I think: "the maximum possible offset relative to the frame pointer."? | |
544 | bytes => byte and "maximum possible offset relative to the stack pointer." I think | |
560 | Nit: end with full stop | |
579 | Nit: re-wrap and end sentence in full stop. | |
589 | Nit: end with full stop | |
597 | Nit: end with full stop. |
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
510 | emm, This sentence is copied from TargetRegisterInfo.h. It means that, For RISCV, if a frame index operand has a Offset that out-of-range 12 bits, this function will return true to indicate this frame index operand needs a frame base register. |
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
510 | Yes that was my bad - the sentence does indeed make sense as written. |
I've benchmarked the impact of this patch with CoreMark and Embench and it looked good. In summary, there were no or minimal performance differences but there were various small improvements to code size, which are probably similar to your test case.
I was going to add other checks like running the llvm test suite but the patch no longer applies. Can you refresh it?
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
520–524 | Can we not just look and see if it both has an FI and is an I or S type instruction? This isn't really flexible, and I don't see why it's needed either (and I'd have to more than double the size of this switch statement downstream in CHERI-LLVM to add both capability base and capability value instructions). | |
535–537 | What about floating point registers? Can we also generalise this to iterate over the list of saved registers rather than copying the ABI to yet another place (and thus yet another place we'd have to change downstream in CHERI-LLVM)? | |
542 | Why 128? A hard-coded constant common across all ISAs seems fishy. | |
582 | We use the less-confusing name FIOperandNum elsewhere | |
584 | The copies of this in RegisterScavenging and other backends have assert(i < MI.getNumOperands() && "Instr doesn't have FrameIndex operand!"); | |
586–588 | I don't see why the first part matters. Only include the relevant information, that "FrameIndex operands are always represented as a register followed by an immediate". | |
589 | This variable seems pointless to me, just inline the +1 | |
llvm/test/CodeGen/RISCV/local-stack-allocation.ll | ||
30 ↗ | (On Diff #338495) | Please fix this test, either before you pre-commit it or now as another pre-commit if you've already done so. |
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
519 | I think vector load/store no longer have frame indices. | |
520–524 | This is missing LBU, LHU, and LWU. I agree with @jrtc27 we should check the I or S type instead. | |
533 | No need to say 'int' llvm almost always uses unsigned by itself. | |
535–537 | Are X3 and X4 really callee saved to the stack? They're reserved so I don't think we ever spill them. |
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
542 | This value should be an experimental target-dependent value. But I don't know yet which value is appropriate for riscv. |
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp | ||
---|---|---|
602 | Should we assert that it is I or S format? |
llvm/test/CodeGen/RISCV/local-stack-slot-allocation.ll | ||
---|---|---|
7–8 | Remove TODO |
I'm seeing failures in the llvm-testsuite on riscv64-unknown-linux-gnu with -O2 (no vectors) which git bisect attributes to this change.
Failed Tests (3): test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test test-suite :: SingleSource/UnitTests/matrix-types-spec.test
Has anyone seen something similar?
Fix the failures in the llvm-testsuite on riscv64-unknown-linux-gnu with -O2 (no vectors).
For the failures in the llvm-testsuite, it seems some problems for Add instruction which generates wrong frame index.
Also in Arm and AArch64, No need to support for Add instruction.
Therefore load/store(exclude vector load/store) instruction is enough for LocalStackSlotAllocation pass.
If it would be better to add a test case that tests we don't do local stack slot allocation on ADDI instruction?
I'm seeing failures on the llvm testsuite after this change (scalar only).
Failed Tests (3): test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test test-suite :: SingleSource/UnitTests/matrix-types-spec.test
Does anyone else see them too? Thanks!
Yes, I see the same failures. I admit I'm a bit puzzled.
Might be a problem on our side, so if someone can reproduce it too, that'd be helpful.
It looks like maybe we're somehow creating load/stores with negative immediates that are supposed to be large positive immediates.
For example,
I see s9 pointing to sp+2152. A store that is supposed to be accessing sp+4288 was created with s9-1960. If we consider that -1960 as a uimm12 it would be +2136. 2152+2136 is the 4288 we were after.
This is backed up by running -verify-machineinstrs which generates a ton of errors about immediates that are uimm12 instead of simm12.
I think we failed to consider the offset in the store itself in RISCVRegisterInfo::isFrameOffsetLegal
modify the isFrameOffsetLegal to (Offset <= maxIntN(12)) && (Offset >= minIntN(12)),
Simply constrain offset to be within the range of 12-bit unsigned integers.
@LiDongjin I made changes and re-commited this already. Are you proposing additional changes?
That’s what isIntN<12>(Offset) does
Simply constrain offset to be within the range of 12-bit unsigned integers.
Signed, not unsigned, which is already done
Can we please have the diff reverted back to what was committed and closed? This is a mess now…
Just change the isFrameOffsetLegal like:
return (Offset <= maxIntN(12)) && (Offset >= minIntN(12));
Bumped @asb from reviewers to unblock.
This should be back to the version commited in 4554663bc0da71d61ab488641c95ef98430cb451
No need to copy this doc comment TargetRegisterInfo.h.