For large integers (for example, magic numbers generated by
TargetLowering::BuildSDIV when dividing by constant), we may
need about 4~8 instructions to build them.
In the same time, it just takes two instructions to load
constants (with extra cycles to access memory), so it may be
profitable to put these integers into constant pool.
Details
- Reviewers
asb craig.topper - Commits
- rG41454ab25645: [RISCV] Use constant pool for large integers
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
183 | This makes it sound like the immediate is global which doesn't make sense. selectImmWithConstantPool or selectImmUsingConstantPool would be a better name, I think. | |
184 | MVT should always be passed by value. It's an 8-bit integer. | |
190 | could -> can | |
191 | LD assumes VT is MVT::i64. Should we assert that? | |
530 | Can we call this from inside selectImm instead of returning nullptr? selectImm only has 1 caller right now, but it has add other callers in the past. | |
llvm/lib/Target/RISCV/RISCVSubtarget.cpp | ||
114 | LoadLatency is going to very for different CPUs so saying LoadLatency+1 and then having a hardcoded constant doesn't make sense. I'm not opposed to the hard coded constant, it matches what gcc uses. I just don't like the comment. |
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
191 | This needs a MemOperand to be added. |
llvm/lib/Target/RISCV/RISCVSubtarget.cpp | ||
---|---|---|
113 | Since we can get schedmodel and load latency, is there still need that such API to adapt different subtarget cost? Other body can give some suggestion to talk. |
Thanks for the patch!
This is generating illegal code in some cases, please try the following test case with RV64:
$ cat tc.ll define dso_local fp128 @foo(fp128 %x) { entry: %x.addr = alloca fp128, align 16 store fp128 %x, fp128* %x.addr, align 16 %0 = load fp128, fp128* %x.addr, align 16 %1 = bitcast fp128 %0 to i128 %2 = icmp slt i128 %1, 0 %3 = zext i1 %2 to i64 %cond = select i1 %2, fp128 0xL8469898CC51701B84000921FB54442D1, fp128 0xL00000000000000000000000000000000 ret fp128 %cond } ; Function Attrs: nounwind define dso_local signext i32 @main() { entry: %retval = alloca i32, align 4 store i32 0, i32* %retval, align 4 %call = call fp128 @foo(fp128 0xL0000000000000000BFFF000000000000) %cmp = fcmp une fp128 %call, 0xL8469898CC51701B84000921FB54442D1 br i1 %cmp, label %if.then, label %if.end if.then: ; preds = %entry call void @abort() unreachable if.end: ; preds = %entry ret i32 0 } declare dso_local void @abort()
It produces a ld a3, zero(a3) which is of course invalid.
There's a question of what to do about some of the imm.ll test cases. Perhaps it makes sense to add an option to either change the threshold or disable the constant pool, then some of these tests can still usefully check the materialisation logic and it's easier for people to experiment with different thresholds if their microarch might benefit? I'd be open to counter-arguments that it's not worth the hassle though.
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
137 | This should be getTargetConstant(0, DL, VT) |
I am agree with you and I have added two options (one to disable promotion and another to set threshold).
Currently, promotion is disabled in imm.ll, is there any other tests that should also be disabled?
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
130 | We should probably pass int64_t Imm here and call ConstantInt::get(EVT(VT).getTypeForEVT(CurDAG->getContext(), Imm)` to get the ConstantInt* that getConstantPool needs. That would solve the problems in my other comments. | |
747–748 | N1C does not contain the same constant as ShiftedC1. |
Do we have a test case that shows the bug @asb found with "zero" previously? I skimmed the tests but I might have missed it.
llvm/lib/Target/RISCV/RISCVSubtarget.cpp | ||
---|---|---|
53 | Make this a disable flag that defaults to 0. | |
113 | I'm not sure Promote is clear term in this context. Something like useConstantPoolForLargeInts might be better? | |
llvm/test/CodeGen/RISCV/double-imm.ll | ||
8 | Seems like this TODO is addressed by this patch? |
CodeGen/RISCV/srem-vector-lkk.ll and CodeGen/RISCV/urem-vector-lkk.ll generated such codes, so I think there is no need to add extra tests as these two have covered this case.
GCC will merge constant pool entries with the same value in the whole compilation unit, can LLVM do this too?
It seems that ConstantPool is owned by MachineFunction and can't be shared with other MachineFunctions.
Are there something I missed? Is it possible for LLVM to create module-scoped constant pool that can be used by all functions?
Thanks.
So it is a legacy problem and linker may do the optimization (however I found current GNU ld can't do this).
This LGTM, but please wait for @craig.topper to confirm all his outstanding comments were addressed. You'll need to regenerate test/CodeGen/RISCV/div-by-constant.ll as well.
We should probably pass int64_t Imm here and call ConstantInt::get(EVT(VT).getTypeForEVT(CurDAG->getContext(), Imm)` to get the ConstantInt* that getConstantPool needs. That would solve the problems in my other comments.