Now AND is used for zero extension when Zbb and Zbp are both not enabled.
It‘s maybe better to use shift operation if the trailing ones mask exceeds simm12.
This patch optimzes LUI + ADDI + AND to SLLI + SRLI.
Paths
| Differential D116720
[RISCV] Use shift for zero extension when Zbb and Zbp are not enabled ClosedPublic Authored by Luhaocong on Jan 5 2022, 7:41 PM.
Details Summary Now AND is used for zero extension when Zbb and Zbp are both not enabled. This patch optimzes LUI + ADDI + AND to SLLI + SRLI.
Diff Detail
Unit TestsFailed Event TimelineHerald added subscribers: VincentWu, luke957, achieveartificialintelligence and 22 others. · View Herald TranscriptJan 5 2022, 7:41 PM
Comment Actions The AND could be better for loops if LICM can move the constant materialization out of the loop. I’ve wondered about doing this as a machine IR peephole after LICM has its chance. But I haven’t spent any time on it. Comment Actions I also concern that the 0xffff has multiple uses, such as unsigned short foo(unsigned short a, unsigned short b, int c, int d) { return (a >> c) + (b >> d); } And there is only one immediate 0xffff is materialized. foo: srl a0, a0, a2 srl a1, a1, a3 add a0, a0, a1 lui a1, 16 addi a1, a1, -1 and a0, a0, a1 ret Using a pass seems to be better, two cases should be excluded,
What's more, the pass can be generalized to a common immediate optimization pass, which combines several small optimization rules, just like AArch64's AArch64MIPeepholeOpt pass. However we need not make is so complex in current patch, that would be something TODO in the future. Comment Actions
Would it be better as follow?
Luhaocong retitled this revision from [RISCV] Use shift for zext.h when Zbb and Zbp are not enabled to [RISCV] Use shift for zero extension when Zbb and Zbp are not enabled. Comment ActionsGeneralise this optimization Comment Actions
PeepholeOptimizer is after LICM so that would be too late. It would need to be before LICM. Another thing I just realized. MachineIR LICM doesn't visit every loop, just the outermost loop with a preheader. So LICM would probably move the LUI+ADDI as far out as it can. And rematerialization in register allocation isn't powerful enough to bring it back in when necessary to avoid a spill since it is two instructions.
This revision is now accepted and ready to land.Jan 10 2022, 10:12 AM This revision was landed with ongoing or failed builds.Jan 10 2022, 6:40 PM Closed by commit rGbd653f6406e7: [RISCV] Use shift for zero extension when Zbb and Zbp are not enabled (authored by Luhaocong, committed by benshi001). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 398584 llvm/lib/Target/RISCV/RISCVInstrInfo.td
llvm/test/CodeGen/RISCV/alu16.ll
llvm/test/CodeGen/RISCV/and.ll
llvm/test/CodeGen/RISCV/atomic-rmw.ll
llvm/test/CodeGen/RISCV/atomic-signext.ll
llvm/test/CodeGen/RISCV/bswap-ctlz-cttz-ctpop.ll
llvm/test/CodeGen/RISCV/calling-conv-half.ll
llvm/test/CodeGen/RISCV/calling-conv-ilp32-ilp32f-ilp32d-common.ll
llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll
llvm/test/CodeGen/RISCV/copysign-casts.ll
llvm/test/CodeGen/RISCV/div.ll
llvm/test/CodeGen/RISCV/double-arith.ll
llvm/test/CodeGen/RISCV/double-bitmanip-dagcombines.ll
llvm/test/CodeGen/RISCV/double-intrinsics.ll
llvm/test/CodeGen/RISCV/float-arith.ll
llvm/test/CodeGen/RISCV/float-bit-preserving-dagcombines.ll
llvm/test/CodeGen/RISCV/float-bitmanip-dagcombines.ll
llvm/test/CodeGen/RISCV/float-intrinsics.ll
llvm/test/CodeGen/RISCV/half-arith.ll
llvm/test/CodeGen/RISCV/half-bitmanip-dagcombines.ll
llvm/test/CodeGen/RISCV/half-convert-strict.ll
llvm/test/CodeGen/RISCV/half-convert.ll
llvm/test/CodeGen/RISCV/half-intrinsics.ll
llvm/test/CodeGen/RISCV/rem.ll
llvm/test/CodeGen/RISCV/rv32zbb.ll
llvm/test/CodeGen/RISCV/rv32zbp.ll
llvm/test/CodeGen/RISCV/rv32zbs.ll
llvm/test/CodeGen/RISCV/rv64zbb.ll
llvm/test/CodeGen/RISCV/rv64zbp.ll
llvm/test/CodeGen/RISCV/rv64zbs.ll
llvm/test/CodeGen/RISCV/rv64zfh-half-convert.ll
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-int-vp.ll
llvm/test/CodeGen/RISCV/rvv/vreductions-int-vp.ll
llvm/test/CodeGen/RISCV/sext-zext-trunc.ll
llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll
llvm/test/CodeGen/RISCV/urem-seteq-illegal-types.ll
|
Make this a PatLeaf like AddiPair and move the one use check in here to get rid of and_const_oneuse.
I think you can then do
No need for Subtarget check.