When AMOs are used to implement parallel reduction operations, typically the return value would be discarded.
This patch adds a peephole pass RISCVDeadRegisterDefinitions. It rewrites rd to x0 when rd is marked as dead.
It may improve the register allocation and reduce pipeline hazards on CPUs without register renaming and OOO.
Comparison with GCC: https://godbolt.org/z/bKaxnEcec
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Indeed. But I could not find any pattern like lw\tzero in the GCC testsuite. Actually it also applies to sc.w/d and amocas.w/d/q. This pass doesn't handle them since they are not common.
Could you please show me some real-world examples?
llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | ||
---|---|---|
420 | Does this depend on DetectDeadLanes? AArch64 has a pass called AArch64DeadRegisterDefinitions that is inserted in addPreRegAlloc. |
llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | ||
---|---|---|
420 | Yes, it currently depends on DetectDeadLanes to mark dead defs. |
At a macro level, this could be generalized to be target independent (as AArch64 also has a zero register), but I support generalizing this within RISCV first.
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp | ||
---|---|---|
66 | I don't think we need to worry about the reserved nops, as they won't have an MI representation will they? Unless we've implemented one, and we could just guard explicitly for the reserved nop cases? | |
llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | ||
411–414 | If you place your new pass after InsertVSETVLI, you should be able to delete this block of code from that pass. // Once we're fully done rewriting all the instructions, do a final pass // through to check for VSETVLIs which write to an unused destination. // For the non X0, X0 variant, we can replace the destination register // with X0 to reduce register pressure. This is really a generic // optimization which can be applied to any dead def (TODO: generalize). for (MachineBasicBlock &MBB : MF) { for (MachineInstr &MI : MBB) { if (MI.getOpcode() == RISCV::PseudoVSETVLI || MI.getOpcode() == RISCV::PseudoVSETIVLI) { Register VRegDef = MI.getOperand(0).getReg(); if (VRegDef != RISCV::X0 && MRI->use_nodbg_empty(VRegDef)) MI.getOperand(0).setReg(RISCV::X0); } } } | |
llvm/test/CodeGen/RISCV/atomic-rmw-discard.ll | ||
1 | Please go ahead and precommit the new test file so that we can see the change being made in this review. |
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp | ||
---|---|---|
66 | I think this was trying to prevent turning a regular instruction like add into a reserved nop encoding? If for some reason dead code elimination didn't already delete a dead add. | |
llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | ||
411–414 | We'd have to remove the mayLoad check I think, but then we'd have to block changing PseudoVSETVLIX0? |
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp | ||
---|---|---|
67 | I cannot handle PseudoVSETVLI/PseudoVSETIVLI now since the diff is too large (about 300+ KLOCs with full context). | |
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll | ||
12396 ↗ | (On Diff #554599) | I don't know why DCE does not remove this dead instruction. |
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp | ||
---|---|---|
67 | If you're getting a lot of changes, it's probably a bug. It's not safe to change the destination to X0 for RISCV::PseudoVSETVLIX0 which is a different instruction than PseudoVSETVLI/PseudoVSETIVLI. Without an explicit check for it, that instruction it would be included. | |
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll | ||
12396 ↗ | (On Diff #554622) | I don't know why it didn't get removed either, but its not ok for this pass to change this instruction. |
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll | ||
---|---|---|
12396 ↗ | (On Diff #554622) | It only became dead after InsertVSETVLI and we don't run DCE after that. @craig.topper Why isn't this legal? If the instruction is dead, this does appear to be a valid instruction encoding? I don't see anything in the spec about addi zero, zero, imm being reserved or anything. |
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll | ||
---|---|---|
12396 ↗ | (On Diff #554622) | Isn't it one of the hint encodings? It won't crash, but it might cause unintended effects from hint. If it was an ORI it could be a non-temporal hint for example. If it was an AUIPC it would become a landing pad from the CFI proposal. |
This may not be much of an issue in practice, but I'd note that rewriting rd to x0 can sometimes result in a non-compressible instruction when previously it was compressible (e.g. rewriting loads).
I agree with the concern about this unwittingly introducing new HINT instructions. Although we expect instructions without side-effects that aren't caught elsewhere to be very rare, wouldn't it make sense to go ahead and delete them in this case? That said, I'm not sure our current modeling of CSR instructions would handle this properly for a theoretical architecture that had side-effecting CSR reads (e.g. a FIFO accessed via a CSR).
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll | ||
---|---|---|
12396 ↗ | (On Diff #554622) | Agreed this is undesirable for HINT instructions. These are standard RV32I instructions that do not alter architectural state, so of course very commonly have rd==x0. |
llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir | ||
---|---|---|
365 | This looks like a regression |
llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir | ||
---|---|---|
365 | It will be addressed by the following pass RISCVDeadRegisterDefinitions. This test only runs the pass RISCVInsertVSETVLI. |
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp | ||
---|---|---|
93 | Why are we using MRI->getRegClass instead of TII->getRegClass like AArch64? |
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp | ||
---|---|---|
93 | It protects defs with reg class GPRNoX0 being rewritten to x0. I have a separate patch to address this: https://github.com/llvm/llvm-project/pull/65934 |
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp | ||
---|---|---|
93 | Does the `!Desc.mayLoad() && !Desc.mayStore() && !Desc.hasUnmodeledSideEffects()` check already prevent that? |
I don't think we need to worry about the reserved nops, as they won't have an MI representation will they? Unless we've implemented one, and we could just guard explicitly for the reserved nop cases?