This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add a pass to rewrite rd to x0 for non-computational instrs whose return values are unused
ClosedPublic

Authored by dtcxzyw on Aug 24 2023, 10:50 AM.

Details

Summary

When AMOs are used to implement parallel reduction operations, typically the return value would be discarded.
This patch adds a peephole pass RISCVDeadRegisterDefinitions. It rewrites rd to x0 when rd is marked as dead.
It may improve the register allocation and reduce pipeline hazards on CPUs without register renaming and OOO.
Comparison with GCC: https://godbolt.org/z/bKaxnEcec

Diff Detail

Event Timeline

dtcxzyw created this revision.Aug 24 2023, 10:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 24 2023, 10:50 AM
dtcxzyw requested review of this revision.Aug 24 2023, 10:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 24 2023, 10:50 AM

This also applies to volatile regular loads.

This also applies to volatile regular loads.

Indeed. But I could not find any pattern like lw\tzero in the GCC testsuite. Actually it also applies to sc.w/d and amocas.w/d/q. This pass doesn't handle them since they are not common.
Could you please show me some real-world examples?

craig.topper added inline comments.Aug 24 2023, 11:46 AM
llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
409

Does this depend on DetectDeadLanes?

AArch64 has a pass called AArch64DeadRegisterDefinitions that is inserted in addPreRegAlloc.

dtcxzyw added inline comments.Aug 24 2023, 11:58 AM
llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
409

Yes, it currently depends on DetectDeadLanes to mark dead defs.
AArch64DeadRegisterDefinitions is better than this implementation.
I will update the implementation soon. Thank you!

dtcxzyw updated this revision to Diff 553440.Aug 25 2023, 4:53 AM
dtcxzyw retitled this revision from [RISCV] Add a pass to rewrite rd to x0 for AMO instrs whose return values are unused to [RISCV] Add a pass to rewrite rd to x0 for non-computational instrs whose return values are unused.
  • Rebase
  • Handle volatile loads

At a macro level, this could be generalized to be target independent (as AArch64 also has a zero register), but I support generalizing this within RISCV first.

llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp
66

I don't think we need to worry about the reserved nops, as they won't have an MI representation will they? Unless we've implemented one, and we could just guard explicitly for the reserved nop cases?

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
400–403

If you place your new pass after InsertVSETVLI, you should be able to delete this block of code from that pass.

// Once we're fully done rewriting all the instructions, do a final pass
// through to check for VSETVLIs which write to an unused destination.
// For the non X0, X0 variant, we can replace the destination register
// with X0 to reduce register pressure.  This is really a generic
// optimization which can be applied to any dead def (TODO: generalize).
for (MachineBasicBlock &MBB : MF) {
  for (MachineInstr &MI : MBB) {
    if (MI.getOpcode() == RISCV::PseudoVSETVLI ||
        MI.getOpcode() == RISCV::PseudoVSETIVLI) {
      Register VRegDef = MI.getOperand(0).getReg();
      if (VRegDef != RISCV::X0 && MRI->use_nodbg_empty(VRegDef))
        MI.getOperand(0).setReg(RISCV::X0);
    }
  }
}
llvm/test/CodeGen/RISCV/atomic-rmw-discard.ll
1

Please go ahead and precommit the new test file so that we can see the change being made in this review.

craig.topper added inline comments.Aug 29 2023, 12:20 PM
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp
66

I think this was trying to prevent turning a regular instruction like add into a reserved nop encoding? If for some reason dead code elimination didn't already delete a dead add.

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
400–403

We'd have to remove the mayLoad check I think, but then we'd have to block changing PseudoVSETVLIX0?

dtcxzyw updated this revision to Diff 554599.Aug 30 2023, 12:19 AM
dtcxzyw retitled this revision from [RISCV] Add a pass to rewrite rd to x0 for non-computational instrs whose return values are unused to [RISCV] Add a pass to rewrite rd to x0 for instrs whose return values are unused.
dtcxzyw edited the summary of this revision. (Show Details)
  • Rebase
  • Handle computational instrs
dtcxzyw marked 3 inline comments as done.Aug 30 2023, 12:19 AM
dtcxzyw added inline comments.Aug 30 2023, 12:28 AM
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp
67

I cannot handle PseudoVSETVLI/PseudoVSETIVLI now since the diff is too large (about 300+ KLOCs with full context).
Could I remove this check after committing this patch?

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
12396 ↗(On Diff #554599)

I don't know why DCE does not remove this dead instruction.

dtcxzyw updated this revision to Diff 554622.Aug 30 2023, 1:15 AM
  • Rebase on the top of pre-commit tests
  • Drop curly braces
dtcxzyw marked an inline comment as done.Aug 30 2023, 1:15 AM
craig.topper added inline comments.Aug 30 2023, 8:58 AM
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp
67

If you're getting a lot of changes, it's probably a bug. It's not safe to change the destination to X0 for RISCV::PseudoVSETVLIX0 which is a different instruction than PseudoVSETVLI/PseudoVSETIVLI. Without an explicit check for it, that instruction it would be included.

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
12396 ↗(On Diff #554622)

I don't know why it didn't get removed either, but its not ok for this pass to change this instruction.

reames added inline comments.Aug 31 2023, 10:34 AM
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
12396 ↗(On Diff #554622)

It only became dead after InsertVSETVLI and we don't run DCE after that.

@craig.topper Why isn't this legal? If the instruction is dead, this does appear to be a valid instruction encoding? I don't see anything in the spec about addi zero, zero, imm being reserved or anything.

craig.topper added inline comments.Aug 31 2023, 10:45 AM
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
12396 ↗(On Diff #554622)

Isn't it one of the hint encodings? It won't crash, but it might cause unintended effects from hint. If it was an ORI it could be a non-temporal hint for example. If it was an AUIPC it would become a landing pad from the CFI proposal.

asb added a comment.Sep 5 2023, 7:10 AM

This may not be much of an issue in practice, but I'd note that rewriting rd to x0 can sometimes result in a non-compressible instruction when previously it was compressible (e.g. rewriting loads).

I agree with the concern about this unwittingly introducing new HINT instructions. Although we expect instructions without side-effects that aren't caught elsewhere to be very rare, wouldn't it make sense to go ahead and delete them in this case? That said, I'm not sure our current modeling of CSR instructions would handle this properly for a theoretical architecture that had side-effecting CSR reads (e.g. a FIFO accessed via a CSR).

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
12396 ↗(On Diff #554622)

Agreed this is undesirable for HINT instructions. These are standard RV32I instructions that do not alter architectural state, so of course very commonly have rd==x0.

dtcxzyw updated this revision to Diff 556338.Sep 8 2023, 11:17 PM
dtcxzyw retitled this revision from [RISCV] Add a pass to rewrite rd to x0 for instrs whose return values are unused to [RISCV] Add a pass to rewrite rd to x0 for non-computational instrs whose return values are unused.
  • Rebase
  • Skip computational instrs
  • Handle PseudoVSETVLIs (related patch: D124961)
dtcxzyw marked 3 inline comments as done.Sep 8 2023, 11:19 PM
craig.topper added inline comments.Sep 16 2023, 9:38 AM
llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir
365

This looks like a regression

dtcxzyw marked an inline comment as done.Sep 16 2023, 9:53 AM
dtcxzyw added inline comments.
llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir
365

It will be addressed by the following pass RISCVDeadRegisterDefinitions. This test only runs the pass RISCVInsertVSETVLI.

craig.topper added inline comments.Sep 16 2023, 10:42 AM
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp
93

Why are we using MRI->getRegClass instead of TII->getRegClass like AArch64?

dtcxzyw marked an inline comment as done.Sep 16 2023, 11:51 AM
dtcxzyw added inline comments.
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp
93

It protects defs with reg class GPRNoX0 being rewritten to x0.
See also https://reviews.llvm.org/D158759?id=554599#inline-1539364.
For inst li a3, 32, MRI->getRegClass returns GPRNoX0 while TTI->getRegClass returns GPR.

I have a separate patch to address this: https://github.com/llvm/llvm-project/pull/65934

craig.topper added inline comments.Sep 16 2023, 12:27 PM
llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp
93

Does the `!Desc.mayLoad() && !Desc.mayStore() &&

!Desc.hasUnmodeledSideEffects()` check already prevent that?
dtcxzyw updated this revision to Diff 556907.Sep 17 2023, 1:19 AM
  • Rebase
  • Use TargetInstrInfo::getRegClass
dtcxzyw marked 2 inline comments as done.Sep 17 2023, 1:20 AM
This revision is now accepted and ready to land.Sep 19 2023, 9:15 AM
This revision was landed with ongoing or failed builds.Sep 19 2023, 10:03 AM
This revision was automatically updated to reflect the committed changes.