This is an archive of the discontinued LLVM Phabricator instance.

Teach PeepholeOpt to eliminate redundant copy from constant physreg (e.g VLENB on RISCV)
ClosedPublic

Authored by reames on May 13 2022, 10:18 AM.

Details

Summary

The existing redundant copy elimination required a virtual register source, but the same logic works for any physreg where we don't have to worry about clobbers.

(Aside: The code can be pretty easily generalized to handle the clobber case too, but that'll be a separate patch.)

On RISCV, this helps eliminate redundant CSR reads from VLENB.

Diff Detail

Event Timeline

reames created this revision.May 13 2022, 10:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2022, 10:18 AM
reames requested review of this revision.May 13 2022, 10:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2022, 10:18 AM

This looks ok to me, but I'd like to understand some things about the larger roadmap here.

If SelectionDAG CSEd the read_register this test case wouldn't need this patch. The peephole pass code here seems to be restricted to a single basic block, same as SelectionDAG. read_register isn't CSEd in SelectionDAG because read_register and read_volatile_register use the same code and have a chain dependency.

The main places we create PseudoVLENB are from ISD::VSCALE lowering and frame lowering.

For ISD::VSCALE lowering we go through RISCVISD::READ_VLENB. Do you plan to replace with RISCVISD::READ_VLENB with ISD::READ_REGISTER? Or make RISCVISD::READ_VLENB isel to a COPY? Or maybe use ISD::CopyFromReg? RISCVISD::READ_VLENB and ISD::CopyFromReg should CSE as long as the chain input is EntryNode and there is no Glue.

For the frame lowering case, that happens after regalloc so this patch doesn't help with that.

Do you have other cases that this patch helps that wouldn't already be covered by SelectionDAG CSE?

This looks ok to me, but I'd like to understand some things about the larger roadmap here.

If SelectionDAG CSEd the read_register this test case wouldn't need this patch. The peephole pass code here seems to be restricted to a single basic block, same as SelectionDAG. read_register isn't CSEd in SelectionDAG because read_register and read_volatile_register use the same code and have a chain dependency.

The main places we create PseudoVLENB are from ISD::VSCALE lowering and frame lowering.

For ISD::VSCALE lowering we go through RISCVISD::READ_VLENB. Do you plan to replace with RISCVISD::READ_VLENB with ISD::READ_REGISTER? Or make RISCVISD::READ_VLENB isel to a COPY? Or maybe use ISD::CopyFromReg? RISCVISD::READ_VLENB and ISD::CopyFromReg should CSE as long as the chain input is EntryNode and there is no Glue.

For the frame lowering case, that happens after regalloc so this patch doesn't help with that.

Do you have other cases that this patch helps that wouldn't already be covered by SelectionDAG CSE?

Ok, two level response here.

First, this is a generic improvement to target independent code. We have the notion of constant physical registers. Regardless of whether there are practice benefits on the particular example constant register, having the generic infrastructure handle more cases (at near zero cost) seems worthwhile on it's own.

Your right that we should also be able to get this case in SelectionDAG. This was already on my punch list (https://github.com/preames/public-notes/blob/master/llvm-riscv-status.rst#optimizations-for-constant-physregs-vlenb-x0), and I don't see any reason to only implement it in one place.

Second, you raise the question of whether this will be useful in practice on RISCV for VLENB. The honest answer is I don't know. I want to be able to replace PsuedoReadVLENB the whole way thought, but whether that will end up being more than a very minor clean is non obvious.

What got me looking at this was the recent change to PsuedoReadVL which we've turned into something much more than just a register read. If we can use COPY as canonical form for VTYPE, I was going to prototype splitting the PsueodReadVL node into component pieces. I haven't done that yet. I hit too many independent "we should fix that"s and didn't bother until the general infrastructure for constant physical registers had been fleshed out a bit more. (To prevent confusion, VL is *not* constant.)

craig.topper accepted this revision.May 16 2022, 2:26 PM

This looks ok to me, but I'd like to understand some things about the larger roadmap here.

If SelectionDAG CSEd the read_register this test case wouldn't need this patch. The peephole pass code here seems to be restricted to a single basic block, same as SelectionDAG. read_register isn't CSEd in SelectionDAG because read_register and read_volatile_register use the same code and have a chain dependency.

The main places we create PseudoVLENB are from ISD::VSCALE lowering and frame lowering.

For ISD::VSCALE lowering we go through RISCVISD::READ_VLENB. Do you plan to replace with RISCVISD::READ_VLENB with ISD::READ_REGISTER? Or make RISCVISD::READ_VLENB isel to a COPY? Or maybe use ISD::CopyFromReg? RISCVISD::READ_VLENB and ISD::CopyFromReg should CSE as long as the chain input is EntryNode and there is no Glue.

For the frame lowering case, that happens after regalloc so this patch doesn't help with that.

Do you have other cases that this patch helps that wouldn't already be covered by SelectionDAG CSE?

Ok, two level response here.

First, this is a generic improvement to target independent code. We have the notion of constant physical registers. Regardless of whether there are practice benefits on the particular example constant register, having the generic infrastructure handle more cases (at near zero cost) seems worthwhile on it's own.

Yeah I'm not opposed to this. I guess part of my concern was that this becomes untested if we fix SelectionDAG CSE. Maybe we should have a directed MIR test?

This revision is now accepted and ready to land.May 16 2022, 2:26 PM
This revision was landed with ongoing or failed builds.May 16 2022, 4:39 PM
This revision was automatically updated to reflect the committed changes.