This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add a prepass to vsetvli insertion to propagate VLMAX vsetvli to the instructions.
AbandonedPublic

Authored by craig.topper on Mar 30 2022, 4:05 PM.

Details

Summary

If the VL value for a vector instruction comes from a vsetvli for
VLMAX and the SEW/LMUL ratio matches, we can replace the VL on the
instruction with X0. This allows all the vsetvlis generated for the
instruction to use VLMAX vsetvlis. Or it can prevent vsetvli
instructions from being generated in loops.

This is based on a test case rogfer01 sent me a while back and the code
found here https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/raw/EPI/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp.
We could propagate any AVL, but looking at some of the test changes for that
I'm not sure it makes sense. So I've restricted to just the VLMAX case.

Diff Detail

Event Timeline

craig.topper created this revision.Mar 30 2022, 4:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2022, 4:05 PM
craig.topper requested review of this revision.Mar 30 2022, 4:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2022, 4:05 PM

Not sure how I missed this, sorry

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
927

Is there a reason not to propagate right away here and save us a boolean variable?

934

Good question. We aren't required to set dead, are we? From what I recall it can't be set if it's not true but can be left unset if it is true?

llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
586

Could pre-commit this test so better show the effect?

reames added a subscriber: reames.Apr 25 2022, 9:27 AM
craig.topper edited the summary of this revision. (Show Details)Apr 25 2022, 11:06 AM

Address feedback
Rebase on top of pre-committed test.

Any chance we can get more tests here?

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
900

Pulling out a helper function to match this would help readability for me.

918

Can you use an early continue to reduce nesting here?

More early outs.
Helper function for determining if it is a VLMAX vsetvli.
Restoring the comparison that the vsetvli and instruction have the same SEW/LMUL ratio.

Clearly I need more tests, the last version was missing a critical check.

craig.topper planned changes to this revision.Apr 27 2022, 1:26 PM
reames added a comment.May 6 2022, 3:47 PM

Taking another look at this triggered by our offline chat.

I can't help but think the forward direction here is a bit awkward. It would seem more natural to visit all the instructions with VL operands, and then check for the VReg which defines it whether we could prove it be equivalent to VLMAX. Doing this either as a pre-pass, or maybe even inside getInfoForVSETVLI does seem to be worthwhile as it could eliminate inserted vsetvlis. On the other hand, a post-pass could be easier to test (via explicit vsetvli tests), and might catch some intrinsic cases.

In short, I don't have strong opinions on pre-vs-post, but might encourage working back from the consumer instruction instead of forward.

Hm, just a thought, if we are working backwards in a pre-pass this starts looking like a dag combine? Would we benefit from doing this there instead?

Taking another look at this triggered by our offline chat.

I can't help but think the forward direction here is a bit awkward. It would seem more natural to visit all the instructions with VL operands, and then check for the VReg which defines it whether we could prove it be equivalent to VLMAX. Doing this either as a pre-pass, or maybe even inside getInfoForVSETVLI does seem to be worthwhile as it could eliminate inserted vsetvlis. On the other hand, a post-pass could be easier to test (via explicit vsetvli tests), and might catch some intrinsic cases.

In short, I don't have strong opinions on pre-vs-post, but might encourage working back from the consumer instruction instead of forward.

Makes sense. I'll make that change

Hm, just a thought, if we are working backwards in a pre-pass this starts looking like a dag combine? Would we benefit from doing this there instead?

DAGCombine can't do cross basic block.

reames added a comment.May 7 2022, 7:42 AM

I played around with this a bit last night and realized you can achieve the same effect with a small change inside insertVSETVLI. I also implemented a simple version of the post-pass, and getting that to cover your case of interest turns out to be difficult as you also need the cross block reasoning.

@@ -682,6 +705,16 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
     return;
   }
 
+  
+  if (AVLReg.isVirtual()) {
+    MachineInstr &DefMI = *MRI->getVRegDef(AVLReg);
+    if (isVLMaxVSETVLI(DefMI)) {
+      VSETVLIInfo DefInfo = getInfoForVSETVLI(DefMI);
+      if (Info.hasSameVLMAX(DefInfo))
+        AVLReg = RISCV::X0;
+    }
+  }
   if (AVLReg.isVirtual())
     MRI->constrainRegClass(AVLReg, &RISCV::GPRNoX0RegClass);
craig.topper abandoned this revision.May 17 2022, 10:59 AM

I think D125812 covers this