Page MenuHomePhabricator

[RISCV] Initial version of a demand based vsetvli insertion pass.

Authored by craig.topper on May 5 2021, 1:06 PM.



This patch adds a new pass to conservativley insert vsetvli's.
This replaces the current approach of aggressively adding them
and then trying to remove unneeded.

I've removed the custom inserter and implicit uses of vl/vtype
from the pseudo instructions. The implicit uses will be added
when the vsetvli instruction is inserted.

I've placed this pass after the SSA optimizations but while we
are still in SSE form. I'm considering pushing it later to after
the machine scheduler, but that affects more tests and requires
live intervals to be handled. Moving it after the scheduler may
allow us to add heuristics to the scheduler to group same vtype
and vsetvli operations together.

Other goals for this are to enable more intelligent selection of
vtype for mask instructions like vmand/vmor/vmxor, etc which
don't use SEW. More intelligent vtype selection for vmv.x.s which
doesn't use LMUL. If we are able to see the previous vtypes we can
avoid inserting a vsetvli for these instructions in some cases. This
is hard to do in the current design since we can't track the users
of a vsetvli when we are trying to see if it is unneeded.

Diff Detail

Event Timeline

craig.topper created this revision.May 5 2021, 1:06 PM
craig.topper requested review of this revision.May 5 2021, 1:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2021, 1:06 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
HsiangKai added inline comments.May 6 2021, 8:03 PM

Should we check the range of immediate? vsetivli only has 5-bits immediate field.


Is 'the' redundant in the comment?

craig.topper added inline comments.May 6 2021, 8:13 PM

I can assert it. It should have been checked by SelectionDAG when the instruction was created.


Yes. Thanks

HsiangKai added inline comments.May 6 2021, 9:28 PM

Got it. That's fine without the assertion.


I think is still useful here.
How do you think if I move it into RISCVInsertVSETVLI.cpp instead?


is it enough to say this instruction is not reading VL?
If yes, we don't have to add VL implicit use if HasVLOpMask is false.

HsiangKai added inline comments.May 6 2021, 11:23 PM

This test case needs to add -mattr=+experimental-v in the command line. Otherwise, the PseudoVSETVLI will disappear.

Don't add VL implicit use if VL isn't used.
Add VL implicit use to X0, X0 vsetvli.
Add experimental-v flag to addi-scalable-offset.mir

Remove extra word from comment


I think is still useful here.
How do you think if I move it into RISCVInsertVSETVLI.cpp instead?

I agree it's useful, but we should probably keep it in RISCVCleanupVSETVLI.cpp so we can keep the skipFunction() check in it's runOnMachineFunction and the getOptLevel() check in RISCVTargetMachine.

Fix up the mask-reg-alloc.mir test to not have noreg for AVL.

Rebase on recent test additions.

Remove implicit VL/VTYPE from another test.

Add a new MIR test to get more direct coverage and show the manipulation of the implicit operands.

Don't add VSETVLI before vmv.x.s if the SEW matches. The other fields of VTYPE don't matter.

Rebase after removal of RISCVII:VSEW

Thanks @craig.topper. This is a great cleanup compared to our existing approach (which was actually very conservative).

Some higher level questions:

  • I understand we still want a global cleanup after this pass, don't we?
  • Would be this pass a good moment to try to optimise those cases where the AVL and SEW/LMUL are not changed and the instruction has explicit EEW. (Maybe we need to add some extra TSFlag for those)
    • For instance if we are under SEW=64 LMUL=2 and AVL=%avl a PseudoVLE32 that switches to SEW=32, LMUL=½, AVL=%avl I understand it can skip the vsetvli as it will behave as expected.
frasercrmck accepted this revision.May 13 2021, 3:13 AM

LGTM, good stuff. One minor nit and a question.


Not for this patch but should we make sure calls are marked as clobbering VL/VTYPE? Or is there a reason to keep them separate?


ST has already been defined above

This revision is now accepted and ready to land.May 13 2021, 3:13 AM
craig.topper added inline comments.May 17 2021, 2:48 PM

I think any call with a regmask operand probably does appear to clobber VL since the regmask marks non-clobbered registers and not include VL. But I think I wasn't sure all calls would end up with a regmask argument.

Address some review comments.
Clear dead flags for VL and VTYPE defs on existing vsetvli instructions in case
we make them live.

craig.topper abandoned this revision.May 24 2021, 1:30 PM

Abandoned in favor of D102737