This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add a vsetvli insert pass that can be extended to be aware of incoming VL/VTYPE from other basic blocks.
ClosedPublic

Authored by craig.topper on May 18 2021, 4:40 PM.

Details

Summary

This is a replacement for D101938 for inserting vsetvli
instructions where needed. This new version changes how
we track the information in such a way that we can extend
it to be aware of VL/VTYPE changes in other blocks. Given
how much it changes the previous patch, I've decided to
abandon the previous patch and post this from scratch.

The pass now consists of 3 phases. The first phase inserts
intra block vsetvlis and collects the requirements for the
first instruction in the block. The second phase calculates
the incoming value for each block. The third phase checks
the requirements against the incoming value and inserts a
vsetvli if needed.

For now I've stubbed out phase 2 to make it equivalent to
D101938. The only change from the tests in D101938 is the
numbering of virtual registers in one test in vsetvli-insert.mir.

Diff Detail

Event Timeline

craig.topper created this revision.May 18 2021, 4:40 PM
craig.topper requested review of this revision.May 18 2021, 4:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 18 2021, 4:40 PM
Herald added a subscriber: MaskRay. · View Herald Transcript

This is looking great.

I was wondering if we can delay all the insertions to phase 3. I understand this would require keep tracking the insertion points. I mention that because it could help if we want to be able to simplify further the insertion points after phase1 or phase2. In that setting, phase 1 could be named something like computeDemandedVSETVL, phase 2 could be computeIncomingVLVTYPE and then phase 3 emitVSETVL

An example of possible optimization, could be a loop that loads SEW=32 vectors and does a widening operation (say vfwmac) and stores the resulting SEW=64 (without further processing that vector) using vse64. Looks to me we could just go with a single vsetvl here with SEW=32.

What do you think?

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
318

Not sure what this RequirePending refers to?

arcbbb added inline comments.May 20 2021, 12:35 AM
llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
302

is the assertion avoidable by doing setAVLReg(RISCV::NoRegister) for vtype-only change, and something like
BBInfo.change = Merge(BBInfo.change, getInfoForVSETVLI) to keep the latest VL & vtype change ?

Remove reference to RequirePending.

craig.topper added inline comments.May 20 2021, 8:50 AM
llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
318

At one point I was using AMDGPUs SIModeRegister pass as a reference for this cross basic block stuff and that got copied from there before I realized I didn't need it.

craig.topper added a comment.EditedMay 20 2021, 9:01 AM

This is looking great.

I was wondering if we can delay all the insertions to phase 3. I understand this would require keep tracking the insertion points. I mention that because it could help if we want to be able to simplify further the insertion points after phase1 or phase2. In that setting, phase 1 could be named something like computeDemandedVSETVL, phase 2 could be computeIncomingVLVTYPE and then phase 3 emitVSETVL

An example of possible optimization, could be a loop that loads SEW=32 vectors and does a widening operation (say vfwmac) and stores the resulting SEW=64 (without further processing that vector) using vse64. Looks to me we could just go with a single vsetvl here with SEW=32.

What do you think?

We can probably do it in phase 1 at least when the previous instruction is in the same block. Harder if the instruction is in a predecessor block. Though if the previous instruction isn't in the same block and there are no instructions after it, I'm not sure what value to propagate to the successors for the data flow.

Handle the all insertion in "phase 3" which is the only phase in this patch.

LGTM

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
302

Sorry I forgot it was not possible because vsetvli wasn't inserted at this stage. Please ignore it.

HsiangKai added inline comments.May 22 2021, 1:10 AM
llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
273

"ask" the requirements start "from" this ...?

Cleanup a poorly written comment

frasercrmck accepted this revision.May 24 2021, 1:58 AM

LGTM.

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
11

Documenting this pass and its phases in line with the commit message would be good.

19

nit: I can imagine this is used by later patches, but it's not used by this one.

This revision is now accepted and ready to land.May 24 2021, 1:58 AM
This revision was landed with ongoing or failed builds.May 24 2021, 11:48 AM
This revision was automatically updated to reflect the committed changes.
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-int.ll