This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Fold vmv.v.v into vops
AbandonedPublic

Authored by luke on Jun 20 2023, 7:34 AM.

Details

Summary

A vmv.v.v can be thought of a vmerge, where instead of masking with a mask,
we're masking the tail with VL. For example, in the sequence below the vmv.v.v
copies over the first 2 elements from the vadd.vv:

vsetivli zero, 4, e32, m1, ta, ma
vadd.vv v9, v10, v11
vsetivli zero, 2, e32, m1, tu, ma
vmv.v.v v8, v9

This patch adds an optimisation to fold away the vmv.v.v into the preceding op
if it has only use, by modifying the op's VL:

vsetivli zero, 2, e32, m1, ta, ma
vadd.vv v8, v10, v11

In general, we can just replace the VL of the op with the VL of the vmv.v.v
(Unless it's a load, in which case we make sure we're only loading a VL less
than or equal to the original, so we don't end up loading more elements than
before)

The actual optimisation shares largely the same structure as
performCombineVMergeAndVOps: I've just duplicated the code for now, but if
people wanted I could try and abstract some of the shared bits away.

Diff Detail

Event Timeline

luke created this revision.Jun 20 2023, 7:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2023, 7:34 AM
luke requested review of this revision.Jun 20 2023, 7:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2023, 7:34 AM
luke updated this revision to Diff 532925.Jun 20 2023, 7:36 AM

Fix some formatting

luke updated this revision to Diff 532929.Jun 20 2023, 7:38 AM

Remove TODO

luke updated this revision to Diff 532933.Jun 20 2023, 7:52 AM

Fix assert message

What prevents this from changing the VL of a vcompress or a reduction?

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
3499

Why aren't we using RISCVII::hasVLOp if you want to know there is a VL operand.

luke planned changes to this revision.Jun 22 2023, 3:47 PM

What prevents this from changing the VL of a vcompress or a reduction?

This patch assumes every operation is lanewise, back to the drawing board!
It doesn't look like we have any information on the pseudos at the moment as to what's lanewise and what's not. Is there an easier way to work this out other than adding a new bit to RVInst?

luke updated this revision to Diff 533953.Jun 23 2023, 7:21 AM

Check for existence of RISCVMaskedPseudo to prevent non element-wise
pseudos from being transformed

luke marked an inline comment as done.Jun 23 2023, 7:26 AM

I hadn't realised that RISCVMaskedPseudo was used to explicitly mark pseudos for the vmerge peephole. They all appear to be element wise, so I've update it to check that the masked pseudo info always exists.

reames requested changes to this revision.Jul 6 2023, 11:07 AM

The example code from the description is wrong. The "ta" needs to be "tu" to be correct.

vsetivli zero, 2, e32, m1, ta, ma
vadd.vv v8, v10, v11

Glancing at a few of the test, this appears wrong in the implementation as well.

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
3517

Please rebase this over D154620, and then use the exact same code structure as the vmerge transform. Having them different makes it very hard to follow.

Hm, this is just a thought. But could we take the existing vmerge transform, and split it into two steps? If a vmv.v.v is like a vmerge with a all ones mask, could the vmerge transform become "first form vmv.v.v" and "then fold vmv.v.v into true"? We in fact already have both of these pieces, can we simply reorder and restructure?

This revision now requires changes to proceed.Jul 6 2023, 11:07 AM
reames added inline comments.Jul 10 2023, 7:40 AM
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
3517

Ignore the just a thought bit. That handles the sub-case where the mask is all ones, but does not handle the arbitrary mask case. I got myself confused with thinking about the unmasking peephole, but that's a different bit of logic.

luke added inline comments.Jul 10 2023, 9:02 AM
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
3517

Yeah, I don't think we can express all vmv.v.v as vmerges but I took another look at the specification and noticed this note in 11.16:

The vector integer move instructions share the encoding with the vector merge instructions, but with vm=1 and vs2=v0.

So a vmv.v.v isn't just like vmerge with an all ones mask, it literally is one (without a mask). I presume we can then generalise the existing vmerge fold to handle vmv.v.v, so I'll give that a try first

luke abandoned this revision.Jul 13 2023, 6:07 AM

Superseded by D155101