This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Use mask agnostic policy for isel patterns where the merge operand is IMPLICIT_DEF.
ClosedPublic

Authored by craig.topper on Oct 6 2022, 11:47 AM.

Details

Summary

I tend to think we should ignore the policy bit in vsetvli insertion
if the tied operand is IMPLICIT_DEF. But that raises questions about
what the policy operand on RVV intrinsics means if you also pass
vundefined().

This change at least fixes some cases. I'll post a separate patch
for vsetvli insertion for discussion.

Diff Detail

Event Timeline

craig.topper created this revision.Oct 6 2022, 11:47 AM
craig.topper requested review of this revision.Oct 6 2022, 11:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 6 2022, 11:47 AM
reames added inline comments.Oct 6 2022, 1:09 PM
llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
1957

Slide down has special wording around the handling of the mask. Could you move this to it's own patch? I think this needs careful consideration on it's own.

(Same for slide up)

llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll
14

Hm, this example and several others in the diff appear to be a regression in practice.

It looks like we're missing a couple of cases in InsertVSETVLI. The general pattern is promoting a MA use to MU if doing so would allow avoiding an extra vsetvli. I see two forms in this test case.

There's a general profitability question though. Does switching from MA to MU ever cost enough execution wise to be worth the extra vsetvli? I can't think of any cases, but maybe?

craig.topper added inline comments.Oct 6 2022, 1:38 PM
llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
1957

Slide down has special wording around the handling of the mask. Could you move this to it's own patch? I think this needs careful consideration on it's own.

(Same for slide up)

Can you paste the special wording you are referring to?

This patch doesn't touch slideup because slideup always reads the destination register for the elements less than OFFSET. So it doesn't have an equivalent of this pattern.

llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll
14

Definitely in this case, using MU can't have any performance effect. Using mu creates a false dependency on v9 for the vfcvt.x.f.v. But v9 was already used to create the mask, so last writer of v9 already finished executing.

Though there's nothing forcing the vfcvt.x.f.v to have the same destination as the vfabs.v. That's just lucky register allocation.

reames accepted this revision.Oct 6 2022, 3:06 PM

LGTM

llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
1957

From 16.3:
The tail agnostic/undisturbed policy is followed for tail elements.
The slide instructions may be masked, with mask element i controlling whether destination element i is written. The mask
undisturbed/agnostic policy is followed for inactive elements.

Reading that again, that looks like the natural interpretation. On first read, I'd gotten confused on the wording of the last sentence.

llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll
14

Let's chat about this a bit offline. I don't think this is a blocking concern, I just want to figure out what's a reasonable heuristic for InsertVSETVLI.

This revision is now accepted and ready to land.Oct 6 2022, 3:08 PM