This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Avoid changing etype for splat of 0 or -1
ClosedPublic

Authored by reames on Jun 16 2022, 1:48 PM.

Details

Summary

A splat of the values 0 and -1 as sign extended 12 bit immediates are always the same bit pattern regardless of the etype used to perform the operation. As a result, we can sometimes avoid introducing a vsetvli just for the purposes of a splat.

Looking at the diffs, we don't get a huge amount of immediate value out of this. We mostly push the vsetvli one instruction down, usually in front of a vmerge. We also don't get the corresponding fixed length vector cases because VL typically is changed despite the actual bits written being the same. Both of these are areas I plan to explore in future patches.

Interestingly, this makes a great example of why we need the forward and backward implementation to be consistent. Before we merged the demanded field handling, if we implement only the forward direction, we lost the ability to mutate a prior vsetvli and eliminate a later one entirely. This resulted in practical regressions instead of improvements. It's always nice when practice matches theory. :)

Diff Detail

Event Timeline

reames created this revision.Jun 16 2022, 1:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2022, 1:48 PM
reames requested review of this revision.Jun 16 2022, 1:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2022, 1:48 PM
reames updated this revision to Diff 437727.Jun 16 2022, 3:16 PM
reames edited the summary of this revision. (Show Details)

Rebase over landed changes and simplify as a result

craig.topper accepted this revision.Jun 16 2022, 10:03 PM

LGTM with one stray formatting change and one possible future improvment.

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
608

Stray blank line change?

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extload-truncstore.ll
10

Seems like this could be e16, mf4 to avoid the later vsetvli

This revision is now accepted and ready to land.Jun 16 2022, 10:03 PM
reames added inline comments.Jun 17 2022, 8:07 AM
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extload-truncstore.ll
10

We don't have any demanded rules for vlm.v, and as such, can't rewrite the vsetvli before it.

I went and glanced at the spec for that instruction, and honestly, the wording is vague enough I'm not quite sure what we're allowed to do with it.

This revision was landed with ongoing or failed builds.Jun 17 2022, 8:10 AM
This revision was automatically updated to reflect the committed changes.
craig.topper added inline comments.Jun 28 2022, 8:29 PM
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv.ll
1571

I don't think this patch is correct. In the old code we zeroed the entire register because lmul was 1. Now we're only zeroing a quarter of the register because lmul is 1/4. The VL in bytes the comment in the code mentions is more or less LMUL.

craig.topper added inline comments.Jun 28 2022, 8:45 PM
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv.ll
1571

Ignore that part of LMUL being VL in bytes. It's really more like LMUL is the upper bound on how much of the register can be updated. If AVL is X0 then you would need the LMUL to be the same independent of SEW. For other AVLs its more complicated.

reames added inline comments.Jun 29 2022, 10:17 AM
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv.ll
1571

I agree, this is definitely wrong. Revert in progress now.

Looking at this again, the whole logic is wrong. This depends on VLInBytes, not VLMAX. I don't know what I was thinking here.

Fraser reported another problem with this patch to me privately. In addition to the wrong number of bits being written Craig found, we could also construct illegal instruction encodings. This could happen because we changed the VLMUL of the splat, but did not change the LMUL on the pseudo itself. As a result, the register allocator would assign e.g. an LMUL1 register, when the VSETVLI was now e.g. LMUL2. Given only have of registers are legal operands at LMUL2, this has a high chance of producing an illegal instruction encoding.