This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Optimize INSERT_VECTOR_ELT sequences
ClosedPublic

Authored by frasercrmck on Mar 9 2021, 4:39 AM.

Details

Summary

This patch optimizes the codegen for INSERT_VECTOR_ELT in various ways.
Primarily, it removes the use of vslidedown during lowering, and the
vector element is inserted entirely using vslideup with a custom VL and
slide index.

Additionally, lowering of i64-element vectors on RV32 has been optimized
in several ways. When the 64-bit value to insert is the same as the
sign-extension of the lower 32-bits, the codegen can follow the regular
path. When this is not possible, a new sequence of two i32 vslide1up
instructions is used to get the vector element into a vector. This
sequence was suggested by @craig.topper. From there, the value is slid
into the final position for more consistent lowering across RV32 and
RV64.

Diff Detail

Event Timeline

frasercrmck created this revision.Mar 9 2021, 4:39 AM
frasercrmck requested review of this revision.Mar 9 2021, 4:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2021, 4:39 AM
craig.topper added inline comments.Mar 9 2021, 2:56 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2253

Would it be simpler to just check that Val is a ConstantSDNode that fits in sign extended 32 bits and if it is just call getConstant with the lower 32 bits?

2279

I can believe that doesn't work. Undef sources that aren't tied don't seem to really allocate and just pick the first register.

I'm now a little worried someone might try that in the intrinsic interface.

2287

Don't we need a bitcast here to get back to the ContainerVT to match the SDTypeProfile for the next operation?

frasercrmck marked 3 inline comments as done.
  • rebase
  • address review feedback
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2253

It would, thanks. I'll still extract ValLo and ValHi for the slow rv32/i64 case, but the lo/hi constant checking is unnecessarily complicated.

2279

Indeed. I went through some of the intrinsic tests to see if they were testing undef but didn't get any hits. I suspect we have problems there. I haven't worked with the builtin/intrinsic layers but presumably it'd be possible to do something like this in pseudo C:

nxvi32_t v;
v = rvv_vslide1up(v, a, m, vl);

and have that first operand lower to undef? I don't know how far we need to take this issue at this point, though I suspect it's not the last we've seen of it. Perhaps we'll just need to sanitize any problematic undef operands.

2287

Good catch, thanks. I've fixed that up.

On a related note, I noticed that it was perfectly content when using the original container mask, even though SDTCisSameNumEltsAs shouldn't have let (e.g.) nxv2i1 match against nxv4i1. The code for the TableGen constraint seems like it should be working, so maybe it's something in our patterns? I didn't have time to dig in more deeply.

craig.topper added inline comments.Mar 10 2021, 9:09 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2287

The SDTypeProfiles cause tablegen to remove type checks from the isel table to minimize table size assuming the rules have been followed.. So if you tell it that something has the same number of elements as another result or operand and tell it the specific element type it won't emit a check.

craig.topper added inline comments.Mar 10 2021, 6:00 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2254

I'm not sure how I feel about creating these extracts that might not get used. Having them here instead of just in the non-simple path only saves us from having to call getConstant where we assign Val = ValLo?

  • rebase
  • address review feedback (pt. 2)
frasercrmck marked an inline comment as done.Mar 11 2021, 1:36 AM
frasercrmck added inline comments.
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2254

Yeah fair enough, looking at it again in the cold light of day it doesn't seem like it brings any benefit and adds extra variables to scope in which they're never going to be used.

craig.topper accepted this revision.Mar 11 2021, 8:56 AM

Thanks! LGTM

This revision is now accepted and ready to land.Mar 11 2021, 8:56 AM
This revision was automatically updated to reflect the committed changes.
frasercrmck marked an inline comment as done.