Page MenuHomePhabricator

[RISCV] Support vslide1up/down intrinsics for SEW=64 on RV32.

Authored by craig.topper on Mon, Apr 5, 5:47 PM.



This can't use our normal strategy of splatting the scalar and using
a .vv operation instead of .vx.

Instead this patch bitcasts the vector to the equivalent SEW=32
vector and inserts the scalar parts using two vslide1up/down. We
do that unmasked and apply the mask separately at the end with
a vmerge.

For vslide1up there maybe some other options here like getting
i64 into element 0 and using with this vector as
vd and the original source as vs1. Masking would still need to
be done afterwards.

That idea doesn't work for vslide1down. We need to slidedown and
then insert a single scalar at vl-1 which we could do with a
vslideup, but that assumes vl > 0 which I don't think we can assume.

The i32 double slide1down implemented here is the best I could come
up with and I just made vslide1up consistent.

Diff Detail

Event Timeline

craig.topper created this revision.Mon, Apr 5, 5:47 PM
craig.topper requested review of this revision.Mon, Apr 5, 5:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptMon, Apr 5, 5:47 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
frasercrmck accepted this revision.Wed, Apr 7, 7:35 AM

Seems like a good strategy to me.


I can't remember exactly how the intrinsics work but is it able to omit this if you're using zero as the vector length (i.e. VLMAX)?

This revision is now accepted and ready to land.Wed, Apr 7, 7:35 AM
craig.topper added inline comments.Wed, Apr 7, 10:01 AM

Passing 0 to the intrinsic is really 0 not VLMAX. If you want VLMAX you would need to call one of the vsetvlmax_*() intrinsics before this and pass the return value which would be the real VLMAX for the vtype.

I suppose we could try to figure out that the producing instruction is a vsetvlmax intrinsic with SEW=64 vtype and create a new vsetvlmax intrinsic with SEW=32 to pass here.