This is an archive of the discontinued LLVM Phabricator instance.

[VP] Declaration and docs for vp.select intrinsic
ClosedPublic

Authored by simoll on Jul 2 2021, 8:04 AM.

Details

Summary

llvm.vp.select extends the regular select instruction with an explicit
vector length (%evl).

All lanes with indexes at and above %evl are
undefined. Lanes below %evl are taken from the first input where the
mask is true and from the second input otherwise.

Diff Detail

Event Timeline

simoll created this revision.Jul 2 2021, 8:04 AM
simoll requested review of this revision.Jul 2 2021, 8:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 2 2021, 8:04 AM
simoll retitled this revision from [VP} vp.select selects lanes on condition and pivot to [VP] vp.select selects lanes on condition and pivot.Jul 2 2021, 8:06 AM
simoll added a comment.EditedJul 2 2021, 8:11 AM

FYI This is also intended for the vp passthru idiom.
vp.select does not use the %evl like the other intrinsics. Instead, everything that is masked out comes from the second input.

%data = vp.gather(%ptr, %mask, %evl)
%gather_with_passthru = vp.select(%mask, %data, %passthru_value, %evl)
simoll updated this revision to Diff 362694.Jul 29 2021, 2:58 AM
simoll retitled this revision from [VP] vp.select selects lanes on condition and pivot to [VP] Declaration and docs for vp.select intrinsic.
simoll edited the summary of this revision. (Show Details)
  • Added i1 tail_passthru immediate to set passthru behavior above the explicit vector length.

For the case of pass-thru, does it make sense to give it a different name like vp.overwrite / vp.insert / vp.update / vp.merge (in lack of better names). When there is pass through, this intrinsic looks to me it can be understood as first taking the whole on_false value and then, to build the result, selectively (as defined by the mask below the EVL) replacing elements with the corresponding values from on_true.

Then the case without pass-thru I'd call it vp.select because this does seem the obvious extension from the IR instruction select to VPred.

I understood from an earlier VPred sync call that you already had considered this scheme with two intrinsics and it had some drawbacks, right? Mind to elaborate? Perhaps a vectorizer will mostly always use vp.merge and never vp.select so the latter is less interesting to have?

llvm/docs/LangRef.rst
17794

I might be reading this wrong, shouldn't it be

The pivot creates a mask, %pivotMask, with all elements ``0 <= i < %pivot`` set to ``1``

(i.e. swap %pivot and %pivotMask occurrences)

17827

I think this should be %m (instead of %M)

For the case of pass-thru, does it make sense to give it a different name like vp.overwrite / vp.insert / vp.update / vp.merge (in lack of better names). When there is pass through, this intrinsic looks to me it can be understood as first taking the whole on_false value and then, to build the result, selectively (as defined by the mask below the EVL) replacing elements with the corresponding values from on_true.

I tend to agree. Mostly, i am worried that the intrinsic id is no longer sufficient to know whether everything above %evl can be ignored and then there is that extra parameter. That's unnecessarily irregular.

Then the case without pass-thru I'd call it vp.select because this does seem the obvious extension from the IR instruction select to VPred.

Agreed.

I understood from an earlier VPred sync call that you already had considered this scheme with two intrinsics and it had some drawbacks, right? Mind to elaborate? Perhaps a vectorizer will mostly always use vp.merge and never vp.select so the latter is less interesting to have?

The original scheme had a vp.merge intrinsic with pivot and no mask and vp.select as you'd expect it. That was designed only with VE in mind where a select will always retain the lanes above evl.
I mostly disliked the verbosity and hoped to fuse the functionality of both in one intrinsic.
There is no way around this now that we need a concise way to express select with evl.
Having a vp.merge(%m, %x, %y, %evl) that reains lanes above evl and vp.select(%m, %x, %y, %evl)that doesn't would be fine though.

simoll updated this revision to Diff 367240.Aug 18 2021, 10:10 AM
simoll edited the summary of this revision. (Show Details)
  • removed i1 mode parameter (will address this with another intrinsic, eg vp.merge, which retains lane above %evl(
  • rebased

Just nits and questions are this stage, really.

llvm/docs/LangRef.rst
17771

Nit: the select instruction is defined as having a "condition" whereas this is a "mask". There's a bit of mixing of "mask" and "condition" in the follow paragraphs too. Should we try to stick to "condition"?

llvm/include/llvm/IR/Intrinsics.td
1510

Nit: is this really a shuffle? I guess in a way it is, but it feels like "shuffles" and "selects" are already two distinct terms in the non-VP world.

simoll added inline comments.Aug 19 2021, 11:41 PM
llvm/docs/LangRef.rst
17771

+1 for consistency. The difference to regular select is that the condition/mask has to be a vector. How about "condition vector"?

llvm/include/llvm/IR/Intrinsics.td
1510

I think it's nice to group all shuffle intrinsics together (intrinsics that rearrange lanes). Actually, vp.splice and the (upcoming) vp.merge/thePivotThing should be in the shuffle category as well.
It's pretty inconsequential anyway as this understanding of "shuffle" only exists in these source code comments and does not appear in the docs.

simoll updated this revision to Diff 367750.Aug 20 2021, 2:53 AM

Phrasing: mask --> condition vector

rogfer01 accepted this revision.Sep 1 2021, 11:52 PM

LGTM. This will enable using those in VPlan as a later followup of @vkmr patches.

This revision is now accepted and ready to land.Sep 1 2021, 11:52 PM
This revision was landed with ongoing or failed builds.Sep 2 2021, 2:18 AM
This revision was automatically updated to reflect the committed changes.
rogfer01 added inline comments.Sep 2 2021, 9:38 AM
llvm/include/llvm/IR/Intrinsics.td
1511

Deep apologies because I didn't realise this earlier: shouldn't this intrinsic be at least IntrNoMem? It was in the reference patch and it'll probably make a difference when you add VP_SELECT.