This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Add activemask intrinsic.
AbandonedPublic

Authored by ABataev on Sep 3 2019, 3:25 PM.

Details

Summary

Since PTX62 vote.ballot cannot be used to get the mask of active
threads, instead activemask.b32 instruction must be used. Required for
Cuda10.

LLVM part of the patches to fix PR43156.

Diff Detail

Event Timeline

ABataev created this revision.Sep 3 2019, 3:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 3 2019, 3:25 PM
Herald added a subscriber: jdoerfert. · View Herald Transcript
tra added inline comments.Sep 3 2019, 3:55 PM
include/llvm/IR/IntrinsicsNVVM.td
4096

Are these attribute sufficient to prevent CSE'ing out of divergent branches.

E.g. we must not allow transforming this:

if (cond(threadIdx.x))
  foo(activemask());
else
  bar(activemask());

into that:

int x = activemask();
if(cond(threadIdx.x)
  foo(x);
else
  bar(x);

It would be great to add a test for that.

4097

AFAICT NVCC does not provide __nvvm_vote_activemask builtin.

ABataev marked 2 inline comments as done.Sep 3 2019, 3:58 PM
ABataev added inline comments.
include/llvm/IR/IntrinsicsNVVM.td
4096

Will add a test

4097

Will remove it, copy-paste.

ABataev abandoned this revision.Sep 4 2019, 7:22 AM
ABataev marked an inline comment as done.