This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Use activemask.b32 instruction to implement __activemask w/ CUDA-9.2+
ClosedPublic

Authored by tra on Aug 23 2019, 10:14 AM.

Details

Summary

vote.ballot instruction is gone in recent CUDA versions and
vote.sync.ballot can not be used because it needs a thread mask parameter.
Fortunately PTX 6.2 (introduced with CUDA-9.2) provides activemask.b32
instruction for this.

Event Timeline

tra created this revision.Aug 23 2019, 10:14 AM
timshen accepted this revision.Aug 23 2019, 3:37 PM
This revision is now accepted and ready to land.Aug 23 2019, 3:37 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 3 2019, 10:34 AM