Extends the changes in D104847 and adds another MMA instruction variant and corresponding intrinsics & builtins.
That should allow clang to compile mma.h from CUDA-11.3.
Didn't test it much yet. There may still be some sharp corners.
Paths
| Differential D105384
[NVPTX, CUDA] Add .and.popc variant of the b1 MMA instruction. ClosedPublic Authored by tra on Jul 2 2021, 5:07 PM.
Details Summary Extends the changes in D104847 and adds another MMA instruction variant and corresponding intrinsics & builtins. That should allow clang to compile mma.h from CUDA-11.3. Didn't test it much yet. There may still be some sharp corners.
Diff Detail
Event TimelineHerald added subscribers: bixia, hiraditya, yaxunl, jholewinski. · View Herald TranscriptJul 2 2021, 5:07 PM Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 2 2021, 5:07 PM Comment Actions Good stuff! Thanks for adding this and adjusting the test generator. I have requested some minor changes, though nothing critical. Are the test failures related to these changes?
This revision now requires changes to proceed.Jul 12 2021, 5:25 AM
Comment Actions Thank you for addressing my concerns. I am happy with the changes. Great work!
This revision is now accepted and ready to land.Jul 13 2021, 2:22 AM Comment Actions
AFAICT, no, the test failures don't seem to be related. It appears that the test runs rarely succeed in general. They are mostly red for nearly all build attempts (https://reviews.llvm.org/harbormaster/build/?plan=PHID-HMCP-p2oc4ocen3l2yzymvg2l), even though LLVM buildbots are green (https://lab.llvm.org/buildbot/#/). This revision was landed with ongoing or failed builds.Jul 15 2021, 12:02 PM Closed by commit rGd774b4aa5eac: [NVPTX, CUDA] Add .and.popc variant of the b1 MMA instruction. (authored by tra). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 359085 clang/include/clang/Basic/BuiltinsNVPTX.def
clang/lib/CodeGen/CGBuiltin.cpp
clang/test/CodeGen/builtins-nvptx-mma.cu
clang/test/CodeGen/builtins-nvptx-mma.py
llvm/include/llvm/IR/IntrinsicsNVVM.td
llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
llvm/test/CodeGen/NVPTX/wmma.py
|