As stated here (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-mma):
".and operation in single-bit wmma requires sm_80 or higher."
Differential D131265
Fixed sm version for .and bmma operator. JackAKirk on Aug 5 2022, 8:30 AM. Authored by
Details As stated here (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-mma): ".and operation in single-bit wmma requires sm_80 or higher."
Diff Detail
Event TimelineComment Actions Thanks. If you could land it for me that would be much appreciated. I don't have the rights. Comment Actions Looks like the tests needed to be updated (and I've found one bug which explains how we've missed this). |