gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
60,040 ms | x64 debian > MLIR.Examples/standalone::test.toy |
Event Timeline
llvm/include/llvm/IR/IntrinsicsAMDGPU.td | ||
---|---|---|
1990 | Missing the clang changes and tests for this |
use Register() in place of NoRegister. Add implicit operands in convertToThreeAddress
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
3270 | I think you could just raise the upper bound of the loop above to E = MI.getNumOperands() instead of adding this extra call? |
better way to copy implicit operands
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3739–3740 | I don't know what to do here. The intrinsic has an i1 field. If you put a non-i1 value that will be reported right? Given that, we are asserting that other parts of ISel haven't transformed this value incorrectly. Also, we do the same thing in selectDotIUVOP3PMods, line 3726. Please let me know what could be done. |
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3739–3740 | I'd just invert the check below to != 0. The machine verifier is certainly not enforcing this be 0/-1 for booleans. Practically speaking, this would only come up for hand written MIR |
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3739–3740 |
Indeed, this was strongly inspired by the existing code in selectDotIUVOP3PMods, which handles the intrinsic in a similar way. |
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3739 | No real point to the assert anymore |
Missing the clang changes and tests for this