The common GPU operation transformation that lowers math operations
to function calls in the gpu-to-nvvm and gpu-to-rocdl passes handles
vector types by applying the function to each scalar and returning a
new vector. However, there was a typo that results in incorrectly
accumulating the result vector, and the rewrite returns an llvm.mlir.undef
result instead of the correct vector. A patch is added and tests are
strengthened.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo