Don't outline the kernel in the test file as this prevent some debug info from being stripped out. Cuda driver doesn't support PTX with debug info causing conversion to cubin to fail.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
The test had started failing with https://github.com/llvm/llvm-project/commit/81467f500f6ad106a69088bc276024c5e1938571. I'll also enable those tests in google build bots that have Tesla T4 GPUs once this is fixed.
Comment Actions
Thanks! this is something I wasn't aware of. BTW I tested these on a Turing with CUDA10.2, and they passed, but maybe they fail on some other devices.
mlir/test/Integration/GPU/CUDA/TensorCore/wmma-matmul-f16.mlir | ||
---|---|---|
46 | This is I think present since this file was added but not required anymore. Can you please drop this? Or should I remove this in a subsequent patch? |
Comment Actions
Yes I ended up doing a more generic fix for the problem as it causing more general problems: https://reviews.llvm.org/D103187
This still feels like a small improvement so I'll move forward with this patch unless you have any concerns.
This is I think present since this file was added but not required anymore. Can you please drop this? Or should I remove this in a subsequent patch?