Specifically, we upgrade llvm.nvvm.:
- brev{32,64}
- clz.{i,ll}
- popc.{i,ll}
- abs.{i,ll}
- {min,max}.{i,ll,u,ull}
- h2f
These either map directly to an existing LLVM target-generic
intrinsic or map to a simple LLVM target-generic idiom.
In all cases, we check that the code we generate is lowered to PTX as we
expect.
This patch also adds implementations of the corresponding builtins to
clang.
Shouldn't that be _ll ? That was the name of the max of long long arguments in BuiltinsNVPTX.def.
Speaking of which, it would need to have builtins removed, too.