Presently AMDGPU backend errors with "unsupported call to function" upon encountering a call to llvm.log{,10}.{f16,f32} intrinsics. This patch adds custom lowering to avoid that error on both R600 and SI.
Details
Diff Detail
Event Timeline
Better version: handle log10 as well, define constants locally, handle f16, do not pretend to handle f64. Tests incoming.
lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
250 | v2f32 and v4f32 can be moved to | |
1891 | Does FDIV have good enough precision to do this? OCL requires 2.5 ULP, and I'm not sure how good the EG/CM hw is. |
Sorry for the delay, updated patch, now passes tests, addressed all the comments. I would appreciate if you could check if I got all the R_ and A_ prefixes correctly (it's just variable naming, but there is some logic behind it -- register vs address, I presume?) in f16 tests.
No, absolutely not, it was a combination of limited time for Clover/Clang/LLVM and reluctance to finish things that are almost done but not "interesting" to complete. Thanks for the OK, full patch coming much sooner than this one. :-)
Should name something else that doesn't collide with the standard names