Fixes OCL CTS rsqrt and half_rsqrt (1 thread, scalaer) tests on AMD Turks.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
Looks good to me from a code standpoint.
The test fails CTS before and after on my BARTS (6850), but that's when running libclc master with LLVM 9.0, so there might be a precision issue that's fixed in newer code. It doesn't make the issue worse, at least.
Comment Actions
thanks.
that's surprising. does it fail for scalar only? do you have the asm dump?
I thought Barts and Turks were largely identical when it came to compute pipeline
This should really produce just single instruction so newer LLVM is unlikely to fix it.