Fixes OCL CTS rsqrt and half_rsqrt (1 thread, scalaer) tests on AMD Turks.
Looks good to me from a code standpoint.
The test fails CTS before and after on my BARTS (6850), but that's when running libclc master with LLVM 9.0, so there might be a precision issue that's fixed in newer code. It doesn't make the issue worse, at least.
that's surprising. does it fail for scalar only? do you have the asm dump?
I thought Barts and Turks were largely identical when it came to compute pipeline
This should really produce just single instruction so newer LLVM is unlikely to fix it.