This is an archive of the discontinued LLVM Phabricator instance.

libclc/r600: Use target specific builtins to implement rsqrt and native_rsqrt
ClosedPublic

Authored by jvesely on Feb 4 2020, 6:08 PM.

Details

Summary

Fixes OCL CTS rsqrt and half_rsqrt (1 thread, scalaer) tests on AMD Turks.

Diff Detail

Event Timeline

jvesely created this revision.Feb 4 2020, 6:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2020, 6:08 PM
awatry accepted this revision.Feb 7 2020, 5:50 PM

Looks good to me from a code standpoint.

The test fails CTS before and after on my BARTS (6850), but that's when running libclc master with LLVM 9.0, so there might be a precision issue that's fixed in newer code. It doesn't make the issue worse, at least.

This revision is now accepted and ready to land.Feb 7 2020, 5:50 PM

Looks good to me from a code standpoint.

The test fails CTS before and after on my BARTS (6850), but that's when running libclc master with LLVM 9.0, so there might be a precision issue that's fixed in newer code. It doesn't make the issue worse, at least.

thanks.
that's surprising. does it fail for scalar only? do you have the asm dump?
I thought Barts and Turks were largely identical when it came to compute pipeline
This should really produce just single instruction so newer LLVM is unlikely to fix it.

This revision was automatically updated to reflect the committed changes.