This is an archive of the discontinued LLVM Phabricator instance.

libclc: clspv: fix fma, add vstore and fix inlining issues
ClosedPublic

Authored by rjodinchr on Apr 7 2023, 2:14 AM.

Details

Reviewers
alan-baker
kpet
Summary

fma:
While the fma implementation was passing the CTS, cos (which is using fma) was not because of the fma.
This new version is closer to the actual default libclc implementation. Still not using ulong to avoid bringing a dependency on u64.

vstore:
Based on the default libclc implementation. I have added several conditions required to have them passed on different platforms (tested on nvidia, intel and AMD).

inlining issues:
At the moment, fma is inlined in function like cos. But for architecture having a compliant native implementation of fma, it is better not to inline it to be able to replace the call by the native implementation.
Force libclc builtins not to be inline + adding a assume, so that we can then remove the noinline in clspv once the NativeMathPass has passed

Also add missing builtins needed for some platform (at least swiftshader and nvidia)

Diff Detail

Event Timeline

rjodinchr created this revision.Apr 7 2023, 2:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2023, 2:14 AM
Herald added a subscriber: jvesely. · View Herald Transcript
rjodinchr requested review of this revision.Apr 7 2023, 2:14 AM

I have a regression for CTS fma test on swiftshader, while it is passing for Nvidia, Intel and AMD. I need to look at it closely.

rjodinchr updated this revision to Diff 512398.Apr 11 2023, 4:51 AM

Update diff to fix clang-format issues

rjodinchr updated this revision to Diff 512836.Apr 12 2023, 8:01 AM

Add missing builtins needed for some platforms (at least swiftshader and nvidia)

rjodinchr edited the summary of this revision. (Show Details)Apr 12 2023, 8:01 AM
rjodinchr updated this revision to Diff 512879.Apr 12 2023, 9:46 AM

add missing file, removed from previous update

This revision is now accepted and ready to land.Apr 17 2023, 7:45 AM
kpet closed this revision.May 9 2023, 8:53 AM

Pushed as 21508fa76914a5e4281dc5bc77cac7f2e8bc3aef.