This is an archive of the discontinued LLVM Phabricator instance.

clang/OpenCL: Add inline implementations of sqrt in builtin header
ClosedPublic

Authored by arsenm on Jul 31 2023, 2:15 PM.

Details

Summary

We want the !fpmath metadata to be attached to the sqrt intrinsic to
make it to the backend lowering. Emit an available_externally
definition which uses the builtin, which emits the !fpmath.

Fixes #64264

Diff Detail

Event Timeline

arsenm created this revision.Jul 31 2023, 2:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 31 2023, 2:15 PM
Herald added a subscriber: Naghasan. · View Herald Transcript
arsenm requested review of this revision.Jul 31 2023, 2:15 PM
arsenm updated this revision to Diff 546164.Aug 1 2023, 11:36 AM
arsenm retitled this revision from [wip] clang/OpenCL: Add inline implementations of sqrt in builtin header to clang/OpenCL: Add inline implementations of sqrt in builtin header.
arsenm edited the summary of this revision. (Show Details)

Move to base header and fix test. update_cc_test_checks --include-generated-funcs does a terrible job here so write manually checks

Anastasia added inline comments.Aug 1 2023, 3:41 PM
clang/lib/Headers/opencl-c-base.h
832

Is this a generic implementation enough? Would some targets not need to do something different for this built-in?

Ideally this header is to be kept light so I am a bit worried about adding definitions of the functions here. Otherwise we will end up in the same situation as we one day were with opencl-c.h. So could these be left there instead? It might be good to check with @svenvh if TableGen header has already a way to do this function forwarding or can be extended to do such a thing. Then it would be implementable in the both header mechanisms. I don't know if Sven has some other ideas or opinions...

svenvh added a comment.Aug 2 2023, 7:13 AM

If we go with this approach, please also remove sqrt from clang/lib/Sema/OpenCLBuiltins.td (and ideally add a comment pointing out that sqrt is handled in opencl-c-base.h)

clang/lib/Headers/opencl-c-base.h
832

We did already discuss this a bit on the GitHub issue: https://github.com/llvm/llvm-project/issues/64264

856–858
arsenm updated this revision to Diff 546814.Aug 3 2023, 5:21 AM
arsenm marked an inline comment as done.
arsenm added inline comments.
clang/lib/Headers/opencl-c-base.h
832

As I mentioned on the ticket, it's only this one case so I'm not worried about adding a lot more to the base header. I think we can start by assuming llvm.sqrt always works correctly, I don't want to add more complexity to handle this case without a specific reason

Anastasia added inline comments.Aug 4 2023, 5:51 AM
clang/lib/Headers/opencl-c-base.h
832

As I mentioned on the ticket, it's only this one case so I'm not worried about adding a lot more to the base header.

This is how things normally start. Someone else might want to continue this approach because it is already there.

I think we can start by assuming llvm.sqrt always works correctly, I don't want to add more complexity to handle this case without a specific reason

Do you mean it would apply to all implementations? What I am missing here is why it is required to be in the headers? Is this because it needs to be inlined or is it because the compiler must see __builtin_elementwise_sqrt with the surrounding code where it is called from?

arsenm added inline comments.Aug 7 2023, 3:36 PM
clang/lib/Headers/opencl-c-base.h
832

Yes, I would expect any llvm consumer to correctly lower llvm.sqrt, and such an implementation would be correctly rounded and pass conformance with -cl-fp32-correctly-rounded-divide-sqrt

The goal is to get !fpmath attached to an llvm.sqrt call. The way this currently happens is based on the language, it gets emitted from the various __builtin_sqrt* calls

ping

The alternative is to directly put the !fpmath on the sqrt call sites but I have no idea how to do that

Anastasia accepted this revision.Sep 12 2023, 7:06 AM

If we think there are no better alternatives and implementation is generic enough for every vendor, LGTM!

This revision is now accepted and ready to land.Sep 12 2023, 7:06 AM

If we think there are no better alternatives and implementation is generic enough for every vendor, LGTM!

You could argue annotating the raw callsite is better but I don't know how to implement that