This is an archive of the discontinued LLVM Phabricator instance.

[libc] Implement memory fences on NVPTX
ClosedPublic

Authored by jhuber6 on Mar 23 2023, 7:52 AM.

Details

Summary

Memory fences are not handled by the NVPTX backend. We need to replace
them with a memory barrier intrinsic function. This doesn't include the
ordering, but should perform the necessary functionality, albeit slower.

Diff Detail

Event Timeline

jhuber6 created this revision.Mar 23 2023, 7:52 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 23 2023, 7:52 AM
jhuber6 requested review of this revision.Mar 23 2023, 7:52 AM

Does it have to be sys? Does gl (kernel level) work?

Does it have to be sys? Does gl (kernel level) work?

It should be sys as far as I understand because this is intended to be used on the Nvidia USM to implement RPC. Also I believe __atomic_thread_fence defaults to system scope on AMDPGU as well.

tianshilei1992 accepted this revision.Mar 23 2023, 8:17 AM

Does it have to be sys? Does gl (kernel level) work?

It should be sys as far as I understand because this is intended to be used on the Nvidia USM to implement RPC. Also I believe __atomic_thread_fence defaults to system scope on AMDPGU as well.

Oh I see. That's for shared memory. LG.

This revision is now accepted and ready to land.Mar 23 2023, 8:17 AM
This revision was automatically updated to reflect the committed changes.