This is an archive of the discontinued LLVM Phabricator instance.

[libc] Fix the `send_n` and `recv_n` utilities under divergent lanes
ClosedPublic

Authored by jhuber6 on May 19 2023, 1:06 PM.

Details

Summary

We provide the send_n and recv_n utilities as a generic way to
stream data between both sides of the process. This was previously
tested and performed as expected when using a string of constant size.
However, when the size was allowed to diverge between the threads in the
warp or wavefront this could deadlock. This did not occur on NVPTX
because of the use of the explicit warp sync. However, on AMD one of the
work items in the wavefront could continue executing and hit the next
recv call before the other threads, then we would deadlock as we
violated the RPC invariants.

This patch replaces the for loop with a thread ballot. This will cause
every thread in the warp or wavefront to continue executing the loop
until all of them can exit. This acts as a more explicit wavefront sync.

Diff Detail

Event Timeline

jhuber6 created this revision.May 19 2023, 1:06 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 19 2023, 1:06 PM
jhuber6 requested review of this revision.May 19 2023, 1:06 PM
jhuber6 updated this revision to Diff 523932.May 19 2023, 1:57 PM

Accidentally removed the old test.

JonChesterfield accepted this revision.May 23 2023, 8:35 AM

Does
> process.get_packet(index).header.mask
hoist out of the loop? If not it might be worth doing that manually. Also interested in whether index is in a sgpr at this point.

Those are codegen effectiveness questions though, and the high level keep going while any lane thinks there's work to do seems reasonable.

This revision is now accepted and ready to land.May 23 2023, 8:35 AM