This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Add DS append/consume intrinsics
ClosedPublic

Authored by arsenm on Jan 28 2019, 9:58 AM.

Details

Reviewers
rampitec
b-sumner
Summary

Since these pass the pointer in m0 unlike other DS instructions, these
need to worry about whether the address is uniform or not. This
assumes the address is dynamically uniform, and just uses
readfirstlane to get a copy into an SGPR.

I don't know if these have the same 16-bit add for the addressing mode
offset problem on SI or not, but I've just assumed they do.

Also includes some misc. changes to avoid test differences between the
LDS and GDS versions.

Diff Detail

Event Timeline

arsenm created this revision.Jan 28 2019, 9:58 AM
arsenm marked an inline comment as done.Jan 28 2019, 9:59 AM
arsenm added inline comments.
include/llvm/IR/IntrinsicsAMDGPU.td
412–413

Not sure if we really need these, I should probably drop them

rampitec added inline comments.Jan 28 2019, 10:06 AM
lib/Target/AMDGPU/SIISelLowering.cpp
5505

Enable it or drop it.

I think it is perfectly reasonable to treat these as essentially relaxed-only atomic RMW operations and require the application to use fences or barriers if necessary. The ordering and scope are only needed if we ever need this operation to act as a non-relaxed atomic RMW.

arsenm updated this revision to Diff 183932.Jan 28 2019, 11:48 AM

Remove leftovers

This revision is now accepted and ready to land.Jan 28 2019, 11:50 AM
arsenm closed this revision.Jan 28 2019, 12:22 PM

r352422

phani added a subscriber: phani.Feb 12 2019, 7:49 PM
phani removed a subscriber: phani.
phani added a subscriber: phani.