Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+.
32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý.
Differential D17280
AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2} • tstellarAMD on Feb 15 2016, 3:10 PM. Authored by
Details Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý.
Diff Detail
Event TimelineComment Actions Please ignore changes in lib/Target/AMDGPU/SIIntrinsics.td, that's unrelated and was uploaded by accident. Will fix on reupload.
Comment Actions You could also try adding this to the set of handled memory nodes in performDAGCombine. This as is won't get that combine since currently it only runs on the legal dag, so the LDS atomic_cmp_swap won't appear anymore Comment Actions Fixed all comments except intrinsics, will upload a new revision with tests and without intrinsics chages when I have tests ready. Comment Actions Yes, I will post the updated patch per arsenm's suggestions tomorrow and one with tests over the weekend. I didn't find time to write the tests, sorry.
Comment Actions Hi, I worked on a similar patch. The tests probably need adapting for your solution, but feel free to reuse any part that you find useful: Jan
Comment Actions Merged Jan's approach. It would be nice if this works, because it's a much smaller diff. Currently fails with: llc: workspace/llvm/include/llvm/CodeGen/ValueTypes.h:249: unsigned int llvm::EVT::getVectorNumElements() const: Assertion `isVector() && "Invalid vector type!"' failed. I have to figure out why, will post an updated one later today. Comment Actions Works now. Since there were no changes to MUBUF_Atomic multiclass, this addresses the concerns by @nhaehnle. Existing tests with flat instructions need to be updated since now we respect the constraint output = input. Before doing that, I would like to make sure this is correct, especially LowerATOMIC_CMP_SWAP(). Comment Actions I had to make some changes to this patch in order to avoid breaking As a result of this change, I also had to add stand-alone patterns for
Comment Actions LGTM, just one inline comment.
|
This should probably include the address space check of global_binary_atomic_op as well, right?
On a second look, the _nortn variant seems to be unused.