Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+.
32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý.
|  Differential  D17280  
AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2} Authored by • tstellarAMD on Feb 15 2016, 3:10 PM. 
Details Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý. 
Diff Detail Event TimelineComment Actions Please ignore changes in lib/Target/AMDGPU/SIIntrinsics.td, that's unrelated and was uploaded by accident. Will fix on reupload. 
 
 Comment Actions You could also try adding this to the set of handled memory nodes in performDAGCombine. This as is won't get that combine since currently it only runs on the legal dag, so the LDS atomic_cmp_swap won't appear anymore Comment Actions Fixed all comments except intrinsics, will upload a new revision with tests and without intrinsics chages when I have tests ready. Comment Actions Yes, I will post the updated patch per arsenm's suggestions tomorrow and one with tests over the weekend. I didn't find time to write the tests, sorry. 
 Comment Actions Hi, I worked on a similar patch. The tests probably need adapting for your solution, but feel free to reuse any part that you find useful: Jan 
 
 Comment Actions Merged Jan's approach. It would be nice if this works, because it's a much smaller diff. Currently fails with: llc: workspace/llvm/include/llvm/CodeGen/ValueTypes.h:249: unsigned int llvm::EVT::getVectorNumElements() const: Assertion `isVector() && "Invalid vector type!"' failed. I have to figure out why, will post an updated one later today. Comment Actions Works now. Since there were no changes to MUBUF_Atomic multiclass, this addresses the concerns by @nhaehnle. Existing tests with flat instructions need to be updated since now we respect the constraint output = input. Before doing that, I would like to make sure this is correct, especially LowerATOMIC_CMP_SWAP(). Comment Actions I had to make some changes to this patch in order to avoid breaking As a result of this change, I also had to add stand-alone patterns for 
 Comment Actions LGTM, just one inline comment. 
 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
You should only need the one node. You need to remove the hardcoded vector type. What you need is SDTCisEltOfVec for the type constraint