Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+.
32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý.
Paths
| Differential D17280
AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2} ClosedPublic Authored by • tstellarAMD on Feb 15 2016, 3:10 PM.
Details Summary Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý.
Diff Detail Event Timelinerivanvx updated this object. Comment Actions Please ignore changes in lib/Target/AMDGPU/SIIntrinsics.td, that's unrelated and was uploaded by accident. Will fix on reupload.
Comment Actions You could also try adding this to the set of handled memory nodes in performDAGCombine. This as is won't get that combine since currently it only runs on the legal dag, so the LDS atomic_cmp_swap won't appear anymore Comment Actions Fixed all comments except intrinsics, will upload a new revision with tests and without intrinsics chages when I have tests ready. rivanvx retitled this revision from Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2} to AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2}.Feb 17 2016, 4:42 AM rivanvx edited edge metadata. Comment Actions Yes, I will post the updated patch per arsenm's suggestions tomorrow and one with tests over the weekend. I didn't find time to write the tests, sorry. nhaehnle edited edge metadata. nhaehnle added inline comments.
This revision now requires changes to proceed.Mar 10 2016, 5:11 PM Comment Actions Hi, I worked on a similar patch. The tests probably need adapting for your solution, but feel free to reuse any part that you find useful: Jan
rivanvx edited edge metadata. Comment ActionsMerged Jan's approach. It would be nice if this works, because it's a much smaller diff. Currently fails with: llc: workspace/llvm/include/llvm/CodeGen/ValueTypes.h:249: unsigned int llvm::EVT::getVectorNumElements() const: Assertion `isVector() && "Invalid vector type!"' failed. I have to figure out why, will post an updated one later today. rivanvx updated this object. rivanvx edited edge metadata. Comment ActionsWorks now. Since there were no changes to MUBUF_Atomic multiclass, this addresses the concerns by @nhaehnle. Existing tests with flat instructions need to be updated since now we respect the constraint output = input. Before doing that, I would like to make sure this is correct, especially LowerATOMIC_CMP_SWAP(). Comment Actions I had to make some changes to this patch in order to avoid breaking As a result of this change, I also had to add stand-alone patterns for rivanvx edited edge metadata. Comment ActionsLGTM and 32-bit version of the instruction tested on Tonga, works.
Closed by commit rL265170: AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2} (authored by tstellar). · Explain WhyApr 1 2016, 11:33 AM This revision was automatically updated to reflect the committed changes. Comment Actions LGTM, just one inline comment.
Revision Contents
Diff 48018 lib/Target/AMDGPU/AMDGPUISelLowering.h
lib/Target/AMDGPU/AMDGPUISelLowering.cpp
lib/Target/AMDGPU/AMDGPUInstrInfo.td
lib/Target/AMDGPU/AMDGPUInstructions.td
lib/Target/AMDGPU/CIInstructions.td
lib/Target/AMDGPU/SIISelLowering.h
lib/Target/AMDGPU/SIISelLowering.cpp
lib/Target/AMDGPU/SIInstrInfo.td
lib/Target/AMDGPU/SIInstructions.td
lib/Target/AMDGPU/SIIntrinsics.td
|
You should only need the one node. You need to remove the hardcoded vector type. What you need is SDTCisEltOfVec for the type constraint