Extend the atomic optimizer to handle signed and unsigned max and min
operations, as well as add and subtract.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll | ||
---|---|---|
240 ↗ | (On Diff #208380) | Can you add tests for i64? |
llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll | ||
---|---|---|
240 ↗ | (On Diff #208380) | Will do, though I'm little nervous of both (a) the combinatorial explosion of unit tests ({add,sub,max,min,umax,umin} x {i32,i64} x {constant,uniform,varying} x {buffer,rawbuffer,structbuffer,localpoint,globalpointer} ...) and (b) the lack of any end-to-end tests to check that the optimization is actually doing the right thing and computing the right result. |
llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp | ||
---|---|---|
322 ↗ | (On Diff #208664) | Can you move this to a separate getIdentityValue function? |
328 ↗ | (On Diff #208664) | Shouldn't this just be zero for add/sub? |
330–338 ↗ | (On Diff #208664) | The min and max values seem backwards? min(MinValue, X) -> MinValue, not X |
388 ↗ | (On Diff #208664) | I don't understand why these are reading lane 63. Why not lane 0? Why not readfirstlane? This won't work with wave64, but I guess that's an existing problem |
llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll | ||
240 ↗ | (On Diff #208380) | I pretty consistently find bugs by enumerating all of the combinations. End-to-end execution tests would be nice, but won't be in the llvm tests |
llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp | ||
---|---|---|
328 ↗ | (On Diff #208664) | Yes, getMinValue means min unsigned value which is zero. |
330–338 ↗ | (On Diff #208664) | I got one of these cases wrong in the first patch but they should all be correct now. The identity for min() is MaxValue and vice versa. |
388 ↗ | (On Diff #208664) | Here's how the exclusive scan works: |
llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp | ||
---|---|---|
276 ↗ | (On Diff #208887) | As a future step you can handle and/or/xor |