Extend the atomic optimizer to handle signed and unsigned max and min
operations, as well as add and subtract.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
| llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll | ||
|---|---|---|
| 240 ↗ | (On Diff #208380) | Can you add tests for i64? |
| llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll | ||
|---|---|---|
| 240 ↗ | (On Diff #208380) | Will do, though I'm little nervous of both (a) the combinatorial explosion of unit tests ({add,sub,max,min,umax,umin} x {i32,i64} x {constant,uniform,varying} x {buffer,rawbuffer,structbuffer,localpoint,globalpointer} ...) and (b) the lack of any end-to-end tests to check that the optimization is actually doing the right thing and computing the right result. |
| llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp | ||
|---|---|---|
| 322 ↗ | (On Diff #208664) | Can you move this to a separate getIdentityValue function? |
| 328 ↗ | (On Diff #208664) | Shouldn't this just be zero for add/sub? |
| 330–338 ↗ | (On Diff #208664) | The min and max values seem backwards? min(MinValue, X) -> MinValue, not X |
| 388 ↗ | (On Diff #208664) | I don't understand why these are reading lane 63. Why not lane 0? Why not readfirstlane? This won't work with wave64, but I guess that's an existing problem |
| llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll | ||
| 240 ↗ | (On Diff #208380) | I pretty consistently find bugs by enumerating all of the combinations. End-to-end execution tests would be nice, but won't be in the llvm tests |
| llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp | ||
|---|---|---|
| 328 ↗ | (On Diff #208664) | Yes, getMinValue means min unsigned value which is zero. |
| 330–338 ↗ | (On Diff #208664) | I got one of these cases wrong in the first patch but they should all be correct now. The identity for min() is MaxValue and vice versa. |
| 388 ↗ | (On Diff #208664) | Here's how the exclusive scan works: |
| llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp | ||
|---|---|---|
| 276 ↗ | (On Diff #208887) | As a future step you can handle and/or/xor |