This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Optimize atomic max/min
ClosedPublic

Authored by foad on Jul 8 2019, 6:00 AM.

Details

Summary

Extend the atomic optimizer to handle signed and unsigned max and min
operations, as well as add and subtract.

Diff Detail

Repository
rL LLVM

Event Timeline

foad created this revision.Jul 8 2019, 6:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2019, 6:00 AM
arsenm added inline comments.Jul 8 2019, 7:09 AM
llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
240 ↗(On Diff #208380)

Can you add tests for i64?

foad marked an inline comment as done.Jul 8 2019, 7:35 AM
foad added inline comments.
llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
240 ↗(On Diff #208380)

Will do, though I'm little nervous of both (a) the combinatorial explosion of unit tests ({add,sub,max,min,umax,umin} x {i32,i64} x {constant,uniform,varying} x {buffer,rawbuffer,structbuffer,localpoint,globalpointer} ...) and (b) the lack of any end-to-end tests to check that the optimization is actually doing the right thing and computing the right result.

foad updated this revision to Diff 208424.Jul 8 2019, 8:12 AM

Add i64 tests.

foad updated this revision to Diff 208664.Jul 9 2019, 6:30 AM

Fix identity for signed min.

arsenm added inline comments.Jul 9 2019, 7:04 AM
llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
322 ↗(On Diff #208664)

Can you move this to a separate getIdentityValue function?

328 ↗(On Diff #208664)

Shouldn't this just be zero for add/sub?

330–338 ↗(On Diff #208664)

The min and max values seem backwards? min(MinValue, X) -> MinValue, not X

388 ↗(On Diff #208664)

I don't understand why these are reading lane 63. Why not lane 0? Why not readfirstlane? This won't work with wave64, but I guess that's an existing problem

llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
240 ↗(On Diff #208380)

I pretty consistently find bugs by enumerating all of the combinations. End-to-end execution tests would be nice, but won't be in the llvm tests

foad updated this revision to Diff 208887.Jul 10 2019, 1:10 AM

New helper function getIdentityValueForAtomicOp.

foad marked 4 inline comments as done.Jul 10 2019, 1:23 AM
foad added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
328 ↗(On Diff #208664)

Yes, getMinValue means min unsigned value which is zero.

330–338 ↗(On Diff #208664)

I got one of these cases wrong in the first patch but they should all be correct now. The identity for min() is MaxValue and vice versa.

388 ↗(On Diff #208664)

Here's how the exclusive scan works:
ExclScan lane 0 is 0
ExclScan lane 1 is V lane 0
ExclScan lane 2 is V lane 0+1
ExclScan lane 3 is V lane 0+1+2
...
ExclScan lane 63 is V lane 0+...+62
Here we're reading lane 63 of NewV = ExclScan+V, i.e. the sum of all lanes in V.

arsenm accepted this revision.Jul 16 2019, 10:00 AM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
276 ↗(On Diff #208887)

As a future step you can handle and/or/xor

This revision is now accepted and ready to land.Jul 16 2019, 10:00 AM
This revision was automatically updated to reflect the committed changes.
foad marked an inline comment as done.Jul 16 2019, 10:49 AM
foad added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
276 ↗(On Diff #208887)

Yup, see D64809!