Page MenuHomePhabricator

[AMDGPU] Ressociate 'add (add x, y), z' to use SALU
ClosedPublic

Authored by rampitec on Feb 13 2019, 4:51 PM.

Details

Summary

Reassociate adds to collect scalar operands in a single
instruction when possible. That will result in a scalar
add followed by vector instead of two vector adds, thus
better utilizing SALU.

Diff Detail

Repository
rL LLVM

Event Timeline

rampitec created this revision.Feb 13 2019, 4:51 PM
arsenm added inline comments.Feb 13 2019, 5:18 PM
lib/Target/AMDGPU/SIISelLowering.cpp
8547 ↗(On Diff #186778)

Is preserving the flags really correct? ReassociateOps seems to not do it

test/CodeGen/AMDGPU/reassoc-scalar.ll
1 ↗(On Diff #186778)

Should have a gfx9 run line also because of add3

rampitec marked an inline comment as done.Feb 13 2019, 5:45 PM
rampitec added inline comments.
lib/Target/AMDGPU/SIISelLowering.cpp
8547 ↗(On Diff #186778)

I think the final add has the same behavior as initial. Otherwise how could analysis tell us it was something like nuw or nsw in a first place?

But if you think this is questionable I can remove it. We do not use these flags in case of full dword adds anyway.

rampitec updated this revision to Diff 186795.Feb 13 2019, 9:14 PM
rampitec marked 3 inline comments as done.

Removed flags.
Added gfx9 run line.

test/CodeGen/AMDGPU/reassoc-scalar.ll
1 ↗(On Diff #186778)

Added. However v_add3 was not generated for these tests.

arsenm added inline comments.Feb 14 2019, 6:17 AM
lib/Target/AMDGPU/SIISelLowering.cpp
8532–8534 ↗(On Diff #186795)

Maybe move this to a helper function with the opcode as a parameter like ReassociateOps. We probably want to do this for the other reassociatable opcodes next

rampitec updated this revision to Diff 186879.Feb 14 2019, 11:14 AM
rampitec marked an inline comment as done.

Factored out reassociateScalarOps().

rampitec updated this revision to Diff 186882.Feb 14 2019, 11:30 AM

Moved VT check right into function.

rampitec updated this revision to Diff 186889.Feb 14 2019, 11:45 AM

Added v2i32 test.

arsenm accepted this revision.Feb 14 2019, 1:58 PM

LGTM

This revision is now accepted and ready to land.Feb 14 2019, 1:58 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2019, 2:11 PM