Reassociate adds to collect scalar operands in a single
instruction when possible. That will result in a scalar
add followed by vector instead of two vector adds, thus
better utilizing SALU.
Details
Diff Detail
Event Timeline
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
8584 | I think the final add has the same behavior as initial. Otherwise how could analysis tell us it was something like nuw or nsw in a first place? But if you think this is questionable I can remove it. We do not use these flags in case of full dword adds anyway. |
Removed flags.
Added gfx9 run line.
test/CodeGen/AMDGPU/reassoc-scalar.ll | ||
---|---|---|
2 | Added. However v_add3 was not generated for these tests. |
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
8569–8571 | Maybe move this to a helper function with the opcode as a parameter like ReassociateOps. We probably want to do this for the other reassociatable opcodes next |
Maybe move this to a helper function with the opcode as a parameter like ReassociateOps. We probably want to do this for the other reassociatable opcodes next