This is an archive of the discontinued LLVM Phabricator instance.

SI Load Store Optimizer: When merging with offset, use V_ADD_{I|U}32_e64
ClosedPublic

Authored by msearles on Jan 16 2018, 12:16 PM.

Details

Summary
  • Change inserted add ( V_ADD_{I|U}32_e32 ) to _e64 version ( V_ADD_{I|U}32_e64 ) so that the add uses a vreg for the carry; this prevents inserted v_add from killing VCC; the _e64 version doesn't accept a literal in its encoding, so we need to introduce a mov instr as well to get the imm into a register to be used in add instr.
  • Change pass name to "SI Load Store Optimizer"; this removes the '/', which complicates scripts.

Diff Detail

Event Timeline

msearles created this revision.Jan 16 2018, 12:16 PM
rampitec added inline comments.Jan 16 2018, 1:01 PM
lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
501

Use COPY, not MOV.

506–507

If this is a no carry version, carry should not be added.
In general do not create instruction manually, use SIInstrInfo::getAddNoCarry().

test/CodeGen/AMDGPU/merge-load-store-vreg.mir
2

Need test for both add with without carry.

msearles planned changes to this revision.Jan 16 2018, 2:16 PM

test/CodeGen/AMDGPU/ds-combine-large-stride.ll is failing with this patch; likely a test update; no review needed until I resolve this.

msearles updated this revision to Diff 130948.Jan 22 2018, 12:32 PM
  • Use scalar mov, not vector mov
  • Add test to merge-load-store-vreg.mir for nocarryadd opcodes
  • Update ds-combine-large-stride.ll . Note that the immediate is not folded if not using nocarryadd opcodes; this is a limitation in foldImmediates() and will be addressed in a follow-on patch
msearles marked 3 inline comments as done.Jan 22 2018, 12:33 PM
msearles added inline comments.
lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
501

COPY doesn't allow an immediate; changed it to a scalar mov

This revision is now accepted and ready to land.Jan 22 2018, 12:40 PM
This revision was automatically updated to reflect the committed changes.
msearles marked an inline comment as done.