This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler
ClosedPublic

Authored by tstellarAMD on Aug 23 2016, 2:02 PM.

Details

Summary

The SILoadStoreOptimizer can now look ahead more then one instruction when
looking for instructions to merge, which greatly improves the number of
loads/stores that we are able to merge.

Moving the pass before scheduling avoids increasing register pressure after
the scheduler, so that the scheduler's register pressure estimates will be
more accurate. It also gives more consistent results, since it is no longer
affected by minor scheduling changes.

Diff Detail

Event Timeline

tstellarAMD retitled this revision from to AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler.
tstellarAMD updated this object.
tstellarAMD added a reviewer: arsenm.
tstellarAMD added a subscriber: llvm-commits.
arsenm added inline comments.Aug 23 2016, 2:45 PM
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
499

This isn't needed here since addMachineSSAOptimization isn't called with None anyway

542–543

The extra run should be removed now

lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
78

ArrayRef

84

ArrayRef

107

Does this need to be required? If it's not required is it available in the current pass pipeline? I think there's a subtarget hook we needed to enable to use this

198–241

Should be a SmallVector

226–249

It is not possible to have a sub register def in SSA so I don't think you need any of this

lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
542–543

The SIShrinkInstructionsPass depends on the being run. I don't think we can remove it until we fix the SIShrinkInstructionsPass.

tstellarAMD marked 6 inline comments as done.

Remove extra RegisterCoalescer run, and address other review comments.

lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
107

This is required so we don't regress tests in llvm.memcpy.ll, The alias analysis passes are always run in TargetPassConfig::addIRPasses(). The subtarget hook looks like it is only used by the SelectionDAG passes.

226–249

I dropped the code that checks for writes, but the code that checks for reads doesn't have anything to do with sub-registers.

arsenm added inline comments.Aug 24 2016, 7:07 PM
lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
107

I think when I was looking at the hook it was for AA for scheduling

141–143

I think you can directly initialize the SmallVector with the range, SmallVector<> Foo(MI.defs())

149

SIInstrInfo

155

Can use ->mayLoadOrStore

216

ditto

364

Missing space

442

SmallVector

tstellarAMD marked 4 inline comments as done.

Address review comments.

arsenm accepted this revision.Aug 25 2016, 10:00 AM
arsenm edited edge metadata.

LGTM

This revision is now accepted and ready to land.Aug 25 2016, 10:00 AM
This revision was automatically updated to reflect the committed changes.