This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler
ClosedPublic

Authored by tstellarAMD on Aug 23 2016, 2:02 PM.

Details

Summary

The SILoadStoreOptimizer can now look ahead more then one instruction when
looking for instructions to merge, which greatly improves the number of
loads/stores that we are able to merge.

Moving the pass before scheduling avoids increasing register pressure after
the scheduler, so that the scheduler's register pressure estimates will be
more accurate. It also gives more consistent results, since it is no longer
affected by minor scheduling changes.

Diff Detail

Repository
rL LLVM

Event Timeline

tstellarAMD retitled this revision from to AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler.
tstellarAMD updated this object.
tstellarAMD added a reviewer: arsenm.
tstellarAMD added a subscriber: llvm-commits.
arsenm added inline comments.Aug 23 2016, 2:45 PM
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
499 ↗(On Diff #69035)

This isn't needed here since addMachineSSAOptimization isn't called with None anyway

547–552 ↗(On Diff #69035)

The extra run should be removed now

lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
78 ↗(On Diff #69035)

ArrayRef

84 ↗(On Diff #69035)

ArrayRef

107 ↗(On Diff #69035)

Does this need to be required? If it's not required is it available in the current pass pipeline? I think there's a subtarget hook we needed to enable to use this

198 ↗(On Diff #69035)

Should be a SmallVector

226–249 ↗(On Diff #69035)

It is not possible to have a sub register def in SSA so I don't think you need any of this

lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
547–552 ↗(On Diff #69035)

The SIShrinkInstructionsPass depends on the being run. I don't think we can remove it until we fix the SIShrinkInstructionsPass.

tstellarAMD marked 6 inline comments as done.

Remove extra RegisterCoalescer run, and address other review comments.

lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
107 ↗(On Diff #69035)

This is required so we don't regress tests in llvm.memcpy.ll, The alias analysis passes are always run in TargetPassConfig::addIRPasses(). The subtarget hook looks like it is only used by the SelectionDAG passes.

226–249 ↗(On Diff #69035)

I dropped the code that checks for writes, but the code that checks for reads doesn't have anything to do with sub-registers.

arsenm added inline comments.Aug 24 2016, 7:07 PM
lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
107 ↗(On Diff #69194)

I think when I was looking at the hook it was for AA for scheduling

141–143 ↗(On Diff #69194)

I think you can directly initialize the SmallVector with the range, SmallVector<> Foo(MI.defs())

149 ↗(On Diff #69194)

SIInstrInfo

155 ↗(On Diff #69194)

Can use ->mayLoadOrStore

216 ↗(On Diff #69194)

ditto

364 ↗(On Diff #69194)

Missing space

442 ↗(On Diff #69194)

SmallVector

tstellarAMD marked 4 inline comments as done.

Address review comments.

arsenm accepted this revision.Aug 25 2016, 10:00 AM
arsenm edited edge metadata.

LGTM

This revision is now accepted and ready to land.Aug 25 2016, 10:00 AM
This revision was automatically updated to reflect the committed changes.