This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Run AMDGPUCodeGenPrepare after scalar opts
ClosedPublic

Authored by arsenm on Aug 24 2019, 1:51 PM.

Details

Reviewers
rampitec
cfang
Summary

The mul24 matching could interfere with SLSR and the other addressing
mode related passes. This probably is not the optimal placement, but
is an intermediate step. This should probably be moved after all the
generic IR passes, particularly LSR. Moving this after LSR seems to
help in some cases, and hurts others.

As-is in this patch, in idiv-licm, it saves 1-2 instructions inside
some of the loop bodies, but increases the number in others. Moving
this later helps these loops. In the new lsr tests in
mul24-pass-ordering, the intrinsic prevents introducing more
instructions in the loop preheader, so moving this later ends up
hurting them. This shouldn't be any worse than before the intrinsics
were introduced in r366094, and LSR should probably be smarter. I
think it's because it doesn't know the and inside the loop will be
folded away.

Diff Detail

Event Timeline

arsenm created this revision.Aug 24 2019, 1:51 PM
This revision is now accepted and ready to land.Aug 24 2019, 2:04 PM
arsenm closed this revision.Aug 26 2019, 5:07 PM

r369991