This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Process SDWA block at a time
ClosedPublic

Authored by arsenm on Feb 8 2018, 12:21 PM.

Details

Summary

Right now this loops over the entire function every time there
is a change, which is not very efficient. There's no practical
reason to track this so globally, since the code motion optimization
passes should be sinking instructions with single uses and
the pass currently will not fold with multiple uses.

Diff Detail

Event Timeline

arsenm created this revision.Feb 8 2018, 12:21 PM

I agree theoretically. Anyway, are there any regressions?

I agree theoretically. Anyway, are there any regressions?

None of the tests regressed. I tried a few small samples with multiple blocks and they all sank and were handled

This revision is now accepted and ready to land.Feb 8 2018, 12:52 PM
arsenm closed this revision.Feb 8 2018, 2:48 PM

r324667