This is an archive of the discontinued LLVM Phabricator instance.

[LV] Improve register pressure estimation if MaxLocalUsers is zero
Needs ReviewPublic

Authored by yrouban on Apr 3 2023, 3:48 AM.

Details

Summary

Do not limit LoopVectorize interleave count using MaxLocalUsers in case MaxLocalUsers is zero.

Diff Detail

Event Timeline

yrouban created this revision.Apr 3 2023, 3:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2023, 3:48 AM
yrouban requested review of this revision.Apr 3 2023, 3:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2023, 3:48 AM
yrouban retitled this revision from [LV] Improve register pressure estimation for MaxLocalUsers is zero to [LV] Improve register pressure estimation if MaxLocalUsers is zero.Apr 4 2023, 7:59 AM
fhahn added a comment.Apr 4 2023, 9:39 AM

Do you have any performance data motivating the change and ruling out any regressions?

llvm/test/Transforms/LoopVectorize/X86/interleave-count.ll
38

Please update the test to use opaque pointers. Also, it would be good to put up a patch to just add the test separately and then only include the changes caused by the patch in the diff

yrouban marked an inline comment as done.Apr 4 2023, 9:16 PM

Do you have any performance data motivating the change and ruling out any regressions?

No. I investigated an AVX512 memset code generated by a non-llvm compiler with vectorized move instruction using zmm register and unrolled as if it had interleave count 16. Trying to achieve the same result with LoopVectorize I have found this nit. This is not a problem as long as the number of vector registers is big enough, that is even if decremented and bit-floored but still bigger than the other interleave count limits (e.g. X86TTIImpl::getMaxInterleaveFactor() returns 4 for AVX).

llvm/test/Transforms/LoopVectorize/X86/interleave-count.ll
38

Done. See D147588.

yrouban updated this revision to Diff 510993.Apr 4 2023, 9:17 PM
yrouban marked an inline comment as done.

extracted the test to a separate patch.

yrouban updated this revision to Diff 520930.May 10 2023, 1:16 AM

just rebased over the updated D147588