This is an archive of the discontinued LLVM Phabricator instance.

[LV] Predicated epilog vectorization
Needs ReviewPublic

Authored by dmgreen on Jan 30 2023, 1:18 AM.

Details

Summary

This extends the creation of epilogs in loop vectorization to include the ability to generate predicated loops from plans that are FoldTailByMasking. Providing that the main loop is unpredicated and there are FoldTailByMasking plans available, it can be profitable to pick one providing they are quicker than a scalar loop (and has a smaller or equal VF to the main body).

The iter.check min epliog iter check and the vec.epilog.iter.check are both changed to always jump to the eplog loop. It is otherwise fairly straight forward, although some of the details may change as we start to use them.

Diff Detail

Event Timeline

dmgreen created this revision.Jan 30 2023, 1:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 30 2023, 1:18 AM
dmgreen requested review of this revision.Jan 30 2023, 1:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 30 2023, 1:18 AM

Hi @dmgreen, thanks for this patch - it adds some very useful functionality to the vectoriser and allows us to reduce the code size for epilogues too! I just had a few questions ...

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5696

Hi @dmgreen, to be honest this logic feels really counter-intuitive to me. At least on the surface, it seems to be saying if we explicitly do not want an epilogue, then ignore it and generate an epilogue. Do you know under which circumstances we would even reach this code?

5713–5714

I assume we also reach this point with tail-folded schemes and choose the lowest cost out of all the tail-folded and normal vectorized plans?

10448–10449

I think it's better if we can avoid calling this if we don't need to. We may go to a lot of effort to choose a suitable epilogue VF, looking through all the plans, only to just ignore the result completely if we're already tail-folding.

dmgreen updated this revision to Diff 504572.Mar 13 2023, 3:24 AM

Sorry for the delay - I've had less time than I would like to get back to this. I have updated and rebased the patch. There is still one large MVE issue I need to work through, and the combo of epilog vectorization + DataAndControlFlow is currently not working correctly. I will split that off into another patch though as it is a bit of a more involved change. Plus there is another patch for letting this be controlled by the target or an option.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5696

Yeah because of how we got here, !isScalarEpilogueAllowed is equivalent to mayFoldTailByMasking(). I have updated the logic in this function.

5713–5714

We currently make the assumption that if we can then we either want a predicated remainder or scalar. i.e an unpredicated vector epliog + scalar will not be better. As the trip count of the vector remainder is expected to be low I think this should be a OK decision to take.

10448–10449

Yep sounds good. This is now handled at the start of selectEpilogueVectorizationFactor.

dmgreen updated this revision to Diff 509386.Mar 29 2023, 9:51 AM

Rebase. (And I've cleaned up a few of the unrelated changes, some of which have moved to other patches).