This is an archive of the discontinued LLVM Phabricator instance.

[IAI,LV] Avoid creating a scalar epilogue due to gaps in interleave-groups when optimizing for size
ClosedPublic

Authored by dorit on Oct 19 2018, 1:17 AM.

Details

Summary

LV is careful to respect -Os and not to create a scalar epilog in all cases (runtime tests, trip-counts that require a remainder loop) except for peeling due to gaps in interleave-groups. This patch fixes that; -Os will now have us invalidate such interleave-groups and vectorize without an epilog.

(Heads up: an upcoming patch will allow us to vectorize such interleave-group-with-gaps using masking instead of peeling, so we won't have to invalidate such groups for targets that support masked interleaving).

The patch also removes a related FIXME comment that is now obsolete, and was also inaccurate :
"FIXME: return None if loop requiresScalarEpilog(<MaxVF>), or look for a smaller MaxVF that does not require a scalar epilog."
(requiresScalarEpilog() has nothing to do with VF).

Diff Detail

Event Timeline

dorit created this revision.Oct 19 2018, 1:17 AM
dorit updated this revision to Diff 170313.Oct 20 2018, 11:34 AM

updated to top of trunk.

Ayal accepted this revision.Oct 21 2018, 8:18 AM

LGTM, only minor comments and optional suggestions.

include/llvm/Analysis/VectorUtils.h
312

Fix comment - this Group may require that at-least a single scalar iteration will run after the vector loop, which potentially requires VF(*UF) such scalar iterations when the trip count is a multiple of VF(*UF).

317

clang-format?

lib/Analysis/VectorUtils.cpp
938

May also indicate which group is being invalidated, e.g., dump along its insert pos insn.

lib/Transforms/Vectorize/LoopVectorize.cpp
1140

Is this change really needed? If a scalar epilogue is not allowed, all interleave groups that require it are invalidated, and InterleaveInfo.requiresScalarEpilogue() should return false. Right?

If needed, slight preference to place the method call second.

1146

Document this new field.

Technically speaking, the scalar epilogue might need to be destroyed rather than being "created", as it's based on the original *existing* version . Perhaps a more accurate name is IsScalarEpilogueAllowed, or CanEnsureScalarEpilogue.

4577

Suffice to check this only once, preferably in callee only, which also checks the EpilogCreationAllowed flag.

test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
46

Perhaps it's better to positively CHECK-NEXT instead of these multiple CHECK-NOT's.

This revision is now accepted and ready to land.Oct 21 2018, 8:18 AM
dorit marked 3 inline comments as done.Oct 21 2018, 8:23 PM

Addressed comments. See couple responses below. Thanks!

lib/Analysis/VectorUtils.cpp
938

The current printout is in the same format of the other "LV: Invalidate candidate interleaved group" printouts elsewhere, and it's nice/clear to keep it that way. So if we want to add the insert-pos insn in the printout (which is not a bad idea :-)) we should probably add it to all these occurrences, which I think is more appropriate for a separate patch.

lib/Transforms/Vectorize/LoopVectorize.cpp
1140

yes, I think you're right. This is a leftover from a previous version that included other changes (coming up next in a separate patch) before I noticed this bug in LV which requires invalidating groups.

1146

Field removed for now due to previous comment (will reappear in followup patch that adds masking support instead of epilog, with the proposed new name :-)).

test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
46

I like that it's clear and concise exactly what this test want to check...

Ayal added inline comments.Oct 21 2018, 10:22 PM
include/llvm/Analysis/VectorUtils.h
317

Suffice to check only getMember(getFactor() - 1).

Could have simply returned it, but for the asserts.

This revision was automatically updated to reflect the committed changes.
dorit added inline comments.Oct 21 2018, 11:25 PM
include/llvm/Analysis/VectorUtils.h
317

right! missed your response before the commit unfortunately. will commit separately