This is an archive of the discontinued LLVM Phabricator instance.

[LV] Use VScaleForTuning to allow wider epilogue VFs.
ClosedPublic

Authored by sdesmalen on Feb 1 2022, 8:50 AM.

Details

Summary

When the main loop is e.g. VF=vscale x 1 and the epilogue VF cannot
be any smaller, the vectorizer should try to estimate how many lanes are
executed at runtime and allow a suitable fixed-width VF to be chosen. It
can use VScaleForTuning to figure out what a suitable fixed-width VF could
be. For the case where the main loop VF is VF=vscale x 1, and VScaleForTuning=8,
it could still choose an epilogue VF upto VF=4.

This was a bit tricky to test, so this patch also introduces a wrapper
function to get 'VScaleForTuning' by also considering vscale_range.
If min and max are equal, then that will be the vscale we compile for.
It makes little sense to tune for a different width if the code
will not be portable for other widths.

Diff Detail

Event Timeline

sdesmalen created this revision.Feb 1 2022, 8:50 AM
sdesmalen requested review of this revision.Feb 1 2022, 8:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2022, 8:50 AM
david-arm accepted this revision.Feb 2 2022, 6:59 AM

LGTM! Seems like a sensible change.

This revision is now accepted and ready to land.Feb 2 2022, 6:59 AM
This revision was landed with ongoing or failed builds.Feb 3 2022, 7:40 AM
This revision was automatically updated to reflect the committed changes.