This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Allow smaller VMOVL in tail predicated loops
ClosedPublic

Authored by dmgreen on Sep 13 2021, 10:28 AM.

Details

Summary

This allows VMOVL in tail predicated loops so long as the the vector size the VMOVL is extending into is less than or equal to the size of the VCTP in the tail predicated loop. There cases represent a sign-extend-inreg (or zero-extend-inreg), which needn't block tail predication as in https://godbolt.org/z/hdTsEbx8Y.

For this a vecsize has been added to the TSFlag bits of MVE instructions, which stores the size of the elements that the MVE instruction operates on. In the case of multiple size (such as a MVE_VMOVLs8bh that extends from i8 to i16, the largest size was be chosen). The sizes are encoded as 00 = i8, 01 = i16, 10 = i32 and 11 = i64, which often (but not always) comes from the instruction encoding directly. A unit test was added, and although only a subset of the vecsizes are currently used, the rest should be useful for other cases.

Diff Detail

Event Timeline

dmgreen created this revision.Sep 13 2021, 10:28 AM
dmgreen requested review of this revision.Sep 13 2021, 10:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 13 2021, 10:28 AM
samtebbs accepted this revision.Sep 21 2021, 9:15 AM

Nice idea. LGTM

This revision is now accepted and ready to land.Sep 21 2021, 9:15 AM
This revision was landed with ongoing or failed builds.Sep 22 2021, 4:08 AM
This revision was automatically updated to reflect the committed changes.