Page MenuHomePhabricator

[LoopVectorize][AArch64] Use get.active.lane.mask intrinsic when SVE is enabled
ClosedPublic

Authored by david-arm on Jan 12 2022, 6:04 AM.

Details

Summary

When SVE is enabled for AArch64 targets it makes more sense to use the
get.active.lane.mask intrinsic, because SVE has an exact 1-1 mapping
from the intrinsic to the 'whilelo' instruction for legal vector types.
This instruction neatly takes overflow into account as well. This patch
fixes an issue in VPInstruction::generateInstruction that assumed we are
only dealing with fixed-width vectors.

Diff Detail

Event Timeline

david-arm created this revision.Jan 12 2022, 6:04 AM
david-arm requested review of this revision.Jan 12 2022, 6:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2022, 6:04 AM
Matt added a subscriber: Matt.Jan 13 2022, 10:49 AM
kmclaughlin accepted this revision.Jan 14 2022, 6:05 AM

This seems like a sensible change to me, LGTM!

This revision is now accepted and ready to land.Jan 14 2022, 6:05 AM
sdesmalen accepted this revision.Jan 14 2022, 8:59 AM
This revision was landed with ongoing or failed builds.Jan 18 2022, 3:59 AM
This revision was automatically updated to reflect the committed changes.

I landed this patch with dependencies open, because we don't have to improve the fixed-width code generation of get.active.lane.mask at this moment in time.