This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Perform fixed-width predicate OR reduction on SVE predicate vectors.
Needs ReviewPublic

Authored by sdesmalen on Feb 9 2022, 8:29 AM.

Details

Summary

By default fixed-width i1 vectors are promoted, but when SVE is available,
some expression trees can be rewritten to use <vscale x M x i1> types,
such that all operations are performed on predicate registers, thus
avoiding unnecessary sign-extends and truncates. It does this by bubbling
up the 'sign-extend + extract' operations all the way up to nodes that
can be performed on SVE predicate registers.

The example chosen in this patch is to optimise an OR reduction
of a <N x i1> type, which can be implemented directly with a PTEST
instruction.

This patch is a rework of D117574.

Diff Detail

Event Timeline

sdesmalen created this revision.Feb 9 2022, 8:29 AM
sdesmalen requested review of this revision.Feb 9 2022, 8:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2022, 8:29 AM
Matt added a subscriber: Matt.Feb 9 2022, 8:33 AM
efriedma added inline comments.Feb 9 2022, 1:23 PM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
14379

This feels weird; we're adding nodes to the worklist without having actually done any transforms. Maybe this should be somewhere else?

llvm/test/CodeGen/AArch64/sve-fixed-length-float-compares.ll
372

This is a nice improvement.

llvm/test/CodeGen/AArch64/sve-fixed-length-ptest.ll
19

We should probably prefer to do this unpacking in predicate registers. But not necessary for this patch.