This is an archive of the discontinued LLVM Phabricator instance.

Teach the AArch64 backend that vector reduction NEON instructions implicitly zero the high lanes of the result, meaning that we can eliminate explicit zeroing.
AbandonedPublic

Authored by resistor on Mar 24 2022, 2:42 PM.

Details

Reviewers
dmgreen
efriedma

Diff Detail

Event Timeline

resistor created this revision.Mar 24 2022, 2:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2022, 2:42 PM
resistor requested review of this revision.Mar 24 2022, 2:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2022, 2:42 PM
dmgreen edited reviewers, added: dmgreen, efriedma; removed: greened.Apr 4 2022, 5:03 AM
dmgreen added a subscriber: dmgreen.

My worry with this is that the top lanes are not always defined to be zero by the DAG nodes. There is a comment in the header that says:

// Vector across-lanes addition
// Only the lower result lane is defined.

And they can be selected in a number of ways, things like ADDPv2i64p are defined to produce a scalar results which is inserted into an undef vector.

Maybe that's OK, but we are relying on shaky semantics. Whilst it is true that the ADDV/ADDP instructions clear the top bits (as do many other instruction that set s/d regs), it's not clear to me where that is ensured through the pipeline.

llvm/test/CodeGen/AArch64/vecreduce-zeroing.ll
6

We can usually remove dso_local and local_unnamed_addr #0 to clean up the tests a bit.

62

Can these be removed?

resistor abandoned this revision.Jan 9 2023, 7:26 PM