This is an archive of the discontinued LLVM Phabricator instance.

[ARM][MVE] Tail predicate VMAXV(unsigned) and VMAXAV
AbandonedPublic

Authored by samparker on Mar 24 2020, 10:27 AM.

Details

Summary

With predicated false lanes now tracked to guarantee zeros in the false bytes, we can allow vmaxv.u* and vmaxav instructions to be tail predicated if the argument requirements are met.

Diff Detail

Event Timeline

samparker created this revision.Mar 24 2020, 10:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2020, 10:27 AM

The unit test failure looks genuine, does that needs fixing?

A minor question inlined.

llvm/lib/Target/ARM/ARMInstrMVE.td
850

same for these 3 (why not suitable for TP?)

897

I am being lazy here (haven't checked the ARMARM), but why is this vmin not suitable for TP?

The unit test failure looks genuine, does that needs fixing?

Ah, I haven't updated the unit test in D76708.

For min/max, we can't support an implicit vmin because the results may not be the same after the conversion. So, say we only have three 32-bit elements left to process (and the fourth element is the LHS 0x00):

opcodeinputoriginal resulttail predicated result
VMAXV.u320x000102030x030x03
VMINV.u320x000102030x000x01

The tail predicated instruction will ignore the predicated lanes/bytes, whereas the original doesn't.
We're also only supporting unsigned values because we know that the 'FalseLaneZeros' can't interfere with the result, because the zero will only be the answer if the rest of the elements are also zero. This is not true for signed values though, where the false zero may be the largest value.

Thanks for that example! I asked this question because I expected the vmax and vmin to behave roughly the same. In your example, if you change the input and example from 0x00010203 to 0x04010203, then the VMAX will also give a different result after tail-predication, or am I still missing something?

if you change the input and example from 0x00010203 to 0x04010203, then the VMAX will also give a different result after tail-predication

Indeed! Which is why we track for our zero'd false lanes. With vmax being a horizontal operations, we check that it operates upon on registers that we know have zero'd false lanes.

if you change the input and example from 0x00010203 to 0x04010203, then the VMAX will also give a different result after tail-predication

Indeed! Which is why we track for our zero'd false lanes. With vmax being a horizontal operations, we check that it operates upon on registers that we know have zero'd false lanes.

Ah yes, this was before my first coffee...!

samparker updated this revision to Diff 255673.Apr 7 2020, 7:27 AM

Using predicate logic to determine validity.

I was going through my Phabricator list... is still relevant?

samparker abandoned this revision.Feb 16 2021, 9:13 AM

Maybe? Not something that I'll be continuing with though.

@dmgreen , interesting for you?