This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV
ClosedPublic

Authored by samtebbs on Sep 17 2020, 8:19 AM.

Details

Summary

This folds a select_cc and (select(set_cc) of a max or min vector reduction with a scalar value into a VMAXV or VMINV.

Diff Detail

Event Timeline

samtebbs created this revision.Sep 17 2020, 8:19 AM
samtebbs requested review of this revision.Sep 17 2020, 8:19 AM
dmgreen added inline comments.Sep 17 2020, 9:45 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
5265

This could do with a few extra tests to make sure it's a min/max. LHS==TrueVal and RHS==FalseVal for example.

Plus it's worth checking if commuted min/max work too. min(vecreduce.min, x) or min(x, vecreduce.min)

samtebbs updated this revision to Diff 292763.Sep 18 2020, 5:25 AM

Support commuted operands and add tests to cover them.

samtebbs marked an inline comment as done.Sep 18 2020, 5:25 AM
samtebbs added inline comments.
llvm/lib/Target/ARM/ARMISelLowering.cpp
5265

Thanks for pointing that out, done now.

dmgreen added inline comments.Sep 21 2020, 12:56 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
5264

It's might be worth pulling this out into a separate function, similar to isLowerSaturatingConditional or something like PerformSplittingToNarrowingStores where we return the new SDValue.

5296–5298

This could use std::swap?

5307–5309

I think we might need to check these things, otherwise we might be matching things incorrectly.

I believe the exact semantics of a vminv.s8 are that it reads the bottom 8 bits of Rn to do the final scalar min. So a 32bit min could actually produce a different value. We should make sure we are getting this correct too.

samtebbs updated this revision to Diff 293156.Sep 21 2020, 6:35 AM
samtebbs marked 4 inline comments as done.

Fold using the LHS and check for LHS and RHS opcodes explicitly.

llvm/lib/Target/ARM/ARMISelLowering.cpp
5264

That's much cleaner! Done.

5296–5298

Ah I didn't know that existed. Done.

5307–5309

Thanks. I've added explicit checks for the LHS opcode and have made the lowering use the LHS instead of the TrueVal as the scalar so that we know the top 24 bits aren't set.

dmgreen added inline comments.Sep 21 2020, 7:34 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
5168

else return false;
Then it doesn't need the extra indenting.

5186

This looks like it need to be a bit stricter still. The AND needs to go with a umin/umax, and the sign extend with smin/smax (they shouldn't be the other way around). The AND needs a mask of the right size (it'll be 255 for i8 for example, which is the same as a "zero extend inreg"). The sign extend will have a type as the second argument, which should be the same as the vecreduce's scalar type.

Some of this might be difficult to get to come up in practice, but we should make sure it won't be subtly wrong and cause bugs.

samtebbs updated this revision to Diff 293200.Sep 21 2020, 9:35 AM

Be a bit more strict.

samtebbs marked 2 inline comments as done.Sep 21 2020, 9:35 AM
samtebbs added inline comments.
llvm/lib/Target/ARM/ARMISelLowering.cpp
5168

Thumbs up

5186

I thought about those two but hoped that the IR type system would make those situations impossible. Will add these, thanks.

samtebbs updated this revision to Diff 293689.Sep 23 2020, 3:48 AM
samtebbs marked 2 inline comments as done.
samtebbs retitled this revision from [ARM]Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV to [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Allow inverted condition codes

samtebbs updated this revision to Diff 294041.Sep 24 2020, 6:45 AM

Add truncate_sext test

samtebbs updated this revision to Diff 294999.EditedSep 29 2020, 8:16 AM

Add type promotion checks and two new tests

dmgreen added inline comments.Oct 1 2020, 1:24 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
5143

Please clang-format, and possible just make it a static function?

5195

-> FalseVal->getOperand(0).getValueType().getVectorElementType();

5201

Hmm. This is getting complicated.

I would suggest changing this to something more like:

if (Type == i32)
  return Selected == Compared;
else
  // Check the opcodes are and/signexted/assertzext/assertsext.

But I don't think that will work very well. For i8/i16 I think we might actually need to be matching:

    t7: i32 = AssertZext t5, ValueType:ch:i8
      t17: i32 = vecreduce_umin t3
    t25: i32 = and t17, Constant:i32<255>
  t27: i32 = select_cc t25, t7, t17, t7, setult:ch
t21: i32 = and t27, Constant:i32<255>

Including that root "and" (which is the same as the two TrueVal/FalseVal sides of the select_cc being extended). But one loop I tried had no visible way I could see to prove the value needed to be a i16. It might be easier to do this from a combine, not during lowering. So before we have legalized the types.

What do you think about getting i32 working first with this patch, and doing i8/i16 as a followup? That way we can get all the boilerplate out of the way, and it's just this code that produces the ARMISD::VMINVs that we need to work further on.

samtebbs updated this revision to Diff 295792.Oct 2 2020, 4:49 AM
samtebbs edited the summary of this revision. (Show Details)

Perform the folding in a DAG combination rather than lowering.

Yeah, nice. This is a lot easier to follow.

llvm/lib/Target/ARM/ARMISelLowering.cpp
12173

I think it needs to check that operand is a Setcc too, to be sure it's not something strange.

12246

The top bits of this are not read by the instruction. Can we change it to an ANY_EXTEND?

12247

I think the last operand isn't needed here.

16115

Move the hasMVEIntegerOps check into the function? So it can look like the rest.

samtebbs added inline comments.Oct 2 2020, 6:52 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
12173

Good idea.

12246

Sure.

12247

You are indeed correct.

samtebbs updated this revision to Diff 295822.Oct 2 2020, 6:59 AM
samtebbs marked 3 inline comments as done.
samtebbs marked 2 inline comments as done.

One last thing I thought of. Because this is pre-lowering we have to be careful with illegal types.

Can you check the types are the ones we expect them to be (v16i8/v8i16/v4i32). And add some tests for v2i64 (which should be scalarized).

llvm/lib/Target/ARM/ARMISelLowering.h
758

This is no longer used?

samtebbs added a comment.EditedOct 5 2020, 2:53 AM

One last thing I thought of. Because this is pre-lowering we have to be careful with illegal types.

Can you check the types are the ones we expect them to be (v16i8/v8i16/v4i32). And add some tests for v2i64 (which should be scalarized).

Thanks for noticing this. I added tests for v2i64 and it failed an assertion so I have added checks for the valid vector types.

samtebbs updated this revision to Diff 296132.Oct 5 2020, 3:00 AM
samtebbs marked 2 inline comments as done.

Add check for the valid vector types.

dmgreen accepted this revision.Oct 5 2020, 3:22 AM

Thanks. LGTM

llvm/lib/Target/ARM/ARMISelLowering.cpp
12234

-> getOperand(0).getValueType()

This revision is now accepted and ready to land.Oct 5 2020, 3:22 AM

Thanks. LGTM

Thanks!

Hi Amara.

Apologies for this. The fix is just a matter of updating the two failing
tests. What is the process for getting the change back in?

Sam

Hi Amara.

Apologies for this. The fix is just a matter of updating the two failing
tests. What is the process for getting the change back in?

If it's just a trivial update of a test, you can just go ahead and recommit it. If you're in doubt, you can always put it for review again.

Hi Amara.

Apologies for this. The fix is just a matter of updating the two failing
tests. What is the process for getting the change back in?

If it's just a trivial update of a test, you can just go ahead and recommit it. If you're in doubt, you can always put it for review again.

Thanks Sjoerd, I have done so in 68e002e1819f1598fc6815226a353ad2f04cd509

samtebbs closed this revision.Oct 8 2020, 9:24 AM