Page MenuHomePhabricator

[Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457)
ClosedPublic

Authored by lebedev.ri on Jul 2 2019, 11:05 AM.

Details

Summary

This is the backend part of PR42457.
In middle-end, we'd want to prefer the form with two adds - D63992,
but as this diff shows, not every target will prefer that pattern.

Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars,
but only X86 prefer that same pattern for vectors.

Here i'm adding a new TLI hook, always defaulting to the inc-of-add,
but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars.

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri created this revision.Jul 2 2019, 11:05 AM
efriedma added inline comments.Jul 2 2019, 12:23 PM
test/CodeGen/ARM/inc-of-add.ll
528 ↗(On Diff #207592)

Before we treat vectors differently from scalars, should we check the vector is actually legal?

In this case, we actually save an instruction (Thumb1 doesn't support adc with immediate), but that seems unintended.

lebedev.ri marked an inline comment as done.Jul 2 2019, 1:13 PM
lebedev.ri added inline comments.
test/CodeGen/ARM/inc-of-add.ll
528 ↗(On Diff #207592)

I have tried VT.isScalarInteger() || !isTypeLegal(VT); on ARM and PPC, and the diff says it's worse in total.
Could you be a bit more specific as to what you envision?

efriedma added inline comments.Jul 2 2019, 2:02 PM
test/CodeGen/ARM/inc-of-add.ll
528 ↗(On Diff #207592)

On targets where a vector is going to be scalarized, we should give the same result for the vector type and the scalarized type. For example, we should give <2 x i64> and i64 the same treatment on ARM without NEON.

If that's making code worse for some vector testcases, maybe we aren't making the correct choice for some scalar types. For example, it looks like we should prefer not+sub for i64 on Thumb1.

lebedev.ri marked an inline comment as not done.Jul 2 2019, 2:42 PM
lebedev.ri added inline comments.
test/CodeGen/ARM/inc-of-add.ll
528 ↗(On Diff #207592)

Ok, i'm feeling dumber than usual at the moment.
How should that all look? Could you please be a bit more specific?

bool ARMTargetLowering::preferIncOfAddToSubOfNot(EVT VT,
                                                 CombineLevel Level) const {
  return VT.isScalarInteger() || Level >= AfterLegalizeVectorOps;
}

^ does nothing.

efriedma added inline comments.Jul 2 2019, 2:55 PM
test/CodeGen/ARM/inc-of-add.ll
528 ↗(On Diff #207592)

Maybe something along the lines of:

if (!Subtarget->hasNEON()) {
  if (Thumb1Only())
    return VT.getScalarSizeInBits() <= 32;
  return true;
}
return VT.isScalarInteger();

That might not do the right thing for some edge cases with really weird types, but I think it should be roughly correct.

lebedev.ri added inline comments.Jul 2 2019, 3:11 PM
test/CodeGen/ARM/inc-of-add.ll
528 ↗(On Diff #207592)

I'm sorry, i believe i'm still missing the point here.
In this function, for THUMB6 check lines, *this* is an improvement currently, no? 15 lines vs 14 lines
Do we want to preserve LHS of the diff here?

efriedma added inline comments.Jul 2 2019, 3:49 PM
test/CodeGen/ARM/inc-of-add.ll
528 ↗(On Diff #207592)

There are two separate issues here:

  1. We want to use sub-of-not for i64 on Thumb1 i.e., use the RHS here, and the LHS from scalar_i64.
  2. We want to treat scalar types and vector types the same way on targets without neon (i.e. always return true on non-Thumb1 targets that don't have NEON).

My suggestion addresses both.

lebedev.ri updated this revision to Diff 207653.Jul 2 2019, 4:25 PM
lebedev.ri marked 8 inline comments as done.

Tune ARM hook.

test/CodeGen/ARM/inc-of-add.ll
528 ↗(On Diff #207592)

We want to use sub-of-not for i64 on Thumb1 i.e., use the RHS here, and the LHS from scalar_i64.

Ooh, that clears things up a bit :)
Thank you, hopefully this is better.

efriedma accepted this revision.Jul 2 2019, 4:49 PM

LGTM

This revision is now accepted and ready to land.Jul 2 2019, 4:49 PM

LGTM

Thank you for the review.

PPC hook may also need tuning, but i'm not sure as to the incantation needed.

This revision was automatically updated to reflect the committed changes.

LGTM

Thank you for the review.

PPC hook may also need tuning, but i'm not sure as to the incantation needed.

Nothing in the test changes sticks out to me. Go ahead and we'll need to see if we see any performance regressions.