This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Inefficient pattern for high-bits checking 2 (PR38708)
ClosedPublic

Authored by lebedev.ri on Sep 12 2018, 1:14 PM.

Details

Summary

It is sometimes important to check that some newly-computed value
is non-negative and only n bits wide (where n is a variable.)
There are many ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

More complicated, canonical pattern:
https://rise4fun.com/Alive/uhA

We do need to have two switch()'es like this,
to not mismatch the swappable predicates.

https://bugs.llvm.org/show_bug.cgi?id=38708

Diff Detail

Repository
rL LLVM

Event Timeline

I was wondering if it would be easier to read if we swapped everything as a first step, so something like this in the existing code:

// We want X to be the icmp's second operand, so swap if not.
Value *Cmp0 = Cmp.getOperand(0), *X = Cmp.getOperand(1);
ICmpInst::Predicate Pred = Cmp.getPredicate(), NewPred;
if (match(X, m_OneUse(m_Shl(m_One(), m_Value())))) {
  std::swap(Cmp0, X);
  Pred = Cmp.getSwappedPredicate();
}
Value *Y;
if (!match(Cmp0, m_OneUse(m_Shl(m_One(), m_Value(Y)))))
  return nullptr;
if (Pred == ICmpInst::ICMP_ULE) NewPred = ICmpInst::ICMP_NE;
else if (Pred == ICmpInst::ICMP_UGT) NewPred = ICmpInst::ICMP_EQ;
else return nullptr;

It's mostly a matter of taste, but there's a subtle logic difference: what if both sides of the icmp match the shift pattern?

define i1 @p0(i8 %shamt0, i8 %shamt1) {
  %t0 = shl i8 1, %shamt0
  %t1 = shl i8 1, %shamt1
  %r = icmp ugt i8 %t0, %t1
  ret i1 %r
}

With the current code, it gets transformed to:

define i1 @p0(i8 %shamt0, i8 %shamt1) {
  %t1 = shl i8 1, %shamt1
  %t1.highbits = lshr i8 %t1, %shamt0
  %r = icmp eq i8 %t1.highbits, 0
  ret i1 %r
}

It should be reduced:

%t0 = shl i8 1, %shamt0
%t1 = shl i8 1, %shamt1
%r = icmp ugt i8 %t0, %t1
=>
%r = icmp ugt i8 %shamt0, %shamt1

I don't think we need to hold up this patch for that, but maybe it changes the way we want to implement it?

I was wondering if it would be easier to read if we swapped everything as a first step, so something like this in the existing code:

// We want X to be the icmp's second operand, so swap if not.
Value *Cmp0 = Cmp.getOperand(0), *X = Cmp.getOperand(1);
ICmpInst::Predicate Pred = Cmp.getPredicate(), NewPred;
if (match(X, m_OneUse(m_Shl(m_One(), m_Value())))) {
  std::swap(Cmp0, X);
  Pred = Cmp.getSwappedPredicate();
}
Value *Y;
if (!match(Cmp0, m_OneUse(m_Shl(m_One(), m_Value(Y)))))
  return nullptr;
if (Pred == ICmpInst::ICMP_ULE) NewPred = ICmpInst::ICMP_NE;
else if (Pred == ICmpInst::ICMP_UGT) NewPred = ICmpInst::ICMP_EQ;
else return nullptr;

I'm not sure.
I think we really should have two separate matches, and two switch()es.
Else i think we may use the wrong predicate..
Also, i *think* i will add one/two patterns to this new matcher (less canonical variants with extra uses),
so specifying the pattern twice seems sub-par.

It's mostly a matter of taste, but there's a subtle logic difference: what if both sides of the icmp match the shift pattern?

define i1 @p0(i8 %shamt0, i8 %shamt1) {
  %t0 = shl i8 1, %shamt0
  %t1 = shl i8 1, %shamt1
  %r = icmp ugt i8 %t0, %t1
  ret i1 %r
}

With the current code, it gets transformed to:

define i1 @p0(i8 %shamt0, i8 %shamt1) {
  %t1 = shl i8 1, %shamt1
  %t1.highbits = lshr i8 %t1, %shamt0
  %r = icmp eq i8 %t1.highbits, 0
  ret i1 %r
}

It should be reduced:

%t0 = shl i8 1, %shamt0
%t1 = shl i8 1, %shamt1
%r = icmp ugt i8 %t0, %t1
=>
%r = icmp ugt i8 %shamt0, %shamt1

I don't think we need to hold up this patch for that, but maybe it changes the way we want to implement it?

I acknowledge that there is some problem when we have the same/similar pattern on the both sides,
i have thought about it a bit (rL342076), but i don't have anything concrete on that.

spatel accepted this revision.Sep 13 2018, 12:22 PM

I was wondering if it would be easier to read if we swapped everything as a first step, so something like this in the existing code:

// We want X to be the icmp's second operand, so swap if not.
Value *Cmp0 = Cmp.getOperand(0), *X = Cmp.getOperand(1);
ICmpInst::Predicate Pred = Cmp.getPredicate(), NewPred;
if (match(X, m_OneUse(m_Shl(m_One(), m_Value())))) {
  std::swap(Cmp0, X);
  Pred = Cmp.getSwappedPredicate();
}
Value *Y;
if (!match(Cmp0, m_OneUse(m_Shl(m_One(), m_Value(Y)))))
  return nullptr;
if (Pred == ICmpInst::ICMP_ULE) NewPred = ICmpInst::ICMP_NE;
else if (Pred == ICmpInst::ICMP_UGT) NewPred = ICmpInst::ICMP_EQ;
else return nullptr;

I'm not sure.
I think we really should have two separate matches, and two switch()es.
Else i think we may use the wrong predicate..
Also, i *think* i will add one/two patterns to this new matcher (less canonical variants with extra uses),
so specifying the pattern twice seems sub-par.

Ok - just wanted to throw it out as a possibility. I agree that the switch version is less likely to go buggy.

define i1 @p0(i8 %shamt0, i8 %shamt1) {
  %t0 = shl i8 1, %shamt0
  %t1 = shl i8 1, %shamt1
  %r = icmp ugt i8 %t0, %t1
  ret i1 %r
}

With the current code, it gets transformed to:

define i1 @p0(i8 %shamt0, i8 %shamt1) {
  %t1 = shl i8 1, %shamt1
  %t1.highbits = lshr i8 %t1, %shamt0
  %r = icmp eq i8 %t1.highbits, 0
  ret i1 %r
}

It should be reduced:

%t0 = shl i8 1, %shamt0
%t1 = shl i8 1, %shamt1
%r = icmp ugt i8 %t0, %t1
=>
%r = icmp ugt i8 %shamt0, %shamt1

I don't think we need to hold up this patch for that, but maybe it changes the way we want to implement it?

I acknowledge that there is some problem when we have the same/similar pattern on the both sides,
i have thought about it a bit (rL342076), but i don't have anything concrete on that.

Sounds good. Please add tests where both sides match, so we have some evidence of the missing folds. Mostly, I'm paranoid that we'll open up some infinite looping scenario if we don't have tests for those unexpected patterns. As we're finding with min/max, it's hard to see those problems in advance.

LGTM

This revision is now accepted and ready to land.Sep 13 2018, 12:22 PM

I was wondering if it would be easier to read if we swapped everything as a first step, so something like this in the existing code:

// We want X to be the icmp's second operand, so swap if not.
Value *Cmp0 = Cmp.getOperand(0), *X = Cmp.getOperand(1);
ICmpInst::Predicate Pred = Cmp.getPredicate(), NewPred;
if (match(X, m_OneUse(m_Shl(m_One(), m_Value())))) {
  std::swap(Cmp0, X);
  Pred = Cmp.getSwappedPredicate();
}
Value *Y;
if (!match(Cmp0, m_OneUse(m_Shl(m_One(), m_Value(Y)))))
  return nullptr;
if (Pred == ICmpInst::ICMP_ULE) NewPred = ICmpInst::ICMP_NE;
else if (Pred == ICmpInst::ICMP_UGT) NewPred = ICmpInst::ICMP_EQ;
else return nullptr;

I'm not sure.
I think we really should have two separate matches, and two switch()es.
Else i think we may use the wrong predicate..
Also, i *think* i will add one/two patterns to this new matcher (less canonical variants with extra uses),
so specifying the pattern twice seems sub-par.

Ok - just wanted to throw it out as a possibility. I agree that the switch version is less likely to go buggy.

define i1 @p0(i8 %shamt0, i8 %shamt1) {
  %t0 = shl i8 1, %shamt0
  %t1 = shl i8 1, %shamt1
  %r = icmp ugt i8 %t0, %t1
  ret i1 %r
}

With the current code, it gets transformed to:

define i1 @p0(i8 %shamt0, i8 %shamt1) {
  %t1 = shl i8 1, %shamt1
  %t1.highbits = lshr i8 %t1, %shamt0
  %r = icmp eq i8 %t1.highbits, 0
  ret i1 %r
}

It should be reduced:

%t0 = shl i8 1, %shamt0
%t1 = shl i8 1, %shamt1
%r = icmp ugt i8 %t0, %t1
=>
%r = icmp ugt i8 %shamt0, %shamt1

I don't think we need to hold up this patch for that, but maybe it changes the way we want to implement it?

I acknowledge that there is some problem when we have the same/similar pattern on the both sides,
i have thought about it a bit (rL342076), but i don't have anything concrete on that.

Sounds good.

Please add tests where both sides match, so we have some evidence of the missing folds.

Will do.

Mostly, I'm paranoid that we'll open up some infinite looping scenario if we don't have tests for those unexpected patterns. As we're finding with min/max, it's hard to see those problems in advance.

LGTM

Thank you for the review!

This revision was automatically updated to reflect the committed changes.