This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold (sext bool X) * (sext bool X) to zext (and X, X)
ClosedPublic

Authored by vdsered on Jun 13 2021, 6:39 AM.

Details

Summary

InstCombine didn't perform (sext bool X) * (sext bool X) --> zext (and X, X) which can result in just (zext X). The patch adds regression tests to check this transformation and adds a check for equality of mul's operands for that case.

Alive says that it is a correct transformation:
https://alive2.llvm.org/ce/z/27ryac
https://alive2.llvm.org/ce/z/k4l6Hj

Diff Detail

Event Timeline

vdsered created this revision.Jun 13 2021, 6:39 AM
vdsered requested review of this revision.Jun 13 2021, 6:39 AM

There's two cases here

  1. X == Y, where we should produce zext regardless of use count
  2. and this. but there is a number of tests missing - can this not happen when the operands of mul are different instructions?

Correction: what i meant to say is, there are two cases:

  1. X == Y, in which case we do not care about use count. this implicitly handles the case where both operands are the same instruction
  2. either operand is one-use

Correction: what i meant to say is, there are two cases:

  1. X == Y, in which case we do not care about use count. this implicitly handles the case where both operands are the same instruction
  2. either operand is one-use

I'd add one more case where one operand has one use and an other more than one. We check this for zext in mul_bools_use2 and for sext in mul_bools_sext_use1 and mul_bools_sext_use2 (they seem to differ only in operand's order, should it be the same for zext?).

Regarding your cases...

I can be wrong, but we do care about use count when X == Y because the way how we optimize depends on it. Look at the example below

declare void @use(i32)
define i32 @src(i1 %x, i1 %y) {
  %zx = zext i1 %x to i32
  %r = mul i32 %zx, %zx
  call void @use(i32 %zx)
  ret i32 %r
}

If there is more than one user (or 2 uses in other words), we cannot replace

%zx = zext i1 %x to i32
%r = mul i32 %zx, %zx

by

%r = and i1 %x, %y
%zy = zext i1 %r to i32

because zx has another user (call of @use(i32)). Otherwise, we need to leave zx as-is. I'm not sure if we should do this (would it be beneficial?). I didn't intend to handle this particular case as a part of that patch.

The second case where either of ops has one use. Yes, it must be legal to do such a transformation when both ops has one use and they are different instruction. As an addition alive says it is correct. Previously, it was Op0->hasOneUse() || Op1->hasOneUse() and when there is one use, then there is one user so it must be equivalent to I.isOnlyUserOfAnyOperand() in that case.

I guess, tests like mul_bools_sext_use2 must partially check the second case, but yes, there are no tests like.

define i32 @func(i1 %x, i1 %y) {
  %sx = sext i1 %x to i32
  %sy = sext i1 %y to i32
  %r = mul i32 %sy, %sx
  ret i32 %r

I'll update the diff

I think I should update the patch because LLVM already handles zext

%sx = zext i1 %x to i32
%r = mul i32 %sx, %sx
ret i32 %r

Correction: what i meant to say is, there are two cases:

  1. X == Y, in which case we do not care about use count. this implicitly handles the case where both operands are the same instruction
  2. either operand is one-use

I'd add one more case where one operand has one use and an other more than one. We check this for zext in mul_bools_use2 and for sext in mul_bools_sext_use1 and mul_bools_sext_use2 (they seem to differ only in operand's order, should it be the same for zext?).

Regarding your cases...

I can be wrong, but we do care about use count when X == Y because the way how we optimize depends on it. Look at the example below

declare void @use(i32)
define i32 @src(i1 %x, i1 %y) {
  %zx = zext i1 %x to i32
  %r = mul i32 %zx, %zx
  call void @use(i32 %zx)
  ret i32 %r
}

If there is more than one user (or 2 uses in other words), we cannot replace

%zx = zext i1 %x to i32
%r = mul i32 %zx, %zx

by

%r = and i1 %x, %y
%zy = zext i1 %r to i32

because zx has another user (call of @use(i32)). Otherwise, we need to leave zx as-is. I'm not sure if we should do this (would it be beneficial?). I didn't intend to handle this particular case as a part of that patch.

I'm not sure i follow.
%r = and i1 %x, %x is just %x, is it not?
So you end up with a single zext.

The second case where either of ops has one use. Yes, it must be legal to do such a transformation when both ops has one use and they are different instruction. As an addition alive says it is correct. Previously, it was Op0->hasOneUse() || Op1->hasOneUse() and when there is one use, then there is one user so it must be equivalent to I.isOnlyUserOfAnyOperand() in that case.

I guess, tests like mul_bools_sext_use2 must partially check the second case, but yes, there are no tests like.

define i32 @func(i1 %x, i1 %y) {
  %sx = sext i1 %x to i32
  %sy = sext i1 %y to i32
  %r = mul i32 %sy, %sx
  ret i32 %r

I'll update the diff

vdsered updated this revision to Diff 351777.EditedJun 13 2021, 11:10 PM
vdsered retitled this revision from [InstCombine] Require one user (not one use) for operands when optimize both (sext bool X) * (sext bool Y) and (zext bool X) * (zext bool Y) to [InstCombine] Fold (sext bool X) * (sext bool X) to zext (and X, X).
vdsered edited the summary of this revision. (Show Details)
  1. Added an equality check of operands (previously checked if they have exactly one user)
  2. Added more tests
vdsered added a comment.EditedJun 13 2021, 11:15 PM

We're going to see difference only in tests @mul_bool_sext_one_extra_user and mul_bool_sext_one_user as other transformations are performed even without the change in the patch. Can someone push baseline tests to show that in the diff?

lebedev.ri accepted this revision.Jun 15 2021, 2:17 AM

Seems fine to me. Please precommit the tests before committing this.
Thanks.

llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
331

Equality check is faster, it should go first.

This revision is now accepted and ready to land.Jun 15 2021, 2:17 AM

@lebedev.ri I don't have commit access

Time to ask for one then?