This is an archive of the discontinued LLVM Phabricator instance.

Added InstCombine transform for pattern " ((X % Z) + (Y % Z)) % Z -> (X + Y) % Z ".
Needs ReviewPublic

Authored by ankur29.garg on Sep 15 2014, 3:38 AM.

Download Raw Diff

Details

Reviewers

majnemer
suyog
dexonsmith

Summary

Hi Eveyone,

This patch implements the following transformation

" ((X % Z) + (Y % Z)) % Z -> (X + Y) % Z "

Please help in reviewing it.

Regards,
Ankur.

Diff Detail

Event Timeline

ankur29.garg updated this revision to Diff 13700.Sep 15 2014, 3:38 AM

ankur29.garg retitled this revision from to Added InstCombine transform for pattern " ((X % Z) + (Y % Z)) % Z -> (X + Y) % Z "..

ankur29.garg updated this object.

ankur29.garg edited the test plan for this revision. (Show Details)

ankur29.garg added reviewers: majnemer, suyog, dexonsmith.

ankur29.garg set the repository for this revision to rL LLVM.

ankur29.garg added a subscriber: Unknown Object (MLST).

This patch is incorrect:

$ ./alive.py < a.opt

Optimization: 1
Precondition: true
%1 = srem %x, %z
%2 = srem %y, %z
%3 = add %1, %2
%r = srem %3, %z

=>

%6 = add %x, %y
%r = srem %6, %z

Done: 1
ERROR: Domain of definedness of Target is smaller than Source's for i2 %r

Example:
%x i2 = 1 (0x1)
%z i2 = 3 (0x3)
%1 i2 = 0 (0x0)
%y i2 = 1 (0x1)
%2 i2 = 0 (0x0)
%3 i2 = 0 (0x0)
%6 i2 = 2 (0x2)
Source value: 0 (0x0)
Target value: undef

Meaning that the optimized code is introducing undefined behaviour
where the original code didn't have.

Nuno

Citando Ankur Garg <ankur29.garg@samsung.com>:

Hi majnemer, suyog, dexonsmith,

Hi Eveyone,

This patch implements the following transformation

" ((X % Z) + (Y % Z)) % Z -> (X + Y) % Z "

Please help in reviewing it.

Regards,
Ankur.

http://reviews.llvm.org/D5351

Files:
lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
test/Transforms/InstCombine/rem.ll

Thanks nlopes for the comments.
I think the tranformation only works for cases in which the addition doesn't overflow. I will add that check and resubmit the patch.

Thanks.

Hi nlopes,
I have modified it to check for the cases in which the addition overflows. The check is only valid when the operands or some their bits are known prior to the calculation, though. The transformation won't work for the cases in which the operands are completely unknown. I have included the test cases with both cases - one in which the addition overflows and another one in which it doesn't.
I think it can't be extended to the cases where the operands are completely variables because it is not possible to figure out whether their addition will overflow or not.
Please suggest if there is any other way to handle the cases for variable operands.

Thanks.
Regards,
Ankur Garg

majnemer added inline comments.Sep 17 2014, 11:47 AM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1338–1340	Why are you mutating the LHS of I? This is unsafe if it has any other users.
test/Transforms/InstCombine/rem.ll
229–243	You don't have any CHECK statements here.

Hi David,
I have made the changes as per your comments. I have made sure that the I isn't mutated and also added CHECK statements for the test cases.
Please review the updated revision.

Thanks.

Gentle Ping !

Sorry, still not correct:

Pre: WillNotOverflowSignedAdd(%x, %y)
%1 = srem %x, %z
%2 = srem %y, %z
%3 = add %1, %2
%r = srem %3, %z
  =>
%6 = add %x, %y
%r = srem %6, %z

Done: 1
ERROR: Domain of definedness of Target is smaller than Source's for i2 %r

Example:
%x i2 = 3 (0x3)
%z i2 = 3 (0x3)
%y i2 = 3 (0x3)
%1 i2 = 0 (0x0)
%2 i2 = 0 (0x0)
%3 i2 = 0 (0x0)
%6 i2 = 2 (0x2)
Source value: 0 (0x0)
Target value: undef

Hi nlopes,
Thanks for reviewing the patch.

In the example you have stated,

%x i2 = 3

since we are talking about signed integers, that is equivalent to %x i2 = -1

Similarly, %y i2 = -1 and %z i2 = -1.

I am not sure how [%6 = add %x, %y] calculates to [%6 i2 = 2]. Shouldn't it be [%6 i2 = -2] ?

Please clarify.
Thanks.

In D5351#15, @ankur29.garg wrote:

Hi nlopes,
Thanks for reviewing the patch.

In the example you have stated,

%x i2 = 3

since we are talking about signed integers, that is equivalent to %x i2 = -1

Similarly, %y i2 = -1 and %z i2 = -1.

I am not sure how [%6 = add %x, %y] calculates to [%6 i2 = 2]. Shouldn't it be [%6 i2 = -2] ?

Please clarify.
Thanks.

I think our output is now slightly better:

Example:
%x i4 = 0xC (12, -4)
%z i4 = 0xF (15, -1)
%y i4 = 0xC (12, -4)
%1 i4 = 0x0 (0)
%2 i4 = 0x0 (0)
%3 i4 = 0x0 (0)
%6 i4 = 0x8 (8, -8)
Source value: 0x0 (0)
Target value: undef

The idea is that %6 is INT_MIN and %z=-1. And so we get 'srem INT_MIN, -1', which is undefined behavior per the manual (http://llvm.org/docs/LangRef.html#srem-instruction).

Thanks for clarifying that. I understand now, what that undefined behavior is for.

Another thing, in these examples, all three inputs, %x, %y, & %z are known. That is, they are constants. I think such cases wouldn't make it to the transformation. When I was testing the transformation, I tested it with these inputs, the part of the code where the transformation takes place was never reached.
The transformation is useful only when some (or all) bits of any of these inputs are unknown. Otherwise, direct calculation of the expression is done. So, there should be no change in the behavior for the above input, as, in case of this input, this transformation will never be applied.

Thanks.

In D5351#17, @ankur29.garg wrote:

Thanks for clarifying that. I understand now, what that undefined behavior is for.

Another thing, in these examples, all three inputs, %x, %y, & %z are known. That is, they are constants. I think such cases wouldn't make it to the transformation. When I was testing the transformation, I tested it with these inputs, the part of the code where the transformation takes place was never reached.
The transformation is useful only when some (or all) bits of any of these inputs are unknown. Otherwise, direct calculation of the expression is done. So, there should be no change in the behavior for the above input, as, in case of this input, this transformation will never be applied.

Thanks.

So the counterexample shows concrete, constant, values, yes. But that doesn't mean that you need to have these constants in the IR.
This counterexample means that if you do this transformation, and if at run time, it happens that %x,%y,%z have the values that I've given you previously, then the program would execute undefined behavior. For example, if a subsequent optimization pass folds all %x,%y,%z to the constants I gave you, it will detect the undefined behavior and it can then remove all the code.
So, the optimization you're proposing introduces undefined behavior for one particular input (of %x,%y,%z) where the original code had defined behavior. Therefore the transformation is not safe.

Oh, thanks for clarifying that. I will work on the comments and resubmit the transformation.

Thanks.

dexonsmith resigned from this revision.Aug 25 2019, 8:22 AM

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineMulDivRem.cpp

11 lines

test/

Transforms/

InstCombine/

rem.ll

37 lines

Diff 14270

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

Show First 20 Lines • Show All 1,324 Lines • ▼ Show 20 Lines	if (!isa<Constant>(RHSNeg) \|\|
(isa<ConstantInt>(RHSNeg) &&		(isa<ConstantInt>(RHSNeg) &&
cast<ConstantInt>(RHSNeg)->getValue().isStrictlyPositive())) {		cast<ConstantInt>(RHSNeg)->getValue().isStrictlyPositive())) {
// X % -Y -> X % Y		// X % -Y -> X % Y
Worklist.AddValue(I.getOperand(1));		Worklist.AddValue(I.getOperand(1));
I.setOperand(1, RHSNeg);		I.setOperand(1, RHSNeg);
return &I;		return &I;
}		}

		// ((X % Z) + (Y % Z)) % Z -> (X + Y) % Z
		Value X, Y, *Z;
		if (match(Op1, m_Value(Z)) &&
		match(Op0, m_Add(m_SRem(m_Value(X), m_Specific(Z)),
		m_SRem(m_Value(Y), m_Specific(Z))))) {
		if (WillNotOverflowSignedAdd(X, Y, (Instruction *)&I)) {
		return ReplaceInstUsesWith(
		I, Builder->CreateSRem(Builder->CreateAdd(X, Y), Z));
		majnemerUnsubmitted Not Done Reply Inline Actions Why are you mutating the LHS of I? This is unsafe if it has any other users. majnemer: Why are you mutating the LHS of I? This is unsafe if it has any other users.
		}
		}

// If the sign bits of both operands are zero (i.e. we can prove they are		// If the sign bits of both operands are zero (i.e. we can prove they are
// unsigned inputs), turn this into a urem.		// unsigned inputs), turn this into a urem.
if (I.getType()->isIntegerTy()) {		if (I.getType()->isIntegerTy()) {
APInt Mask(APInt::getSignBit(I.getType()->getPrimitiveSizeInBits()));		APInt Mask(APInt::getSignBit(I.getType()->getPrimitiveSizeInBits()));
if (MaskedValueIsZero(Op1, Mask, 0, &I) &&		if (MaskedValueIsZero(Op1, Mask, 0, &I) &&
MaskedValueIsZero(Op0, Mask, 0, &I)) {		MaskedValueIsZero(Op0, Mask, 0, &I)) {
// X srem Y -> X urem Y, iff X and Y don't have sign bit set		// X srem Y -> X urem Y, iff X and Y don't have sign bit set
return BinaryOperator::CreateURem(Op0, Op1, I.getName());		return BinaryOperator::CreateURem(Op0, Op1, I.getName());
▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

test/Transforms/InstCombine/rem.ll

	Show First 20 Lines • Show All 207 Lines • ▼ Show 20 Lines
	define <2 x i64> @test20(<2 x i64> %X, <2 x i1> %C) {			define <2 x i64> @test20(<2 x i64> %X, <2 x i1> %C) {
	; CHECK-LABEL: @test20(			; CHECK-LABEL: @test20(
	; CHECK-NEXT: select <2 x i1> %C, <2 x i64> <i64 1, i64 2>, <2 x i64> zeroinitializer			; CHECK-NEXT: select <2 x i1> %C, <2 x i64> <i64 1, i64 2>, <2 x i64> zeroinitializer
	; CHECK-NEXT: ret <2 x i64>			; CHECK-NEXT: ret <2 x i64>
	%V = select <2 x i1> %C, <2 x i64> <i64 1, i64 2>, <2 x i64> <i64 8, i64 9>			%V = select <2 x i1> %C, <2 x i64> <i64 1, i64 2>, <2 x i64> <i64 8, i64 9>
	%R = urem <2 x i64> %V, <i64 2, i64 3>			%R = urem <2 x i64> %V, <i64 2, i64 3>
	ret <2 x i64> %R			ret <2 x i64> %R
	}			}

				define i32 @test21(i32 %x, i32 %y, i32 %n) {
				; CHECK-LABEL: @test21(
				; CHECK-NEXT: %1 = add i32 %x, %y
				; CHECK-NEXT: %2 = srem i32 %1, %n
				; CHECK-NEXT: ret i32 %2
				%mod = srem i32 %x, %n
				%mod1 = srem i32 %y, %n
				%add = add i32 %mod, %mod1
				%mod2 = srem i32 %add, %n
				ret i32 %mod2
				}

				define i2 @test22(i2 %c) {
				; CHECK-LABEL: @test22(
				; CHECK-NEXT: %lhs = srem i2 -1, %c
				; CHECK-NEXT: %rhs = srem i2 -2, %c
				; CHECK-NEXT: %add = add i2 %lhs, %rhs
				; CHECK-NEXT: %mod = srem i2 %add, %c
				; CHECK-NEXT: ret i2 %mod
				%lhs = srem i2 3, %c
				%rhs = srem i2 2, %c
				%add = add i2 %lhs, %rhs
				%mod = srem i2 %add, %c
				ret i2 %mod
				}

				define i4 @test23(i4 %c) {
				majnemerUnsubmitted Not Done Reply Inline Actions You don't have any CHECK statements here. majnemer: You don't have any CHECK statements here.
				; CHECK-LABEL: @test23(
				; CHECK-NEXT: %1 = srem i4 5, %c
				; CHECK-NEXT: ret i4 %1
				%lhs = srem i4 3, %c
				%rhs = srem i4 2, %c
				%add = add i4 %lhs, %rhs
				%mod = srem i4 %add, %c
				ret i4 %mod
				}

This is an archive of the discontinued LLVM Phabricator instance.

Added InstCombine transform for pattern " ((X % Z) + (Y % Z)) % Z -> (X + Y) % Z ".Needs ReviewPublic

Details

Diff Detail

Event Timeline

$ ./alive.py < a.opt

Revision Contents

Diff 14270

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

test/Transforms/InstCombine/rem.ll

Added InstCombine transform for pattern " ((X % Z) + (Y % Z)) % Z -> (X + Y) % Z ".
Needs ReviewPublic