This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/Transforms/Scalar/NaryReassociate.cpp
319	Can we factor out code of `if(match) { ... }` into a lambda that takes a matcher and returns instruction and avoid this copy-paste?
606	The result of `tryReassociateMinOrMax` is only used as instruction, and non-instruction values are discarded by `dyn_cast_or_null`. Maybe change API to return instruction from here?
635	`findClosestMatchingDominator` returns an `Instruction `, do we really need to downcast it do `Value `
649	Does it make more sense to name it `".reassociate"` instead?
llvm/test/Transforms/NaryReassociate/nary-smax.ll
1	Please commit these tests with auto-generated checks without your patch and rebase on top of it to see what your patch changes.

mkazantsev added inline comments.Oct 29 2020, 9:56 PM

llvm/lib/Transforms/Scalar/NaryReassociate.cpp
606	Never mind, I think what you have is fine (I didn't notice it's a mutator).

ebrevnov added a child revision: D88285: [NARY-REASSOCIATE] Simplify traversal logic by post deleting dead instructions.Oct 29 2020, 10:44 PM

ebrevnov added a child revision: D88286: [NFC][NARY-REASSOCIATE] Restructure code to aviod isPotentiallyReassociatable.

ebrevnov removed a child revision: D88285: [NARY-REASSOCIATE] Simplify traversal logic by post deleting dead instructions.

ebrevnov added inline comments.Dec 2 2020, 2:39 AM

llvm/lib/Transforms/Scalar/NaryReassociate.cpp
319	Not sure how many lines of the code it will save.... let me try that... will see if it's any better
635	I don't think it will make any difference in this particular case. No problems to change to Instruction*
649	In addition to NaryReassociate there is Ressociate pass. I want the name to clearly point to NaryReassociate pass. Using "reassociate" seems too long for me...
llvm/test/Transforms/NaryReassociate/nary-smax.ll
1	Ok

ebrevnov removed a child revision: D88286: [NFC][NARY-REASSOCIATE] Restructure code to aviod isPotentiallyReassociatable.Dec 3 2020, 1:59 AM

ebrevnov added a parent revision: D88286: [NFC][NARY-REASSOCIATE] Restructure code to aviod isPotentiallyReassociatable.Dec 3 2020, 2:00 AM

Rebase

Harbormaster completed remote builds in B80941: Diff 309213.Dec 3 2020, 3:18 AM

mkazantsev added inline comments.Dec 4 2020, 4:10 AM

llvm/lib/Transforms/Scalar/NaryReassociate.cpp
319	I still think it would be useful, these lines copy-paste over and over.
llvm/test/Transforms/NaryReassociate/nary-smax.ll
1	Still actual.

Addressed comments

Harbormaster completed remote builds in B90767: Diff 326312.Feb 25 2021, 12:55 AM

Update

LGTM

llvm/lib/Transforms/Scalar/NaryReassociate.cpp
340	{ } not needed

This revision is now accepted and ready to land.Feb 25 2021, 3:04 AM

Harbormaster completed remote builds in B90784: Diff 326334.Feb 25 2021, 3:07 AM

ebrevnov added inline comments.Feb 25 2021, 3:13 AM

llvm/lib/Transforms/Scalar/NaryReassociate.cpp
340	will remove before commit

This revision was landed with ongoing or failed builds.Feb 25 2021, 3:23 AM

Closed by commit rG83d134c3c422: [NARY-REASSOCIATE] Support reassociation of min/max (authored by Evgeniy Brevnov <ybrevnov@azul.com>). · Explain Why

This revision was automatically updated to reflect the committed changes.

Evgeniy Brevnov <ybrevnov@azul.com> added a commit: rG83d134c3c422: [NARY-REASSOCIATE] Support reassociation of min/max.

RKSimon added a subscriber: RKSimon.Feb 25 2021, 3:45 AM

RKSimon added inline comments.

llvm/lib/Transforms/Scalar/NaryReassociate.cpp
615	@ebrevnov uint is an unknown type on most targets - breaking buiilds - just use int ?

This patch regressed the following tests:

LLVM :: Transforms/NaryReassociate/nary-smax.ll
LLVM :: Transforms/NaryReassociate/nary-smin.ll
LLVM :: Transforms/NaryReassociate/nary-umax.ll
LLVM :: Transforms/NaryReassociate/nary-umin.ll

The reason is that it doesn't account for undef values. Example:

----------------------------------------
define i32 @smax_test1(i32 %a, i32 %b, i32 %c) {
%0:
  %c1 = icmp sgt i32 %a, %b
  %smax1 = select i1 %c1, i32 %a, i32 %b
  %c2 = icmp sgt i32 %b, %c
  %smax2 = select i1 %c2, i32 %b, i32 %c
  %c3 = icmp sgt i32 %smax2, %a
  %smax3 = select i1 %c3, i32 %smax2, i32 %a
  %res = add i32 %smax1, %smax3
  ret i32 %res
}
=>
define i32 @smax_test1(i32 %a, i32 %b, i32 %c) {
%0:
  %c1 = icmp sgt i32 %a, %b
  %smax1 = select i1 %c1, i32 %a, i32 %b
  %1 = icmp sgt i32 %smax1, %c
  %smax3.nary = select i1 %1, i32 %smax1, i32 %c
  %res = add i32 %smax1, %smax3.nary
  ret i32 %res
}
Transformation doesn't verify!
ERROR: Target's return value is more undefined

Example:
i32 %a = #x7fffffff (2147483647)
i32 %b = #x00000000 (0)
i32 %c = undef

Source:
i1 %c1 = #x1 (1)
i32 %smax1 = #x7fffffff (2147483647)
i1 %c2 = any
i32 %smax2 = any
i1 %c3 = #x0 (0)
i32 %smax3 = #x7fffffff (2147483647)
i32 %res = #xfffffffe (4294967294, -2)

Target:
i1 %c1 = #x1 (1)
i32 %smax1 = #x7fffffff (2147483647)
i1 %1 = #x0 (0)
i32 %smax3.nary = #x03002006 (50339846)
i32 %res = #x83002005 (2197823493, -2097143803)
Source value: #xfffffffe (4294967294, -2)
Target value: #x83002005 (2197823493, -2097143803)

I think this is an issue of verification itself. In the first case max(0, undef)=>any and max(any, max_int)=>max_int. In the second case max(max_int, undef)=>x03002006. I believe the behavior of the verifier is inconsistent in these two cases and max(max_int, undef) should be evaluated to max_int as well. We can do the following trivial transformations to prove that: max(max_int, undef) is trivially equal to max(max_int, max(undef, undef)) and max(undef, undef) should be evaluated to 'any' since max(0, undef) is evaluated to 'any' in the first case. Thus we get max(max_int, any) which is evaluated to 'max_int' in the first case. So max(max_int, undef) should be evaluated to 'max_int' but not 'x03002006'.

Makes sense?

In D88287#2589378, @ebrevnov wrote:

I think this is an issue of verification itself. In the first case max(0, undef)=>any and max(any, max_int)=>max_int. In the second case max(max_int, undef)=>x03002006. I believe the behavior of the verifier is inconsistent in these two cases and max(max_int, undef) should be evaluated to max_int as well. We can do the following trivial transformations to prove that: max(max_int, undef) is trivially equal to max(max_int, max(undef, undef)) and max(undef, undef) should be evaluated to 'any' since max(0, undef) is evaluated to 'any' in the first case. Thus we get max(max_int, any) which is evaluated to 'max_int' in the first case. So max(max_int, undef) should be evaluated to 'max_int' but not 'x03002006'.

Makes sense?

Note that given

%a = undef
%b = %a

, %a and %b have undefined values, and there are no guarantees that they are equal/not equal.
Since you emitted icmp+select, you 'read' from undefined %c twice, and you are free to get different result each time.

In D88287#2589590, @lebedev.ri wrote:
In D88287#2589378, @ebrevnov wrote:

I think this is an issue of verification itself. In the first case max(0, undef)=>any and max(any, max_int)=>max_int. In the second case max(max_int, undef)=>x03002006. I believe the behavior of the verifier is inconsistent in these two cases and max(max_int, undef) should be evaluated to max_int as well. We can do the following trivial transformations to prove that: max(max_int, undef) is trivially equal to max(max_int, max(undef, undef)) and max(undef, undef) should be evaluated to 'any' since max(0, undef) is evaluated to 'any' in the first case. Thus we get max(max_int, any) which is evaluated to 'max_int' in the first case. So max(max_int, undef) should be evaluated to 'max_int' but not 'x03002006'.

Makes sense?

Note that given
%a = undef
%b = %a
, %a and %b have undefined values, and there are no guarantees that they are equal/not equal.
Since you emitted icmp+select, you 'read' from undefined %c twice, and you are free to get different result each time.

I must be missing something but I don't see how that applies to the above case. I think the problem is not connected with evaluation of %c twice (to different values).
I think the problem is that in the second case "%1 = icmp sgt i32 %smax1, %c" was evaluated to 'false' even though 'smax1' is known to be max_int. But if we replace "undef" with "any" like in the first case it is evaluated to max_int....

I think I understand the problem now. 'any'>max_int is known to be false (first case), while max_int >'any' is unknown (second case)....so this is not verification issue....

In D88287#2589378, @ebrevnov wrote:

I think this is an issue of verification itself. In the first case max(0, undef)=>any and max(any, max_int)=>max_int. In the second case max(max_int, undef)=>x03002006. I believe the behavior of the verifier is inconsistent in these two cases and max(max_int, undef) should be evaluated to max_int as well. We can do the following trivial transformations to prove that: max(max_int, undef) is trivially equal to max(max_int, max(undef, undef)) and max(undef, undef) should be evaluated to 'any' since max(0, undef) is evaluated to 'any' in the first case. Thus we get max(max_int, any) which is evaluated to 'max_int' in the first case. So max(max_int, undef) should be evaluated to 'max_int' but not 'x03002006'.

Makes sense?

Just to add to what Roman wrote, thinking of the code as max(,) is misleading. The code is doing icmp sgt INT_MAX, undef which can evaluate to true or false. But we cannot assume that undef from now on is equal to INT_MAX just because the comparison evaluated to true.
The are 2 possible fixes:

only do the optimization if %c is known non-undef (use ValueTracking's utility), e.g., https://alive2.llvm.org/ce/z/sLeOE0
freeze %c, e.g., https://alive2.llvm.org/ce/z/ZX557G

In D88287#2589791, @nlopes wrote:

In D88287#2589378, @ebrevnov wrote:

I think this is an issue of verification itself. In the first case max(0, undef)=>any and max(any, max_int)=>max_int. In the second case max(max_int, undef)=>x03002006. I believe the behavior of the verifier is inconsistent in these two cases and max(max_int, undef) should be evaluated to max_int as well. We can do the following trivial transformations to prove that: max(max_int, undef) is trivially equal to max(max_int, max(undef, undef)) and max(undef, undef) should be evaluated to 'any' since max(0, undef) is evaluated to 'any' in the first case. Thus we get max(max_int, any) which is evaluated to 'max_int' in the first case. So max(max_int, undef) should be evaluated to 'max_int' but not 'x03002006'.

Makes sense?

Just to add to what Roman wrote, thinking of the code as max(,) is misleading. The code is doing icmp sgt INT_MAX, undef which can evaluate to true or false. But we cannot assume that undef from now on is equal to INT_MAX just because the comparison evaluated to true.
The are 2 possible fixes:

only do the optimization if %c is known non-undef (use ValueTracking's utility), e.g., https://alive2.llvm.org/ce/z/sLeOE0

freeze %c, e.g., https://alive2.llvm.org/ce/z/ZX557G

0. fix SCEV's https://github.com/llvm/llvm-project/blob/00fe10c6a65173c9c578babd19f8fee44d07a761/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp#L1788-L1789 to emit proper intrinsics?

Just to add to what Roman wrote, thinking of the code as max(,) is misleading. The code is doing icmp sgt INT_MAX, undef which can evaluate to true or false. But we cannot assume that undef from now on is equal to INT_MAX just because the comparison evaluated to true.

Sure "undef" can be evaluated to any value each time. That's why in the first test case "%c2 = icmp sgt i32 %b, %c" is evaluated to 'any'. I don't see where 'undef' is assumed to be always equal to INT_MAX. Let me some time to think on possible fix. If I can't find solution quickly I will revert. Thank you.

I think the solution is to use >= instead of > when we do min/max reassociation. In other words, originally we had 'any' > MAX_INT which is known to be false. If we want semantically equal but reassociated expression we should invert the comparison logic. In other words we should check MAX_INT >= 'any' which is known to be true and MAX_INT will be selected.

In D88287#2589841, @ebrevnov wrote:

I think the solution is to use >= instead of > when we do min/max reassociation. In other words, originally we had 'any' > MAX_INT which is known to be false. If we want semantically equal but reassociated expression we should invert the comparison logic. In other words we should check MAX_INT >= 'any' which is known to be true and MAX_INT will be selected.

That's clever yes!
Alive2 says it's correct: https://alive2.llvm.org/ce/z/gieHpn

Current: https://alive2.llvm.org/ce/z/x8NBH8

In D88287#2589841, @ebrevnov wrote:

I think the solution is to use >= instead of > when we do min/max reassociation. In other words, originally we had 'any' > MAX_INT which is known to be false. If we want semantically equal but reassociated expression we should invert the comparison logic. In other words we should check MAX_INT >= 'any' which is known to be true and MAX_INT will be selected.

No.

Like i have already said, the fix is https://alive2.llvm.org/ce/z/RkBWxC.
Let me just fix this then.

No.

Like i have already said, the fix is https://alive2.llvm.org/ce/z/RkBWxC.
Let me just fix this then.

This is because you transform 'sgt,select" to smax which are not semantically equal due to the same reason
https://alive2.llvm.org/ce/z/zk_RrZ

I think it's better to revert it for now. Looks like the topic is not that trivial and we should first agree on the fix. There is no need to hurry.

Evgeniy Brevnov <ybrevnov@azul.com> added a reverting change: rG13a5cac2ba91: Revert "[NARY-REASSOCIATE] Support reassociation of min/max".Feb 26 2021, 4:48 AM

commit 13a5cac2ba919b4d02a296428b58919231e08569 (HEAD -> main, origin/main)
Author: Evgeniy Brevnov <ybrevnov@azul.com>
Date: Fri Feb 26 19:23:32 2021 +0700

Revert "[NARY-REASSOCIATE] Support reassociation of min/max"

This reverts commit 83d134c3c4222e8b8d3d90c099f749a3b3abc8e0

JonChesterfield added a subscriber: JonChesterfield.Mar 1 2021, 11:13 AM

ebrevnov reopened this revision.Mar 2 2021, 12:18 AM

This revision is now accepted and ready to land.Mar 2 2021, 12:18 AM

Fix found verification issue

I don't think we should reinvent SCEV Expander.
I was looking at fixing this properly, but got sidetracked by having to first fix the isHighCostExpansion().

This revision now requires changes to proceed.Mar 2 2021, 12:20 AM

In D88287#2596538, @lebedev.ri wrote:

I don't think we should reinvent SCEV Expander.
I was looking at fixing this properly, but got sidetracked by having to first fix the isHighCostExpansion().

I tried to adjust SCEV Expander but it's pretty challenging by itself. The reason is it should support not only integer types but floating point and pointer types as well. For example, for floating point types changing "cmp" and "select" pair to related intrinsic may potentially change the semantics for corner cases like NANs, INFs, exceptions, etc...

I'm not saying that enhancing SCEV Expander is definitely wrong way to go. It's pretty challenging and can potentially take long time. I think these two should not be mixed. Once SCEV Expander is reworked it will be a trivial change in reassociation pass to employ it.

In D88287#2596556, @ebrevnov wrote:

In D88287#2596538, @lebedev.ri wrote:

I don't think we should reinvent SCEV Expander.
I was looking at fixing this properly, but got sidetracked by having to first fix the isHighCostExpansion().

I tried to adjust SCEV Expander but it's pretty challenging by itself.

The reason is it should support not only integer types but floating point and pointer types as well. For example, for floating point types changing "cmp" and "select" pair to related intrinsic may potentially change the semantics for corner cases like NANs, INFs, exceptions, etc...

Uhm, no? SCEV only supports integers and pointers.

I'm not saying that enhancing SCEV Expander is definitely wrong way to go. It's pretty challenging and can potentially take long time. I think these two should not be mixed. Once SCEV Expander is reworked it will be a trivial change in reassociation pass to employ it.

Uhm, no? SCEV only supports integers and pointers.

I looked at the failing list one more time and indeed it was a pointer to double case.... In total, there are 54 failing tests which needs to be investigated... Roman, Are you working in this direction?
SCEV Expander should be fixed separately anyway and I don't think there is strong dependence between these pieces.

In D88287#2596715, @ebrevnov wrote:

Uhm, no? SCEV only supports integers and pointers.

I looked at the failing list one more time and indeed it was a pointer to double case.... In total, there are 54 failing tests which needs to be investigated...

I actually think this patch should not have been reverted in the first place.
SCEV is known to be not good with undef, i don't really see why we should start blocking on that now.
So i would personally recommend to directly revert your revert.

Roman, Are you working in this direction?
SCEV Expander should be fixed separately anyway and I don't think there is strong dependence between these pieces.

Right now i'm not looking at that, but i already had a patch to fix this,
but got sidetracked into isHighCostExpansion(), which then resulted in rage-closing IDE...
Right now i'm trying to deal with another SCEV issue.

Harbormaster completed remote builds in B91513: Diff 327370.Mar 2 2021, 6:22 AM

I actually think this patch should not have been reverted in the first place.
SCEV is known to be not good with undef, i don't really see why we should start blocking on that now.
So i would personally recommend to directly revert your revert.

If I'm getting it right you are OK to commit the patch in its current state. If this is the case please unblock it.

In D88287#2599227, @ebrevnov wrote:

I actually think this patch should not have been reverted in the first place.
SCEV is known to be not good with undef, i don't really see why we should start blocking on that now.
So i would personally recommend to directly revert your revert.

If I'm getting it right you are OK to commit the patch in its current state. If this is the case please unblock it.

Sure can do, but i don't want to unblock/accept *this* current code state, so please upload the result of git revert.

Hi Roman, I'm somewhat confused too. If we just revert the revert, we introduce undef value where was no undef (in fact, we create poison where it wasn't present). Do we have plans to fix SCEV expander's behavior with undefs in near term? If no, reverting the revert will simply introduce the bug. I'd rather stick to the current solution, saying that "ok, SCEVExpander is broken, let's not use it". We can add a FIXME to replace this with expander in the future, but I don't understand why would we want to use it despite it's buggy.

lebedev.ri mentioned this in rGb46c085d2b6d: [NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV….Mar 6 2021, 10:52 AM

Fixed SCEVExpaned in b46c085d2b6d15873fb53718f0a70b3848e19e4a, please rebase/update this patch accordingly.

But, i think we may have a problem still.
Does this patch intend to reassociate only integers, or pointers too? :)

kzhuravl added a subscriber: kzhuravl.Mar 8 2021, 12:24 PM

Stepped back to use SCEVExpander

Harbormaster completed remote builds in B93239: Diff 329868.Mar 11 2021, 1:06 AM

In D88287#2609126, @lebedev.ri wrote:

Fixed SCEVExpaned in b46c085d2b6d15873fb53718f0a70b3848e19e4a, please rebase/update this patch accordingly.

But, i think we may have a problem still.
Does this patch intend to reassociate only integers, or pointers too? :)

Updated to use SCEVExpander and limited to integer types only.

Please do add an precommit tests for pointers (unless i just missed them)

In D88287#2618821, @ebrevnov wrote:

In D88287#2609126, @lebedev.ri wrote:

Fixed SCEVExpaned in b46c085d2b6d15873fb53718f0a70b3848e19e4a, please rebase/update this patch accordingly.

But, i think we may have a problem still.
Does this patch intend to reassociate only integers, or pointers too? :)

Updated to use SCEVExpander and limited to integer types only.

So i guess we may want to extend those intrinsics to support pointer types after all... @spatel @nikic

llvm/lib/Transforms/Scalar/NaryReassociate.cpp
285–290	Early return may be cleaner?
648

This revision is now accepted and ready to land.Mar 11 2021, 1:20 AM

Update

Harbormaster completed remote builds in B93264: Diff 329904.Mar 11 2021, 3:41 AM

This commit may cause multiple CI failures on TensorFlow/NVPTX backend. See https://github.com/tensorflow/tensorflow/commit/b1758bd553dfc2ebbfd07eec01d1e3254eda25b8#commitcomment-48080097.

I am trying to find a minimal reproducible test case here.

In D88287#2619803, @byronyi wrote:

This commit may cause multiple CI failures on TensorFlow/NVPTX backend. See https://github.com/tensorflow/tensorflow/commit/b1758bd553dfc2ebbfd07eec01d1e3254eda25b8#commitcomment-48080097.

I am trying to find a minimal reproducible test case here.

According to bisect log the guilty commit is "first bad commit: [b46c085d2b6d15873fb53718f0a70b3848e19e4a] [NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV expressions". Am I missing something?

In D88287#2621211, @ebrevnov wrote:

In D88287#2619803, @byronyi wrote:

This commit may cause multiple CI failures on TensorFlow/NVPTX backend. See https://github.com/tensorflow/tensorflow/commit/b1758bd553dfc2ebbfd07eec01d1e3254eda25b8#commitcomment-48080097.

I am trying to find a minimal reproducible test case here.

According to bisect log the guilty commit is "first bad commit: [b46c085d2b6d15873fb53718f0a70b3848e19e4a] [NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV expressions". Am I missing something?

Right now this seems like a downstream problem. I've yet to see a reproducer.

In D88287#2621319, @lebedev.ri wrote:

In D88287#2621211, @ebrevnov wrote:

In D88287#2619803, @byronyi wrote:

This commit may cause multiple CI failures on TensorFlow/NVPTX backend. See https://github.com/tensorflow/tensorflow/commit/b1758bd553dfc2ebbfd07eec01d1e3254eda25b8#commitcomment-48080097.

I am trying to find a minimal reproducible test case here.

According to bisect log the guilty commit is "first bad commit: [b46c085d2b6d15873fb53718f0a70b3848e19e4a] [NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV expressions". Am I missing something?

Right now this seems like a downstream problem. I've yet to see a reproducer.

Sorry, posted to the wrong venue. Downstream seems to get a workaround and I will test again and get back to you.

spatel mentioned this in D98152: [InstCombine] Canonicalize SPF to min/max intrinsics.Mar 15 2021, 7:50 AM

This revision was landed with ongoing or failed builds.Apr 2 2021, 1:30 AM

Closed by commit rG2388aae401dc: [NARY-REASSOCIATE] Support reassociation of min/max (authored by Evgeniy Brevnov <ybrevnov@azul.com>). · Explain Why

This revision was automatically updated to reflect the committed changes.

Evgeniy Brevnov <ybrevnov@azul.com> added a commit: rG2388aae401dc: [NARY-REASSOCIATE] Support reassociation of min/max.

Raising this here as well, as it seems the previous concern I raised with the commit was ignored.

I am seeing this change go into an infinite loop attempting to example an expression, adding more and more .nary postfixes.
I have reduced the failure to the attached bugpoint.

llc -march=amdgcn -mcpu=gfx700 < bugpoint-reduced-simplified.ll

bugpoint-reduced-simplified.ll3 KBDownload

ebrevnov added a comment.Apr 7 2021, 5:46 AM

This comment was removed by ebrevnov.

In D88287#2672928, @critson wrote:

Raising this here as well, as it seems the previous concern I raised with the commit was ignored.

Thanks for letting me know. Hadn't seen this before. I can reproduce the issue and working on the fix.

I am seeing this change go into an infinite loop attempting to example an expression, adding more and more .nary postfixes.
I have reduced the failure to the attached bugpoint.

llc -march=amdgcn -mcpu=gfx700 < bugpoint-reduced-simplified.ll

bugpoint-reduced-simplified.ll3 KBDownload

In D88287#2673846, @ebrevnov wrote:

In D88287#2672928, @critson wrote:

Raising this here as well, as it seems the previous concern I raised with the commit was ignored.

Thanks for letting me know. Hadn't seen this before. I can reproduce the issue and working on the fix.

I am seeing this change go into an infinite loop attempting to example an expression, adding more and more .nary postfixes.
I have reduced the failure to the attached bugpoint.

llc -march=amdgcn -mcpu=gfx700 < bugpoint-reduced-simplified.ll

bugpoint-reduced-simplified.ll3 KBDownload

Fix has been submitted for review https://reviews.llvm.org/D100170

cc @Hipony
This patch causes an infinite loop with the following example and "opt -nary-reassociate":

define i32 @nary_infinite_loop_minmax(i32 %d0, i32 %d1, i32 %d2, i32 %d3) {
  %cmp0 = icmp slt i32 %d2, %d1
  %sel0 = select i1 %cmp0, i32 %d1, i32 %d2

  %cmp1 = icmp slt i32 %d3, %d0
  %sel1 = select i1 %cmp1, i32 %d0, i32 %d3

  %cmp2 = icmp slt i32 %sel1, %sel0
  %sel2 = select i1 %cmp2, i32 %sel1, i32 %sel0

  %cmp3 = icmp slt i32 %d3, %d0
  %sel3 = select i1 %cmp3, i32 %d0, i32 %d3

  %cmp4 = icmp slt i32 %sel3, %d2
  %sel4 = select i1 %cmp4, i32 %d2, i32 %sel3

  %cmp5 = icmp slt i32 %sel4, %d1
  %sel5 = select i1 %cmp5, i32 %d1, i32 %sel4
  ret i32 %sel5
}

There was another infinite loop example attached to the commit message for this patch - https://reviews.llvm.org/rG83d134c3c4222e8b8d3d90c099f749a3b3abc8e0
I'm not sure if that is the same root cause, but I don't think there was any reply to that failure.

There was another infinite loop example attached to the commit message for this patch - https://reviews.llvm.org/rG83d134c3c4222e8b8d3d90c099f749a3b3abc8e0
I'm not sure if that is the same root cause, but I don't think there was any reply to that failure.

This must be another problem. Previously reported issue has been fixed here https://reviews.llvm.org/rG36b932d6a385bb97efe17818a7a47d29d2d8acf3.
Working on the fix...

The fix is sent for review https://reviews.llvm.org/D112060

spatel removed a subscriber: Hipony.Oct 19 2021, 9:10 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

PatternMatch.h

1 line

Transforms/

Scalar/

NaryReassociate.h

5 lines

lib/

Transforms/

Scalar/

NaryReassociate.cpp

114 lines

test/

Transforms/

NaryReassociate/

151 lines

151 lines

151 lines

151 lines

Diff 294253

llvm/include/llvm/IR/PatternMatch.h

	Show First 20 Lines • Show All 1,582 Lines • ▼ Show 20 Lines

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Matchers for max/min idioms, eg: "select (sgt x, y), x, y" -> smax(x,y).			// Matchers for max/min idioms, eg: "select (sgt x, y), x, y" -> smax(x,y).
	//			//

	template <typename CmpInst_t, typename LHS_t, typename RHS_t, typename Pred_t,			template <typename CmpInst_t, typename LHS_t, typename RHS_t, typename Pred_t,
	bool Commutable = false>			bool Commutable = false>
	struct MaxMin_match {			struct MaxMin_match {
				using PredType = Pred_t;
	LHS_t L;			LHS_t L;
	RHS_t R;			RHS_t R;

	// The evaluation order is always stable, regardless of Commutability.			// The evaluation order is always stable, regardless of Commutability.
	// The LHS is always matched first.			// The LHS is always matched first.
	MaxMin_match(const LHS_t &LHS, const RHS_t &RHS) : L(LHS), R(RHS) {}			MaxMin_match(const LHS_t &LHS, const RHS_t &RHS) : L(LHS), R(RHS) {}

	template <typename OpTy> bool match(OpTy *V) {			template <typename OpTy> bool match(OpTy *V) {
	▲ Show 20 Lines • Show All 645 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/Scalar/NaryReassociate.h

Show First 20 Lines • Show All 152 Lines • ▼ Show 20 Lines	private:
const SCEV getBinarySCEV(BinaryOperator I, const SCEV *LHS,		const SCEV getBinarySCEV(BinaryOperator I, const SCEV *LHS,
const SCEV *RHS);		const SCEV *RHS);

// Returns the closest dominator of \c Dominatee that computes		// Returns the closest dominator of \c Dominatee that computes
// \c CandidateExpr. Returns null if not found.		// \c CandidateExpr. Returns null if not found.
Instruction findClosestMatchingDominator(const SCEV CandidateExpr,		Instruction findClosestMatchingDominator(const SCEV CandidateExpr,
Instruction *Dominatee);		Instruction *Dominatee);

		// Reassociate Min/Max.
		template <typename MaxMinT>
		Value tryReassociateMinOrMax(Instruction I, MaxMinT MaxMinMatch, Value *LHS,
		Value *RHS);

// GetElementPtrInst implicitly sign-extends an index if the index is shorter		// GetElementPtrInst implicitly sign-extends an index if the index is shorter
// than the pointer size. This function returns whether Index is shorter than		// than the pointer size. This function returns whether Index is shorter than
// GEP's pointer size, i.e., whether Index needs to be sign-extended in order		// GEP's pointer size, i.e., whether Index needs to be sign-extended in order
// to be an index of GEP.		// to be an index of GEP.
bool requiresSignExtension(Value Index, GetElementPtrInst GEP);		bool requiresSignExtension(Value Index, GetElementPtrInst GEP);

AssumptionCache *AC;		AssumptionCache *AC;
const DataLayout *DL;		const DataLayout *DL;
Show All 20 Lines

llvm/lib/Transforms/Scalar/NaryReassociate.cpp

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#include "llvm/Transforms/Scalar/NaryReassociate.h" #include "llvm/Transforms/Scalar/NaryReassociate.h"

#include "llvm/ADT/DepthFirstIterator.h" #include "llvm/ADT/DepthFirstIterator.h"

#include "llvm/ADT/SmallVector.h" #include "llvm/ADT/SmallVector.h"

#include "llvm/Analysis/AssumptionCache.h" #include "llvm/Analysis/AssumptionCache.h"

#include "llvm/Analysis/ScalarEvolution.h" #include "llvm/Analysis/ScalarEvolution.h"

#include "llvm/Analysis/ScalarEvolutionExpressions.h"

#include "llvm/Analysis/TargetLibraryInfo.h" #include "llvm/Analysis/TargetLibraryInfo.h"

#include "llvm/Analysis/TargetTransformInfo.h" #include "llvm/Analysis/TargetTransformInfo.h"

#include "llvm/Analysis/ValueTracking.h" #include "llvm/Analysis/ValueTracking.h"

#include "llvm/IR/BasicBlock.h" #include "llvm/IR/BasicBlock.h"

#include "llvm/IR/Constants.h" #include "llvm/IR/Constants.h"

#include "llvm/IR/DataLayout.h" #include "llvm/IR/DataLayout.h"

#include "llvm/IR/DerivedTypes.h" #include "llvm/IR/DerivedTypes.h"

#include "llvm/IR/Dominators.h" #include "llvm/IR/Dominators.h"

Show All 10 Lines

#include "llvm/IR/Value.h" #include "llvm/IR/Value.h"

#include "llvm/IR/ValueHandle.h" #include "llvm/IR/ValueHandle.h"

#include "llvm/InitializePasses.h" #include "llvm/InitializePasses.h"

#include "llvm/Pass.h" #include "llvm/Pass.h"

#include "llvm/Support/Casting.h" #include "llvm/Support/Casting.h"

#include "llvm/Support/ErrorHandling.h" #include "llvm/Support/ErrorHandling.h"

#include "llvm/Transforms/Scalar.h" #include "llvm/Transforms/Scalar.h"

#include "llvm/Transforms/Utils/Local.h" #include "llvm/Transforms/Utils/Local.h"

#include "llvm/Transforms/Utils/ScalarEvolutionExpander.h"

#include <cassert> #include <cassert>

#include <cstdint> #include <cstdint>

using namespace llvm; using namespace llvm;

using namespace PatternMatch; using namespace PatternMatch;

#define DEBUG_TYPE "nary-reassociate" #define DEBUG_TYPE "nary-reassociate"

▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines bool NaryReassociatePass::doOneIteration(Function &F) {

RecursivelyDeleteTriviallyDeadInstructionsPermissive( RecursivelyDeleteTriviallyDeadInstructionsPermissive(

DeadInsts, TLI, nullptr, [this](Value *V) { SE->forgetValue(V); }); DeadInsts, TLI, nullptr, [this](Value *V) { SE->forgetValue(V); });

return Changed; return Changed;

} }

Instruction *NaryReassociatePass::tryReassociate(Instruction *I, Instruction *NaryReassociatePass::tryReassociate(Instruction *I,

const SCEV *&OrigSCEV) { const SCEV *&OrigSCEV) {

Value *LHS = nullptr;

Value *RHS = nullptr;

if (!SE->isSCEVable(I->getType())) if (!SE->isSCEVable(I->getType()))

return nullptr; return nullptr;

{

// Try to match unsigned Min.

auto UMinMatch = m_UMin(m_Value(LHS), m_Value(RHS));

if (match(I, UMinMatch)) {

OrigSCEV = SE->getSCEV(I);

return dyn_cast_or_null<Instruction>(

tryReassociateMinOrMax(I, UMinMatch, LHS, RHS));

}

lebedev.riUnsubmitted

Not Done

Early return may be cleaner?

lebedev.ri: Early return may be cleaner?

}

{

// Try to match signed Min.

auto SMinMatch = m_SMin(m_Value(LHS), m_Value(RHS));

if (match(I, SMinMatch)) {

OrigSCEV = SE->getSCEV(I);

return dyn_cast_or_null<Instruction>(

tryReassociateMinOrMax(I, SMinMatch, LHS, RHS));

}

{

// Try to match unsigned Max.

auto UMaxMatch = m_UMax(m_Value(LHS), m_Value(RHS));

if (match(I, UMaxMatch)) {

OrigSCEV = SE->getSCEV(I);

return dyn_cast_or_null<Instruction>(

tryReassociateMinOrMax(I, UMaxMatch, LHS, RHS));

}

{

// Try to match signed Max.

auto SMaxMatch = m_SMax(m_Value(LHS), m_Value(RHS));

if (match(I, SMaxMatch)) {

OrigSCEV = SE->getSCEV(I);

return dyn_cast_or_null<Instruction>(

tryReassociateMinOrMax(I, SMaxMatch, LHS, RHS));

mkazantsevUnsubmitted

Not Done

Can we factor out code of if(match) { ... } into a lambda that takes a matcher and returns instruction and avoid this copy-paste?

mkazantsev: Can we factor out code of `if(match) { ... }` into a lambda that takes a matcher and returns…

ebrevnovAuthorUnsubmitted

Done

Not sure how many lines of the code it will save.... let me try that... will see if it's any better

ebrevnov: Not sure how many lines of the code it will save.... let me try that... will see if it's any…

mkazantsevUnsubmitted

Not Done

I still think it would be useful, these lines copy-paste over and over.

mkazantsev: I still think it would be useful, these lines copy-paste over and over.

}

switch (I->getOpcode()) { switch (I->getOpcode()) {

case Instruction::Add: case Instruction::Add:

case Instruction::Mul: case Instruction::Mul:

OrigSCEV = SE->getSCEV(I); OrigSCEV = SE->getSCEV(I);

return tryReassociateBinaryOp(cast<BinaryOperator>(I)); return tryReassociateBinaryOp(cast<BinaryOperator>(I));

case Instruction::GetElementPtr: case Instruction::GetElementPtr:

OrigSCEV = SE->getSCEV(I); OrigSCEV = SE->getSCEV(I);

return tryReassociateGEP(cast<GetElementPtrInst>(I)); return tryReassociateGEP(cast<GetElementPtrInst>(I));

default: default:

return nullptr; return nullptr;

} }

llvm_unreachable("should not be reached"); llvm_unreachable("should not be reached");

return nullptr; return nullptr;

} }

static bool isGEPFoldable(GetElementPtrInst *GEP, static bool isGEPFoldable(GetElementPtrInst *GEP,

const TargetTransformInfo *TTI) { const TargetTransformInfo *TTI) {

mkazantsevUnsubmitted

Not Done

{ } not needed

mkazantsev: { } not needed

ebrevnovAuthorUnsubmitted

Done

will remove before commit

ebrevnov: will remove before commit

SmallVector<const Value*, 4> Indices; SmallVector<const Value*, 4> Indices;

for (auto I = GEP->idx_begin(); I != GEP->idx_end(); ++I) for (auto I = GEP->idx_begin(); I != GEP->idx_end(); ++I)

Indices.push_back(*I); Indices.push_back(*I);

return TTI->getGEPCost(GEP->getSourceElementType(), GEP->getPointerOperand(), return TTI->getGEPCost(GEP->getSourceElementType(), GEP->getPointerOperand(),

Indices) == TargetTransformInfo::TCC_Free; Indices) == TargetTransformInfo::TCC_Free;

} }

Instruction *NaryReassociatePass::tryReassociateGEP(GetElementPtrInst *GEP) { Instruction *NaryReassociatePass::tryReassociateGEP(GetElementPtrInst *GEP) {

▲ Show 20 Lines • Show All 233 Lines • ▼ Show 20 Lines if (Value *Candidate = Candidates.back()) {

Instruction *CandidateInstruction = cast<Instruction>(Candidate); Instruction *CandidateInstruction = cast<Instruction>(Candidate);

if (DT->dominates(CandidateInstruction, Dominatee)) if (DT->dominates(CandidateInstruction, Dominatee))

return CandidateInstruction; return CandidateInstruction;

} }

Candidates.pop_back(); Candidates.pop_back();

} }

return nullptr; return nullptr;

} }

template <typename MaxMinT> static SCEVTypes convertToSCEVype(MaxMinT &MM) {

if (std::is_same<smax_pred_ty, typename MaxMinT::PredType>::value)

return scSMaxExpr;

else if (std::is_same<umax_pred_ty, typename MaxMinT::PredType>::value)

return scUMaxExpr;

else if (std::is_same<smin_pred_ty, typename MaxMinT::PredType>::value)

return scSMinExpr;

else if (std::is_same<umin_pred_ty, typename MaxMinT::PredType>::value)

return scUMinExpr;

llvm_unreachable("Can't convert MinMax pattern to SCEV type");

return scUnknown;

}

template <typename MaxMinT>

Value *NaryReassociatePass::tryReassociateMinOrMax(Instruction *I,

mkazantsevUnsubmitted

Not Done

The result of tryReassociateMinOrMax is only used as instruction, and non-instruction values are discarded by dyn_cast_or_null. Maybe change API to return instruction from here?

mkazantsev: The result of `tryReassociateMinOrMax` is only used as instruction, and non-instruction values…

mkazantsevUnsubmitted

Not Done

Never mind, I think what you have is fine (I didn't notice it's a mutator).

mkazantsev: Never mind, I think what you have is fine (I didn't notice it's a mutator).

MaxMinT MaxMinMatch,

Value *LHS, Value *RHS) {

Value *A = nullptr, *B = nullptr;

MaxMinT m_MaxMin(m_Value(A), m_Value(B));

for (uint i = 0; i < 2; ++i) {

if (match(LHS, m_MaxMin)) {

const SCEV *AExpr = SE->getSCEV(A), *BExpr = SE->getSCEV(B);

const SCEV *RHSExpr = SE->getSCEV(RHS);

for (uint j = 0; j < 2; ++j) {

RKSimonUnsubmitted

Not Done

@ebrevnov uint is an unknown type on most targets - breaking buiilds - just use int ?

RKSimon: @ebrevnov uint is an unknown type on most targets - breaking buiilds - just use int ?

if (j == 0) {

if (BExpr == RHSExpr)

continue;

// Transform 'I = (A op B) op RHS' to 'I = (A op RHS) op B' on the

// first iteration.

std::swap(BExpr, RHSExpr);

} else {

if (AExpr == RHSExpr)

continue;

// Transform 'I = (A op RHS) op B' 'I = (B op RHS) op A' on the second

// iteration.

std::swap(AExpr, RHSExpr);

}

SCEVExpander Expander(*SE, *DL, "nary-reassociate");

SmallVector<const SCEV *, 2> Ops1{ BExpr, AExpr };

const SCEVTypes SCEVType = convertToSCEVype(m_MaxMin);

const SCEV *R1Expr = SE->getMinMaxExpr(SCEVType, Ops1);

Value *R1MinMax = findClosestMatchingDominator(R1Expr, I);

mkazantsevUnsubmitted

Not Done

findClosestMatchingDominator returns an Instruction *, do we really need to downcast it do Value *

mkazantsev: `findClosestMatchingDominator` returns an `Instruction *`, do we really need to downcast it do…

ebrevnovAuthorUnsubmitted

Done

I don't think it will make any difference in this particular case. No problems to change to Instruction*

ebrevnov: I don't think it will make any difference in this particular case. No problems to change to…

if (!R1MinMax) {

continue;

}

LLVM_DEBUG(dbgs() << "NARY: Found common sub-expr: " << *R1MinMax

<< "\n");

R1Expr = SE->getUnknown(R1MinMax);

SmallVector<const SCEV *, 2> Ops2{ RHSExpr, R1Expr };

const SCEV *R2Expr = SE->getMinMaxExpr(SCEVType, Ops2);

Value *NewMinMax = Expander.expandCodeFor(R2Expr, I->getType(), I);

lebedev.riUnsubmitted

Not Done

Value *NewMinMax = Expander.expandCodeFor(R2Expr, I->getType(), I);

- NewMinMax->setName(Twine(I->getName()).concat(".nary"));

+ NewMinMax->setName(I->getName() + ".nary");

LLVM_DEBUG(dbgs() << "NARY: Deleting: " << *I << "\n"

lebedev.ri:

NewMinMax->setName(Twine(I->getName()).concat(".nary"));

mkazantsevUnsubmitted

Not Done

Does it make more sense to name it ".reassociate" instead?

mkazantsev: Does it make more sense to name it `".reassociate"` instead?

ebrevnovAuthorUnsubmitted

Done

In addition to NaryReassociate there is Ressociate pass. I want the name to clearly point to NaryReassociate pass. Using "reassociate" seems too long for me...

ebrevnov: In addition to NaryReassociate there is Ressociate pass. I want the name to clearly point to…

LLVM_DEBUG(dbgs() << "NARY: Deleting: " << *I << "\n"

<< "NARY: Inserting: " << *NewMinMax << "\n");

return NewMinMax;

}

std::swap(LHS, RHS);

}

return nullptr;

}

llvm/test/Transforms/NaryReassociate/nary-smax.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				mkazantsevUnsubmitted Not Done Reply Inline Actions Please commit these tests with auto-generated checks without your patch and rebase on top of it to see what your patch changes. mkazantsev: Please commit these tests with auto-generated checks without your patch and rebase on top of it…
				ebrevnovAuthorUnsubmitted Done Reply Inline Actions Ok ebrevnov: Ok
				mkazantsevUnsubmitted Not Done Reply Inline Actions Still actual. mkazantsev: Still actual.
				; RUN: opt < %s -nary-reassociate -S \| FileCheck %s
				; RUN: opt < %s -passes='nary-reassociate' -S \| FileCheck %s

				declare i32 @llvm.smax.i32(i32 %a, i32 %b)

				; m1 = smax(a,b) ; has side uses
				; m2 = smax(smax((b,c), a) -> m2 = smax(m1, c)
				define i32 @smax_test1(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smax_test1(
				; CHECK-NEXT: [[C1:%.]] = icmp sgt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[SMAX1]], [[C:%.]]
				; CHECK-NEXT: [[SMAX3_NARY:%.*]] = select i1 [[TMP1]], i32 [[SMAX1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMAX1]], [[SMAX3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp sgt i32 %a, %b
				%smax1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp sgt i32 %b, %c
				%smax2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp sgt i32 %smax2, %a
				%smax3 = select i1 %c3, i32 %smax2, i32 %a
				%res = add i32 %smax1, %smax3
				ret i32 %res
				}

				; m1 = smax(a,b) ; has side uses
				; m2 = smax(b, (smax(a, c))) -> m2 = smax(m1, c)
				define i32 @smax_test2(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smax_test2(
				; CHECK-NEXT: [[C1:%.]] = icmp sgt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[SMAX1]], [[C:%.]]
				; CHECK-NEXT: [[SMAX3_NARY:%.*]] = select i1 [[TMP1]], i32 [[SMAX1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMAX1]], [[SMAX3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp sgt i32 %a, %b
				%smax1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp sgt i32 %a, %c
				%smax2 = select i1 %c2, i32 %a, i32 %c
				%c3 = icmp sgt i32 %b, %smax2
				%smax3 = select i1 %c3, i32 %b, i32 %smax2
				%res = add i32 %smax1, %smax3
				ret i32 %res
				}

				; Same test as smax_test1 but uses @llvm.smax intrinsic
				define i32 @smax_test3(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smax_test3(
				; CHECK-NEXT: [[SMAX1:%.]] = call i32 @llvm.smax.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[SMAX1]], [[C:%.]]
				; CHECK-NEXT: [[SMAX3_NARY:%.*]] = select i1 [[TMP1]], i32 [[SMAX1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMAX1]], [[SMAX3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%smax1 = call i32 @llvm.smax.i32(i32 %a, i32 %b)
				%smax2 = call i32 @llvm.smax.i32(i32 %b, i32 %c)
				%smax3 = call i32 @llvm.smax.i32(i32 %smax2, i32 %a)
				%res = add i32 %smax1, %smax3
				ret i32 %res
				}

				; m1 = smax(a,b) ; has side uses
				; m2 = smax(smax_or_eq((b,c), a) -> m2 = smax(m1, c)
				define i32 @umax_test4(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umax_test4(
				; CHECK-NEXT: [[C1:%.]] = icmp sgt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[SMAX1]], [[C:%.]]
				; CHECK-NEXT: [[SMAX3_NARY:%.*]] = select i1 [[TMP1]], i32 [[SMAX1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMAX1]], [[SMAX3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp sgt i32 %a, %b
				%smax1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp sge i32 %b, %c
				%smax_or_eq2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp sgt i32 %smax_or_eq2, %a
				%smax3 = select i1 %c3, i32 %smax_or_eq2, i32 %a
				%res = add i32 %smax1, %smax3
				ret i32 %res
				}

				; m1 = smax_or_eq(a,b) ; has side uses
				; m2 = smax_or_eq(smax((b,c), a) -> m2 = smax(m1, c)
				define i32 @smax_test5(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smax_test5(
				; CHECK-NEXT: [[C1:%.]] = icmp sge i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMAX_OR_EQ1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[SMAX_OR_EQ1]], [[C:%.]]
				; CHECK-NEXT: [[SMAX_OR_EQ3_NARY:%.*]] = select i1 [[TMP1]], i32 [[SMAX_OR_EQ1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMAX_OR_EQ1]], [[SMAX_OR_EQ3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp sge i32 %a, %b
				%smax_or_eq1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp sgt i32 %b, %c
				%smax2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp sge i32 %smax2, %a
				%smax_or_eq3 = select i1 %c3, i32 %smax2, i32 %a
				%res = add i32 %smax_or_eq1, %smax_or_eq3
				ret i32 %res
				}

				; m1 = smax(a,b) ; has side uses
				; m2 = smax(umax((b,c), a) ; check that signed and unsigned maxs are not mixed
				define i32 @smax_test6(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smax_test6(
				; CHECK-NEXT: [[C1:%.]] = icmp sgt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[C2:%.]] = icmp ugt i32 [[B]], [[C:%.]]
				; CHECK-NEXT: [[UMAX2:%.*]] = select i1 [[C2]], i32 [[B]], i32 [[C]]
				; CHECK-NEXT: [[C3:%.*]] = icmp sgt i32 [[UMAX2]], [[A]]
				; CHECK-NEXT: [[SMAX3:%.*]] = select i1 [[C3]], i32 [[UMAX2]], i32 [[A]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMAX1]], [[SMAX3]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp sgt i32 %a, %b
				%smax1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ugt i32 %b, %c
				%umax2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp sgt i32 %umax2, %a
				%smax3 = select i1 %c3, i32 %umax2, i32 %a
				%res = add i32 %smax1, %smax3
				ret i32 %res
				}

				; m1 = smax(a,b) ; has side uses
				; m2 = smax(smin((b,c), a) ; check that max and min are not mixed
				define i32 @smax_test7(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smax_test7(
				; CHECK-NEXT: [[C1:%.]] = icmp sgt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[C2:%.]] = icmp slt i32 [[B]], [[C:%.]]
				; CHECK-NEXT: [[SMIN2:%.*]] = select i1 [[C2]], i32 [[B]], i32 [[C]]
				; CHECK-NEXT: [[C3:%.*]] = icmp slt i32 [[SMIN2]], [[A]]
				; CHECK-NEXT: [[SMAX3:%.*]] = select i1 [[C3]], i32 [[SMIN2]], i32 [[A]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMAX1]], [[SMAX3]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp sgt i32 %a, %b
				%smax1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp slt i32 %b, %c
				%smin2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp slt i32 %smin2, %a
				%smax3 = select i1 %c3, i32 %smin2, i32 %a
				%res = add i32 %smax1, %smax3
				ret i32 %res
				}

llvm/test/Transforms/NaryReassociate/nary-smin.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -nary-reassociate -S \| FileCheck %s
				; RUN: opt < %s -passes='nary-reassociate' -S \| FileCheck %s

				declare i32 @llvm.smin.i32(i32 %a, i32 %b)

				; m1 = smin(a,b) ; has side uses
				; m2 = smin(smin((b,c), a) -> m2 = smin(m1, c)
				define i32 @smin_test1(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smin_test1(
				; CHECK-NEXT: [[C1:%.]] = icmp slt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMIN1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[SMIN1]], [[C:%.]]
				; CHECK-NEXT: [[SMIN3_NARY:%.*]] = select i1 [[TMP1]], i32 [[SMIN1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMIN1]], [[SMIN3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp slt i32 %a, %b
				%smin1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp slt i32 %b, %c
				%smin2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp slt i32 %smin2, %a
				%smin3 = select i1 %c3, i32 %smin2, i32 %a
				%res = add i32 %smin1, %smin3
				ret i32 %res
				}

				; m1 = smin(a,b) ; has side uses
				; m2 = smin(b, (smin(a, c))) -> m2 = smin(m1, c)
				define i32 @smin_test2(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smin_test2(
				; CHECK-NEXT: [[C1:%.]] = icmp slt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMIN1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[SMIN1]], [[C:%.]]
				; CHECK-NEXT: [[SMIN3_NARY:%.*]] = select i1 [[TMP1]], i32 [[SMIN1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMIN1]], [[SMIN3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp slt i32 %a, %b
				%smin1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp slt i32 %a, %c
				%smin2 = select i1 %c2, i32 %a, i32 %c
				%c3 = icmp slt i32 %b, %smin2
				%smin3 = select i1 %c3, i32 %b, i32 %smin2
				%res = add i32 %smin1, %smin3
				ret i32 %res
				}

				; Same test as smin_test1 but uses @llvm.smin intrinsic
				define i32 @smin_test3(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smin_test3(
				; CHECK-NEXT: [[SMIN1:%.]] = call i32 @llvm.smin.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[SMIN1]], [[C:%.]]
				; CHECK-NEXT: [[SMIN3_NARY:%.*]] = select i1 [[TMP1]], i32 [[SMIN1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMIN1]], [[SMIN3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%smin1 = call i32 @llvm.smin.i32(i32 %a, i32 %b)
				%smin2 = call i32 @llvm.smin.i32(i32 %b, i32 %c)
				%smin3 = call i32 @llvm.smin.i32(i32 %smin2, i32 %a)
				%res = add i32 %smin1, %smin3
				ret i32 %res
				}

				; m1 = smin(a,b) ; has side uses
				; m2 = smin(smin_or_eq((b,c), a) -> m2 = smin(m1, c)
				define i32 @umin_test4(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umin_test4(
				; CHECK-NEXT: [[C1:%.]] = icmp slt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMIN1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[SMIN1]], [[C:%.]]
				; CHECK-NEXT: [[SMIN3_NARY:%.*]] = select i1 [[TMP1]], i32 [[SMIN1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMIN1]], [[SMIN3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp slt i32 %a, %b
				%smin1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp sle i32 %b, %c
				%smin_or_eq2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp slt i32 %smin_or_eq2, %a
				%smin3 = select i1 %c3, i32 %smin_or_eq2, i32 %a
				%res = add i32 %smin1, %smin3
				ret i32 %res
				}

				; m1 = smin_or_eq(a,b) ; has side uses
				; m2 = smin_or_eq(smin((b,c), a) -> m2 = smin(m1, c)
				define i32 @smin_test5(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smin_test5(
				; CHECK-NEXT: [[C1:%.]] = icmp sle i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMIN_OR_EQ1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[SMIN_OR_EQ1]], [[C:%.]]
				; CHECK-NEXT: [[SMIN_OR_EQ3_NARY:%.*]] = select i1 [[TMP1]], i32 [[SMIN_OR_EQ1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMIN_OR_EQ1]], [[SMIN_OR_EQ3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp sle i32 %a, %b
				%smin_or_eq1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp slt i32 %b, %c
				%smin2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp sle i32 %smin2, %a
				%smin_or_eq3 = select i1 %c3, i32 %smin2, i32 %a
				%res = add i32 %smin_or_eq1, %smin_or_eq3
				ret i32 %res
				}

				; m1 = smin(a,b) ; has side uses
				; m2 = smin(umin((b,c), a) ; check that signed and unsigned mins are not mixed
				define i32 @smin_test6(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smin_test6(
				; CHECK-NEXT: [[C1:%.]] = icmp slt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMIN1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[C2:%.]] = icmp ult i32 [[B]], [[C:%.]]
				; CHECK-NEXT: [[UMIN2:%.*]] = select i1 [[C2]], i32 [[B]], i32 [[C]]
				; CHECK-NEXT: [[C3:%.*]] = icmp slt i32 [[UMIN2]], [[A]]
				; CHECK-NEXT: [[SMIN3:%.*]] = select i1 [[C3]], i32 [[UMIN2]], i32 [[A]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMIN1]], [[SMIN3]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp slt i32 %a, %b
				%smin1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ult i32 %b, %c
				%umin2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp slt i32 %umin2, %a
				%smin3 = select i1 %c3, i32 %umin2, i32 %a
				%res = add i32 %smin1, %smin3
				ret i32 %res
				}

				; m1 = smin(a,b) ; has side uses
				; m2 = smin(smax((b,c), a) ; check that min and max are not mixed
				define i32 @smin_test7(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @smin_test7(
				; CHECK-NEXT: [[C1:%.]] = icmp slt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[SMIN1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[C2:%.]] = icmp sgt i32 [[B]], [[C:%.]]
				; CHECK-NEXT: [[SMAX2:%.*]] = select i1 [[C2]], i32 [[B]], i32 [[C]]
				; CHECK-NEXT: [[C3:%.*]] = icmp slt i32 [[SMAX2]], [[A]]
				; CHECK-NEXT: [[SMIN3:%.*]] = select i1 [[C3]], i32 [[SMAX2]], i32 [[A]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[SMIN1]], [[SMIN3]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp slt i32 %a, %b
				%smin1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp sgt i32 %b, %c
				%smax2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp slt i32 %smax2, %a
				%smin3 = select i1 %c3, i32 %smax2, i32 %a
				%res = add i32 %smin1, %smin3
				ret i32 %res
				}

llvm/test/Transforms/NaryReassociate/nary-umax.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -nary-reassociate -S \| FileCheck %s
				; RUN: opt < %s -passes='nary-reassociate' -S \| FileCheck %s

				declare i32 @llvm.umax.i32(i32 %a, i32 %b)

				; m1 = umax(a,b) ; has side uses
				; m2 = umax(umax((b,c), a) -> m2 = umax(m1, c)
				define i32 @umax_test1(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umax_test1(
				; CHECK-NEXT: [[C1:%.]] = icmp ugt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMAX1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp ugt i32 [[UMAX1]], [[C:%.]]
				; CHECK-NEXT: [[UMAX3_NARY:%.*]] = select i1 [[TMP1]], i32 [[UMAX1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMAX1]], [[UMAX3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ugt i32 %a, %b
				%umax1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ugt i32 %b, %c
				%umax2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp ugt i32 %umax2, %a
				%umax3 = select i1 %c3, i32 %umax2, i32 %a
				%res = add i32 %umax1, %umax3
				ret i32 %res
				}

				; m1 = umax(a,b) ; has side uses
				; m2 = umax(b, (umax(a, c))) -> m2 = umax(m1, c)
				define i32 @umax_test2(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umax_test2(
				; CHECK-NEXT: [[C1:%.]] = icmp ugt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMAX1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp ugt i32 [[UMAX1]], [[C:%.]]
				; CHECK-NEXT: [[UMAX3_NARY:%.*]] = select i1 [[TMP1]], i32 [[UMAX1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMAX1]], [[UMAX3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ugt i32 %a, %b
				%umax1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ugt i32 %a, %c
				%umax2 = select i1 %c2, i32 %a, i32 %c
				%c3 = icmp ugt i32 %b, %umax2
				%umax3 = select i1 %c3, i32 %b, i32 %umax2
				%res = add i32 %umax1, %umax3
				ret i32 %res
				}

				; Same test as umax_test1 but uses @llvm.umax intrinsic
				define i32 @umax_test3(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umax_test3(
				; CHECK-NEXT: [[UMAX1:%.]] = call i32 @llvm.umax.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP1:%.]] = icmp ugt i32 [[UMAX1]], [[C:%.]]
				; CHECK-NEXT: [[UMAX3_NARY:%.*]] = select i1 [[TMP1]], i32 [[UMAX1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMAX1]], [[UMAX3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%umax1 = call i32 @llvm.umax.i32(i32 %a, i32 %b)
				%umax2 = call i32 @llvm.umax.i32(i32 %b, i32 %c)
				%umax3 = call i32 @llvm.umax.i32(i32 %umax2, i32 %a)
				%res = add i32 %umax1, %umax3
				ret i32 %res
				}

				; m1 = umax(a,b) ; has side uses
				; m2 = umax(umax_or_eq((b,c), a) -> m2 = umax(m1, c)
				define i32 @umax_test4(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umax_test4(
				; CHECK-NEXT: [[C1:%.]] = icmp ugt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMAX1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp ugt i32 [[UMAX1]], [[C:%.]]
				; CHECK-NEXT: [[UMAX3_NARY:%.*]] = select i1 [[TMP1]], i32 [[UMAX1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMAX1]], [[UMAX3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ugt i32 %a, %b
				%umax1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp uge i32 %b, %c
				%umax_or_eq2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp ugt i32 %umax_or_eq2, %a
				%umax3 = select i1 %c3, i32 %umax_or_eq2, i32 %a
				%res = add i32 %umax1, %umax3
				ret i32 %res
				}

				; m1 = umax_or_eq(a,b) ; has side uses
				; m2 = umax_or_eq(umax((b,c), a) -> m2 = umax(m1, c)
				define i32 @umax_test5(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umax_test5(
				; CHECK-NEXT: [[C1:%.]] = icmp uge i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMAX_OR_EQ1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp ugt i32 [[UMAX_OR_EQ1]], [[C:%.]]
				; CHECK-NEXT: [[UMAX_OR_EQ3_NARY:%.*]] = select i1 [[TMP1]], i32 [[UMAX_OR_EQ1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMAX_OR_EQ1]], [[UMAX_OR_EQ3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp uge i32 %a, %b
				%umax_or_eq1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ugt i32 %b, %c
				%umax2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp uge i32 %umax2, %a
				%umax_or_eq3 = select i1 %c3, i32 %umax2, i32 %a
				%res = add i32 %umax_or_eq1, %umax_or_eq3
				ret i32 %res
				}

				; m1 = umax(a,b) ; has side uses
				; m2 = umax(smax((b,c), a) ; check that signed and unsigned maxs are not mixed
				define i32 @umax_test6(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umax_test6(
				; CHECK-NEXT: [[C1:%.]] = icmp ugt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMAX1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[C2:%.]] = icmp sgt i32 [[B]], [[C:%.]]
				; CHECK-NEXT: [[SMAX2:%.*]] = select i1 [[C2]], i32 [[B]], i32 [[C]]
				; CHECK-NEXT: [[C3:%.*]] = icmp ugt i32 [[SMAX2]], [[A]]
				; CHECK-NEXT: [[UMAX3:%.*]] = select i1 [[C3]], i32 [[SMAX2]], i32 [[A]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMAX1]], [[UMAX3]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ugt i32 %a, %b
				%umax1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp sgt i32 %b, %c
				%smax2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp ugt i32 %smax2, %a
				%umax3 = select i1 %c3, i32 %smax2, i32 %a
				%res = add i32 %umax1, %umax3
				ret i32 %res
				}

				; m1 = umax(a,b) ; has side uses
				; m2 = umax(umin((b,c), a) ; check that max and min are not mixed
				define i32 @umax_test7(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umax_test7(
				; CHECK-NEXT: [[C1:%.]] = icmp ugt i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMAX1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[C2:%.]] = icmp ult i32 [[B]], [[C:%.]]
				; CHECK-NEXT: [[UMAX2:%.*]] = select i1 [[C2]], i32 [[B]], i32 [[C]]
				; CHECK-NEXT: [[C3:%.*]] = icmp ugt i32 [[UMAX2]], [[A]]
				; CHECK-NEXT: [[UMAX3:%.*]] = select i1 [[C3]], i32 [[UMAX2]], i32 [[A]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMAX1]], [[UMAX3]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ugt i32 %a, %b
				%umax1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ult i32 %b, %c
				%umax2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp ugt i32 %umax2, %a
				%umax3 = select i1 %c3, i32 %umax2, i32 %a
				%res = add i32 %umax1, %umax3
				ret i32 %res
				}

llvm/test/Transforms/NaryReassociate/nary-umin.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -nary-reassociate -S \| FileCheck %s
				; RUN: opt < %s -passes='nary-reassociate' -S \| FileCheck %s

				declare i32 @llvm.umin.i32(i32 %a, i32 %b)

				; m1 = umin(a,b) ; has side uses
				; m2 = umin(umin((b,c), a) -> m2 = umin(m1, c)
				define i32 @umin_test1(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umin_test1(
				; CHECK-NEXT: [[C1:%.]] = icmp ult i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMIN1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp ult i32 [[UMIN1]], [[C:%.]]
				; CHECK-NEXT: [[UMIN3_NARY:%.*]] = select i1 [[TMP1]], i32 [[UMIN1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMIN1]], [[UMIN3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ult i32 %a, %b
				%umin1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ult i32 %b, %c
				%umin2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp ult i32 %umin2, %a
				%umin3 = select i1 %c3, i32 %umin2, i32 %a
				%res = add i32 %umin1, %umin3
				ret i32 %res
				}

				; m1 = umin(a,b) ; has side uses
				; m2 = umin(b, (umin(a, c))) -> m2 = umin(m1, c)
				define i32 @umin_test2(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umin_test2(
				; CHECK-NEXT: [[C1:%.]] = icmp ult i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMIN1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp ult i32 [[UMIN1]], [[C:%.]]
				; CHECK-NEXT: [[UMIN3_NARY:%.*]] = select i1 [[TMP1]], i32 [[UMIN1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMIN1]], [[UMIN3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ult i32 %a, %b
				%umin1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ult i32 %a, %c
				%umin2 = select i1 %c2, i32 %a, i32 %c
				%c3 = icmp ult i32 %b, %umin2
				%umin3 = select i1 %c3, i32 %b, i32 %umin2
				%res = add i32 %umin1, %umin3
				ret i32 %res
				}

				; Same test as umin_test1 but uses @llvm.umin intrinsic
				define i32 @umin_test3(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umin_test3(
				; CHECK-NEXT: [[UMIN1:%.]] = call i32 @llvm.umin.i32(i32 [[A:%.]], i32 [[B:%.*]])
				; CHECK-NEXT: [[TMP1:%.]] = icmp ult i32 [[UMIN1]], [[C:%.]]
				; CHECK-NEXT: [[UMIN3_NARY:%.*]] = select i1 [[TMP1]], i32 [[UMIN1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMIN1]], [[UMIN3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%umin1 = call i32 @llvm.umin.i32(i32 %a, i32 %b)
				%umin2 = call i32 @llvm.umin.i32(i32 %b, i32 %c)
				%umin3 = call i32 @llvm.umin.i32(i32 %umin2, i32 %a)
				%res = add i32 %umin1, %umin3
				ret i32 %res
				}

				; m1 = umin(a,b) ; has side uses
				; m2 = umin(umin_or_eq((b,c), a) -> m2 = umin(m1, c)
				define i32 @umin_test4(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umin_test4(
				; CHECK-NEXT: [[C1:%.]] = icmp ult i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMIN1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp ult i32 [[UMIN1]], [[C:%.]]
				; CHECK-NEXT: [[UMIN3_NARY:%.*]] = select i1 [[TMP1]], i32 [[UMIN1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMIN1]], [[UMIN3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ult i32 %a, %b
				%umin1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ule i32 %b, %c
				%umin_or_eq2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp ult i32 %umin_or_eq2, %a
				%umin3 = select i1 %c3, i32 %umin_or_eq2, i32 %a
				%res = add i32 %umin1, %umin3
				ret i32 %res
				}

				; m1 = umin_or_eq(a,b) ; has side uses
				; m2 = umin_or_eq(umin((b,c), a) -> m2 = umin(m1, c)
				define i32 @umin_test5(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umin_test5(
				; CHECK-NEXT: [[C1:%.]] = icmp ule i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMIN_OR_EQ1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[TMP1:%.]] = icmp ult i32 [[UMIN_OR_EQ1]], [[C:%.]]
				; CHECK-NEXT: [[UMIN_OR_EQ3_NARY:%.*]] = select i1 [[TMP1]], i32 [[UMIN_OR_EQ1]], i32 [[C]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMIN_OR_EQ1]], [[UMIN_OR_EQ3_NARY]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ule i32 %a, %b
				%umin_or_eq1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ult i32 %b, %c
				%umin2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp ule i32 %umin2, %a
				%umin_or_eq3 = select i1 %c3, i32 %umin2, i32 %a
				%res = add i32 %umin_or_eq1, %umin_or_eq3
				ret i32 %res
				}

				; m1 = umin(a,b) ; has side uses
				; m2 = umin(smin((b,c), a) ; check that signed and unsigned mins are not mixed
				define i32 @umin_test6(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umin_test6(
				; CHECK-NEXT: [[C1:%.]] = icmp ult i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMIN1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[C2:%.]] = icmp slt i32 [[B]], [[C:%.]]
				; CHECK-NEXT: [[SMIN2:%.*]] = select i1 [[C2]], i32 [[B]], i32 [[C]]
				; CHECK-NEXT: [[C3:%.*]] = icmp ult i32 [[SMIN2]], [[A]]
				; CHECK-NEXT: [[UMIN3:%.*]] = select i1 [[C3]], i32 [[SMIN2]], i32 [[A]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMIN1]], [[UMIN3]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ult i32 %a, %b
				%umin1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp slt i32 %b, %c
				%smin2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp ult i32 %smin2, %a
				%umin3 = select i1 %c3, i32 %smin2, i32 %a
				%res = add i32 %umin1, %umin3
				ret i32 %res
				}

				; m1 = umin(a,b) ; has side uses
				; m2 = umin(umax((b,c), a) ; check that min and max are not mixed
				define i32 @umin_test7(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @umin_test7(
				; CHECK-NEXT: [[C1:%.]] = icmp ult i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[UMIN1:%.*]] = select i1 [[C1]], i32 [[A]], i32 [[B]]
				; CHECK-NEXT: [[C2:%.]] = icmp ugt i32 [[B]], [[C:%.]]
				; CHECK-NEXT: [[UMAX2:%.*]] = select i1 [[C2]], i32 [[B]], i32 [[C]]
				; CHECK-NEXT: [[C3:%.*]] = icmp ult i32 [[UMAX2]], [[A]]
				; CHECK-NEXT: [[UMIN3:%.*]] = select i1 [[C3]], i32 [[UMAX2]], i32 [[A]]
				; CHECK-NEXT: [[RES:%.*]] = add i32 [[UMIN1]], [[UMIN3]]
				; CHECK-NEXT: ret i32 [[RES]]
				;
				%c1 = icmp ult i32 %a, %b
				%umin1 = select i1 %c1, i32 %a, i32 %b
				%c2 = icmp ugt i32 %b, %c
				%umax2 = select i1 %c2, i32 %b, i32 %c
				%c3 = icmp ult i32 %umax2, %a
				%umin3 = select i1 %c3, i32 %umax2, i32 %a
				%res = add i32 %umin1, %umin3
				ret i32 %res
				}

This is an archive of the discontinued LLVM Phabricator instance.

[NARY-REASSOCIATE] Support reassociation of min/maxClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 294253

llvm/include/llvm/IR/PatternMatch.h

llvm/include/llvm/Transforms/Scalar/NaryReassociate.h

llvm/lib/Transforms/Scalar/NaryReassociate.cpp

llvm/test/Transforms/NaryReassociate/nary-smax.ll

llvm/test/Transforms/NaryReassociate/nary-smin.ll

llvm/test/Transforms/NaryReassociate/nary-umax.ll

llvm/test/Transforms/NaryReassociate/nary-umin.ll

[NARY-REASSOCIATE] Support reassociation of min/max
ClosedPublic