This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
21/42
SimpleLoopUnswitch.cpp
-
test/Transforms/SimpleLoopUnswitch/
-
Transforms/
-
SimpleLoopUnswitch/
5/10
partial-unswitch.ll

Differential D99354

[SimpleLoopUnswitch] Port partially invariant unswitch from LoopUnswitch to SimpleLoopUnswitch
ClosedPublic

Authored by jaykang10 on Mar 25 2021, 10:30 AM.

Download Raw Diff

Details

Reviewers

fhahn
sanwou01
jdoerfert
jonpa
chandlerc

Commits

rGf3a27511c9f8: [SimpleLoopUnswitch] Port partially invariant unswitch from LoopUnswitch to…
rG88b259c01463: [SimpleLoopUnswitch] Port partially invariant unswitch from LoopUnswitch to…

Summary

The partially invariant unswitch has been implemented on LoopUnswitch. https://reviews.llvm.org/D93764 We need to port the feature to SimpleLoopUnswitch for new pass manager. It is related to https://bugs.llvm.org/show_bug.cgi?id=49128

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Herald added a project: Restricted Project. · View Herald TranscriptMar 25 2021, 10:30 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B95719: Diff 333339.Mar 25 2021, 11:10 AM

jaykang10 added reviewers: jdoerfert, jonpa.Mar 26 2021, 1:37 AM

Thanks for putting up this patch! Would it be possible to move the detection code (hasPartialIVCondition) somewhere, so both version of unswitching can use the same code?

In D99354#2652524, @fhahn wrote:

Thanks for putting up this patch! Would it be possible to move the detection code (hasPartialIVCondition) somewhere, so both version of unswitching can use the same code?

Yep, I also wanted to share it so I have tried to keep the existing implementation as much as possible... I am not sure which place is good for this... maybe, somewhere in lib/Transform/Utils?

Moved hasPartialIVCondition to LoopUtils.h following comment of @fhahn

Harbormaster completed remote builds in B95998: Diff 333716.Mar 28 2021, 5:27 AM

jaykang10 updated this revision to Diff 333782.Mar 29 2021, 1:12 AM

Harbormaster completed remote builds in B96054: Diff 333782.Mar 29 2021, 1:13 AM

Thank you for looking into this!
Unhelpful comment: have you considered splitting this into two patches, moving code from LoopUnswitch to a common place, and enhancing SimpleLoopUnswitch ?

In D99354#2655155, @lebedev.ri wrote:

Thank you for looking into this!
Unhelpful comment: have you considered splitting this into two patches, moving code from LoopUnswitch to a common place, and enhancing SimpleLoopUnswitch ?

You are right! Let me split this patch into two patches.

jaykang10 mentioned this in D99490: [NFC][LoopUnswitch] Move hasPartialIVCondition to LoopUtils.Mar 29 2021, 2:17 AM

Following comment of @lebedev.ri, split previous patch into two patches. This one works on top of https://reviews.llvm.org/D99490

lebedev.ri added inline comments.Mar 29 2021, 2:28 AM

llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll
1	Precommit this?

jaykang10 added inline comments.Mar 29 2021, 2:42 AM

llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll
1	Sorry... This test file is a copy of `test/Transforms/LoopUnswitch/partial-unswitch.ll` using SimpleLoopUnswitch. The output of SimpleLoopUnswitch is slightly different with LoopUnswitch one so I have run the script to generate assertion. I have checked the each tests' output. If it is not good to use the script, I will add the assertions manually. Please let me know.

lebedev.ri added inline comments.Mar 29 2021, 2:47 AM

llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll
1	No i mean, just commit this test file as-is, obviously after regenerating the check lines to pass on main.

jaykang10 added inline comments.Mar 29 2021, 2:51 AM

llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll
1	Ah, sorry. let me commit this test separately after checking it again.

jaykang10 mentioned this in D99493: [SimpleLoopUnswitch] Add tests to check partially invariant unswitch.Mar 29 2021, 3:16 AM

Harbormaster completed remote builds in B96062: Diff 333792.Mar 29 2021, 3:18 AM

Following comment of @lebedev.ri, moved the test to separate patch.

Harbormaster completed remote builds in B96069: Diff 333804.Mar 29 2021, 4:07 AM

Rebased

Rebase

Harbormaster completed remote builds in B96094: Diff 333833.Mar 29 2021, 6:35 AM

Harbormaster completed remote builds in B96097: Diff 333837.Mar 29 2021, 7:05 AM

jaykang10 updated this revision to Diff 334076.Mar 30 2021, 1:49 AM

Harbormaster completed remote builds in B96263: Diff 334076.Mar 30 2021, 2:39 AM

Any comments please?

Any objection to push this change please? or can someone let me know what I have to do something more for this change please?

jaykang10 added a reviewer: chandlerc.Apr 6 2021, 5:28 AM

Rebased

Harbormaster completed remote builds in B97518: Diff 335821.Apr 7 2021, 9:16 AM

@fhahn Can you review this change when you have time please?

Does this patch give the expected speedup on omnetpp?

In D99354#2676137, @jaykang10 wrote:

@fhahn Can you review this change when you have time please?

please keep the 'common courtesy ping rate' of 1 week in mind (https://llvm.org/docs/CodeReview.html#code-reviews-speed-and-reciprocity), also considering that doing a proper review can take a substantial amount of time for a reviewer and there can be a number of reasons for a delayed response (like holidays, other urgent work).

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
212	`Invariants` here is not really accurate I think. Those are the instructions we need to duplicate outside the loop, right?
219	nit: perhaps a bit simpler `NewInst->insertBefore(BB->getTerminator())`
2015	`struct` not needed.
2027–2029	I think we should probably update `IVConditionInfo` to provide a better way to check if there are partially invariant conditions, e.g. an `isPartiallyInvariant` accessor. WDYT?
2151	nit: `In the partially invariant case, if UnswitchedSuccBB is an exit block, do not ....`
2153	no need for `llvm::`, also can we just capture `[&SuccBB]`?
2366	We don't need to execute this loop at all for the `PartiallyInvariant` case, right?
2683	`struct` not needed.
2736	nit: no `llvm::` should be needed.
2740	nit: no `llvm::` should be needed
2868	Do we need to check here that we only do this for blocks where we partially unswitch on their conditions?
2947–2948	this change seems unrelated?
3151	can you add a comment here to explain the check?

In D99354#2678873, @fhahn wrote:

Does this patch give the expected speedup on omnetpp?

Yep, It makes expected speedup on omnetpp.

In D99354#2676137, @jaykang10 wrote:

@fhahn Can you review this change when you have time please?

please keep the 'common courtesy ping rate' of 1 week in mind (https://llvm.org/docs/CodeReview.html#code-reviews-speed-and-reciprocity), also considering that doing a proper review can take a substantial amount of time for a reviewer and there can be a number of reasons for a delayed response (like holidays, other urgent work).

I am sorry for inconvenient. I will follow the review rule.

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
212	Yep, I will change it to `ToToDuplicate`.
219	The BB is empty block so we can not use `BB.getTerminator()` here.
2015	Yep, I will remove it.
2027–2029	I agree with you. It is better to provide `isPartiallyInvariant` in `IVConditionInfo`. If possible, it could be good to create a separate patch for it after pushing this patch.
2151	Yep, I will update it.
2153	Yep, I will update it.
2366	Yep, you are right! I will update it.
2683	Yep, I will update it.
2736	Yep, I will update it.
2740	Yep, I will update it.
2868	We need to check the cost of the successors which are duplicated. The partially unswitched blocks on their conditions are only duplicated.
2947–2948	Ah, I ran clang-format. It is result of clang-format.
3151	Yep, I will add a comment.

Following comments of @fhahn, updated patch.

Harbormaster completed remote builds in B97932: Diff 336388.Apr 9 2021, 4:58 AM

Thanks for the latest update! I gave the patch a try on SPEC2006 on X86 with -O3 -flto and it looks like there's a crash when building 445.gobmk.

This revision now requires changes to proceed.Apr 15 2021, 8:13 AM

In D99354#2691707, @fhahn wrote:

Thanks for the latest update! I gave the patch a try on SPEC2006 on X86 with -O3 -flto and it looks like there's a crash when building 445.gobmk.

Ah, sorry, I could make a mistake while I rebase and update the change. Let me check.

Fixed a bug

In D99354#2694514, @jaykang10 wrote:

Fixed a bug

Could you add a test case for the problem you fixed?

@fhahn There was a bug. I have fixed it. I have re-run spec2006 on x86 and it was fine.

In D99354#2694518, @fhahn wrote:

In D99354#2694514, @jaykang10 wrote:

Fixed a bug

Could you add a test case for the problem you fixed?

Yep, let me try to add it.

Harbormaster completed remote builds in B99152: Diff 338080.Apr 16 2021, 6:46 AM

Added a test case for previous bug

Harbormaster completed remote builds in B99196: Diff 338138.Apr 16 2021, 10:53 AM

@fhahn Is it ok to push this change? If you need something more, please let me know.

Rebased

Harbormaster completed remote builds in B99983: Diff 339237.Apr 21 2021, 8:28 AM

@fhahn Can we push this change please? If you need something more, please let me know.

@fhahn I think it is ready to push this change.

@fhahn Sorry for ping.

One final round of comments, after that I think this LG.

Could you also add a test for the threshold (like llvm/test/Transforms/LoopUnswitch/partial-unswitch-mssa-threshold.ll) and the interesting MemorySSA update cases from llvm/test/Transforms/LoopUnswitch/partial-unswitch-update-memoryssa.ll?

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
210	nit: `Copy a set of loop invariant values \p ToDuplicate and insert them at the end of \p BB....`? ... and conditional branch on the copied condition.` We only branch on a single value.
2947–2948	ok, can you undo it? As it is not related to the change.

fhahn added inline comments.Apr 27 2021, 3:22 AM

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
2027–2029	Sounds good, looking forward to a follow-up!
2053	message in assert needs updating?
2735–2749	enough to capture `[this]`?
2744	Why `TinyPtrVector` here? In most case we will have at least 2 instructions here anyways? Probably better to use a `SmallVector`?
2747	`emplace_back`?

In D99354#2719131, @fhahn wrote:

One final round of comments, after that I think this LG.

Could you also add a test for the threshold (like llvm/test/Transforms/LoopUnswitch/partial-unswitch-mssa-threshold.ll) and the interesting MemorySSA update cases from llvm/test/Transforms/LoopUnswitch/partial-unswitch-update-memoryssa.ll?

Yep, let me try to add the tests.

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
210	Yep, I will update it.
2027–2029	Yep
2053	Yep, I will update it.
2735–2749	This code is inside static function rather than class's member function so we can not use `[this]`. I will update it with `[&L]`.
2744	The SimpleUnswitchPass uses `TinyPtrVector` for `UnswitchCandidates`. In order to follow the interface, `TinyPtrVector` is being used here.
2747	`TinyPtrVector` does not have emplace_back.
2947–2948	Yep, I will undo it.

Following comments of @fhahn, updated code and added tests.

Herald added subscribers: asbirlea, george.burgess.iv. · View Herald TranscriptApr 28 2021, 2:48 AM

Harbormaster completed remote builds in B101360: Diff 341113.Apr 28 2021, 3:36 AM

LGTM, thanks! A few small comments that can be addressed before committing without further review I think.

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
3151	nit: perhaps `unswitched using a partially invariant condition, ...`?
llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll
1104	Could you add a brief comment and a more descriptive name for the test?
1143	It would be great if you could use descriptive labels in the test both for basic blocks and values, to make it easier to read in case people need to take a look in the future.
1150	please avoid using `null` as pointer, as this UB

This revision is now accepted and ready to land.Apr 29 2021, 1:32 PM

In D99354#2726745, @fhahn wrote:

LGTM, thanks! A few small comments that can be addressed before committing without further review I think.

Thanks @fhahn! After updating code following your comments, I will push this change.

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
3151	Yep, I will update it.
llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll
1104	Yep, I will add them.
1143	Yep, I will update it.
1150	Yep, I will update it.

Following comments of @fhahn, updated code and test

This revision was landed with ongoing or failed builds.Apr 30 2021, 7:58 AM

Closed by commit rG88b259c01463: [SimpleLoopUnswitch] Port partially invariant unswitch from LoopUnswitch to… (authored by jaykang10). · Explain Why

This revision was automatically updated to reflect the committed changes.

jaykang10 added a commit: rG88b259c01463: [SimpleLoopUnswitch] Port partially invariant unswitch from LoopUnswitch to….

Harbormaster completed remote builds in B101930: Diff 341905.Apr 30 2021, 8:55 AM

It looks like this patch causes SimpleLoopUnswitch to not terminate on some inputs, see https://bugs.llvm.org/show_bug.cgi?id=50302 for an example

I think this causes https://bugs.llvm.org/show_bug.cgi?id=50279 as well

In D99354#2750756, @fhahn wrote:

I think this causes https://bugs.llvm.org/show_bug.cgi?id=50279 as well

@fhahn Thanks for letting me know. Let me have a look.

In D99354#2750822, @jaykang10 wrote:

In D99354#2750756, @fhahn wrote:

I think this causes https://bugs.llvm.org/show_bug.cgi?id=50279 as well

@fhahn Thanks for letting me know. Let me have a look.

Thanks. If it's not straight-forward to resolve, it would be best to revert the patch for now.

In D99354#2754363, @fhahn wrote:

In D99354#2750822, @jaykang10 wrote:

In D99354#2750756, @fhahn wrote:

I think this causes https://bugs.llvm.org/show_bug.cgi?id=50279 as well

@fhahn Thanks for letting me know. Let me have a look.

Thanks. If it's not straight-forward to resolve, it would be best to revert the patch for now.

@fhahn Thanks for kind suggestion. I have figured out what causes the endless compiling. It looks we need to move partially invariant instructions rather than duplicating them. Once I resolve this issue, I need to check the score of benchmarks again. Therefore, I would post a patch for review tomorrow or the day after tomorrow.

In D99354#2754454, @jaykang10 wrote:

In D99354#2754363, @fhahn wrote:

In D99354#2750822, @jaykang10 wrote:

In D99354#2750756, @fhahn wrote:

I think this causes https://bugs.llvm.org/show_bug.cgi?id=50279 as well

@fhahn Thanks for letting me know. Let me have a look.

Thanks. If it's not straight-forward to resolve, it would be best to revert the patch for now.

@fhahn Thanks for kind suggestion. I have figured out what causes the endless compiling. It looks we need to move partially invariant instructions rather than duplicating them. Once I resolve this issue, I need to check the score of benchmarks again. Therefore, I would post a patch for review tomorrow or the day after tomorrow.

Great, thanks! It sounds like it would make sense to revert until this is resolved than, so there's no need to rush :)

In D99354#2754816, @fhahn wrote:

In D99354#2754454, @jaykang10 wrote:

In D99354#2754363, @fhahn wrote:

In D99354#2750822, @jaykang10 wrote:

In D99354#2750756, @fhahn wrote:

I think this causes https://bugs.llvm.org/show_bug.cgi?id=50279 as well

@fhahn Thanks for letting me know. Let me have a look.

Thanks. If it's not straight-forward to resolve, it would be best to revert the patch for now.

@fhahn Thanks for kind suggestion. I have figured out what causes the endless compiling. It looks we need to move partially invariant instructions rather than duplicating them. Once I resolve this issue, I need to check the score of benchmarks again. Therefore, I would post a patch for review tomorrow or the day after tomorrow.

Great, thanks! It sounds like it would make sense to revert until this is resolved than, so there's no need to rush :)

Yep, I will revert it now. Additionally, I have seen SimpleLoopUnswitch with this patch does not affect performance number of omnetpp in SPEC2017. I need to figure out it too. Once I fixed these issues, I will let you know.

jaykang10 added a reverting change: rG107d19eb017f: Revert "[SimpleLoopUnswitch] Port partially invariant unswitch from….May 13 2021, 12:41 AM

In D99354#2756257, @jaykang10 wrote:

Yep, I will revert it now. Additionally, I have seen SimpleLoopUnswitch with this patch does not affect performance number of omnetpp in SPEC2017. I need to figure out it too. Once I fixed these issues, I will let you know.

Thanks!

The below bugs are fixed.
https://bugs.llvm.org/show_bug.cgi?id=50279
https://bugs.llvm.org/show_bug.cgi?id=50302

Even if this patch does not add the processed loop to loop pass manager again, the later loop transformations create loop with the header, which has partially invariant instructions, and add it to loop pass manager again. The loop pass manager runs SimpleLoopUnswitchPass with the loop again and it sometimes causes endless unswitch with some CFGs.

In order to avoid the endless unswitch, the new change removes the partially invariant instruction from header as below.

//    +--------------------+
//    |     preheader      |
//    |  %a = load %ptr    |
//    +--------------------+
//             |     /----------------\
//    +--------v----v------+          |
//    |      header        |---\      |
//    | %c = phi %a, %b    |   |      |
//    | %cond = cmp %c, .. |   |      |
//    | br %cond           |   |      |
//    +--------------------+   |      |
//             |               |      |
//    +--------v-----------+   |      |
//    |  store %ptr        |   |      |
//    +--------------------+   |      |
//             |               |      |
//    +--------v-----------<---/      |
//    |       latch        >----------/
//    |  %b = load %ptr    |
//    +--------------------+

@fhahn I have updated the diff which fixes the bugs. I have added phi node to header in order to remove the partially invariant instructions from header. If you feel something wrong from it, please let me know.

For the performance impact on the omnetpp of spec2017, it looks the different inlining is being happened on the benchmark between legacy and new pass manager and it causes fewer chances for unswitching loop. Maybe, I could try to figure out what it causes the different inlining between legacy and new pass manager later.

Harbormaster completed remote builds in B104780: Diff 345815.May 17 2021, 4:41 AM

@fhahn Can we push this patch again please? I have checked the benchmarks and the scores are fine.

In D99354#2763024, @jaykang10 wrote:
The below bugs are fixed.
https://bugs.llvm.org/show_bug.cgi?id=50279
https://bugs.llvm.org/show_bug.cgi?id=50302

Even if this patch does not add the processed loop to loop pass manager again, the later loop transformations create loop with the header, which has partially invariant instructions, and add it to loop pass manager again. The loop pass manager runs SimpleLoopUnswitchPass with the loop again and it sometimes causes endless unswitch with some CFGs.

In order to avoid the endless unswitch, the new change removes the partially invariant instruction from header as below.
//    +--------------------+
//    |     preheader      |
//    |  %a = load %ptr    |
//    +--------------------+
//             |     /----------------\
//    +--------v----v------+          |
//    |      header        |---\      |
//    | %c = phi %a, %b    |   |      |
//    | %cond = cmp %c, .. |   |      |
//    | br %cond           |   |      |
//    +--------------------+   |      |
//             |               |      |
//    +--------v-----------+   |      |
//    |  store %ptr        |   |      |
//    +--------------------+   |      |
//             |               |      |
//    +--------v-----------<---/      |
//    |       latch        >----------/
//    |  %b = load %ptr    |
//    +--------------------+

Hm, this seems quite fragile, as another pass may decide to move the load to the header again. In the legacy pass manager, the loop is annotated with metadata to avoid partially unswithcing the same loop multiple times (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Scalar/LoopUnswitch.cpp#L849) Can we do the same here?

llvm/test/Transforms/SimpleLoopUnswitch/endless-unswitch.ll
1 ↗	(On Diff #345815)	We should avoid adding tests using `-O3` here. Can we reproduce the issue by running `simple-loop-unswtich` twice? Also it would be good to clean up the test a bit.

In D99354#2773344, @fhahn wrote:
In D99354#2763024, @jaykang10 wrote:
The below bugs are fixed.
https://bugs.llvm.org/show_bug.cgi?id=50279
https://bugs.llvm.org/show_bug.cgi?id=50302

Even if this patch does not add the processed loop to loop pass manager again, the later loop transformations create loop with the header, which has partially invariant instructions, and add it to loop pass manager again. The loop pass manager runs SimpleLoopUnswitchPass with the loop again and it sometimes causes endless unswitch with some CFGs.

In order to avoid the endless unswitch, the new change removes the partially invariant instruction from header as below.
//    +--------------------+
//    |     preheader      |
//    |  %a = load %ptr    |
//    +--------------------+
//             |     /----------------\
//    +--------v----v------+          |
//    |      header        |---\      |
//    | %c = phi %a, %b    |   |      |
//    | %cond = cmp %c, .. |   |      |
//    | br %cond           |   |      |
//    +--------------------+   |      |
//             |               |      |
//    +--------v-----------+   |      |
//    |  store %ptr        |   |      |
//    +--------------------+   |      |
//             |               |      |
//    +--------v-----------<---/      |
//    |       latch        >----------/
//    |  %b = load %ptr    |
//    +--------------------+
Hm, this seems quite fragile, as another pass may decide to move the load to the header again.

um... I thought the load would not be moved to header later because it is loop invariant and we usually try to move it to outside loop.

In the legacy pass manager, the loop is annotated with metadata to avoid partially unswithcing the same loop multiple times (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Scalar/LoopUnswitch.cpp#L849) Can we do the same here?

Yep, I have seen the metadata. I was just not sure whether people will accept it or not. Let me try to add it.

llvm/test/Transforms/SimpleLoopUnswitch/endless-unswitch.ll
1 ↗	(On Diff #345815)	um... in order to reproduce the endless unswitch, it needs other passes. Let me try to reduce the test.

Following comments of @fhahn, updated code.

Added metadata to avoid endless unswitch
Reduced test case

@fhahn I have followed the LoopUnSwitch pass's metadata and it looks ok with benchmark scores. If you feel something wrong from the change, please let me know.

Harbormaster completed remote builds in B105632: Diff 347025.May 21 2021, 8:13 AM

@fhahn Can we push this change again please?

@fhahn Sorry for ping.

Thanks for the update! looks good, as this in line with the legacy loop-unswitch implementation. I think the last piece of work remaining is to update the tests to check that the metadata is emitted? Also, could you add a test with an partially unswitchable loop, that already has llvm.loop.unswitch.partial.disable and ensure it is not unswitched?

Thanks for comments @fhahn! Let me add tests for the metadata.

Following comment of @fhahn, added tests for metadata with llvm.loop.unswitch.partial.disable.

Harbormaster completed remote builds in B107028: Diff 348957.Jun 1 2021, 7:04 AM

fhahn reopened this revision.Jun 2 2021, 2:21 AM

fhahn added inline comments.

llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch-check-metadata.ll
9 ↗	(On Diff #348957)	Can this test just be part of llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll?
llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch-generate-metadata.ll
8 ↗	(On Diff #348957)	can you also include checks for the generated IR, ensuring that the whole metadata chain is correct (!llvm.loop attached to the right unswitched loop and contains the disable metadata? Also, can this just be part of `llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll`? I think we can just extend the checks there, rather than adding a new file + function.

This revision is now accepted and ready to land.Jun 2 2021, 2:21 AM

LGTM thanks, with the additional suggestions for the tests.

In D99354#2793045, @fhahn wrote:

LGTM thanks, with the additional suggestions for the tests.

Thanks @fhahn! After updating tests, I will push it.

llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch-check-metadata.ll
9 ↗	(On Diff #348957)	Yep, I will add the metadata check in llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll.
llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch-generate-metadata.ll
8 ↗	(On Diff #348957)	Yep, I will add the metadata check in llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll.

Following comments from @fhahn, updated tests.

This revision was landed with ongoing or failed builds.Jun 2 2021, 3:26 AM

Closed by commit rGf3a27511c9f8: [SimpleLoopUnswitch] Port partially invariant unswitch from LoopUnswitch to… (authored by jaykang10). · Explain Why

This revision was automatically updated to reflect the committed changes.

jaykang10 added a commit: rGf3a27511c9f8: [SimpleLoopUnswitch] Port partially invariant unswitch from LoopUnswitch to….

Harbormaster completed remote builds in B107209: Diff 349216.Jun 2 2021, 4:04 AM

Great to see progress on this :-)

I don't see the speedup on omnetpp that we saw before on SystemZ. Do I need to pass some option or adjust some target threshold value somewhere?

In D99354#2794815, @jonpa wrote:

Great to see progress on this :-)

I don't see the speedup on omnetpp that we saw before on SystemZ. Do I need to pass some option or adjust some target threshold value somewhere?

@jonpa Thanks for checking the performance number on SystemZ.

I was able to see the speed up on omnetpp of SPEC2006 but there was no speedup on omentpp of SPEC2017. I guess you are looking SPEC2017 one.

It seemed there is different inlining between new pass manager and legacy pass manager and it causes to reduce partial loop unswitch opportunities on omnetpp of SPEC2017.

I am trying to get the speedup with the partial loop unswitch on omnetpp of SPEC2017 again. Once I fix it, I will let you know.

In D99354#2795599, @jaykang10 wrote:

In D99354#2794815, @jonpa wrote:

Great to see progress on this :-)

I don't see the speedup on omnetpp that we saw before on SystemZ. Do I need to pass some option or adjust some target threshold value somewhere?

@jonpa Thanks for checking the performance number on SystemZ.

I was able to see the speed up on omnetpp of SPEC2006 but there was no speedup on omentpp of SPEC2017. I guess you are looking SPEC2017 one.

yes

It seemed there is different inlining between new pass manager and legacy pass manager and it causes to reduce partial loop unswitch opportunities on omnetpp of SPEC2017.

I am trying to get the speedup with the partial loop unswitch on omnetpp of SPEC2017 again. Once I fix it, I will let you know.

awesome!

In D99354#2763024, @jaykang10 wrote:

The below bugs are fixed.
https://bugs.llvm.org/show_bug.cgi?id=50279
https://bugs.llvm.org/show_bug.cgi?id=50302

@jaykang10 can you verify the patch fixes the linked issue & close them?

In D99354#2802472, @fhahn wrote:

In D99354#2763024, @jaykang10 wrote:

The below bugs are fixed.
https://bugs.llvm.org/show_bug.cgi?id=50279
https://bugs.llvm.org/show_bug.cgi?id=50302

@jaykang10 can you verify the patch fixes the linked issue & close them?

Thanks for reminding me! @fhahn

Yep, I have checked those bug are fixed. I will close them.

@jaykang10 it looks like this patch regressed codegen in some cases: https://bugs.llvm.org/show_bug.cgi?id=51139. It would be great if you could take a look.

In D99354#2903817, @fhahn wrote:

@jaykang10 it looks like this patch regressed codegen in some cases: https://bugs.llvm.org/show_bug.cgi?id=51139. It would be great if you could take a look.

Ah, let me have a look.

In D99354#2912791, @jaykang10 wrote:

In D99354#2903817, @fhahn wrote:

@jaykang10 it looks like this patch regressed codegen in some cases: https://bugs.llvm.org/show_bug.cgi?id=51139. It would be great if you could take a look.

Ah, let me have a look.

Maybe revert while investigating?

In D99354#2949059, @xbolva00 wrote:

In D99354#2912791, @jaykang10 wrote:

In D99354#2903817, @fhahn wrote:

@jaykang10 it looks like this patch regressed codegen in some cases: https://bugs.llvm.org/show_bug.cgi?id=51139. It would be great if you could take a look.

Ah, let me have a look.

Maybe revert while investigating?

As I mentioned on https://bugs.llvm.org/show_bug.cgi?id=51141, I am not sure the unswitch pass has to detect the cases which the loop load pre, sccp or other passes can optimize and the pass do not unswitch the loop...
I am creating a patch to fix https://bugs.llvm.org/show_bug.cgi?id=51141 first. The patch does not fix https://bugs.llvm.org/show_bug.cgi?id=51139. Once I create it, let me add you as reviewer.

In D99354#2949165, @jaykang10 wrote:

In D99354#2949059, @xbolva00 wrote:

In D99354#2912791, @jaykang10 wrote:

In D99354#2903817, @fhahn wrote:

@jaykang10 it looks like this patch regressed codegen in some cases: https://bugs.llvm.org/show_bug.cgi?id=51139. It would be great if you could take a look.

Ah, let me have a look.

Maybe revert while investigating?

As I mentioned on https://bugs.llvm.org/show_bug.cgi?id=51141, I am not sure the unswitch pass has to detect the cases which the loop load pre, sccp or other passes can optimize and the pass do not unswitch the loop...
I am creating a patch to fix https://bugs.llvm.org/show_bug.cgi?id=51141 first. The patch does not fix https://bugs.llvm.org/show_bug.cgi?id=51139. Once I create it, let me add you as reviewer.

For https://bugs.llvm.org/show_bug.cgi?id=51139, I have added below comment on it.

In this case, the inliner pass fails to inline the function g after unswitch because of the cost as below debug message.

NOT Inlining (cost=250, threshold=250), Call:   call void @g(i32 %2) #3

Originally, after inlining function g, the JumpThreading pass made the block with call @foo dead and SimplifyCFG pass deleted it.

If you add `always_inline` attribute to the function g's prototype as below, you can see the callfoo is gone.

void g(int h) __attribute__((always_inline));

In this case, as I mentioned previously, I am not sure the unswitch pass has to check the inline cost or something like that... If someone has idea about it, please let me know.

In D99354#2951701, @jaykang10 wrote:
In D99354#2949165, @jaykang10 wrote:

In D99354#2949059, @xbolva00 wrote:

In D99354#2912791, @jaykang10 wrote:

In D99354#2903817, @fhahn wrote:

@jaykang10 it looks like this patch regressed codegen in some cases: https://bugs.llvm.org/show_bug.cgi?id=51139. It would be great if you could take a look.

Ah, let me have a look.

Maybe revert while investigating?

As I mentioned on https://bugs.llvm.org/show_bug.cgi?id=51141, I am not sure the unswitch pass has to detect the cases which the loop load pre, sccp or other passes can optimize and the pass do not unswitch the loop...
I am creating a patch to fix https://bugs.llvm.org/show_bug.cgi?id=51141 first. The patch does not fix https://bugs.llvm.org/show_bug.cgi?id=51139. Once I create it, let me add you as reviewer.

For https://bugs.llvm.org/show_bug.cgi?id=51139, I have added below comment on it.
In this case, the inliner pass fails to inline the function g after unswitch because of the cost as below debug message.

NOT Inlining (cost=250, threshold=250), Call:   call void @g(i32 %2) #3

Originally, after inlining function g, the JumpThreading pass made the block with call @foo dead and SimplifyCFG pass deleted it.

If you add `always_inline` attribute to the function g's prototype as below, you can see the callfoo is gone.

void g(int h) __attribute__((always_inline));
In this case, as I mentioned previously, I am not sure the unswitch pass has to check the inline cost or something like that... If someone has idea about it, please let me know.

The SimpleUnswitchPass checks the cost of duplicating the loop so we could adjust it. Let me try to adjust the cost calculation for partially invariant unswitch.

In D99354#2795599, @jaykang10 wrote:

I was able to see the speed up on omnetpp of SPEC2006 but there was no speedup on omentpp of SPEC2017. I guess you are looking SPEC2017 one.

@jaykang10, can you share the key loops that contribute to the speedup you observed from partial unswitch in spec2k6 omnetpp by any chance? Thanks.

Herald added a project: Restricted Project. · View Herald TranscriptMay 18 2022, 9:14 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

SimpleLoopUnswitch.cpp

209 lines

test/

Transforms/

SimpleLoopUnswitch/

partial-unswitch.ll

315 lines

Diff 338138

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp

Show First 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	static cl::opt<bool> UnswitchGuards(
"simple-loop-unswitch-guards", cl::init(true), cl::Hidden,		"simple-loop-unswitch-guards", cl::init(true), cl::Hidden,
cl::desc("If enabled, simple loop unswitching will also consider "		cl::desc("If enabled, simple loop unswitching will also consider "
"llvm.experimental.guard intrinsics as unswitch candidates."));		"llvm.experimental.guard intrinsics as unswitch candidates."));
static cl::opt<bool> DropNonTrivialImplicitNullChecks(		static cl::opt<bool> DropNonTrivialImplicitNullChecks(
"simple-loop-unswitch-drop-non-trivial-implicit-null-checks",		"simple-loop-unswitch-drop-non-trivial-implicit-null-checks",
cl::init(false), cl::Hidden,		cl::init(false), cl::Hidden,
cl::desc("If enabled, drop make.implicit metadata in unswitched implicit "		cl::desc("If enabled, drop make.implicit metadata in unswitched implicit "
"null checks to save time analyzing if we can keep it."));		"null checks to save time analyzing if we can keep it."));
		static cl::opt<unsigned>
		MSSAThreshold("simple-loop-unswitch-memoryssa-threshold",
		cl::desc("Max number of memory uses to explore during "
		"partial unswitching analysis"),
		cl::init(100), cl::Hidden);

/// Collect all of the loop invariant input values transitively used by the		/// Collect all of the loop invariant input values transitively used by the
/// homogeneous instruction graph from a given root.		/// homogeneous instruction graph from a given root.
///		///
/// This essentially walks from a root recursively through loop variant operands		/// This essentially walks from a root recursively through loop variant operands
/// which have the exact same opcode and finds all inputs which are loop		/// which have the exact same opcode and finds all inputs which are loop
/// invariant. For some operations these can be re-associated and unswitched out		/// invariant. For some operations these can be re-associated and unswitched out
/// of the loop entirely.		/// of the loop entirely.
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	static void buildPartialUnswitchConditionalBranch(BasicBlock &BB,
IRBuilder<> IRB(&BB);		IRBuilder<> IRB(&BB);

Value *Cond = Direction ? IRB.CreateOr(Invariants) :		Value *Cond = Direction ? IRB.CreateOr(Invariants) :
IRB.CreateAnd(Invariants);		IRB.CreateAnd(Invariants);
IRB.CreateCondBr(Cond, Direction ? &UnswitchedSucc : &NormalSucc,		IRB.CreateCondBr(Cond, Direction ? &UnswitchedSucc : &NormalSucc,
Direction ? &NormalSucc : &UnswitchedSucc);		Direction ? &NormalSucc : &UnswitchedSucc);
}		}

		/// Copy a set of loop invariant values, and conditionally branch on them.
		fhahnUnsubmitted Not Done Reply Inline Actions nit: `Copy a set of loop invariant values \p ToDuplicate and insert them at the end of \p BB....`? ... and conditional branch on the copied condition.` We only branch on a single value. fhahn: nit: `Copy a set of loop invariant values \p ToDuplicate and insert them at the end of \p BB...
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will update it. jaykang10: Yep, I will update it.
		static void buildPartialInvariantUnswitchConditionalBranch(
		BasicBlock &BB, ArrayRef<Value *> ToDuplicate, bool Direction,
		fhahnUnsubmitted Not Done Reply Inline Actions `Invariants` here is not really accurate I think. Those are the instructions we need to duplicate outside the loop, right? fhahn: `Invariants` here is not really accurate I think. Those are the instructions we need to…
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will change it to `ToToDuplicate`. jaykang10: Yep, I will change it to `ToToDuplicate`.
		BasicBlock &UnswitchedSucc, BasicBlock &NormalSucc, Loop &L,
		MemorySSAUpdater *MSSAU) {
		ValueToValueMapTy VMap;
		for (auto *Val : reverse(ToDuplicate)) {
		Instruction *Inst = cast<Instruction>(Val);
		Instruction *NewInst = Inst->clone();
		BB.getInstList().insert(BB.end(), NewInst);
		fhahnUnsubmitted Not Done Reply Inline Actions nit: perhaps a bit simpler `NewInst->insertBefore(BB->getTerminator())` fhahn: nit: perhaps a bit simpler `NewInst->insertBefore(BB->getTerminator())`
		jaykang10AuthorUnsubmitted Done Reply Inline Actions The BB is empty block so we can not use `BB.getTerminator()` here. jaykang10: The BB is empty block so we can not use `BB.getTerminator()` here.
		RemapInstruction(NewInst, VMap,
		RF_NoModuleLevelChanges \| RF_IgnoreMissingLocals);
		VMap[Val] = NewInst;

		if (!MSSAU)
		continue;

		MemorySSA *MSSA = MSSAU->getMemorySSA();
		if (auto *MemUse =
		dyn_cast_or_null<MemoryUse>(MSSA->getMemoryAccess(Inst))) {
		auto *DefiningAccess = MemUse->getDefiningAccess();
		// Get the first defining access before the loop.
		while (L.contains(DefiningAccess->getBlock())) {
		// If the defining access is a MemoryPhi, get the incoming
		// value for the pre-header as defining access.
		if (auto *MemPhi = dyn_cast<MemoryPhi>(DefiningAccess))
		DefiningAccess =
		MemPhi->getIncomingValueForBlock(L.getLoopPreheader());
		else
		DefiningAccess = cast<MemoryDef>(DefiningAccess)->getDefiningAccess();
		}
		MSSAU->createMemoryAccessInBB(NewInst, DefiningAccess,
		NewInst->getParent(),
		MemorySSA::BeforeTerminator);
		}
		}

		IRBuilder<> IRB(&BB);
		Value *Cond = VMap[ToDuplicate[0]];
		IRB.CreateCondBr(Cond, Direction ? &UnswitchedSucc : &NormalSucc,
		Direction ? &NormalSucc : &UnswitchedSucc);
		}

/// Rewrite the PHI nodes in an unswitched loop exit basic block.		/// Rewrite the PHI nodes in an unswitched loop exit basic block.
///		///
/// Requires that the loop exit and unswitched basic block are the same, and		/// Requires that the loop exit and unswitched basic block are the same, and
/// that the exiting block was a unique predecessor of that block. Rewrites the		/// that the exiting block was a unique predecessor of that block. Rewrites the
/// PHI nodes in that block such that what were LCSSA PHI nodes become trivial		/// PHI nodes in that block such that what were LCSSA PHI nodes become trivial
/// PHI nodes from the old preheader that now contains the unswitched		/// PHI nodes from the old preheader that now contains the unswitched
/// terminator.		/// terminator.
static void rewritePHINodesForUnswitchedExitBlock(BasicBlock &UnswitchedBB,		static void rewritePHINodesForUnswitchedExitBlock(BasicBlock &UnswitchedBB,
▲ Show 20 Lines • Show All 1,745 Lines • ▼ Show 20 Lines	for (DomTreeNode ChildN : N) {
"Cannot visit a node twice when walking a tree!");		"Cannot visit a node twice when walking a tree!");
DomWorklist.push_back(ChildN);		DomWorklist.push_back(ChildN);
}		}
} while (!DomWorklist.empty());		} while (!DomWorklist.empty());
}		}

static void unswitchNontrivialInvariants(		static void unswitchNontrivialInvariants(
Loop &L, Instruction &TI, ArrayRef<Value *> Invariants,		Loop &L, Instruction &TI, ArrayRef<Value *> Invariants,
SmallVectorImpl<BasicBlock *> &ExitBlocks, DominatorTree &DT, LoopInfo &LI,		SmallVectorImpl<BasicBlock *> &ExitBlocks, IVConditionInfo &PartialIVInfo,
AssumptionCache &AC, function_ref<void(bool, ArrayRef<Loop *>)> UnswitchCB,		DominatorTree &DT, LoopInfo &LI, AssumptionCache &AC,
		fhahnUnsubmitted Not Done Reply Inline Actions `struct` not needed. fhahn: `struct` not needed.
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will remove it. jaykang10: Yep, I will remove it.
		function_ref<void(bool, bool, ArrayRef<Loop *>)> UnswitchCB,
ScalarEvolution SE, MemorySSAUpdater MSSAU) {		ScalarEvolution SE, MemorySSAUpdater MSSAU) {
auto *ParentBB = TI.getParent();		auto *ParentBB = TI.getParent();
BranchInst *BI = dyn_cast<BranchInst>(&TI);		BranchInst *BI = dyn_cast<BranchInst>(&TI);
SwitchInst *SI = BI ? nullptr : cast<SwitchInst>(&TI);		SwitchInst *SI = BI ? nullptr : cast<SwitchInst>(&TI);

// We can only unswitch switches, conditional branches with an invariant		// We can only unswitch switches, conditional branches with an invariant
// condition, or combining invariant conditions with an instruction.		// condition, or combining invariant conditions with an instruction or
		// partially invariant instructions.
assert((SI \|\| (BI && BI->isConditional())) &&		assert((SI \|\| (BI && BI->isConditional())) &&
"Can only unswitch switches and conditional branch!");		"Can only unswitch switches and conditional branch!");
bool FullUnswitch = SI \|\| BI->getCondition() == Invariants[0];		bool PartiallyInvariant = !PartialIVInfo.InstToDuplicate.empty();
		bool FullUnswitch =
		SI \|\| (BI->getCondition() == Invariants[0] && !PartiallyInvariant);
		fhahnUnsubmitted Not Done Reply Inline Actions I think we should probably update `IVConditionInfo` to provide a better way to check if there are partially invariant conditions, e.g. an `isPartiallyInvariant` accessor. WDYT? fhahn: I think we should probably update `IVConditionInfo` to provide a better way to check if there…
		jaykang10AuthorUnsubmitted Done Reply Inline Actions I agree with you. It is better to provide `isPartiallyInvariant` in `IVConditionInfo`. If possible, it could be good to create a separate patch for it after pushing this patch. jaykang10: I agree with you. It is better to provide `isPartiallyInvariant` in `IVConditionInfo`. If…
		fhahnUnsubmitted Not Done Reply Inline Actions Sounds good, looking forward to a follow-up! fhahn: Sounds good, looking forward to a follow-up!
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep jaykang10: Yep
if (FullUnswitch)		if (FullUnswitch)
assert(Invariants.size() == 1 &&		assert(Invariants.size() == 1 &&
"Cannot have other invariants with full unswitching!");		"Cannot have other invariants with full unswitching!");
else		else
assert(isa<Instruction>(BI->getCondition()) &&		assert(isa<Instruction>(BI->getCondition()) &&
"Partial unswitching requires an instruction as the condition!");		"Partial unswitching requires an instruction as the condition!");

if (MSSAU && VerifyMemorySSA)		if (MSSAU && VerifyMemorySSA)
MSSAU->getMemorySSA()->verifyMemorySSA();		MSSAU->getMemorySSA()->verifyMemorySSA();

// Constant and BBs tracking the cloned and continuing successor. When we are		// Constant and BBs tracking the cloned and continuing successor. When we are
// unswitching the entire condition, this can just be trivially chosen to		// unswitching the entire condition, this can just be trivially chosen to
// unswitch towards `true`. However, when we are unswitching a set of		// unswitch towards `true`. However, when we are unswitching a set of
// invariants combined with `and` or `or`, the combining operation determines		// invariants combined with `and` or `or` or partially invariant instructions,
// the best direction to unswitch: we want to unswitch the direction that will		// the combining operation determines the best direction to unswitch: we want
// collapse the branch.		// to unswitch the direction that will collapse the branch.
bool Direction = true;		bool Direction = true;
int ClonedSucc = 0;		int ClonedSucc = 0;
if (!FullUnswitch) {		if (!FullUnswitch) {
Value *Cond = BI->getCondition();		Value *Cond = BI->getCondition();
(void)Cond;		(void)Cond;
assert((match(Cond, m_LogicalAnd()) ^ match(Cond, m_LogicalOr())) &&		assert(((match(Cond, m_LogicalAnd()) ^ match(Cond, m_LogicalOr())) \|\|
		PartiallyInvariant) &&
"Only `or`, `and`, an `select` instructions can combine "		"Only `or`, `and`, an `select` instructions can combine "
		fhahnUnsubmitted Not Done Reply Inline Actions message in assert needs updating? fhahn: message in assert needs updating?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will update it. jaykang10: Yep, I will update it.
"invariants being unswitched.");		"invariants being unswitched.");
if (!match(BI->getCondition(), m_LogicalOr())) {		if (!match(BI->getCondition(), m_LogicalOr())) {
		if (match(BI->getCondition(), m_LogicalAnd()) \|\|
		(PartiallyInvariant && !PartialIVInfo.KnownValue->isOneValue())) {
Direction = false;		Direction = false;
ClonedSucc = 1;		ClonedSucc = 1;
}		}
}		}
		}

BasicBlock *RetainedSuccBB =		BasicBlock *RetainedSuccBB =
BI ? BI->getSuccessor(1 - ClonedSucc) : SI->getDefaultDest();		BI ? BI->getSuccessor(1 - ClonedSucc) : SI->getDefaultDest();
SmallSetVector<BasicBlock *, 4> UnswitchedSuccBBs;		SmallSetVector<BasicBlock *, 4> UnswitchedSuccBBs;
if (BI)		if (BI)
UnswitchedSuccBBs.insert(BI->getSuccessor(ClonedSucc));		UnswitchedSuccBBs.insert(BI->getSuccessor(ClonedSucc));
else		else
for (auto Case : SI->cases())		for (auto Case : SI->cases())
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	static void unswitchNontrivialInvariants(
// Clone the loop for each unswitched successor.		// Clone the loop for each unswitched successor.
SmallVector<std::unique_ptr<ValueToValueMapTy>, 4> VMaps;		SmallVector<std::unique_ptr<ValueToValueMapTy>, 4> VMaps;
VMaps.reserve(UnswitchedSuccBBs.size());		VMaps.reserve(UnswitchedSuccBBs.size());
SmallDenseMap<BasicBlock , BasicBlock , 4> ClonedPHs;		SmallDenseMap<BasicBlock , BasicBlock , 4> ClonedPHs;
for (auto *SuccBB : UnswitchedSuccBBs) {		for (auto *SuccBB : UnswitchedSuccBBs) {
VMaps.emplace_back(new ValueToValueMapTy());		VMaps.emplace_back(new ValueToValueMapTy());
ClonedPHs[SuccBB] = buildClonedLoopBlocks(		ClonedPHs[SuccBB] = buildClonedLoopBlocks(
L, LoopPH, SplitBB, ExitBlocks, ParentBB, SuccBB, RetainedSuccBB,		L, LoopPH, SplitBB, ExitBlocks, ParentBB, SuccBB, RetainedSuccBB,
DominatingSucc, *VMaps.back(), DTUpdates, AC, DT, LI, MSSAU);		DominatingSucc, *VMaps.back(), DTUpdates, AC, DT, LI, MSSAU);
		fhahnUnsubmitted Not Done Reply Inline Actions nit: `In the partially invariant case, if UnswitchedSuccBB is an exit block, do not ....` fhahn: nit: `In the partially invariant case, if UnswitchedSuccBB is an exit block, do not ....`
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will update it. jaykang10: Yep, I will update it.
}		}

		fhahnUnsubmitted Not Done Reply Inline Actions no need for `llvm::`, also can we just capture `[&SuccBB]`? fhahn: no need for `llvm::`, also can we just capture `[&SuccBB]`?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will update it. jaykang10: Yep, I will update it.
// Drop metadata if we may break its semantics by moving this instr into the		// Drop metadata if we may break its semantics by moving this instr into the
// split block.		// split block.
if (TI.getMetadata(LLVMContext::MD_make_implicit)) {		if (TI.getMetadata(LLVMContext::MD_make_implicit)) {
if (DropNonTrivialImplicitNullChecks)		if (DropNonTrivialImplicitNullChecks)
// Do not spend time trying to understand if we can keep it, just drop it		// Do not spend time trying to understand if we can keep it, just drop it
// to save compile time.		// to save compile time.
TI.setMetadata(LLVMContext::MD_make_implicit, nullptr);		TI.setMetadata(LLVMContext::MD_make_implicit, nullptr);
else {		else {
▲ Show 20 Lines • Show All 108 Lines • ▼ Show 20 Lines	if (FullUnswitch) {
BranchInst::Create(RetainedSuccBB, ParentBB);		BranchInst::Create(RetainedSuccBB, ParentBB);
} else {		} else {
assert(BI && "Only branches have partial unswitching.");		assert(BI && "Only branches have partial unswitching.");
assert(UnswitchedSuccBBs.size() == 1 &&		assert(UnswitchedSuccBBs.size() == 1 &&
"Only one possible unswitched block for a branch!");		"Only one possible unswitched block for a branch!");
BasicBlock *ClonedPH = ClonedPHs.begin()->second;		BasicBlock *ClonedPH = ClonedPHs.begin()->second;
// When doing a partial unswitch, we have to do a bit more work to build up		// When doing a partial unswitch, we have to do a bit more work to build up
// the branch in the split block.		// the branch in the split block.
		if (PartiallyInvariant)
		buildPartialInvariantUnswitchConditionalBranch(
		SplitBB, Invariants, Direction, ClonedPH, *LoopPH, L, MSSAU);
		else
buildPartialUnswitchConditionalBranch(*SplitBB, Invariants, Direction,		buildPartialUnswitchConditionalBranch(*SplitBB, Invariants, Direction,
ClonedPH, LoopPH);		ClonedPH, LoopPH);
DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});		DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});

if (MSSAU) {		if (MSSAU) {
DT.applyUpdates(DTUpdates);		DT.applyUpdates(DTUpdates);
DTUpdates.clear();		DTUpdates.clear();

// Perform MSSA cloning updates.		// Perform MSSA cloning updates.
for (auto &VMap : VMaps)		for (auto &VMap : VMaps)
Show All 35 Lines	static void unswitchNontrivialInvariants(

// This transformation has a high risk of corrupting the dominator tree, and		// This transformation has a high risk of corrupting the dominator tree, and
// the below steps to rebuild loop structures will result in hard to debug		// the below steps to rebuild loop structures will result in hard to debug
// errors in that case so verify that the dominator tree is sane first.		// errors in that case so verify that the dominator tree is sane first.
// FIXME: Remove this when the bugs stop showing up and rely on existing		// FIXME: Remove this when the bugs stop showing up and rely on existing
// verification steps.		// verification steps.
assert(DT.verify(DominatorTree::VerificationLevel::Fast));		assert(DT.verify(DominatorTree::VerificationLevel::Fast));

if (BI) {		if (BI && !PartiallyInvariant) {
// If we unswitched a branch which collapses the condition to a known		// If we unswitched a branch which collapses the condition to a known
// constant we want to replace all the uses of the invariants within both		// constant we want to replace all the uses of the invariants within both
// the original and cloned blocks. We do this here so that we can use the		// the original and cloned blocks. We do this here so that we can use the
// now updated dominator tree to identify which side the users are on.		// now updated dominator tree to identify which side the users are on.
assert(UnswitchedSuccBBs.size() == 1 &&		assert(UnswitchedSuccBBs.size() == 1 &&
"Only one possible unswitched block for a branch!");		"Only one possible unswitched block for a branch!");
BasicBlock *ClonedPH = ClonedPHs.begin()->second;		BasicBlock *ClonedPH = ClonedPHs.begin()->second;

// When considering multiple partially-unswitched invariants		// When considering multiple partially-unswitched invariants
// we cant just go replace them with constants in both branches.		// we cant just go replace them with constants in both branches.
//		//
// For 'AND' we infer that true branch ("continue") means true		// For 'AND' we infer that true branch ("continue") means true
// for each invariant operand.		// for each invariant operand.
// For 'OR' we can infer that false branch ("continue") means false		// For 'OR' we can infer that false branch ("continue") means false
// for each invariant operand.		// for each invariant operand.
// So it happens that for multiple-partial case we dont replace		// So it happens that for multiple-partial case we dont replace
// in the unswitched branch.		// in the unswitched branch.
bool ReplaceUnswitched = FullUnswitch \|\| (Invariants.size() == 1);		bool ReplaceUnswitched =
		FullUnswitch \|\| (Invariants.size() == 1) \|\| PartiallyInvariant;

ConstantInt *UnswitchedReplacement =		ConstantInt *UnswitchedReplacement =
Direction ? ConstantInt::getTrue(BI->getContext())		Direction ? ConstantInt::getTrue(BI->getContext())
: ConstantInt::getFalse(BI->getContext());		: ConstantInt::getFalse(BI->getContext());
ConstantInt *ContinueReplacement =		ConstantInt *ContinueReplacement =
Direction ? ConstantInt::getFalse(BI->getContext())		Direction ? ConstantInt::getFalse(BI->getContext())
: ConstantInt::getTrue(BI->getContext());		: ConstantInt::getTrue(BI->getContext());
for (Value *Invariant : Invariants)		for (Value *Invariant : Invariants)
// Use make_early_inc_range here as set invalidates the iterator.		// Use make_early_inc_range here as set invalidates the iterator.
for (Use &U : llvm::make_early_inc_range(Invariant->uses())) {		for (Use &U : llvm::make_early_inc_range(Invariant->uses())) {
Instruction *UserI = dyn_cast<Instruction>(U.getUser());		Instruction *UserI = dyn_cast<Instruction>(U.getUser());
if (!UserI)		if (!UserI)
		fhahnUnsubmitted Not Done Reply Inline Actions We don't need to execute this loop at all for the `PartiallyInvariant` case, right? fhahn: We don't need to execute this loop at all for the `PartiallyInvariant` case, right?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, you are right! I will update it. jaykang10: Yep, you are right! I will update it.
continue;		continue;

// Replace it with the 'continue' side if in the main loop body, and the		// Replace it with the 'continue' side if in the main loop body, and the
// unswitched if in the cloned blocks.		// unswitched if in the cloned blocks.
if (DT.dominates(LoopPH, UserI->getParent()))		if (DT.dominates(LoopPH, UserI->getParent()))
U.set(ContinueReplacement);		U.set(ContinueReplacement);
else if (ReplaceUnswitched &&		else if (ReplaceUnswitched &&
DT.dominates(ClonedPH, UserI->getParent()))		DT.dominates(ClonedPH, UserI->getParent()))
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	#endif

// Now that we've unswitched something, make callbacks to report the changes.		// Now that we've unswitched something, make callbacks to report the changes.
// For that we need to merge together the updated loops and the cloned loops		// For that we need to merge together the updated loops and the cloned loops
// and check whether the original loop survived.		// and check whether the original loop survived.
SmallVector<Loop *, 4> SibLoops;		SmallVector<Loop *, 4> SibLoops;
for (Loop UpdatedL : llvm::concat<Loop >(NonChildClonedLoops, HoistedLoops))		for (Loop UpdatedL : llvm::concat<Loop >(NonChildClonedLoops, HoistedLoops))
if (UpdatedL->getParentLoop() == ParentL)		if (UpdatedL->getParentLoop() == ParentL)
SibLoops.push_back(UpdatedL);		SibLoops.push_back(UpdatedL);
UnswitchCB(IsStillLoop, SibLoops);		UnswitchCB(IsStillLoop, PartiallyInvariant, SibLoops);

if (MSSAU && VerifyMemorySSA)		if (MSSAU && VerifyMemorySSA)
MSSAU->getMemorySSA()->verifyMemorySSA();		MSSAU->getMemorySSA()->verifyMemorySSA();

if (BI)		if (BI)
++NumBranches;		++NumBranches;
else		else
++NumSwitches;		++NumSwitches;
▲ Show 20 Lines • Show All 198 Lines • ▼ Show 20 Lines	static int CalculateUnswitchCostMultiplier(

LLVM_DEBUG(dbgs() << " Computed multiplier " << CostMultiplier		LLVM_DEBUG(dbgs() << " Computed multiplier " << CostMultiplier
<< " (siblings " << SiblingsMultiplier << " * clones "		<< " (siblings " << SiblingsMultiplier << " * clones "
<< (1 << ClonesPower) << ")"		<< (1 << ClonesPower) << ")"
<< " for unswitch candidate: " << TI << "\n");		<< " for unswitch candidate: " << TI << "\n");
return CostMultiplier;		return CostMultiplier;
}		}

static bool		static bool unswitchBestCondition(
unswitchBestCondition(Loop &L, DominatorTree &DT, LoopInfo &LI,		Loop &L, DominatorTree &DT, LoopInfo &LI, AssumptionCache &AC,
AssumptionCache &AC, TargetTransformInfo &TTI,		AAResults &AA, TargetTransformInfo &TTI,
function_ref<void(bool, ArrayRef<Loop *>)> UnswitchCB,		function_ref<void(bool, bool, ArrayRef<Loop *>)> UnswitchCB,
ScalarEvolution SE, MemorySSAUpdater MSSAU) {		ScalarEvolution SE, MemorySSAUpdater MSSAU) {
// Collect all invariant conditions within this loop (as opposed to an inner		// Collect all invariant conditions within this loop (as opposed to an inner
// loop which would be handled when visiting that inner loop).		// loop which would be handled when visiting that inner loop).
SmallVector<std::pair<Instruction , TinyPtrVector<Value >>, 4>		SmallVector<std::pair<Instruction , TinyPtrVector<Value >>, 4>
UnswitchCandidates;		UnswitchCandidates;

// Whether or not we should also collect guards in the loop.		// Whether or not we should also collect guards in the loop.
bool CollectGuards = false;		bool CollectGuards = false;
if (UnswitchGuards) {		if (UnswitchGuards) {
auto *GuardDecl = L.getHeader()->getParent()->getParent()->getFunction(		auto *GuardDecl = L.getHeader()->getParent()->getParent()->getFunction(
Intrinsic::getName(Intrinsic::experimental_guard));		Intrinsic::getName(Intrinsic::experimental_guard));
if (GuardDecl && !GuardDecl->use_empty())		if (GuardDecl && !GuardDecl->use_empty())
CollectGuards = true;		CollectGuards = true;
}		}

		IVConditionInfo PartialIVInfo;
		fhahnUnsubmitted Not Done Reply Inline Actions `struct` not needed. fhahn: `struct` not needed.
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will update it. jaykang10: Yep, I will update it.
for (auto *BB : L.blocks()) {		for (auto *BB : L.blocks()) {
if (LI.getLoopFor(BB) != &L)		if (LI.getLoopFor(BB) != &L)
continue;		continue;

if (CollectGuards)		if (CollectGuards)
for (auto &I : *BB)		for (auto &I : *BB)
if (isGuard(&I)) {		if (isGuard(&I)) {
auto *Cond = cast<IntrinsicInst>(&I)->getArgOperand(0);		auto *Cond = cast<IntrinsicInst>(&I)->getArgOperand(0);
Show All 24 Lines	for (auto *BB : L.blocks()) {
BI->setCondition(Cond);		BI->setCondition(Cond);

if (L.isLoopInvariant(BI->getCondition())) {		if (L.isLoopInvariant(BI->getCondition())) {
UnswitchCandidates.push_back({BI, {BI->getCondition()}});		UnswitchCandidates.push_back({BI, {BI->getCondition()}});
continue;		continue;
}		}

Instruction &CondI = *cast<Instruction>(BI->getCondition());		Instruction &CondI = *cast<Instruction>(BI->getCondition());
if (!match(&CondI, m_CombineOr(m_LogicalAnd(), m_LogicalOr())))		if (match(&CondI, m_CombineOr(m_LogicalAnd(), m_LogicalOr()))) {
continue;

TinyPtrVector<Value *> Invariants =		TinyPtrVector<Value *> Invariants =
collectHomogenousInstGraphLoopInvariants(L, CondI, LI);		collectHomogenousInstGraphLoopInvariants(L, CondI, LI);
if (Invariants.empty())		if (Invariants.empty())
continue;		continue;

UnswitchCandidates.push_back({BI, std::move(Invariants)});		UnswitchCandidates.push_back({BI, std::move(Invariants)});
		continue;
		}
		}

		if (MSSAU && !any_of(UnswitchCandidates, [&](auto &TerminatorAndInvariants) {
		return TerminatorAndInvariants.first == L.getHeader()->getTerminator();
		fhahnUnsubmitted Not Done Reply Inline Actions nit: no `llvm::` should be needed. fhahn: nit: no `llvm::` should be needed.
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will update it. jaykang10: Yep, I will update it.
		})) {
		MemorySSA *MSSA = MSSAU->getMemorySSA();
		if (auto Info = hasPartialIVCondition(L, MSSAThreshold, *MSSA, AA)) {
		LLVM_DEBUG(
		fhahnUnsubmitted Not Done Reply Inline Actions nit: no `llvm::` should be needed fhahn: nit: no `llvm::` should be needed
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will update it. jaykang10: Yep, I will update it.
		dbgs() << "simple-loop-unswitch: Found partially invariant condition "
		<< *Info->InstToDuplicate[0] << "\n");
		PartialIVInfo = *Info;
		TinyPtrVector<Value *> ValsToDuplicate;
		fhahnUnsubmitted Not Done Reply Inline Actions Why `TinyPtrVector` here? In most case we will have at least 2 instructions here anyways? Probably better to use a `SmallVector`? fhahn: Why `TinyPtrVector` here? In most case we will have at least 2 instructions here anyways?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions The SimpleUnswitchPass uses `TinyPtrVector` for `UnswitchCandidates`. In order to follow the interface, `TinyPtrVector` is being used here. jaykang10: The SimpleUnswitchPass uses `TinyPtrVector` for `UnswitchCandidates`. In order to follow the…
		for (auto *Inst : Info->InstToDuplicate)
		ValsToDuplicate.push_back(Inst);
		UnswitchCandidates.push_back(
		fhahnUnsubmitted Not Done Reply Inline Actions `emplace_back`? fhahn: `emplace_back`?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions `TinyPtrVector` does not have emplace_back. jaykang10: `TinyPtrVector` does not have emplace_back.
		{L.getHeader()->getTerminator(), std::move(ValsToDuplicate)});
		}
		fhahnUnsubmitted Not Done Reply Inline Actions enough to capture `[this]`? fhahn: enough to capture `[this]`?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions This code is inside static function rather than class's member function so we can not use `[this]`. I will update it with `[&L]`. jaykang10: This code is inside static function rather than class's member function so we can not use `…
}		}

// If we didn't find any candidates, we're done.		// If we didn't find any candidates, we're done.
if (UnswitchCandidates.empty())		if (UnswitchCandidates.empty())
return false;		return false;

// Check if there are irreducible CFG cycles in this loop. If so, we cannot		// Check if there are irreducible CFG cycles in this loop. If so, we cannot
// easily unswitch non-trivial edges out of the loop. Doing so might turn the		// easily unswitch non-trivial edges out of the loop. Doing so might turn the
▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines	auto ComputeUnswitchedCost = [&](Instruction &TI,

InstructionCost Cost = 0;		InstructionCost Cost = 0;
for (BasicBlock *SuccBB : successors(&BB)) {		for (BasicBlock *SuccBB : successors(&BB)) {
// Don't count successors more than once.		// Don't count successors more than once.
if (!Visited.insert(SuccBB).second)		if (!Visited.insert(SuccBB).second)
continue;		continue;

// If this is a partial unswitch candidate, then it must be a conditional		// If this is a partial unswitch candidate, then it must be a conditional
// branch with a condition of either `or`, `and`, or their corresponding		// branch with a condition of either `or`, `and`, their corresponding
// select forms. In that case, one of the successors is necessarily		// select forms or partially invariant instructions. In that case, one of
// duplicated, so don't even try to remove its cost.		// the successors is necessarily duplicated, so don't even try to remove
		// its cost.
if (!FullUnswitch) {		if (!FullUnswitch) {
auto &BI = cast<BranchInst>(TI);		auto &BI = cast<BranchInst>(TI);
if (match(BI.getCondition(), m_LogicalAnd())) {		if (match(BI.getCondition(), m_LogicalAnd())) {
if (SuccBB == BI.getSuccessor(1))		if (SuccBB == BI.getSuccessor(1))
continue;		continue;
} else {		} else if (match(BI.getCondition(), m_LogicalOr())) {
assert(match(BI.getCondition(), m_LogicalOr()) &&
"Only `and` and `or` conditions can result in a partial "
"unswitch!");
if (SuccBB == BI.getSuccessor(0))		if (SuccBB == BI.getSuccessor(0))
continue;		continue;
		} else if (!PartialIVInfo.InstToDuplicate.empty()) {
		if (PartialIVInfo.KnownValue->isOneValue() &&
		fhahnUnsubmitted Not Done Reply Inline Actions Do we need to check here that we only do this for blocks where we partially unswitch on their conditions? fhahn: Do we need to check here that we only do this for blocks where we partially unswitch on their…
		jaykang10AuthorUnsubmitted Done Reply Inline Actions We need to check the cost of the successors which are duplicated. The partially unswitched blocks on their conditions are only duplicated. jaykang10: We need to check the cost of the successors which are duplicated. The partially unswitched…
		SuccBB == BI.getSuccessor(1))
		continue;
		else if (!PartialIVInfo.KnownValue->isOneValue() &&
		SuccBB == BI.getSuccessor(0))
		continue;
}		}
}		}

// This successor's domtree will not need to be duplicated after		// This successor's domtree will not need to be duplicated after
// unswitching if the edge to the successor dominates it (and thus the		// unswitching if the edge to the successor dominates it (and thus the
// entire tree). This essentially means there is no other path into this		// entire tree). This essentially means there is no other path into this
// subtree and so it will end up live in only one clone of the loop.		// subtree and so it will end up live in only one clone of the loop.
if (SuccBB->getUniquePredecessor() \|\|		if (SuccBB->getUniquePredecessor() \|\|
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	if (BestUnswitchCost >= UnswitchThreshold) {
return false;		return false;
}		}

// If the best candidate is a guard, turn it into a branch.		// If the best candidate is a guard, turn it into a branch.
if (isGuard(BestUnswitchTI))		if (isGuard(BestUnswitchTI))
BestUnswitchTI = turnGuardIntoBranch(cast<IntrinsicInst>(BestUnswitchTI), L,		BestUnswitchTI = turnGuardIntoBranch(cast<IntrinsicInst>(BestUnswitchTI), L,
ExitBlocks, DT, LI, MSSAU);		ExitBlocks, DT, LI, MSSAU);

LLVM_DEBUG(dbgs() << " Unswitching non-trivial (cost = "		LLVM_DEBUG(dbgs() << " Unswitching non-trivial (cost = " << BestUnswitchCost
<< BestUnswitchCost << ") terminator: " << *BestUnswitchTI		<< ") terminator: " << *BestUnswitchTI << "\n");
		fhahnUnsubmitted Not Done Reply Inline Actions this change seems unrelated? fhahn: this change seems unrelated?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Ah, I ran clang-format. It is result of clang-format. jaykang10: Ah, I ran clang-format. It is result of clang-format.
		fhahnUnsubmitted Not Done Reply Inline Actions ok, can you undo it? As it is not related to the change. fhahn: ok, can you undo it? As it is not related to the change.
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will undo it. jaykang10: Yep, I will undo it.
<< "\n");
unswitchNontrivialInvariants(L, *BestUnswitchTI, BestUnswitchInvariants,		unswitchNontrivialInvariants(L, *BestUnswitchTI, BestUnswitchInvariants,
ExitBlocks, DT, LI, AC, UnswitchCB, SE, MSSAU);		ExitBlocks, PartialIVInfo, DT, LI, AC,
		UnswitchCB, SE, MSSAU);
return true;		return true;
}		}

/// Unswitch control flow predicated on loop invariant conditions.		/// Unswitch control flow predicated on loop invariant conditions.
///		///
/// This first hoists all branches or switches which are trivial (IE, do not		/// This first hoists all branches or switches which are trivial (IE, do not
/// require duplicating any part of the loop) out of the loop body. It then		/// require duplicating any part of the loop) out of the loop body. It then
/// looks at other loop invariant control flows and tries to unswitch those as		/// looks at other loop invariant control flows and tries to unswitch those as
/// well by cloning the loop if the result is small enough.		/// well by cloning the loop if the result is small enough.
///		///
/// The `DT`, `LI`, `AC`, `TTI` parameters are required analyses that are also		/// The `DT`, `LI`, `AC`, `AA`, `TTI` parameters are required analyses that are
/// updated based on the unswitch.		/// also updated based on the unswitch. The `MSSA` analysis is also updated if
/// The `MSSA` analysis is also updated if valid (i.e. its use is enabled).		/// valid (i.e. its use is enabled).
///		///
/// If either `NonTrivial` is true or the flag `EnableNonTrivialUnswitch` is		/// If either `NonTrivial` is true or the flag `EnableNonTrivialUnswitch` is
/// true, we will attempt to do non-trivial unswitching as well as trivial		/// true, we will attempt to do non-trivial unswitching as well as trivial
/// unswitching.		/// unswitching.
///		///
/// The `UnswitchCB` callback provided will be run after unswitching is		/// The `UnswitchCB` callback provided will be run after unswitching is
/// complete, with the first parameter set to `true` if the provided loop		/// complete, with the first parameter set to `true` if the provided loop
/// remains a loop, and a list of new sibling loops created.		/// remains a loop, and a list of new sibling loops created.
///		///
/// If `SE` is non-null, we will update that analysis based on the unswitching		/// If `SE` is non-null, we will update that analysis based on the unswitching
/// done.		/// done.
static bool unswitchLoop(Loop &L, DominatorTree &DT, LoopInfo &LI,		static bool
AssumptionCache &AC, TargetTransformInfo &TTI,		unswitchLoop(Loop &L, DominatorTree &DT, LoopInfo &LI, AssumptionCache &AC,
bool NonTrivial,		AAResults &AA, TargetTransformInfo &TTI, bool NonTrivial,
function_ref<void(bool, ArrayRef<Loop *>)> UnswitchCB,		function_ref<void(bool, bool, ArrayRef<Loop *>)> UnswitchCB,
ScalarEvolution SE, MemorySSAUpdater MSSAU) {		ScalarEvolution SE, MemorySSAUpdater MSSAU) {
assert(L.isRecursivelyLCSSAForm(DT, LI) &&		assert(L.isRecursivelyLCSSAForm(DT, LI) &&
"Loops must be in LCSSA form before unswitching.");		"Loops must be in LCSSA form before unswitching.");

// Must be in loop simplified form: we need a preheader and dedicated exits.		// Must be in loop simplified form: we need a preheader and dedicated exits.
if (!L.isLoopSimplifyForm())		if (!L.isLoopSimplifyForm())
return false;		return false;

// Try trivial unswitch first before loop over other basic blocks in the loop.		// Try trivial unswitch first before loop over other basic blocks in the loop.
if (unswitchAllTrivialConditions(L, DT, LI, SE, MSSAU)) {		if (unswitchAllTrivialConditions(L, DT, LI, SE, MSSAU)) {
// If we unswitched successfully we will want to clean up the loop before		// If we unswitched successfully we will want to clean up the loop before
// processing it further so just mark it as unswitched and return.		// processing it further so just mark it as unswitched and return.
UnswitchCB(/CurrentLoopValid/ true, {});		UnswitchCB(/CurrentLoopValid/ true, false, {});
return true;		return true;
}		}

// Check whether we should continue with non-trivial conditions.		// Check whether we should continue with non-trivial conditions.
// EnableNonTrivialUnswitch: Global variable that forces non-trivial		// EnableNonTrivialUnswitch: Global variable that forces non-trivial
// unswitching for testing and debugging.		// unswitching for testing and debugging.
// NonTrivial: Parameter that enables non-trivial unswitching for this		// NonTrivial: Parameter that enables non-trivial unswitching for this
// invocation of the transform. But this should be allowed only		// invocation of the transform. But this should be allowed only
Show All 19 Lines	unswitchLoop(Loop &L, DominatorTree &DT, LoopInfo &LI, AssumptionCache &AC,
// For non-trivial unswitching, because it often creates new loops, we rely on		// For non-trivial unswitching, because it often creates new loops, we rely on
// the pass manager to iterate on the loops rather than trying to immediately		// the pass manager to iterate on the loops rather than trying to immediately
// reach a fixed point. There is no substantial advantage to iterating		// reach a fixed point. There is no substantial advantage to iterating
// internally, and if any of the new loops are simplified enough to contain		// internally, and if any of the new loops are simplified enough to contain
// trivial unswitching we want to prefer those.		// trivial unswitching we want to prefer those.

// Try to unswitch the best invariant condition. We prefer this full unswitch to		// Try to unswitch the best invariant condition. We prefer this full unswitch to
// a partial unswitch when possible below the threshold.		// a partial unswitch when possible below the threshold.
if (unswitchBestCondition(L, DT, LI, AC, TTI, UnswitchCB, SE, MSSAU))		if (unswitchBestCondition(L, DT, LI, AC, AA, TTI, UnswitchCB, SE, MSSAU))
return true;		return true;

// No other opportunities to unswitch.		// No other opportunities to unswitch.
return false;		return false;
}		}

PreservedAnalyses SimpleLoopUnswitchPass::run(Loop &L, LoopAnalysisManager &AM,		PreservedAnalyses SimpleLoopUnswitchPass::run(Loop &L, LoopAnalysisManager &AM,
LoopStandardAnalysisResults &AR,		LoopStandardAnalysisResults &AR,
LPMUpdater &U) {		LPMUpdater &U) {
Function &F = *L.getHeader()->getParent();		Function &F = *L.getHeader()->getParent();
(void)F;		(void)F;

LLVM_DEBUG(dbgs() << "Unswitching loop in " << F.getName() << ": " << L		LLVM_DEBUG(dbgs() << "Unswitching loop in " << F.getName() << ": " << L
<< "\n");		<< "\n");

// Save the current loop name in a variable so that we can report it even		// Save the current loop name in a variable so that we can report it even
// after it has been deleted.		// after it has been deleted.
std::string LoopName = std::string(L.getName());		std::string LoopName = std::string(L.getName());

auto UnswitchCB = [&L, &U, &LoopName](bool CurrentLoopValid,		auto UnswitchCB = [&L, &U, &LoopName](bool CurrentLoopValid,
		bool PartiallyInvariant,
ArrayRef<Loop *> NewLoops) {		ArrayRef<Loop *> NewLoops) {
// If we did a non-trivial unswitch, we have added new (cloned) loops.		// If we did a non-trivial unswitch, we have added new (cloned) loops.
if (!NewLoops.empty())		if (!NewLoops.empty())
U.addSiblingLoops(NewLoops);		U.addSiblingLoops(NewLoops);

// If the current loop remains valid, we should revisit it to catch any		// If the current loop remains valid, we should revisit it to catch any
// other unswitch opportunities. Otherwise, we need to mark it as deleted.		// other unswitch opportunities. Otherwise, we need to mark it as deleted.
if (CurrentLoopValid)		if (CurrentLoopValid) {
		if (!PartiallyInvariant)
U.revisitCurrentLoop();		U.revisitCurrentLoop();
else		} else
U.markLoopAsDeleted(L, LoopName);		U.markLoopAsDeleted(L, LoopName);
};		};

Optional<MemorySSAUpdater> MSSAU;		Optional<MemorySSAUpdater> MSSAU;
if (AR.MSSA) {		if (AR.MSSA) {
MSSAU = MemorySSAUpdater(AR.MSSA);		MSSAU = MemorySSAUpdater(AR.MSSA);
if (VerifyMemorySSA)		if (VerifyMemorySSA)
AR.MSSA->verifyMemorySSA();		AR.MSSA->verifyMemorySSA();
}		}
if (!unswitchLoop(L, AR.DT, AR.LI, AR.AC, AR.TTI, NonTrivial, UnswitchCB,		if (!unswitchLoop(L, AR.DT, AR.LI, AR.AC, AR.AA, AR.TTI, NonTrivial,
&AR.SE, MSSAU.hasValue() ? MSSAU.getPointer() : nullptr))		UnswitchCB, &AR.SE,
		MSSAU.hasValue() ? MSSAU.getPointer() : nullptr))
return PreservedAnalyses::all();		return PreservedAnalyses::all();

if (AR.MSSA && VerifyMemorySSA)		if (AR.MSSA && VerifyMemorySSA)
AR.MSSA->verifyMemorySSA();		AR.MSSA->verifyMemorySSA();

// Historically this pass has had issues with the dominator tree so verify it		// Historically this pass has had issues with the dominator tree so verify it
// in asserts builds.		// in asserts builds.
assert(AR.DT.verify(DominatorTree::VerificationLevel::Fast));		assert(AR.DT.verify(DominatorTree::VerificationLevel::Fast));
Show All 40 Lines	bool SimpleLoopUnswitchLegacyPass::runOnLoop(Loop *L, LPPassManager &LPM) {
Function &F = *L->getHeader()->getParent();		Function &F = *L->getHeader()->getParent();

LLVM_DEBUG(dbgs() << "Unswitching loop in " << F.getName() << ": " << *L		LLVM_DEBUG(dbgs() << "Unswitching loop in " << F.getName() << ": " << *L
<< "\n");		<< "\n");

auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();		auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
auto &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();		auto &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
auto &AC = getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);		auto &AC = getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
		auto &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();
auto &TTI = getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);		auto &TTI = getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
MemorySSA *MSSA = nullptr;		MemorySSA *MSSA = nullptr;
Optional<MemorySSAUpdater> MSSAU;		Optional<MemorySSAUpdater> MSSAU;
if (EnableMSSALoopDependency) {		if (EnableMSSALoopDependency) {
MSSA = &getAnalysis<MemorySSAWrapperPass>().getMSSA();		MSSA = &getAnalysis<MemorySSAWrapperPass>().getMSSA();
MSSAU = MemorySSAUpdater(MSSA);		MSSAU = MemorySSAUpdater(MSSA);
}		}

auto *SEWP = getAnalysisIfAvailable<ScalarEvolutionWrapperPass>();		auto *SEWP = getAnalysisIfAvailable<ScalarEvolutionWrapperPass>();
auto *SE = SEWP ? &SEWP->getSE() : nullptr;		auto *SE = SEWP ? &SEWP->getSE() : nullptr;

auto UnswitchCB = [&L, &LPM](bool CurrentLoopValid,		auto UnswitchCB = [&L, &LPM](bool CurrentLoopValid, bool PartiallyInvariant,
ArrayRef<Loop *> NewLoops) {		ArrayRef<Loop *> NewLoops) {
// If we did a non-trivial unswitch, we have added new (cloned) loops.		// If we did a non-trivial unswitch, we have added new (cloned) loops.
for (auto *NewL : NewLoops)		for (auto *NewL : NewLoops)
LPM.addLoop(*NewL);		LPM.addLoop(*NewL);

// If the current loop remains valid, re-add it to the queue. This is		// If the current loop remains valid, re-add it to the queue. This is
// a little wasteful as we'll finish processing the current loop as well,		// a little wasteful as we'll finish processing the current loop as well,
// but it is the best we can do in the old PM.		// but it is the best we can do in the old PM.
if (CurrentLoopValid)		if (CurrentLoopValid) {
		// If the current loop has been unswitched with partially invariant
		fhahnUnsubmitted Not Done Reply Inline Actions can you add a comment here to explain the check? fhahn: can you add a comment here to explain the check?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will add a comment. jaykang10: Yep, I will add a comment.
		fhahnUnsubmitted Not Done Reply Inline Actions nit: perhaps `unswitched using a partially invariant condition, ...`? fhahn: nit: perhaps `unswitched using a partially invariant condition, ...`?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will update it. jaykang10: Yep, I will update it.
		// condition, we should not re-add the current loop to avoid unswitching
		// on the same condition again.
		if (!PartiallyInvariant)
LPM.addLoop(*L);		LPM.addLoop(*L);
else		} else
LPM.markLoopAsDeleted(*L);		LPM.markLoopAsDeleted(*L);
};		};

if (MSSA && VerifyMemorySSA)		if (MSSA && VerifyMemorySSA)
MSSA->verifyMemorySSA();		MSSA->verifyMemorySSA();

bool Changed = unswitchLoop(*L, DT, LI, AC, TTI, NonTrivial, UnswitchCB, SE,		bool Changed =
		unswitchLoop(*L, DT, LI, AC, AA, TTI, NonTrivial, UnswitchCB, SE,
MSSAU.hasValue() ? MSSAU.getPointer() : nullptr);		MSSAU.hasValue() ? MSSAU.getPointer() : nullptr);

if (MSSA && VerifyMemorySSA)		if (MSSA && VerifyMemorySSA)
MSSA->verifyMemorySSA();		MSSA->verifyMemorySSA();

// Historically this pass has had issues with the dominator tree so verify it		// Historically this pass has had issues with the dominator tree so verify it
// in asserts builds.		// in asserts builds.
assert(DT.verify(DominatorTree::VerificationLevel::Fast));		assert(DT.verify(DominatorTree::VerificationLevel::Fast));

Show All 18 Lines

llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
		lebedev.riUnsubmitted Not Done Reply Inline Actions Precommit this? lebedev.ri: Precommit this?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Sorry... This test file is a copy of `test/Transforms/LoopUnswitch/partial-unswitch.ll` using SimpleLoopUnswitch. The output of SimpleLoopUnswitch is slightly different with LoopUnswitch one so I have run the script to generate assertion. I have checked the each tests' output. If it is not good to use the script, I will add the assertions manually. Please let me know. jaykang10: Sorry... This test file is a copy of `test/Transforms/LoopUnswitch/partial-unswitch.ll` using…
		lebedev.riUnsubmitted Not Done Reply Inline Actions No i mean, just commit this test file as-is, obviously after regenerating the check lines to pass on main. lebedev.ri: No i mean, just commit this test file as-is, obviously after regenerating the check lines to…
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Ah, sorry. let me commit this test separately after checking it again. jaykang10: Ah, sorry. let me commit this test separately after checking it again.
; RUN: opt -passes='loop-mssa(unswitch<nontrivial>),verify<loops>' -S < %s \| FileCheck %s		; RUN: opt -passes='loop-mssa(unswitch<nontrivial>),verify<loops>' -S < %s \| FileCheck %s

declare void @clobber()		declare void @clobber()

define i32 @partial_unswitch_true_successor(i32* %ptr, i32 %N) {		define i32 @partial_unswitch_true_successor(i32* %ptr, i32 %N) {
; CHECK-LABEL: @partial_unswitch_true_successor(		; CHECK-LABEL: @partial_unswitch_true_successor(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[PTR:%.*]], align 4
		; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i32 [[TMP0]], 100
		; CHECK-NEXT: br i1 [[TMP1]], label [[ENTRY_SPLIT_US:%.]], label [[ENTRY_SPLIT:%.]]
		; CHECK: entry.split.us:
		; CHECK-NEXT: br label [[LOOP_HEADER_US:%.*]]
		; CHECK: loop.header.us:
		; CHECK-NEXT: [[IV_US:%.]] = phi i32 [ 0, [[ENTRY_SPLIT_US]] ], [ [[IV_NEXT_US:%.]], [[LOOP_LATCH_US:%.*]] ]
		; CHECK-NEXT: br label [[NOCLOBBER_US:%.*]]
		; CHECK: noclobber.us:
		; CHECK-NEXT: br label [[LOOP_LATCH_US]]
		; CHECK: loop.latch.us:
		; CHECK-NEXT: [[C_US:%.]] = icmp ult i32 [[IV_US]], [[N:%.]]
		; CHECK-NEXT: [[IV_NEXT_US]] = add i32 [[IV_US]], 1
		; CHECK-NEXT: br i1 [[C_US]], label [[LOOP_HEADER_US]], label [[EXIT_SPLIT_US:%.*]]
		; CHECK: exit.split.us:
		; CHECK-NEXT: br label [[EXIT:%.*]]
		; CHECK: entry.split:
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY_SPLIT]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR:%.*]], align 4		; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR]], align 4
; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100		; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100
; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]		; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]
; CHECK: noclobber:		; CHECK: noclobber:
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: clobber:		; CHECK: clobber:
; CHECK-NEXT: call void @clobber()		; CHECK-NEXT: call void @clobber()
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[C:%.]] = icmp ult i32 [[IV]], [[N:%.]]		; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[IV]], [[N]]
; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT_SPLIT:%.*]]
		; CHECK: exit.split:
		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret i32 10		; CHECK-NEXT: ret i32 10
;		;
entry:		entry:
br label %loop.header		br label %loop.header

loop.header:		loop.header:
%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]		%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]
Show All 15 Lines

exit:		exit:
ret i32 10		ret i32 10
}		}

define i32 @partial_unswitch_false_successor(i32* %ptr, i32 %N) {		define i32 @partial_unswitch_false_successor(i32* %ptr, i32 %N) {
; CHECK-LABEL: @partial_unswitch_false_successor(		; CHECK-LABEL: @partial_unswitch_false_successor(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[PTR:%.*]], align 4
		; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i32 [[TMP0]], 100
		; CHECK-NEXT: br i1 [[TMP1]], label [[ENTRY_SPLIT:%.]], label [[ENTRY_SPLIT_US:%.]]
		; CHECK: entry.split.us:
		; CHECK-NEXT: br label [[LOOP_HEADER_US:%.*]]
		; CHECK: loop.header.us:
		; CHECK-NEXT: [[IV_US:%.]] = phi i32 [ 0, [[ENTRY_SPLIT_US]] ], [ [[IV_NEXT_US:%.]], [[LOOP_LATCH_US:%.*]] ]
		; CHECK-NEXT: br label [[NOCLOBBER_US:%.*]]
		; CHECK: noclobber.us:
		; CHECK-NEXT: br label [[LOOP_LATCH_US]]
		; CHECK: loop.latch.us:
		; CHECK-NEXT: [[C_US:%.]] = icmp ult i32 [[IV_US]], [[N:%.]]
		; CHECK-NEXT: [[IV_NEXT_US]] = add i32 [[IV_US]], 1
		; CHECK-NEXT: br i1 [[C_US]], label [[LOOP_HEADER_US]], label [[EXIT_SPLIT_US:%.*]]
		; CHECK: exit.split.us:
		; CHECK-NEXT: br label [[EXIT:%.*]]
		; CHECK: entry.split:
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY_SPLIT]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR:%.*]], align 4		; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR]], align 4
; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100		; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100
; CHECK-NEXT: br i1 [[SC]], label [[CLOBBER:%.]], label [[NOCLOBBER:%.]]		; CHECK-NEXT: br i1 [[SC]], label [[CLOBBER:%.]], label [[NOCLOBBER:%.]]
; CHECK: clobber:		; CHECK: clobber:
; CHECK-NEXT: call void @clobber()		; CHECK-NEXT: call void @clobber()
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: noclobber:		; CHECK: noclobber:
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[C:%.]] = icmp ult i32 [[IV]], [[N:%.]]		; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[IV]], [[N]]
; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT_SPLIT:%.*]]
		; CHECK: exit.split:
		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret i32 10		; CHECK-NEXT: ret i32 10
;		;
entry:		entry:
br label %loop.header		br label %loop.header

loop.header:		loop.header:
%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]		%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]
Show All 15 Lines

exit:		exit:
ret i32 10		ret i32 10
}		}

define i32 @partial_unswtich_gep_load_icmp(i32** %ptr, i32 %N) {		define i32 @partial_unswtich_gep_load_icmp(i32** %ptr, i32 %N) {
; CHECK-LABEL: @partial_unswtich_gep_load_icmp(		; CHECK-LABEL: @partial_unswtich_gep_load_icmp(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = getelementptr i32, i32** [[PTR:%.*]], i32 1
		; CHECK-NEXT: [[TMP1:%.]] = load i32, i32** [[TMP0]], align 8
		; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP1]], align 4
		; CHECK-NEXT: [[TMP3:%.*]] = icmp eq i32 [[TMP2]], 100
		; CHECK-NEXT: br i1 [[TMP3]], label [[ENTRY_SPLIT_US:%.]], label [[ENTRY_SPLIT:%.]]
		; CHECK: entry.split.us:
		; CHECK-NEXT: br label [[LOOP_HEADER_US:%.*]]
		; CHECK: loop.header.us:
		; CHECK-NEXT: [[IV_US:%.]] = phi i32 [ 0, [[ENTRY_SPLIT_US]] ], [ [[IV_NEXT_US:%.]], [[LOOP_LATCH_US:%.*]] ]
		; CHECK-NEXT: br label [[NOCLOBBER_US:%.*]]
		; CHECK: noclobber.us:
		; CHECK-NEXT: br label [[LOOP_LATCH_US]]
		; CHECK: loop.latch.us:
		; CHECK-NEXT: [[C_US:%.]] = icmp ult i32 [[IV_US]], [[N:%.]]
		; CHECK-NEXT: [[IV_NEXT_US]] = add i32 [[IV_US]], 1
		; CHECK-NEXT: br i1 [[C_US]], label [[LOOP_HEADER_US]], label [[EXIT_SPLIT_US:%.*]]
		; CHECK: exit.split.us:
		; CHECK-NEXT: br label [[EXIT:%.*]]
		; CHECK: entry.split:
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY_SPLIT]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[GEP:%.]] = getelementptr i32, i32** [[PTR:%.*]], i32 1		; CHECK-NEXT: [[GEP:%.]] = getelementptr i32, i32** [[PTR]], i32 1
; CHECK-NEXT: [[LV_1:%.]] = load i32, i32** [[GEP]], align 8		; CHECK-NEXT: [[LV_1:%.]] = load i32, i32** [[GEP]], align 8
; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[LV_1]], align 4		; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[LV_1]], align 4
; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100		; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100
; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]		; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]
; CHECK: noclobber:		; CHECK: noclobber:
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: clobber:		; CHECK: clobber:
; CHECK-NEXT: call void @clobber()		; CHECK-NEXT: call void @clobber()
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[C:%.]] = icmp ult i32 [[IV]], [[N:%.]]		; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[IV]], [[N]]
; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT_SPLIT:%.*]]
		; CHECK: exit.split:
		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret i32 10		; CHECK-NEXT: ret i32 10
;		;
entry:		entry:
br label %loop.header		br label %loop.header

loop.header:		loop.header:
%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]		%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]
Show All 17 Lines

exit:		exit:
ret i32 10		ret i32 10
}		}

define i32 @partial_unswitch_reduction_phi(i32* %ptr, i32 %N) {		define i32 @partial_unswitch_reduction_phi(i32* %ptr, i32 %N) {
; CHECK-LABEL: @partial_unswitch_reduction_phi(		; CHECK-LABEL: @partial_unswitch_reduction_phi(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[PTR:%.*]], align 4
		; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i32 [[TMP0]], 100
		; CHECK-NEXT: br i1 [[TMP1]], label [[ENTRY_SPLIT:%.]], label [[ENTRY_SPLIT_US:%.]]
		; CHECK: entry.split.us:
		; CHECK-NEXT: br label [[LOOP_HEADER_US:%.*]]
		; CHECK: loop.header.us:
		; CHECK-NEXT: [[IV_US:%.]] = phi i32 [ 0, [[ENTRY_SPLIT_US]] ], [ [[IV_NEXT_US:%.]], [[LOOP_LATCH_US:%.*]] ]
		; CHECK-NEXT: [[RED_US:%.]] = phi i32 [ 20, [[ENTRY_SPLIT_US]] ], [ [[RED_NEXT_US:%.]], [[LOOP_LATCH_US]] ]
		; CHECK-NEXT: br label [[NOCLOBBER_US:%.*]]
		; CHECK: noclobber.us:
		; CHECK-NEXT: [[ADD_10_US:%.*]] = add i32 [[RED_US]], 10
		; CHECK-NEXT: br label [[LOOP_LATCH_US]]
		; CHECK: loop.latch.us:
		; CHECK-NEXT: [[RED_NEXT_US]] = phi i32 [ [[ADD_10_US]], [[NOCLOBBER_US]] ]
		; CHECK-NEXT: [[C_US:%.]] = icmp ult i32 [[IV_US]], [[N:%.]]
		; CHECK-NEXT: [[IV_NEXT_US]] = add i32 [[IV_US]], 1
		; CHECK-NEXT: br i1 [[C_US]], label [[LOOP_HEADER_US]], label [[EXIT_SPLIT_US:%.*]]
		; CHECK: exit.split.us:
		; CHECK-NEXT: [[RED_NEXT_LCSSA_US:%.*]] = phi i32 [ [[RED_NEXT_US]], [[LOOP_LATCH_US]] ]
		; CHECK-NEXT: br label [[EXIT:%.*]]
		; CHECK: entry.split:
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY_SPLIT]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[RED:%.]] = phi i32 [ 20, [[ENTRY]] ], [ [[RED_NEXT:%.]], [[LOOP_LATCH]] ]		; CHECK-NEXT: [[RED:%.]] = phi i32 [ 20, [[ENTRY_SPLIT]] ], [ [[RED_NEXT:%.]], [[LOOP_LATCH]] ]
; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR:%.*]], align 4		; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR]], align 4
; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100		; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100
; CHECK-NEXT: br i1 [[SC]], label [[CLOBBER:%.]], label [[NOCLOBBER:%.]]		; CHECK-NEXT: br i1 [[SC]], label [[CLOBBER:%.]], label [[NOCLOBBER:%.]]
; CHECK: clobber:		; CHECK: clobber:
; CHECK-NEXT: call void @clobber()		; CHECK-NEXT: call void @clobber()
; CHECK-NEXT: [[ADD_5:%.*]] = add i32 [[RED]], 5		; CHECK-NEXT: [[ADD_5:%.*]] = add i32 [[RED]], 5
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: noclobber:		; CHECK: noclobber:
; CHECK-NEXT: [[ADD_10:%.*]] = add i32 [[RED]], 10		; CHECK-NEXT: [[ADD_10:%.*]] = add i32 [[RED]], 10
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[RED_NEXT]] = phi i32 [ [[ADD_5]], [[CLOBBER]] ], [ [[ADD_10]], [[NOCLOBBER]] ]		; CHECK-NEXT: [[RED_NEXT]] = phi i32 [ [[ADD_5]], [[CLOBBER]] ], [ [[ADD_10]], [[NOCLOBBER]] ]
; CHECK-NEXT: [[C:%.]] = icmp ult i32 [[IV]], [[N:%.]]		; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[IV]], [[N]]
; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT_SPLIT:%.*]]
; CHECK: exit:		; CHECK: exit.split:
; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi i32 [ [[RED_NEXT]], [[LOOP_LATCH]] ]		; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi i32 [ [[RED_NEXT]], [[LOOP_LATCH]] ]
; CHECK-NEXT: ret i32 [[RED_NEXT_LCSSA]]		; CHECK-NEXT: br label [[EXIT]]
		; CHECK: exit:
		; CHECK-NEXT: [[DOTUS_PHI:%.*]] = phi i32 [ [[RED_NEXT_LCSSA]], [[EXIT_SPLIT]] ], [ [[RED_NEXT_LCSSA_US]], [[EXIT_SPLIT_US]] ]
		; CHECK-NEXT: ret i32 [[DOTUS_PHI]]
;		;
entry:		entry:
br label %loop.header		br label %loop.header

loop.header:		loop.header:
%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]		%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]
%red = phi i32 [ 20, %entry ], [ %red.next, %loop.latch ]		%red = phi i32 [ 20, %entry ], [ %red.next, %loop.latch ]
%lv = load i32, i32* %ptr		%lv = load i32, i32* %ptr
Show All 20 Lines	exit:
ret i32 %red.next.lcssa		ret i32 %red.next.lcssa
}		}

; Partial unswitching is possible, because the store in %noclobber does not		; Partial unswitching is possible, because the store in %noclobber does not
; alias the load of the condition.		; alias the load of the condition.
define i32 @partial_unswitch_true_successor_noclobber(i32* noalias %ptr.1, i32* noalias %ptr.2, i32 %N) {		define i32 @partial_unswitch_true_successor_noclobber(i32* noalias %ptr.1, i32* noalias %ptr.2, i32 %N) {
; CHECK-LABEL: @partial_unswitch_true_successor_noclobber(		; CHECK-LABEL: @partial_unswitch_true_successor_noclobber(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[PTR_1:%.*]], align 4
		; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i32 [[TMP0]], 100
		; CHECK-NEXT: br i1 [[TMP1]], label [[ENTRY_SPLIT_US:%.]], label [[ENTRY_SPLIT:%.]]
		; CHECK: entry.split.us:
		; CHECK-NEXT: br label [[LOOP_HEADER_US:%.*]]
		; CHECK: loop.header.us:
		; CHECK-NEXT: [[IV_US:%.]] = phi i32 [ 0, [[ENTRY_SPLIT_US]] ], [ [[IV_NEXT_US:%.]], [[LOOP_LATCH_US:%.*]] ]
		; CHECK-NEXT: [[LV_US:%.]] = load i32, i32 [[PTR_1]], align 4
		; CHECK-NEXT: br label [[NOCLOBBER_US:%.*]]
		; CHECK: noclobber.us:
		; CHECK-NEXT: [[GEP_1_US:%.]] = getelementptr i32, i32 [[PTR_2:%.*]], i32 [[IV_US]]
		; CHECK-NEXT: store i32 [[LV_US]], i32* [[GEP_1_US]], align 4
		; CHECK-NEXT: br label [[LOOP_LATCH_US]]
		; CHECK: loop.latch.us:
		; CHECK-NEXT: [[C_US:%.]] = icmp ult i32 [[IV_US]], [[N:%.]]
		; CHECK-NEXT: [[IV_NEXT_US]] = add i32 [[IV_US]], 1
		; CHECK-NEXT: br i1 [[C_US]], label [[LOOP_HEADER_US]], label [[EXIT_SPLIT_US:%.*]]
		; CHECK: exit.split.us:
		; CHECK-NEXT: br label [[EXIT:%.*]]
		; CHECK: entry.split:
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY_SPLIT]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR_1:%.*]], align 4		; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR_1]], align 4
; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100		; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100
; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]		; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]
; CHECK: noclobber:		; CHECK: noclobber:
; CHECK-NEXT: [[GEP_1:%.]] = getelementptr i32, i32 [[PTR_2:%.*]], i32 [[IV]]		; CHECK-NEXT: [[GEP_1:%.]] = getelementptr i32, i32 [[PTR_2]], i32 [[IV]]
; CHECK-NEXT: store i32 [[LV]], i32* [[GEP_1]], align 4		; CHECK-NEXT: store i32 [[LV]], i32* [[GEP_1]], align 4
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: clobber:		; CHECK: clobber:
; CHECK-NEXT: call void @clobber()		; CHECK-NEXT: call void @clobber()
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[C:%.]] = icmp ult i32 [[IV]], [[N:%.]]		; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[IV]], [[N]]
; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT_SPLIT:%.*]]
		; CHECK: exit.split:
		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret i32 10		; CHECK-NEXT: ret i32 10
;		;
entry:		entry:
br label %loop.header		br label %loop.header

loop.header:		loop.header:
%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]		%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]
▲ Show 20 Lines • Show All 262 Lines • ▼ Show 20 Lines
; because it is already checked in the @partial_unswitch_true_successor test		; because it is already checked in the @partial_unswitch_true_successor test
; case.		; case.
define i32 @partial_unswitch_true_successor_preheader_insertion(i32* %ptr, i32 %N) {		define i32 @partial_unswitch_true_successor_preheader_insertion(i32* %ptr, i32 %N) {
; CHECK-LABEL: @partial_unswitch_true_successor_preheader_insertion(		; CHECK-LABEL: @partial_unswitch_true_successor_preheader_insertion(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EC:%.]] = icmp ne i32 [[PTR:%.*]], null		; CHECK-NEXT: [[EC:%.]] = icmp ne i32 [[PTR:%.*]], null
; CHECK-NEXT: br i1 [[EC]], label [[LOOP_PH:%.]], label [[EXIT:%.]]		; CHECK-NEXT: br i1 [[EC]], label [[LOOP_PH:%.]], label [[EXIT:%.]]
; CHECK: loop.ph:		; CHECK: loop.ph:
		; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[PTR]], align 4
		; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i32 [[TMP0]], 100
		; CHECK-NEXT: br i1 [[TMP1]], label [[LOOP_PH_SPLIT_US:%.]], label [[LOOP_PH_SPLIT:%.]]
		; CHECK: loop.ph.split.us:
		; CHECK-NEXT: br label [[LOOP_HEADER_US:%.*]]
		; CHECK: loop.header.us:
		; CHECK-NEXT: [[IV_US:%.]] = phi i32 [ 0, [[LOOP_PH_SPLIT_US]] ], [ [[IV_NEXT_US:%.]], [[LOOP_LATCH_US:%.*]] ]
		; CHECK-NEXT: br label [[NOCLOBBER_US:%.*]]
		; CHECK: noclobber.us:
		; CHECK-NEXT: br label [[LOOP_LATCH_US]]
		; CHECK: loop.latch.us:
		; CHECK-NEXT: [[C_US:%.]] = icmp ult i32 [[IV_US]], [[N:%.]]
		; CHECK-NEXT: [[IV_NEXT_US]] = add i32 [[IV_US]], 1
		; CHECK-NEXT: br i1 [[C_US]], label [[LOOP_HEADER_US]], label [[EXIT_LOOPEXIT_SPLIT_US:%.*]]
		; CHECK: exit.loopexit.split.us:
		; CHECK-NEXT: br label [[EXIT_LOOPEXIT:%.*]]
		; CHECK: loop.ph.split:
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[LOOP_PH]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]		; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[LOOP_PH_SPLIT]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR]], align 4		; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR]], align 4
; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100		; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100
; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]		; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]
; CHECK: noclobber:		; CHECK: noclobber:
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: clobber:		; CHECK: clobber:
; CHECK-NEXT: call void @clobber()		; CHECK-NEXT: call void @clobber()
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[C:%.]] = icmp ult i32 [[IV]], [[N:%.]]		; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[IV]], [[N]]
; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT_LOOPEXIT:%.*]]		; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT_LOOPEXIT_SPLIT:%.*]]
		; CHECK: exit.loopexit.split:
		; CHECK-NEXT: br label [[EXIT_LOOPEXIT]]
; CHECK: exit.loopexit:		; CHECK: exit.loopexit:
; CHECK-NEXT: br label [[EXIT]]		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret i32 10		; CHECK-NEXT: ret i32 10
;		;

entry:		entry:
%ec = icmp ne i32* %ptr, null		%ec = icmp ne i32* %ptr, null
Show All 26 Lines

; Make sure the duplicated instructions are hoisted just before the branch of		; Make sure the duplicated instructions are hoisted just before the branch of
; the preheader. Do not check the unswitched code, because it is already checked		; the preheader. Do not check the unswitched code, because it is already checked
; in the @partial_unswitch_true_successor test case		; in the @partial_unswitch_true_successor test case
define i32 @partial_unswitch_true_successor_insert_point(i32* %ptr, i32 %N) {		define i32 @partial_unswitch_true_successor_insert_point(i32* %ptr, i32 %N) {
; CHECK-LABEL: @partial_unswitch_true_successor_insert_point(		; CHECK-LABEL: @partial_unswitch_true_successor_insert_point(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: call void @clobber()		; CHECK-NEXT: call void @clobber()
		; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[PTR:%.*]], align 4
		; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i32 [[TMP0]], 100
		; CHECK-NEXT: br i1 [[TMP1]], label [[ENTRY_SPLIT_US:%.]], label [[ENTRY_SPLIT:%.]]
		; CHECK: entry.split.us:
		; CHECK-NEXT: br label [[LOOP_HEADER_US:%.*]]
		; CHECK: loop.header.us:
		; CHECK-NEXT: [[IV_US:%.]] = phi i32 [ 0, [[ENTRY_SPLIT_US]] ], [ [[IV_NEXT_US:%.]], [[LOOP_LATCH_US:%.*]] ]
		; CHECK-NEXT: br label [[NOCLOBBER_US:%.*]]
		; CHECK: noclobber.us:
		; CHECK-NEXT: br label [[LOOP_LATCH_US]]
		; CHECK: loop.latch.us:
		; CHECK-NEXT: [[C_US:%.]] = icmp ult i32 [[IV_US]], [[N:%.]]
		; CHECK-NEXT: [[IV_NEXT_US]] = add i32 [[IV_US]], 1
		; CHECK-NEXT: br i1 [[C_US]], label [[LOOP_HEADER_US]], label [[EXIT_SPLIT_US:%.*]]
		; CHECK: exit.split.us:
		; CHECK-NEXT: br label [[EXIT:%.*]]
		; CHECK: entry.split:
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY_SPLIT]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR:%.*]], align 4		; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR]], align 4
; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100		; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100
; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]		; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]
; CHECK: noclobber:		; CHECK: noclobber:
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: clobber:		; CHECK: clobber:
; CHECK-NEXT: call void @clobber()		; CHECK-NEXT: call void @clobber()
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[C:%.]] = icmp ult i32 [[IV]], [[N:%.]]		; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[IV]], [[N]]
; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT_SPLIT:%.*]]
		; CHECK: exit.split:
		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret i32 10		; CHECK-NEXT: ret i32 10
;		;
entry:		entry:
call void @clobber()		call void @clobber()
br label %loop.header		br label %loop.header

loop.header:		loop.header:
Show All 19 Lines
}		}

; Make sure invariant instructions in the loop are also hoisted to the preheader.		; Make sure invariant instructions in the loop are also hoisted to the preheader.
; Do not check the unswitched code, because it is already checked in the		; Do not check the unswitched code, because it is already checked in the
; @partial_unswitch_true_successor test case		; @partial_unswitch_true_successor test case
define i32 @partial_unswitch_true_successor_hoist_invariant(i32* %ptr, i32 %N) {		define i32 @partial_unswitch_true_successor_hoist_invariant(i32* %ptr, i32 %N) {
; CHECK-LABEL: @partial_unswitch_true_successor_hoist_invariant(		; CHECK-LABEL: @partial_unswitch_true_successor_hoist_invariant(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = getelementptr i32, i32 [[PTR:%.*]], i64 1
		; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP0]], align 4
		; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i32 [[TMP1]], 100
		; CHECK-NEXT: br i1 [[TMP2]], label [[ENTRY_SPLIT_US:%.]], label [[ENTRY_SPLIT:%.]]
		; CHECK: entry.split.us:
		; CHECK-NEXT: br label [[LOOP_HEADER_US:%.*]]
		; CHECK: loop.header.us:
		; CHECK-NEXT: [[IV_US:%.]] = phi i32 [ 0, [[ENTRY_SPLIT_US]] ], [ [[IV_NEXT_US:%.]], [[LOOP_LATCH_US:%.*]] ]
		; CHECK-NEXT: br label [[NOCLOBBER_US:%.*]]
		; CHECK: noclobber.us:
		; CHECK-NEXT: br label [[LOOP_LATCH_US]]
		; CHECK: loop.latch.us:
		; CHECK-NEXT: [[C_US:%.]] = icmp ult i32 [[IV_US]], [[N:%.]]
		; CHECK-NEXT: [[IV_NEXT_US]] = add i32 [[IV_US]], 1
		; CHECK-NEXT: br i1 [[C_US]], label [[LOOP_HEADER_US]], label [[EXIT_SPLIT_US:%.*]]
		; CHECK: exit.split.us:
		; CHECK-NEXT: br label [[EXIT:%.*]]
		; CHECK: entry.split:
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY_SPLIT]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[GEP:%.]] = getelementptr i32, i32 [[PTR:%.*]], i64 1		; CHECK-NEXT: [[GEP:%.]] = getelementptr i32, i32 [[PTR]], i64 1
; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[GEP]], align 4		; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[GEP]], align 4
; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100		; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100
; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]		; CHECK-NEXT: br i1 [[SC]], label [[NOCLOBBER:%.]], label [[CLOBBER:%.]]
; CHECK: noclobber:		; CHECK: noclobber:
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: clobber:		; CHECK: clobber:
; CHECK-NEXT: call void @clobber()		; CHECK-NEXT: call void @clobber()
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[C:%.]] = icmp ult i32 [[IV]], [[N:%.]]		; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[IV]], [[N]]
; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT_SPLIT:%.*]]
		; CHECK: exit.split:
		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret i32 10		; CHECK-NEXT: ret i32 10
;		;
entry:		entry:
br label %loop.header		br label %loop.header

loop.header:		loop.header:
%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]		%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]
▲ Show 20 Lines • Show All 243 Lines • ▼ Show 20 Lines

exit:		exit:
ret i32 10		ret i32 10
}		}

define i32 @partial_unswitch_true_to_latch(i32* %ptr, i32 %N) {		define i32 @partial_unswitch_true_to_latch(i32* %ptr, i32 %N) {
; CHECK-LABEL: @partial_unswitch_true_to_latch(		; CHECK-LABEL: @partial_unswitch_true_to_latch(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[PTR:%.*]], align 4
		; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i32 [[TMP0]], 100
		; CHECK-NEXT: br i1 [[TMP1]], label [[ENTRY_SPLIT_US:%.]], label [[ENTRY_SPLIT:%.]]
		; CHECK: entry.split.us:
		; CHECK-NEXT: br label [[LOOP_HEADER_US:%.*]]
		; CHECK: loop.header.us:
		; CHECK-NEXT: [[IV_US:%.]] = phi i32 [ 0, [[ENTRY_SPLIT_US]] ], [ [[IV_NEXT_US:%.]], [[LOOP_LATCH_US:%.*]] ]
		; CHECK-NEXT: br label [[LOOP_LATCH_US]]
		; CHECK: loop.latch.us:
		; CHECK-NEXT: [[C_US:%.]] = icmp ult i32 [[IV_US]], [[N:%.]]
		; CHECK-NEXT: [[IV_NEXT_US]] = add i32 [[IV_US]], 1
		; CHECK-NEXT: br i1 [[C_US]], label [[LOOP_HEADER_US]], label [[EXIT_SPLIT_US:%.*]]
		; CHECK: exit.split.us:
		; CHECK-NEXT: br label [[EXIT:%.*]]
		; CHECK: entry.split:
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[IV:%.]] = phi i32 [ 0, [[ENTRY_SPLIT]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]
; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR:%.*]], align 4		; CHECK-NEXT: [[LV:%.]] = load i32, i32 [[PTR]], align 4
; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100		; CHECK-NEXT: [[SC:%.*]] = icmp eq i32 [[LV]], 100
; CHECK-NEXT: br i1 [[SC]], label [[LOOP_LATCH]], label [[CLOBBER:%.*]]		; CHECK-NEXT: br i1 [[SC]], label [[LOOP_LATCH]], label [[CLOBBER:%.*]]
; CHECK: clobber:		; CHECK: clobber:
; CHECK-NEXT: call void @clobber()		; CHECK-NEXT: call void @clobber()
; CHECK-NEXT: br label [[LOOP_LATCH]]		; CHECK-NEXT: br label [[LOOP_LATCH]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[C:%.]] = icmp ult i32 [[IV]], [[N:%.]]		; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[IV]], [[N]]
; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 [[C]], label [[LOOP_HEADER]], label [[EXIT_SPLIT:%.*]]
		; CHECK: exit.split:
		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret i32 10		; CHECK-NEXT: ret i32 10
;		;
entry:		entry:
br label %loop.header		br label %loop.header

loop.header:		loop.header:
%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]		%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop.latch ]
%lv = load i32, i32* %ptr		%lv = load i32, i32* %ptr
%sc = icmp eq i32 %lv, 100		%sc = icmp eq i32 %lv, 100
br i1 %sc, label %loop.latch, label %clobber		br i1 %sc, label %loop.latch, label %clobber

clobber:		clobber:
call void @clobber()		call void @clobber()
br label %loop.latch		br label %loop.latch

loop.latch:		loop.latch:
%c = icmp ult i32 %iv, %N		%c = icmp ult i32 %iv, %N
%iv.next = add i32 %iv, 1		%iv.next = add i32 %iv, 1
br i1 %c, label %loop.header, label %exit		br i1 %c, label %loop.header, label %exit

exit:		exit:
ret i32 10		ret i32 10
}		}

		define void @test(i32 %0, i32 %1, i32 %2, ...) {
		fhahnUnsubmitted Not Done Reply Inline Actions Could you add a brief comment and a more descriptive name for the test? fhahn: Could you add a brief comment and a more descriptive name for the test?
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will add them. jaykang10: Yep, I will add them.
		; CHECK-LABEL: @test(
		; CHECK-NEXT: [[TMP4:%.]] = icmp ne i32 [[TMP1:%.]], 0
		; CHECK-NEXT: [[TMP5:%.*]] = icmp ne i32 [[TMP1]], 0
		; CHECK-NEXT: br i1 [[TMP5]], label [[DOTSPLIT:%.]], label [[DOTSPLIT_US:%.]]
		; CHECK: .split.us:
		; CHECK-NEXT: br label [[TMP6:%.*]]
		; CHECK: 6:
		; CHECK-NEXT: [[TMP7:%.]] = load i32, i32 null, align 16
		; CHECK-NEXT: [[TMP8:%.*]] = icmp ult i32 [[TMP7]], 41
		; CHECK-NEXT: br i1 [[TMP8]], label [[TMP9:%.]], label [[TMP10:%.]]
		; CHECK: 9:
		; CHECK-NEXT: store i32 undef, i32* null, align 16
		; CHECK-NEXT: br label [[TMP10]]
		; CHECK: 10:
		; CHECK-NEXT: br label [[DOTSPLIT2_US:%.*]]
		; CHECK: .split2.us:
		; CHECK-NEXT: [[TMP11:%.*]] = phi i32 [ 1, [[TMP10]] ]
		; CHECK-NEXT: br label [[TMP18:%.*]]
		; CHECK: .split:
		; CHECK-NEXT: br label [[TMP12:%.*]]
		; CHECK: 12:
		; CHECK-NEXT: [[TMP13:%.]] = load i32, i32 null, align 16
		; CHECK-NEXT: [[TMP14:%.*]] = icmp ult i32 [[TMP13]], 41
		; CHECK-NEXT: br i1 [[TMP14]], label [[TMP15:%.]], label [[TMP16:%.]]
		; CHECK: 15:
		; CHECK-NEXT: store i32 undef, i32* null, align 16
		; CHECK-NEXT: br label [[TMP16]]
		; CHECK: 16:
		; CHECK-NEXT: br i1 [[TMP4]], label [[TMP12]], label [[DOTSPLIT2:%.*]]
		; CHECK: .split2:
		; CHECK-NEXT: [[TMP17:%.*]] = phi i32 [ 1, [[TMP16]] ]
		; CHECK-NEXT: br label [[TMP18]]
		; CHECK: 18:
		; CHECK-NEXT: [[DOTUS_PHI:%.*]] = phi i32 [ [[TMP17]], [[DOTSPLIT2]] ], [ [[TMP11]], [[DOTSPLIT2_US]] ]
		; CHECK-NEXT: ret void
		;
		%4 = icmp ne i32 %1, 0
		br label %5

		fhahnUnsubmitted Not Done Reply Inline Actions It would be great if you could use descriptive labels in the test both for basic blocks and values, to make it easier to read in case people need to take a look in the future. fhahn: It would be great if you could use descriptive labels in the test both for basic blocks and…
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will update it. jaykang10: Yep, I will update it.
		5:
		%6 = load i32, i32* null, align 16
		%7 = icmp ult i32 %6, 41
		br i1 %7, label %8, label %9

		8:
		store i32 undef, i32* null, align 16
		fhahnUnsubmitted Not Done Reply Inline Actions please avoid using `null` as pointer, as this UB fhahn: please avoid using `null` as pointer, as this UB
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Yep, I will update it. jaykang10: Yep, I will update it.
		br label %9

		9:
		br i1 %4, label %5, label %10

		10:
		%11 = phi i32 [ 1, %9 ]
		ret void
		}

This is an archive of the discontinued LLVM Phabricator instance.

[SimpleLoopUnswitch] Port partially invariant unswitch from LoopUnswitch to SimpleLoopUnswitchClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 338138

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp

llvm/test/Transforms/SimpleLoopUnswitch/partial-unswitch.ll

[SimpleLoopUnswitch] Port partially invariant unswitch from LoopUnswitch to SimpleLoopUnswitch
ClosedPublic