This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Coroutines/
-
Transforms/
-
Coroutines/
4/5
CoroSplit.cpp
-
test/Transforms/Coroutines/
-
Transforms/
-
Coroutines/
-
coro-split-musttail4.ll

Differential D116327

[Coroutines] Enhance symmetric transfer for constant CmpInst
ClosedPublic

Authored by ChuanqiXu on Dec 27 2021, 9:17 PM.

Download Raw Diff

Details

Reviewers

rjmccall
lxfind
aeubanks
junparser

Commits

rG403772ff1ce5: [Coroutines] Enhance symmetric transfer for constant CmpInst

Summary

This fixes bug52896: https://github.com/llvm/llvm-project/issues/52896.

Simply, some symmetric transfer optimization chances get invalided due to we delete some inlined optimization passes in https://github.com/llvm/llvm-project/commit/822b92aae439c4ba2946980c8a27bd2c8a62d90c#diff-8cf21bda84c593733aa099f89fe7d197fd83203c6bc4f6fbdcd94f8bb4256d23L1138-L1147 . This would cause stack-overflow in some situations which should be avoided by the design of coroutine: https://godbolt.org/z/E55h9Pv9f

This patch tries to fix this by transforming the constant CmpInst instruction which was done in the deleted passes.

Diff Detail

Unit TestsFailed

	Time	Test
	8,380 ms	x64 debian > libarcher.task::task_late_fulfill.c

Event Timeline

ChuanqiXu created this revision.Dec 27 2021, 9:17 PM

Herald added a subscriber: hiraditya. · View Herald TranscriptDec 27 2021, 9:17 PM

ChuanqiXu requested review of this revision.Dec 27 2021, 9:17 PM

Herald added a subscriber: llvm-commits. · View Herald TranscriptDec 27 2021, 9:17 PM

Harbormaster completed remote builds in B140771: Diff 396364.Dec 27 2021, 10:08 PM

ChuanqiXu mentioned this in D116330: [Coroutines] Handle lifetime markers, bitcast and unused instruciton for symmetric transfer.Dec 28 2021, 12:43 AM

ChuanqiXu added a child revision: D116330: [Coroutines] Handle lifetime markers, bitcast and unused instruciton for symmetric transfer.

ChuanqiXu added a reviewer: junparser.Jan 5 2022, 11:17 PM

@rjmccall @lxfind @junparser gentle ping~

This is general LGTM with some comments

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1227	Instead simplify each instructions at pieces，would you use SimplifyInstructionsInBlock to handle each basic block at once？

ChuanqiXu added inline comments.Jan 10 2022, 8:23 PM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1227	What this function do is slightly different from `SimplifyInstructionsInBlock`. The function is trying to simplify the sequence of terminators to ret. So what the function do is intra-BB optimization which seems not be able to be handled by `SimplifyInstructionsInBlock`. And this function is more effective than running `SimplifyInstructionsInBlock` for each basic blocks.

rjmccall added inline comments.Jan 10 2022, 9:00 PM

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1244	I liked the old comment that was here that explained why we bother handling this case.
1247	Should we do the same lookup in `ResolvedValues` here? I would suggest having a general helper that you can put at the top of this function: auto tryResolveConstant = [&](Value V) -> Constant { auto it = ResolvedValues.find(V)); if (it != ResolvedValues.end()) V = it->second; return dyn_cast<Constant>(V); }; You don't need to check specifically for `ConstantInt`; LLVM can constant-fold some conditions on other kinds of constants.

Address comments.

ChuanqiXu marked 2 inline comments as done.Jan 10 2022, 11:07 PM

ChuanqiXu added inline comments.

llvm/lib/Transforms/Coroutines/CoroSplit.cpp
1247	Should we do the same lookup in ResolvedValues here? According to the comment and transformation detail, the second operand should be literal constant, so we shouldn't call TryResolveConstant for it. I've added this to comment. I would suggest having a general helper that you can put at the top of this function: Done. Thanks for suggestion! You don't need to check specifically for ConstantInt; LLVM can constant-fold some conditions on other kinds of constants. If we check for `Constant` instead of `ConstantInt`, then we must write `auto *Cond = dyn_cast<ConstantInt>(TryResolveConstant(...));` on line 1277 since the argument type of `findCaseValue` is `ConstantInt`, it looks odd to me. And we wouldn't lose anything if we check for `ConstantInt` only, so I chose to check specifically for ConstantInt in this revision still.

Harbormaster completed remote builds in B142589: Diff 398848.Jan 10 2022, 11:36 PM

Hmm, alright. I really don't love that high-level semantics are dependent on exactly the right transforms happening in exactly the right way, but I guess that's what the C++ committee has stuck us with.

This revision is now accepted and ready to land.Jan 11 2022, 11:01 AM

In D116327#3234969, @rjmccall wrote:

Hmm, alright. I really don't love that high-level semantics are dependent on exactly the right transforms happening in exactly the right way, but I guess that's what the C++ committee has stuck us with.

Yeah, agreed. BTW, the symmetric transfer is not part of the C++ standard document like tail call is not in the standard too. It is just a compiler optimization that end-users are familiar with. So it becomes a somehow de facto standard so that users would feel very odd if the compiler fails to do so.

This revision was landed with ongoing or failed builds.Jan 11 2022, 6:15 PM

Closed by commit rG403772ff1ce5: [Coroutines] Enhance symmetric transfer for constant CmpInst (authored by ChuanqiXu). · Explain Why

This revision was automatically updated to reflect the committed changes.

ChuanqiXu added a commit: rG403772ff1ce5: [Coroutines] Enhance symmetric transfer for constant CmpInst.

ChuanqiXu mentioned this in rG22225cc5e665: [Coroutines] Handle lifetime markers, bitcast and unused instruciton for….Jan 12 2022, 12:00 AM

@ChuanqiXu hi, with this patch the generated LLVM IR snippet is looking like:

%_Z5task0v.Frame = type { void (%_Z5task0v.Frame*)*, void (%_Z5task0v.Frame*)*, %struct.TaskPromise, i1 }

(see https://godbolt.org/z/b3s9a4xnW - it's the same example from the patch's headed, but with -S -emit-llvm).

In the following patch you add a regression test clang/test/CodeGenCoroutines/coro-elide.cpp , which checks an opposite, which is a bit confusing

// CHECK: %_Z5task1v.Frame = type {{.*}}%_Z5task0v.Frame

Could you please tell if it's a typo or it was done intentional? If intentional, I keen to know, where could I learn more about the coroutine frame jump, thanks :)

In D116327#3243349, @sidorovd wrote:
@ChuanqiXu hi, with this patch the generated LLVM IR snippet is looking like:
%_Z5task0v.Frame = type { void (%_Z5task0v.Frame*)*, void (%_Z5task0v.Frame*)*, %struct.TaskPromise, i1 }
(see https://godbolt.org/z/b3s9a4xnW - it's the same example from the patch's headed, but with -S -emit-llvm).

In the following patch you add a regression test clang/test/CodeGenCoroutines/coro-elide.cpp , which checks an opposite, which is a bit confusing
// CHECK: %_Z5task1v.Frame = type {{.*}}%_Z5task0v.Frame
Could you please tell if it's a typo or it was done intentional? If intentional, I keen to know, where could I learn more about the coroutine frame jump, thanks :)

Yeah, it is intentional. I added two regression tests in the following commit. But only one is related to this revision, while the other one you mentioned is related to another optimization for coroutine. The optimization is called Coroutine Elision. You could find the original design in: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0981r0.html. For the more detailed implementation, you might need to take a look at CoroElide pass. For the coroutine intrinsic, it is good to visit https://llvm.org/docs/Coroutines.html. It is not completed now. For example, I found that it lacks a ABI document just now...

In D116327#3247285, @ChuanqiXu wrote:
In D116327#3243349, @sidorovd wrote:
@ChuanqiXu hi, with this patch the generated LLVM IR snippet is looking like:
%_Z5task0v.Frame = type { void (%_Z5task0v.Frame*)*, void (%_Z5task0v.Frame*)*, %struct.TaskPromise, i1 }
(see https://godbolt.org/z/b3s9a4xnW - it's the same example from the patch's headed, but with -S -emit-llvm).

In the following patch you add a regression test clang/test/CodeGenCoroutines/coro-elide.cpp , which checks an opposite, which is a bit confusing
// CHECK: %_Z5task1v.Frame = type {{.*}}%_Z5task0v.Frame
Could you please tell if it's a typo or it was done intentional? If intentional, I keen to know, where could I learn more about the coroutine frame jump, thanks :)
Yeah, it is intentional. I added two regression tests in the following commit. But only one is related to this revision, while the other one you mentioned is related to another optimization for coroutine. The optimization is called Coroutine Elision. You could find the original design in: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0981r0.html. For the more detailed implementation, you might need to take a look at CoroElide pass. For the coroutine intrinsic, it is good to visit https://llvm.org/docs/Coroutines.html. It is not completed now. For example, I found that it lacks a ABI document just now...

So the problem I'm seeing in our downstream (which might be behind) is that the _Z5task1v.Frame is:

%_Z5task1v.Frame = type { void (%_Z5task1v.Frame*)*, void (%_Z5task1v.Frame*)*, %"struct.Task::promise_type", i2 }

Note that this does NOT match the 'CHECK' line.

HOWEVER, in upstream, I see:
%_Z5task1v.Frame = type { void (%_Z5task1v.Frame*)*, void (%_Z5task1v.Frame*)*, %"struct.Task::promise_type", %_Z5task0v.Frame, i2 }

Note the %_Z5task0v.Frame before the 'i2'.

Do you perhaps have any hints as to where that could have gone?

In D116327#3273168, @erichkeane wrote:
In D116327#3247285, @ChuanqiXu wrote:
In D116327#3243349, @sidorovd wrote:
@ChuanqiXu hi, with this patch the generated LLVM IR snippet is looking like:
%_Z5task0v.Frame = type { void (%_Z5task0v.Frame*)*, void (%_Z5task0v.Frame*)*, %struct.TaskPromise, i1 }
(see https://godbolt.org/z/b3s9a4xnW - it's the same example from the patch's headed, but with -S -emit-llvm).

In the following patch you add a regression test clang/test/CodeGenCoroutines/coro-elide.cpp , which checks an opposite, which is a bit confusing
// CHECK: %_Z5task1v.Frame = type {{.*}}%_Z5task0v.Frame
Could you please tell if it's a typo or it was done intentional? If intentional, I keen to know, where could I learn more about the coroutine frame jump, thanks :)
Yeah, it is intentional. I added two regression tests in the following commit. But only one is related to this revision, while the other one you mentioned is related to another optimization for coroutine. The optimization is called Coroutine Elision. You could find the original design in: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0981r0.html. For the more detailed implementation, you might need to take a look at CoroElide pass. For the coroutine intrinsic, it is good to visit https://llvm.org/docs/Coroutines.html. It is not completed now. For example, I found that it lacks a ABI document just now...
So the problem I'm seeing in our downstream (which might be behind) is that the _Z5task1v.Frame is:

%_Z5task1v.Frame = type { void (%_Z5task1v.Frame*)*, void (%_Z5task1v.Frame*)*, %"struct.Task::promise_type", i2 }

Note that this does NOT match the 'CHECK' line.

HOWEVER, in upstream, I see:
%_Z5task1v.Frame = type { void (%_Z5task1v.Frame*)*, void (%_Z5task1v.Frame*)*, %"struct.Task::promise_type", %_Z5task0v.Frame, i2 }

Note the %_Z5task0v.Frame before the 'i2'.

Do you perhaps have any hints as to where that could have gone?

Do you talk about https://github.com/llvm/llvm-project/commit/bf5f2354fa6e3f31a1acea75a229fee54359e279#diff-0b5d68fb7c4bb0c5aad6c883444d80715cf6879e3d27e24e38c22a61c6971730?

This is a regression test for an optimization called CoroElide, which going to eliminate the heap allocation.

For example, without the optimization, when we call task1, it would call std::new to generate task1v.Frame and call std::new again to generate task0v.Frame. So here are 2 calls to allocate on the heap. But with the optimization, it would call std::new only once when we call task1. So we decrease the time to allocate on heap.

If this tests fails, it implies the optimization fails in the case. hmmm since it fails in the downstream, I am not sure how could I help you. Any thoughts?

BTW, if coroutine is not important to you, I think you could delete the tests in downstream. It shouldn't do something harm to the semantics. It is just an optimization. It doesn't matter about the semantics in the standard. (The idea of CoroElide comes from a language proposal. But it doesn't show up in the standard. I guess the reason may be that the standard don't care optimization like this one)

In D116327#3274414, @ChuanqiXu wrote:
In D116327#3273168, @erichkeane wrote:
In D116327#3247285, @ChuanqiXu wrote:
In D116327#3243349, @sidorovd wrote:
@ChuanqiXu hi, with this patch the generated LLVM IR snippet is looking like:
%_Z5task0v.Frame = type { void (%_Z5task0v.Frame*)*, void (%_Z5task0v.Frame*)*, %struct.TaskPromise, i1 }
(see https://godbolt.org/z/b3s9a4xnW - it's the same example from the patch's headed, but with -S -emit-llvm).

In the following patch you add a regression test clang/test/CodeGenCoroutines/coro-elide.cpp , which checks an opposite, which is a bit confusing
// CHECK: %_Z5task1v.Frame = type {{.*}}%_Z5task0v.Frame
Could you please tell if it's a typo or it was done intentional? If intentional, I keen to know, where could I learn more about the coroutine frame jump, thanks :)
Yeah, it is intentional. I added two regression tests in the following commit. But only one is related to this revision, while the other one you mentioned is related to another optimization for coroutine. The optimization is called Coroutine Elision. You could find the original design in: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0981r0.html. For the more detailed implementation, you might need to take a look at CoroElide pass. For the coroutine intrinsic, it is good to visit https://llvm.org/docs/Coroutines.html. It is not completed now. For example, I found that it lacks a ABI document just now...
So the problem I'm seeing in our downstream (which might be behind) is that the _Z5task1v.Frame is:

%_Z5task1v.Frame = type { void (%_Z5task1v.Frame*)*, void (%_Z5task1v.Frame*)*, %"struct.Task::promise_type", i2 }

Note that this does NOT match the 'CHECK' line.

HOWEVER, in upstream, I see:
%_Z5task1v.Frame = type { void (%_Z5task1v.Frame*)*, void (%_Z5task1v.Frame*)*, %"struct.Task::promise_type", %_Z5task0v.Frame, i2 }

Note the %_Z5task0v.Frame before the 'i2'.

Do you perhaps have any hints as to where that could have gone?
Do you talk about https://github.com/llvm/llvm-project/commit/bf5f2354fa6e3f31a1acea75a229fee54359e279#diff-0b5d68fb7c4bb0c5aad6c883444d80715cf6879e3d27e24e38c22a61c6971730?

This is a regression test for an optimization called CoroElide, which going to eliminate the heap allocation.

For example, without the optimization, when we call task1, it would call std::new to generate task1v.Frame and call std::new again to generate task0v.Frame. So here are 2 calls to allocate on the heap. But with the optimization, it would call std::new only once when we call task1. So we decrease the time to allocate on heap.

If this tests fails, it implies the optimization fails in the case. hmmm since it fails in the downstream, I am not sure how could I help you. Any thoughts?

BTW, if coroutine is not important to you, I think you could delete the tests in downstream. It shouldn't do something harm to the semantics. It is just an optimization. It doesn't matter about the semantics in the standard. (The idea of CoroElide comes from a language proposal. But it doesn't show up in the standard. I guess the reason may be that the standard don't care optimization like this one)

Yep, thats the file.

I'm just shocked to see a CFE test that depends on optimization/does optimization checks like this. This is pretty far from typical, and of course causes confusion. I presume one of our optimizations is working a bit differently (another reason we typically don't have lit tests like this in the CFE) from community.

We LIKE to keep as many of the tests in downstream, as it helps with correctness, but I'm beginning to doubt the validity of this test; it seems it'll be quite fragile.

In D116327#3275975, @erichkeane wrote:
In D116327#3274414, @ChuanqiXu wrote:
In D116327#3273168, @erichkeane wrote:
In D116327#3247285, @ChuanqiXu wrote:
In D116327#3243349, @sidorovd wrote:
@ChuanqiXu hi, with this patch the generated LLVM IR snippet is looking like:
%_Z5task0v.Frame = type { void (%_Z5task0v.Frame*)*, void (%_Z5task0v.Frame*)*, %struct.TaskPromise, i1 }
(see https://godbolt.org/z/b3s9a4xnW - it's the same example from the patch's headed, but with -S -emit-llvm).

In the following patch you add a regression test clang/test/CodeGenCoroutines/coro-elide.cpp , which checks an opposite, which is a bit confusing
// CHECK: %_Z5task1v.Frame = type {{.*}}%_Z5task0v.Frame
Could you please tell if it's a typo or it was done intentional? If intentional, I keen to know, where could I learn more about the coroutine frame jump, thanks :)
Yeah, it is intentional. I added two regression tests in the following commit. But only one is related to this revision, while the other one you mentioned is related to another optimization for coroutine. The optimization is called Coroutine Elision. You could find the original design in: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0981r0.html. For the more detailed implementation, you might need to take a look at CoroElide pass. For the coroutine intrinsic, it is good to visit https://llvm.org/docs/Coroutines.html. It is not completed now. For example, I found that it lacks a ABI document just now...
So the problem I'm seeing in our downstream (which might be behind) is that the _Z5task1v.Frame is:

%_Z5task1v.Frame = type { void (%_Z5task1v.Frame*)*, void (%_Z5task1v.Frame*)*, %"struct.Task::promise_type", i2 }

Note that this does NOT match the 'CHECK' line.

HOWEVER, in upstream, I see:
%_Z5task1v.Frame = type { void (%_Z5task1v.Frame*)*, void (%_Z5task1v.Frame*)*, %"struct.Task::promise_type", %_Z5task0v.Frame, i2 }

Note the %_Z5task0v.Frame before the 'i2'.

Do you perhaps have any hints as to where that could have gone?
Do you talk about https://github.com/llvm/llvm-project/commit/bf5f2354fa6e3f31a1acea75a229fee54359e279#diff-0b5d68fb7c4bb0c5aad6c883444d80715cf6879e3d27e24e38c22a61c6971730?

This is a regression test for an optimization called CoroElide, which going to eliminate the heap allocation.

For example, without the optimization, when we call task1, it would call std::new to generate task1v.Frame and call std::new again to generate task0v.Frame. So here are 2 calls to allocate on the heap. But with the optimization, it would call std::new only once when we call task1. So we decrease the time to allocate on heap.

If this tests fails, it implies the optimization fails in the case. hmmm since it fails in the downstream, I am not sure how could I help you. Any thoughts?

BTW, if coroutine is not important to you, I think you could delete the tests in downstream. It shouldn't do something harm to the semantics. It is just an optimization. It doesn't matter about the semantics in the standard. (The idea of CoroElide comes from a language proposal. But it doesn't show up in the standard. I guess the reason may be that the standard don't care optimization like this one)
Yep, thats the file.

I'm just shocked to see a CFE test that depends on optimization/does optimization checks like this. This is pretty far from typical, and of course causes confusion. I presume one of our optimizations is working a bit differently (another reason we typically don't have lit tests like this in the CFE) from community.

We LIKE to keep as many of the tests in downstream, as it helps with correctness, but I'm beginning to doubt the validity of this test; it seems it'll be quite fragile.

Yeah, before I created the file, I searched O2/O3 in clang/test/CodeGen* directory and I found many results. So I feel it might not be bad to have test in clang to depend on optimization in middle end directly. The best practice maybe we check the pattern generated in the frontend and we check the transformation for the pattern in the middle end. I would like to do this but I am a little bit busying now and I am going to take a vacation next week so it might wait for a longer time.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Coroutines/

CoroSplit.cpp

76 lines

test/

Transforms/

Coroutines/

coro-split-musttail4.ll

4 lines

Diff 396364

llvm/lib/Transforms/Coroutines/CoroSplit.cpp

	Show All 23 Lines
	#include "llvm/ADT/DenseMap.h"			#include "llvm/ADT/DenseMap.h"
	#include "llvm/ADT/SmallPtrSet.h"			#include "llvm/ADT/SmallPtrSet.h"
	#include "llvm/ADT/SmallVector.h"			#include "llvm/ADT/SmallVector.h"
	#include "llvm/ADT/StringRef.h"			#include "llvm/ADT/StringRef.h"
	#include "llvm/ADT/Twine.h"			#include "llvm/ADT/Twine.h"
	#include "llvm/Analysis/CFG.h"			#include "llvm/Analysis/CFG.h"
	#include "llvm/Analysis/CallGraph.h"			#include "llvm/Analysis/CallGraph.h"
	#include "llvm/Analysis/CallGraphSCCPass.h"			#include "llvm/Analysis/CallGraphSCCPass.h"
				#include "llvm/Analysis/ConstantFolding.h"
	#include "llvm/Analysis/LazyCallGraph.h"			#include "llvm/Analysis/LazyCallGraph.h"
	#include "llvm/IR/Argument.h"			#include "llvm/IR/Argument.h"
	#include "llvm/IR/Attributes.h"			#include "llvm/IR/Attributes.h"
	#include "llvm/IR/BasicBlock.h"			#include "llvm/IR/BasicBlock.h"
	#include "llvm/IR/CFG.h"			#include "llvm/IR/CFG.h"
	#include "llvm/IR/CallingConv.h"			#include "llvm/IR/CallingConv.h"
	#include "llvm/IR/Constants.h"			#include "llvm/IR/Constants.h"
	#include "llvm/IR/DataLayout.h"			#include "llvm/IR/DataLayout.h"
	▲ Show 20 Lines • Show All 1,152 Lines • ▼ Show 20 Lines
	}			}

	// Replace a sequence of branches leading to a ret, with a clone of a ret			// Replace a sequence of branches leading to a ret, with a clone of a ret
	// instruction. Suspend instruction represented by a switch, track the PHI			// instruction. Suspend instruction represented by a switch, track the PHI
	// values and select the correct case successor when possible.			// values and select the correct case successor when possible.
	static bool simplifyTerminatorLeadingToRet(Instruction *InitialInst) {			static bool simplifyTerminatorLeadingToRet(Instruction *InitialInst) {
	DenseMap<Value , Value > ResolvedValues;			DenseMap<Value , Value > ResolvedValues;
	BasicBlock *UnconditionalSucc = nullptr;			BasicBlock *UnconditionalSucc = nullptr;
				assert(InitialInst->getModule());
				const DataLayout &DL = InitialInst->getModule()->getDataLayout();

	Instruction *I = InitialInst;			Instruction *I = InitialInst;
	while (I->isTerminator() \|\|			while (I->isTerminator() \|\|
	(isa<CmpInst>(I) && I->getNextNode()->isTerminator())) {			(isa<CmpInst>(I) && I->getNextNode()->isTerminator())) {
	if (isa<ReturnInst>(I)) {			if (isa<ReturnInst>(I)) {
	if (I != InitialInst) {			if (I != InitialInst) {
	// If InitialInst is an unconditional branch,			// If InitialInst is an unconditional branch,
	// remove PHI values that come from basic block of InitialInst			// remove PHI values that come from basic block of InitialInst
	if (UnconditionalSucc)			if (UnconditionalSucc)
	UnconditionalSucc->removePredecessor(InitialInst->getParent(), true);			UnconditionalSucc->removePredecessor(InitialInst->getParent(), true);
	ReplaceInstWithInst(InitialInst, I->clone());			ReplaceInstWithInst(InitialInst, I->clone());
	}			}
	return true;			return true;
	}			}
	if (auto *BR = dyn_cast<BranchInst>(I)) {			if (auto *BR = dyn_cast<BranchInst>(I)) {
	if (BR->isUnconditional()) {			if (BR->isUnconditional()) {
	BasicBlock *BB = BR->getSuccessor(0);			BasicBlock *Succ = BR->getSuccessor(0);
	if (I == InitialInst)			if (I == InitialInst)
	UnconditionalSucc = BB;			UnconditionalSucc = Succ;
	scanPHIsAndUpdateValueMap(I, BB, ResolvedValues);			scanPHIsAndUpdateValueMap(I, Succ, ResolvedValues);
	I = BB->getFirstNonPHIOrDbgOrLifetime();			I = Succ->getFirstNonPHIOrDbgOrLifetime();
				continue;
				}

				BasicBlock *BB = BR->getParent();
				junparserUnsubmitted Not Done Reply Inline Actions Instead simplify each instructions at pieces，would you use SimplifyInstructionsInBlock to handle each basic block at once？ junparser: Instead simplify each instructions at pieces，would you use SimplifyInstructionsInBlock to…
				ChuanqiXuAuthorUnsubmitted Done Reply Inline Actions What this function do is slightly different from `SimplifyInstructionsInBlock`. The function is trying to simplify the sequence of terminators to ret. So what the function do is intra-BB optimization which seems not be able to be handled by `SimplifyInstructionsInBlock`. And this function is more effective than running `SimplifyInstructionsInBlock` for each basic blocks. ChuanqiXu: What this function do is slightly different from `SimplifyInstructionsInBlock`. The function is…
				// Handle the case the condition of the conditional branch is constant.
				// e.g.,
				//
				// br i1 false, label %cleanup, label %CoroEnd
				//
				// It is possible during the transformation. We could continue the
				// simplifying in this case.
				if (ConstantFoldTerminator(BB, /DeleteDeadConditions=/true)) {
				// Handle this branch in next iteration.
				I = BB->getTerminator();
	continue;			continue;
	}			}
	} else if (auto *CondCmp = dyn_cast<CmpInst>(I)) {			} else if (auto *CondCmp = dyn_cast<CmpInst>(I)) {
	auto *BR = dyn_cast<BranchInst>(I->getNextNode());			auto *BR = dyn_cast<BranchInst>(I->getNextNode());
	if (BR && BR->isConditional() && CondCmp == BR->getCondition()) {			if (!BR \|\| !BR->isConditional() \|\| CondCmp != BR->getCondition())
	// If the case number of suspended switch instruction is reduced to			return false;
	// 1, then it is simplified to CmpInst in llvm::ConstantFoldTerminator.
				rjmccallUnsubmitted Done Reply Inline Actions I liked the old comment that was here that explained why we bother handling this case. rjmccall: I liked the old comment that was here that explained why we bother handling this case.
	// And the comparsion looks like : %cond = icmp eq i8 %V, constant.			auto *CondConst = dyn_cast<ConstantInt>(CondCmp->getOperand(1));
	ConstantInt *CondConst = dyn_cast<ConstantInt>(CondCmp->getOperand(1));			if (!CondConst)
	if (CondConst && CondCmp->getPredicate() == CmpInst::ICMP_EQ) {			return false;
				rjmccallUnsubmitted Done Reply Inline Actions Should we do the same lookup in `ResolvedValues` here? I would suggest having a general helper that you can put at the top of this function: auto tryResolveConstant = [&](Value V) -> Constant { auto it = ResolvedValues.find(V)); if (it != ResolvedValues.end()) V = it->second; return dyn_cast<Constant>(V); }; You don't need to check specifically for `ConstantInt`; LLVM can constant-fold some conditions on other kinds of constants. rjmccall: Should we do the same lookup in `ResolvedValues` here? I would suggest having a general helper…
				ChuanqiXuAuthorUnsubmitted Done Reply Inline Actions Should we do the same lookup in ResolvedValues here? According to the comment and transformation detail, the second operand should be literal constant, so we shouldn't call TryResolveConstant for it. I've added this to comment. I would suggest having a general helper that you can put at the top of this function: Done. Thanks for suggestion! You don't need to check specifically for ConstantInt; LLVM can constant-fold some conditions on other kinds of constants. If we check for `Constant` instead of `ConstantInt`, then we must write `auto Cond = dyn_cast<ConstantInt>(TryResolveConstant(...));` on line 1277 since the argument type of `findCaseValue` is `ConstantInt`, it looks odd to me. And we wouldn't lose anything if we check for `ConstantInt` only, so I chose to check specifically for ConstantInt in this revision still. ChuanqiXu:* > Should we do the same lookup in ResolvedValues here? According to the comment and…

	Value *V = CondCmp->getOperand(0);			Value *V = CondCmp->getOperand(0);
	auto it = ResolvedValues.find(V);			auto it = ResolvedValues.find(V);
	if (it != ResolvedValues.end())			if (it != ResolvedValues.end())
	V = it->second;			V = it->second;

	if (ConstantInt *Cond0 = dyn_cast<ConstantInt>(V)) {			auto *Cond0 = dyn_cast<ConstantInt>(V);
	BasicBlock *BB = Cond0->equalsInt(CondConst->getZExtValue())			if (!Cond0)
	? BR->getSuccessor(0)			return false;
	: BR->getSuccessor(1);
	scanPHIsAndUpdateValueMap(I, BB, ResolvedValues);			// Both operands of the CmpInst are Constant. So that we could evaluate
	I = BB->getFirstNonPHIOrDbgOrLifetime();			// it immediately to get the destination.
				auto *ConstResult =
				dyn_cast_or_null<ConstantInt>(ConstantFoldCompareInstOperands(
				CondCmp->getPredicate(), Cond0, CondConst, DL));
				if (!ConstResult)
				return false;

				CondCmp->replaceAllUsesWith(ConstResult);
				CondCmp->eraseFromParent();

				// Handle this branch in next iteration.
				I = BR;
	continue;			continue;
	}
	}
	}
	} else if (auto *SI = dyn_cast<SwitchInst>(I)) {			} else if (auto *SI = dyn_cast<SwitchInst>(I)) {
	Value *V = SI->getCondition();			Value *V = SI->getCondition();
	auto it = ResolvedValues.find(V);			auto it = ResolvedValues.find(V);
	if (it != ResolvedValues.end())			if (it != ResolvedValues.end())
	V = it->second;			V = it->second;
	if (ConstantInt *Cond = dyn_cast<ConstantInt>(V)) {			if (ConstantInt *Cond = dyn_cast<ConstantInt>(V)) {
	BasicBlock *BB = SI->findCaseValue(Cond)->getCaseSuccessor();			BasicBlock *BB = SI->findCaseValue(Cond)->getCaseSuccessor();
	scanPHIsAndUpdateValueMap(I, BB, ResolvedValues);			scanPHIsAndUpdateValueMap(I, BB, ResolvedValues);
	▲ Show 20 Lines • Show All 1,052 Lines • Show Last 20 Lines

llvm/test/Transforms/Coroutines/coro-split-musttail4.ll

Show All 36 Lines	coro.free:
call void @delete(i8* nonnull %free.handle) #2		call void @delete(i8* nonnull %free.handle) #2
br label %coro.end		br label %coro.end

coro.end:		coro.end:
call i1 @llvm.coro.end(i8* null, i1 false)		call i1 @llvm.coro.end(i8* null, i1 false)
ret void		ret void
}		}

; FIXME: The fakerresume1 here should be musttail call.
; CHECK-LABEL: @f.resume(		; CHECK-LABEL: @f.resume(
; CHECK-NOT: musttail call fastcc void @fakeresume1(		; CHECK: musttail call fastcc void @fakeresume1(
		; CHECK-NEXT: ret void

declare token @llvm.coro.id(i32, i8* readnone, i8* nocapture readonly, i8*) #1		declare token @llvm.coro.id(i32, i8* readnone, i8* nocapture readonly, i8*) #1
declare i1 @llvm.coro.alloc(token) #2		declare i1 @llvm.coro.alloc(token) #2
declare i64 @llvm.coro.size.i64() #3		declare i64 @llvm.coro.size.i64() #3
declare i8* @llvm.coro.begin(token, i8* writeonly) #2		declare i8* @llvm.coro.begin(token, i8* writeonly) #2
declare token @llvm.coro.save(i8*) #2		declare token @llvm.coro.save(i8*) #2
declare i8* @llvm.coro.frame() #3		declare i8* @llvm.coro.frame() #3
declare i8 @llvm.coro.suspend(token, i1) #2		declare i8 @llvm.coro.suspend(token, i1) #2
Show All 10 Lines