This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
1
CloneFunction.cpp
-
unittests/Transforms/Utils/
-
Transforms/
-
Utils/
-
CloningTest.cpp

Differential D132084

[Cloning] handle blockaddress array clone in the same module
Needs ReviewPublic

Authored by ychen on Aug 17 2022, 5:12 PM.

Download Raw Diff

Details

Reviewers

efriedma
dexonsmith

Summary

CloneFunction.cpp currently says "It is only legal to clone a function if a block address within that function is never referenced outside of the function.".
This is only the case for cloning into the same module (this patch handles this); for cloning/moving into a different module, it is actually okay.

I'm still working on a unit test but want to hear the reviewer's opinion earlier since I'm not sure why it is not handled before, no valid use case, or some constraints.

Fixes https://github.com/llvm/llvm-project/issues/56436

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ychen created this revision.Aug 17 2022, 5:12 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2022, 5:12 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

ychen requested review of this revision.Aug 17 2022, 5:12 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2022, 5:12 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

ychen added reviewers: efriedma, dexonsmith.Aug 17 2022, 5:12 PM

ychen edited the summary of this revision. (Show Details)

ychen added a subscriber: ChuanqiXu.

add a test case for discussion.
TODO: check blockaddress cloning is done correctly in verifier.

Harbormaster completed remote builds in B181874: Diff 453488.Aug 17 2022, 6:28 PM

For the coroutine examples, I think is not a perfect solution. Consider the case the initial_suspend is not std::suspend_always and we can edit the dispatch_table during the function: https://godbolt.org/z/rx37TaPde

In this case, it is not so right if there are multiple dispatch tables for one coroutine (The destroy functions could access the dispatch table in theory). There are consistency problems. There may be other tricky cases.

Besides the coroutine example, it will be problematic if there are multiple functions share one dispatch_table.

So at least I think it is OK to disable the use in coroutines since the address of label is not a standard feature.

llvm/lib/Transforms/Utils/CloneFunction.cpp
225	This assertion looks not stable. It is easy to generate code to break the assumption.

In general, you can't clone when indirectbrs are involved: since the blockaddress is a constant, it needs to be valid for all the cloned versions of the indirectbr. And you can't make different basic blocks with the same address.

I'm not sure why you need to clone code in this context, but it's possible to lower indirectbrs to switch instructions if necessary (at the cost of slightly slower code at runtime). See IndirectBrExpandPass.

In D132084#3732941, @efriedma wrote:

In general, you can't clone when indirectbrs are involved: since the blockaddress is a constant, it needs to be valid for all the cloned versions of the indirectbr. And you can't make different basic blocks with the same address.

I mean, there are existing VMap for new blockaddress and old blockaddress. So each cloned indirectbr has its own set of blockaddress. For this case, the cloning logic does not apply this VMap to GlobalValue that uses blockaddress. no?

I'm not sure why you need to clone code in this context, but it's possible to lower indirectbrs to switch instructions if necessary (at the cost of slightly slower code at runtime). See IndirectBrExpandPass.

Oh, this is helpful. I'll take a look. Thanks.

In D132084#3733036, @ychen wrote:

In D132084#3732941, @efriedma wrote:

In general, you can't clone when indirectbrs are involved: since the blockaddress is a constant, it needs to be valid for all the cloned versions of the indirectbr. And you can't make different basic blocks with the same address.

I mean, there are existing VMap for new blockaddress and old blockaddress. So each cloned indirectbr has its own set of blockaddress. For this case, the cloning logic does not apply this VMap to GlobalValue that uses blockaddress. no?

And the side effect is that any GlobalValue referencing blockaddress needs to be cloned too. Most of the time, this is just an array of blockaddress.

In D132084#3733041, @ychen wrote:

In D132084#3733036, @ychen wrote:

In D132084#3732941, @efriedma wrote:

In general, you can't clone when indirectbrs are involved: since the blockaddress is a constant, it needs to be valid for all the cloned versions of the indirectbr. And you can't make different basic blocks with the same address.

I mean, there are existing VMap for new blockaddress and old blockaddress. So each cloned indirectbr has its own set of blockaddress. For this case, the cloning logic does not apply this VMap to GlobalValue that uses blockaddress. no?

And the side effect is that any GlobalValue referencing blockaddress needs to be cloned too. Most of the time, this is just an array of blockaddress.

Never mind, I don't think it is legal to clone any GlobalValue here. Only GlobalValue that contains only blockaddress.

In D132084#3733041, @ychen wrote:

In D132084#3733036, @ychen wrote:

In D132084#3732941, @efriedma wrote:

In general, you can't clone when indirectbrs are involved: since the blockaddress is a constant, it needs to be valid for all the cloned versions of the indirectbr. And you can't make different basic blocks with the same address.

I mean, there are existing VMap for new blockaddress and old blockaddress. So each cloned indirectbr has its own set of blockaddress. For this case, the cloning logic does not apply this VMap to GlobalValue that uses blockaddress. no?

And the side effect is that any GlobalValue referencing blockaddress needs to be cloned too. Most of the time, this is just an array of blockaddress.

The problem isn't a mechanical "how to I clone a value".

The problem is this: suppose you have a function with an indirectbr. Then you clone it. Now you have two indirectbrs, each with a different destination. So you need to rewrite each use of a blockaddress to use the right value, depending on which indirectbr it's going to be passed to. But in general, you can't determine that at the point of use of the indirectbr; you can't tell until the indirectbr is about to be executed. blockaddresses can be stored in global variables, passed to functions, returned from functions, etc.

There are only two ways out of this, in general: either avoid cloning, or transform the indirectbr into a switch (or equivalent). We avoid the indirectbr->switch transform where possible because it adds an extra table lookup to code which is probably performance-sensitive. (We use IndirectBrExpandPass on targets where basic blocks don't have addresses, like wasm.)

In D132084#3733041, @ychen wrote:

In D132084#3733036, @ychen wrote:

In D132084#3732941, @efriedma wrote:

In general, you can't clone when indirectbrs are involved: since the blockaddress is a constant, it needs to be valid for all the cloned versions of the indirectbr. And you can't make different basic blocks with the same address.

I mean, there are existing VMap for new blockaddress and old blockaddress. So each cloned indirectbr has its own set of blockaddress. For this case, the cloning logic does not apply this VMap to GlobalValue that uses blockaddress. no?

And the side effect is that any GlobalValue referencing blockaddress needs to be cloned too. Most of the time, this is just an array of blockaddress.

Never mind, I don't think it is legal to clone any GlobalValue here. Only GlobalValue that contains only blockaddress.

In D132084#3733066, @efriedma wrote:

In D132084#3733041, @ychen wrote:

In D132084#3733036, @ychen wrote:

In D132084#3732941, @efriedma wrote:

In general, you can't clone when indirectbrs are involved: since the blockaddress is a constant, it needs to be valid for all the cloned versions of the indirectbr. And you can't make different basic blocks with the same address.

I mean, there are existing VMap for new blockaddress and old blockaddress. So each cloned indirectbr has its own set of blockaddress. For this case, the cloning logic does not apply this VMap to GlobalValue that uses blockaddress. no?

And the side effect is that any GlobalValue referencing blockaddress needs to be cloned too. Most of the time, this is just an array of blockaddress.

The problem isn't a mechanical "how to I clone a value".

The problem is this: suppose you have a function with an indirectbr. Then you clone it. Now you have two indirectbrs, each with a different destination. So you need to rewrite each use of a blockaddress to use the right value, depending on which indirectbr it's going to be passed to. But in general, you can't determine that at the point of use of the indirectbr; you can't tell until the indirectbr is about to be executed. blockaddresses can be stored in global variables, passed to functions, returned from functions, etc.

There are only two ways out of this, in general: either avoid cloning, or transform the indirectbr into a switch (or equivalent). We avoid the indirectbr->switch transform where possible because it adds an extra table lookup to code which is probably performance-sensitive. (We use IndirectBrExpandPass on targets where basic blocks don't have addresses, like wasm.)

Thanks, I see your point now. The issue is indeed not solvable in general for the reason you mentioned, but for certain commonly used patterns like global blockaddress array, it is legal to clone since each element of such array could not be shared by functions. How about we handle this use case in CloneFunction and let IndirectBrExpandPass handle other use cases? No matter what, the current situation of silently creating broken IR when cloning blockaddress is not ideal.

Thanks, I see your point now. The issue is indeed not solvable in general for the reason you mentioned, but for certain commonly used patterns like global blockaddress array, it is legal to clone since each element of such array could not be shared by functions.

You'd have to prove that values loaded from the array don't escape the function, which seems difficult to prove in cases where I'd expect to see blockaddress used.

the current situation of silently creating broken IR when cloning blockaddress is not ideal.

I'm not sure what invariant we can enforce here. Maybe something related to the CloneFunctionChangeType?

In D132084#3733512, @efriedma wrote:

Thanks, I see your point now. The issue is indeed not solvable in general for the reason you mentioned, but for certain commonly used patterns like global blockaddress array, it is legal to clone since each element of such array could not be shared by functions.

You'd have to prove that values loaded from the array don't escape the function, which seems difficult to prove in cases where I'd expect to see blockaddress used.

That's right. Maybe just handle the case where blockaddress is been written at all?

the current situation of silently creating broken IR when cloning blockaddress is not ideal.

I'm not sure what invariant we can enforce here. Maybe something related to the CloneFunctionChangeType?

That's hard in general and feels insufficient to just handle some cases but not the others. How about emitting a warning of sorts all the way back to the frontend if any blockaddress is written? Also, add a boolean flag for CloneFunction API to optionally call IndirectBrExpandPass's logic.

In D132084#3735812, @ychen wrote:

the current situation of silently creating broken IR when cloning blockaddress is not ideal.

I'm not sure what invariant we can enforce here. Maybe something related to the CloneFunctionChangeType?

That's hard in general and feels insufficient to just handle some cases but not the others. How about emitting a warning of sorts all the way back to the frontend if any blockaddress is written? Also, add a boolean flag for CloneFunction API to optionally call IndirectBrExpandPass's logic.

Oh, I was just talking about the API invariant. Clearly coroutine lowering needs to do something different here. And probably clang should warn or error if indirect goto can't generate the expected code.

I forget, why do we have to clone the coroutine body? I think I read an explanation at some point, but I'm not remembering where it was; it seems like it should be possible to avoid cloning.

In D132084#3736000, @efriedma wrote:

In D132084#3735812, @ychen wrote:

the current situation of silently creating broken IR when cloning blockaddress is not ideal.

I'm not sure what invariant we can enforce here. Maybe something related to the CloneFunctionChangeType?

That's hard in general and feels insufficient to just handle some cases but not the others. How about emitting a warning of sorts all the way back to the frontend if any blockaddress is written? Also, add a boolean flag for CloneFunction API to optionally call IndirectBrExpandPass's logic.

Oh, I was just talking about the API invariant. Clearly coroutine lowering needs to do something different here. And probably clang should warn or error if indirect goto can't generate the expected code.

Add a new boolean flag HandleIndirectBranch defaulted to true: if a caller can know the blockaddress is safe to clone, they could set this to false(I expect this to be false most of the time). When HandleIndirectBranch is true, special-casing the case like global blockaddress array for function F where F has no blockaddress captured/escaped(https://llvm.org/docs/LangRef.html#pointer-capture) by cloning that array; for other cases, call IndirectBrExpandPass. Regardless of HandleIndirectBranch value, if a function references any global value and blockaddress maybe captured, we emit the warning/error because IndirectBrExpandPass can not handle all cases correctly.

I forget, why do we have to clone the coroutine body? I think I read an explanation at some point, but I'm not remembering where it was; it seems like it should be possible to avoid cloning.

Since the instructions executed before the first co_await call need to stay at the original function(ramp function). But for patterns like the below, this could not be decided statically so cloning is needed.

task corofoo(){
  while(...){
    if(..) co_await();
    else <code>;
  }
}

PS: I think it is helpful to add bound check for indirectbr. But I'm not sure where to put it, a new clang flag?

In D132084#3736472, @ychen wrote:

In D132084#3736000, @efriedma wrote:

In D132084#3735812, @ychen wrote:

the current situation of silently creating broken IR when cloning blockaddress is not ideal.

I'm not sure what invariant we can enforce here. Maybe something related to the CloneFunctionChangeType?

That's hard in general and feels insufficient to just handle some cases but not the others. How about emitting a warning of sorts all the way back to the frontend if any blockaddress is written? Also, add a boolean flag for CloneFunction API to optionally call IndirectBrExpandPass's logic.

Oh, I was just talking about the API invariant. Clearly coroutine lowering needs to do something different here. And probably clang should warn or error if indirect goto can't generate the expected code.

Add a new boolean flag HandleIndirectBranch defaulted to true: if a caller can know the blockaddress is safe to clone, they could set this to false(I expect this to be false most of the time). When HandleIndirectBranch is true, special-casing the case like global blockaddress array for function F where F has no blockaddress captured/escaped(https://llvm.org/docs/LangRef.html#pointer-capture) by cloning that array; for other cases, call IndirectBrExpandPass. Regardless of HandleIndirectBranch value, if a function references any global value and blockaddress maybe captured, we emit the warning/error because IndirectBrExpandPass can not handle all cases correctly.

I suspect the special case for "global blockaddress array" won't really trigger in practice; most uses of indirectbr involve the value escaping somehow, because otherwise there isn't really any value over just using a switch. (I don't think your testcase is representative.)

IndirectBrExpandPass can handle all cases correctly; it's just not efficient, like I mentioned before. And you probably don't want to do it without the caller explicitly asking for it.

I forget, why do we have to clone the coroutine body? I think I read an explanation at some point, but I'm not remembering where it was; it seems like it should be possible to avoid cloning.

Since the instructions executed before the first co_await call need to stay at the original function(ramp function). But for patterns like the below, this could not be decided statically so cloning is needed.
task corofoo(){
  while(...){
    if(..) co_await();
    else <code>;
  }
}

Cloning is one way to implement that, sure. I don't think it's the only way, though; you could move the body into a helper function, and have both coroutine entry-points call it. Maybe at some cost to performance. It might make sense to implement a scheme like this in case we run into other situations where we want to avoid cloning.

PS: I think it is helpful to add bound check for indirectbr. But I'm not sure where to put it, a new clang flag?

We could add a check to -fsanitize=undefined (https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html) to check that the address passed to an indirectbr is one of the expected addresses? Normally the undefined-behavior sanitizers insert the code early, though, in clang IR generation, so we can give good diagnostics. Not sure it would have caught a bug in a transform, though.

Alternatively, you could implement something more like CFI (https://clang.llvm.org/docs/ControlFlowIntegrity.html), which inserts checks late. So when the indirect-goto sanitizer is enabled, clang adds a function attribute, then the backend adds some checks just before isel, or something like that.

you could move the body into a helper function, and have both coroutine entry-points call it

that's the solution I used in the end to workaround this issue, by the way. I adjusted my C++ program accordingly, see https://godbolt.org/z/17G8jGbse

In D132084#3736529, @efriedma wrote:

I suspect the special case for "global blockaddress array" won't really trigger in practice; most uses of indirectbr involve the value escaping somehow, because otherwise there isn't really any value over just using a switch. (I don't think your testcase is representative.)

Could you share a test case that escapes or some example URL, please? I could not find many by googling computed gogo. Probably there are better keywords.

IndirectBrExpandPass can handle all cases correctly; it's just not efficient, like I mentioned before. And you probably don't want to do it without the caller explicitly asking for it.

One test case I was thinking of is

bool doInterpret(Instruction*& nextIntruction, int& out, int x) {
    static void* dispatch_table[] = {&&inc, &&suspend, &&stop};
    #define DISPATCH() goto *dispatch_table[*nextIntruction++]

    if(x%2){ // x==7: store &&suspend, x==9: store &&inc
       dispatch_table[0] = dispatch_table[x%3];
       return false;
    } else {
       DISPATCH();
    }

inc:
    ++out;
    DISPATCH();
suspend:
    return false;
stop:
    return true;
}

I don't think and am not sure if this is representative, but I guess it is valid?

If this is cloned even after IndirectBrExpandPass (not necessarily for coroutine, but in general), it may or may not do the right thing depending on how the two instances are called.

Cloning is one way to implement that, sure. I don't think it's the only way, though; you could move the body into a helper function, and have both coroutine entry-points call it. Maybe at some cost to performance. It might make sense to implement a scheme like this in case we run into other situations where we want to avoid cloning.

Sounds good, I'll look into it.

PS: I think it is helpful to add bound check for indirectbr. But I'm not sure where to put it, a new clang flag?

We could add a check to -fsanitize=undefined (https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html) to check that the address passed to an indirectbr is one of the expected addresses? Normally the undefined-behavior sanitizers insert the code early, though, in clang IR generation, so we can give good diagnostics. Not sure it would have caught a bug in a transform, though.

Alternatively, you could implement something more like CFI (https://clang.llvm.org/docs/ControlFlowIntegrity.html), which inserts checks late. So when the indirect-goto sanitizer is enabled, clang adds a function attribute, then the backend adds some checks just before isel, or something like that.

Sounds great. CFI approach seems more promising to me too. I'll look into it.

In D132084#3736593, @ychen wrote:

In D132084#3736529, @efriedma wrote:

I suspect the special case for "global blockaddress array" won't really trigger in practice; most uses of indirectbr involve the value escaping somehow, because otherwise there isn't really any value over just using a switch. (I don't think your testcase is representative.)

Could you share a test case that escapes or some example URL, please? I could not find many by googling computed gogo. Probably there are better keywords.

It's really not that frequently used in general; people usually only go for it to squeeze out the last bit of performance out of an interpreter.

Maybe take a look at OCaml: https://github.com/ocaml/ocaml/blob/trunk/runtime/interp.c

IndirectBrExpandPass can handle all cases correctly; it's just not efficient, like I mentioned before. And you probably don't want to do it without the caller explicitly asking for it.

One test case I was thinking of is
bool doInterpret(Instruction*& nextIntruction, int& out, int x) {
    static void* dispatch_table[] = {&&inc, &&suspend, &&stop};
    #define DISPATCH() goto *dispatch_table[*nextIntruction++]

    if(x%2){ // x==7: store &&suspend, x==9: store &&inc
       dispatch_table[0] = dispatch_table[x%3];
       return false;
    } else {
       DISPATCH();
    }

inc:
    ++out;
    DISPATCH();
suspend:
    return false;
stop:
    return true;
}
I don't think and am not sure if this is representative, but I guess it is valid?

If this is cloned even after IndirectBrExpandPass (not necessarily for coroutine, but in general), it may or may not do the right thing depending on how the two instances are called.

The way IndirectBrExpandPass works is that it rewrites away all "blockaddress" constants. Once the IR no longer contains any "blockaddress", whether or not you clone is irrelevant.

I'm not sure I have all the context, but it sounds like perhaps interpret.c from pr3120 could be a useful test case.

avogelsgesang mentioned this in D131938: [C++20] [Coroutines] Disable to take the address of labels in coroutines.Nov 14 2022, 7:06 PM

I suspect this is related to the broken behavior for llvm-reduce mentioned in D140909

In D132084#4057252, @arsenm wrote:

I suspect this is related to the broken behavior for llvm-reduce mentioned in D140909

The llvm-reduce path has an easy workaround which is to stop using CloneModule and only go through bitcode serialization (which is actually what we do for the multithread path anyway to get a unique context for each)

In D132084#4057254, @arsenm wrote:

In D132084#4057252, @arsenm wrote:

I suspect this is related to the broken behavior for llvm-reduce mentioned in D140909

The llvm-reduce path has an easy workaround which is to stop using CloneModule and only go through bitcode serialization (which is actually what we do for the multithread path anyway to get a unique context for each)

I agree with the summary in D140909. But I am confused that we can fix the current problem by the method in D140909.

All the discussion here is centered around cloning individual functions, and inserting the result in the same module. If you're using CloneModule, the concerns are different. I have no idea if that codepath works, but it's implementable in a straightforward manner.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Utils/

CloneFunction.cpp

50 lines

unittests/

Transforms/

Utils/

CloningTest.cpp

39 lines

Diff 453488

llvm/lib/Transforms/Utils/CloneFunction.cpp

//===- CloneFunction.cpp - Clone a function into another function ---------===//		//===- CloneFunction.cpp - Clone a function into another function ---------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements the CloneFunctionInto interface, which is used as the		// This file implements the CloneFunctionInto interface, which is used as the
// low-level function cloner. This is used by the CloneFunction and function		// low-level function cloner. This is used by the CloneFunction and function
// inliner to do the dirty work of copying the body of a function around.		// inliner to do the dirty work of copying the body of a function around.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/DomTreeUpdater.h"		#include "llvm/Analysis/DomTreeUpdater.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DebugInfo.h"		#include "llvm/IR/DebugInfo.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
		#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/MDBuilder.h"		#include "llvm/IR/MDBuilder.h"
#include "llvm/IR/Metadata.h"		#include "llvm/IR/Metadata.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"		#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/Transforms/Utils/Cloning.h"		#include "llvm/Transforms/Utils/Cloning.h"
Show All 37 Lines	BasicBlock llvm::CloneBasicBlock(const BasicBlock BB, ValueToValueMapTy &VMap,

if (CodeInfo) {		if (CodeInfo) {
CodeInfo->ContainsCalls \|= hasCalls;		CodeInfo->ContainsCalls \|= hasCalls;
CodeInfo->ContainsDynamicAllocas \|= hasDynamicAllocas;		CodeInfo->ContainsDynamicAllocas \|= hasDynamicAllocas;
}		}
return NewBB;		return NewBB;
}		}

		static void cloneGlobalVariablesUsingBlockAddress(
		SmallSet<GlobalVariable , 8> &GVsUsingBlockAddress, Module M,
		ValueToValueMapTy &VMap) {
		for (GlobalVariable *I : GVsUsingBlockAddress) {
		GlobalVariable *NewGV = new GlobalVariable(
		*M, I->getValueType(), I->isConstant(), I->getLinkage(),
		(Constant *)nullptr, I->getName(), I, I->getThreadLocalMode(),
		I->getType()->getAddressSpace());
		NewGV->copyAttributesFrom(I);
		VMap[I] = NewGV;

		SmallVector<std::pair<unsigned, MDNode *>, 1> MDs;
		I->getAllMetadata(MDs);
		for (auto MD : MDs)
		NewGV->addMetadata(MD.first, *MapMetadata(MD.second, VMap));

		assert(I->hasInitializer());
		NewGV->setInitializer(MapValue(I->getInitializer(), VMap));

		if (const Comdat *SC = I->getComdat()) {
		Comdat *DC = M->getOrInsertComdat(SC->getName());
		DC->setSelectionKind(SC->getSelectionKind());
		NewGV->setComdat(DC);
		}
		}
		}

// Clone OldFunc into NewFunc, transforming the old arguments into references to		// Clone OldFunc into NewFunc, transforming the old arguments into references to
// VMap values.		// VMap values.
//		//
void llvm::CloneFunctionInto(Function NewFunc, const Function OldFunc,		void llvm::CloneFunctionInto(Function NewFunc, const Function OldFunc,
ValueToValueMapTy &VMap,		ValueToValueMapTy &VMap,
CloneFunctionChangeType Changes,		CloneFunctionChangeType Changes,
SmallVectorImpl<ReturnInst *> &Returns,		SmallVectorImpl<ReturnInst *> &Returns,
const char NameSuffix, ClonedCodeInfo CodeInfo,		const char NameSuffix, ClonedCodeInfo CodeInfo,
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	#endif

// When we remap instructions within the same module, we want to avoid		// When we remap instructions within the same module, we want to avoid
// duplicating inlined DISubprograms, so record all subprograms we find as we		// duplicating inlined DISubprograms, so record all subprograms we find as we
// duplicate instructions and then freeze them in the MD map. We also record		// duplicate instructions and then freeze them in the MD map. We also record
// information about dbg.value and dbg.declare to avoid duplicating the		// information about dbg.value and dbg.declare to avoid duplicating the
// types.		// types.
Optional<DebugInfoFinder> DIFinder;		Optional<DebugInfoFinder> DIFinder;

		bool CurrentModuleChanges =
		Changes < CloneFunctionChangeType::DifferentModule;

// Track the subprogram attachment that needs to be cloned to fine-tune the		// Track the subprogram attachment that needs to be cloned to fine-tune the
// mapping within the same module.		// mapping within the same module.
DISubprogram *SPClonedWithinModule = nullptr;		DISubprogram *SPClonedWithinModule = nullptr;
if (Changes < CloneFunctionChangeType::DifferentModule) {		if (CurrentModuleChanges) {
assert((NewFunc->getParent() == nullptr \|\|		assert((NewFunc->getParent() == nullptr \|\|
NewFunc->getParent() == OldFunc->getParent()) &&		NewFunc->getParent() == OldFunc->getParent()) &&
"Expected NewFunc to have the same parent, or no parent");		"Expected NewFunc to have the same parent, or no parent");

// Need to find subprograms, types, and compile units.		// Need to find subprograms, types, and compile units.
DIFinder.emplace();		DIFinder.emplace();

SPClonedWithinModule = OldFunc->getSubprogram();		SPClonedWithinModule = OldFunc->getSubprogram();
if (SPClonedWithinModule)		if (SPClonedWithinModule)
DIFinder->processSubprogram(SPClonedWithinModule);		DIFinder->processSubprogram(SPClonedWithinModule);
} else {		} else {
assert((NewFunc->getParent() == nullptr \|\|		assert((NewFunc->getParent() == nullptr \|\|
NewFunc->getParent() != OldFunc->getParent()) &&		NewFunc->getParent() != OldFunc->getParent()) &&
"Expected NewFunc to have different parents, or no parent");		"Expected NewFunc to have different parents, or no parent");

if (Changes == CloneFunctionChangeType::DifferentModule) {		if (Changes == CloneFunctionChangeType::DifferentModule) {
assert(NewFunc->getParent() &&		assert(NewFunc->getParent() &&
"Need parent of new function to maintain debug info invariants");		"Need parent of new function to maintain debug info invariants");

// Need to find all the compile units.		// Need to find all the compile units.
DIFinder.emplace();		DIFinder.emplace();
}		}
}		}

		SmallSet<GlobalVariable *, 8> GVsUsingBlockAddress;

// Loop over all of the basic blocks in the function, cloning them as		// Loop over all of the basic blocks in the function, cloning them as
// appropriate. Note that we save BE this way in order to handle cloning of		// appropriate. Note that we save BE this way in order to handle cloning of
// recursive functions into themselves.		// recursive functions into themselves.
for (const BasicBlock &BB : *OldFunc) {		for (const BasicBlock &BB : *OldFunc) {

// Create a new basic block and copy instructions into it!		// Create a new basic block and copy instructions into it!
BasicBlock *CBB = CloneBasicBlock(&BB, VMap, NameSuffix, NewFunc, CodeInfo,		BasicBlock *CBB = CloneBasicBlock(&BB, VMap, NameSuffix, NewFunc, CodeInfo,
DIFinder ? &*DIFinder : nullptr);		DIFinder ? &*DIFinder : nullptr);

// Add basic block mapping.		// Add basic block mapping.
VMap[&BB] = CBB;		VMap[&BB] = CBB;

// It is only legal to clone a function if a block address within that		// It is only legal to clone a function if a block address within that
// function is never referenced outside of the function. Given that, we		// function is never referenced outside of the function. Given that, we
// want to map block addresses from the old function to block addresses in		// want to map block addresses from the old function to block addresses in
// the clone. (This is different from the generic ValueMapper		// the clone. (This is different from the generic ValueMapper
// implementation, which generates an invalid blockaddress when		// implementation, which generates an invalid blockaddress when
// cloning a function.)		// cloning a function.)
if (BB.hasAddressTaken()) {		if (BB.hasAddressTaken()) {
Constant OldBBAddr = BlockAddress::get(const_cast<Function >(OldFunc),		Constant OldBBAddr = BlockAddress::get(const_cast<Function >(OldFunc),
const_cast<BasicBlock *>(&BB));		const_cast<BasicBlock *>(&BB));
VMap[OldBBAddr] = BlockAddress::get(NewFunc, CBB);		VMap[OldBBAddr] = BlockAddress::get(NewFunc, CBB);

		if (CurrentModuleChanges)
		for (User *U : OldBBAddr->users()) {
		while (isa<Constant>(U) && !isa<GlobalVariable>(U)) {
		assert(U->getNumUses() == 1);
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions This assertion looks not stable. It is easy to generate code to break the assumption. ChuanqiXu: This assertion looks not stable. It is easy to generate code to break the assumption.
		U = cast<Constant>(U)->user_back();
		}
		if (auto *GV = dyn_cast<GlobalVariable>(U))
		GVsUsingBlockAddress.insert(GV);
		}
}		}

// Note return instructions for the caller.		// Note return instructions for the caller.
if (ReturnInst *RI = dyn_cast<ReturnInst>(CBB->getTerminator()))		if (ReturnInst *RI = dyn_cast<ReturnInst>(CBB->getTerminator()))
Returns.push_back(RI);		Returns.push_back(RI);
}		}

		if (CurrentModuleChanges)
		cloneGlobalVariablesUsingBlockAddress(GVsUsingBlockAddress,
		NewFunc->getParent(), VMap);

if (Changes < CloneFunctionChangeType::DifferentModule &&		if (Changes < CloneFunctionChangeType::DifferentModule &&
DIFinder->subprogram_count() > 0) {		DIFinder->subprogram_count() > 0) {
// Turn on module-level changes, since we need to clone (some of) the		// Turn on module-level changes, since we need to clone (some of) the
// debug info metadata.		// debug info metadata.
//		//
// FIXME: Metadata effectively owned by a function should be made		// FIXME: Metadata effectively owned by a function should be made
// local, and only that local metadata should be cloned.		// local, and only that local metadata should be cloned.
ModuleLevelChanges = true;		ModuleLevelChanges = true;
▲ Show 20 Lines • Show All 947 Lines • Show Last 20 Lines

llvm/unittests/Transforms/Utils/CloningTest.cpp

Show First 20 Lines • Show All 743 Lines • ▼ Show 20 Lines	TEST(CloneFunction, CloneFunctionWithInalloca) {
CloneFunctionInto(DeclFunction, ImplFunction, VMap,		CloneFunctionInto(DeclFunction, ImplFunction, VMap,
CloneFunctionChangeType::GlobalChanges, Returns, "", &CCI);		CloneFunctionChangeType::GlobalChanges, Returns, "", &CCI);

EXPECT_FALSE(verifyModule(*ImplModule, &errs()));		EXPECT_FALSE(verifyModule(*ImplModule, &errs()));
EXPECT_TRUE(CCI.ContainsCalls);		EXPECT_TRUE(CCI.ContainsCalls);
EXPECT_TRUE(CCI.ContainsDynamicAllocas);		EXPECT_TRUE(CCI.ContainsDynamicAllocas);
}		}

		TEST(CloneFunction, CloneFunctionWithBlockAddressTaken) {
		StringRef ImplAssembly = R"(
		@ba = internal unnamed_addr constant [1 x ptr] [ptr blockaddress(@foo, %stop)]

		define void @foo() {
		entry:
		%arrayidx = getelementptr inbounds [1 x ptr], ptr @ba, i64 0, i64 0
		%0 = load ptr, ptr %arrayidx
		br label %indirectgoto

		stop:
		ret void

		indirectgoto:
		%indirect.goto.dest = phi ptr [ %0, %entry ]
		indirectbr ptr %indirect.goto.dest, [label %stop]
		}

		declare void @bar()
		)";

		LLVMContext Context;
		SMDiagnostic Error;

		auto ImplModule = parseAssemblyString(ImplAssembly, Error, Context);
		EXPECT_TRUE(ImplModule != nullptr);
		auto *ImplFunction = ImplModule->getFunction("foo");
		EXPECT_TRUE(ImplFunction != nullptr);
		auto *DeclFunction = ImplModule->getFunction("bar");
		EXPECT_TRUE(DeclFunction != nullptr);

		ValueToValueMapTy VMap;
		SmallVector<ReturnInst *, 8> Returns;
		CloneFunctionInto(DeclFunction, ImplFunction, VMap,
		CloneFunctionChangeType::GlobalChanges, Returns, "");

		EXPECT_FALSE(verifyModule(*ImplModule, &errs()));
		}

TEST(CloneFunction, CloneFunctionWithSubprograms) {		TEST(CloneFunction, CloneFunctionWithSubprograms) {
// Tests that the debug info is duplicated correctly when a DISubprogram		// Tests that the debug info is duplicated correctly when a DISubprogram
// happens to be one of the operands of the DISubprogram that is being cloned.		// happens to be one of the operands of the DISubprogram that is being cloned.
// In general, operands of "test" that are distinct should be duplicated,		// In general, operands of "test" that are distinct should be duplicated,
// but in this case "my_operator" should not be duplicated. If it is		// but in this case "my_operator" should not be duplicated. If it is
// duplicated, the metadata in the llvm.dbg.declare could end up with		// duplicated, the metadata in the llvm.dbg.declare could end up with
// different duplicates.		// different duplicates.
StringRef ImplAssembly = R"(		StringRef ImplAssembly = R"(
▲ Show 20 Lines • Show All 331 Lines • Show Last 20 Lines