This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/Driver/ToolChains/
-
Driver/
-
ToolChains/
3/3
Clang.cpp
-
test/CodeGenCUDA/
-
CodeGenCUDA/
2/3
amdgpu-alias-undef-symbols.cu
-
llvm/
-
lib/Target/AMDGPU/
-
Target/
-
AMDGPU/
1/2
AMDGPUAlwaysInlinePass.cpp
3/7
AMDGPUResourceUsageAnalysis.cpp
-
test/CodeGen/AMDGPU/
-
CodeGen/
-
AMDGPU/
1/3
inline-calls.ll

Differential D109707

[HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols
ClosedPublic

Authored by gandhi21299 on Sep 13 2021, 11:09 AM.

Download Raw Diff

Details

Reviewers

yaxunl
arsenm
nhaustov
tstellar

Group Reviewers

Restricted Project

Commits

rG0567f0333176: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols
rG03375a3fb33b: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

Summary

By default clang emits complete contructors as alias of base constructors if they are the same.
The backend is supposed to emit symbols for the alias, otherwise it causes undefined symbols.
@yaxunl observed that this issue is related to the llvm options -amdgpu-early-inline-all=true
and -amdgpu-function-calls=false. This issue is resolved by only inlining global values
with internal linkage. The getCalleeFunction() in AMDGPUResourceUsageAnalysis also had
to be extended to support aliases to functions. inline-calls.ll was corrected appropriately.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

gandhi21299 created this revision.Sep 13 2021, 11:09 AM

Herald added subscribers: foad, kerbowa, hiraditya and 4 others. · View Herald TranscriptSep 13 2021, 11:09 AM

gandhi21299 requested review of this revision.Sep 13 2021, 11:09 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptSep 13 2021, 11:09 AM

Herald added subscribers: llvm-commits, cfe-commits. · View Herald Transcript

gandhi21299 edited the summary of this revision. (Show Details)Sep 13 2021, 11:12 AM

While I think the early inliner is largely obsolete, it should still handle aliases correctly

arsenm added inline comments.Sep 13 2021, 11:15 AM

clang/lib/Driver/ToolChains/Clang.cpp
5094	This looks like an unrelated change?

We cannot disable early inline all since this will cause performance regressions. Instead, the early inline all pass should be fixed so it does not remove aliases.

should the amdgpu-early-inline-all flag be deleted?

Harbormaster completed remote builds in B123708: Diff 372294.Sep 13 2021, 11:50 AM

In D109707#2998095, @aeubanks wrote:

should the amdgpu-early-inline-all flag be deleted?

No. It is still used by HIP. We will deprecate it in the feature, but it is not ready yet.

gandhi21299 marked an inline comment as done.Sep 13 2021, 12:33 PM

gandhi21299 added inline comments.

clang/lib/Driver/ToolChains/Clang.cpp
5094	Ahh yes, I will get rid of it.

set GlobalOpt parameter to false by default to disallow alias elimination when the options EarlyInlineAll and EnableFunctionCalls are true and false, respectively.

clang/lib/Driver/ToolChains/Clang.cpp
5094	This was part of a revert that is required for this patch to function.

gandhi21299 edited the summary of this revision. (Show Details)Sep 14 2021, 12:24 PM

gandhi21299 added a reviewer: arsenm.

Herald added a subscriber: wdng. · View Herald TranscriptSep 14 2021, 12:24 PM

Harbormaster completed remote builds in B123880: Diff 372528.Sep 14 2021, 1:16 PM

added the include header for HIP runtime

Harbormaster completed remote builds in B123895: Diff 372551.Sep 14 2021, 2:33 PM

converted the HIP test into a CUDA test

Harbormaster completed remote builds in B123910: Diff 372573.Sep 14 2021, 3:52 PM

I think you may try fixing the following line:

https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp#L97

@yaxunl Under what criteria should an alias not be removed?

In D109707#3004108, @gandhi21299 wrote:

@yaxunl Under what criteria should an alias not be removed?

If the linkage of the alias is not internal, then it should not be removed.

Internal linkage detection works great for our purposes but it causes a failure in llvm/test/CodeGen/AMDGPU/inline-calls.ll due to @func_alias unable to be casted into a Function. If we pass through that, the @kernel3 causes the error: scalar registers (98) exceeds limit (96) in function 'kernel3'.

@yaxunl I think we have two ways to go from here:

If appropriate, reset the maximum number of scalar registers allowed in @kernel3 (inline-calls.ll) to fix the test.
Determine a stronger condition for inlining.

In D109707#3004869, @gandhi21299 wrote:

Internal linkage detection works great for our purposes but it causes a failure in llvm/test/CodeGen/AMDGPU/inline-calls.ll due to @func_alias unable to be casted into a Function. If we pass through that, the @kernel3 causes the error: scalar registers (98) exceeds limit (96) in function 'kernel3'.

That almost sounds like using the wrong subtarget for the alias

llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
694 ↗	(On Diff #372573)	This needs a backend test

Prevent removing alias if the GlobalAlias does not have internal linkage

aeubanks removed a reviewer: aeubanks.Sep 17 2021, 9:42 AM

Harbormaster completed remote builds in B124424: Diff 373257.Sep 17 2021, 10:03 AM

replaced a cast with a dyn_cast since the return value from getCalleeFunction() is not always a Function
RUN on line 2 was causing 2 more scalar registers to be used on tonga due to @func_alias not being inlined, hence I eliminated that test
RUN on line 3 generated a call instruction to an aliased function which is not supported on r600 (according to @arsenm ), hence I eliminated that test as well

gandhi21299 edited the summary of this revision. (Show Details)Sep 22 2021, 1:44 PM

Harbormaster completed remote builds in B125215: Diff 374354.Sep 22 2021, 1:58 PM

refreshing patch

Harbormaster completed remote builds in B125227: Diff 374369.Sep 22 2021, 2:54 PM

arsenm added inline comments.Sep 23 2021, 9:40 AM

llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
68	I think this is not the right place for this. If we can determine the callee function, we should have directly set it in the instruction during call lowering

gandhi21299 added inline comments.Sep 23 2021, 10:05 AM

llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
68	Which file would that be in?

arsenm added inline comments.Sep 23 2021, 10:06 AM

llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
68	SIISelLowering and AMDGPUCallLowering

Declare an unhandled call lowering in SelectionDAG when a callee is encountered which cannot be casted into a Function
I am still investigating the effects on GlobalISel side of things, there seems to be a problem when lowering a call to @func in @kernel as well.
inline-calls.ll is expected to fail with this patch, we could turn it into a negative test depending on how the work goes.

Harbormaster completed remote builds in B125737: Diff 375079.Sep 25 2021, 8:56 PM

It does not look like function calls are supported yet in AMDGPUCallLowering, is that correct?

Pls update the description of the patch and also make sure it passes internal CI.

clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
2	this test needs // REQUIRES: amdgpu-registered-target, clang-driver
llvm/test/CodeGen/AMDGPU/inline-calls.ll
2–3	why these two lines are removed?

In D109707#3016438, @gandhi21299 wrote:

replaced a cast with a dyn_cast since the return value from getCalleeFunction() is not always a Function

RUN on line 2 was causing 2 more scalar registers to be used on tonga due to @func_alias not being inlined, hence I eliminated that test

RUN on line 3 generated a call instruction to an aliased function which is not supported on r600 (according to @arsenm ), hence I eliminated that test as well

@yaxunl

added the REQUIRES line as requested by Sam

gandhi21299 edited the summary of this revision. (Show Details)Sep 27 2021, 9:08 AM

Harbormaster completed remote builds in B125889: Diff 375284.Sep 27 2021, 9:23 AM

@yaxunl Should inline-calls.ll be converted into an expected failing test or removed? (to avoid cast failure in AMDGPUResourceAnalysis to break the test)

gandhi21299 added a reviewer: nhaustov.Sep 28 2021, 7:55 AM

gandhi21299 added a reviewer: tstellar.

declare failure when lowering an accessor of a callee which is not a function, in GlobalISel

yaxunl added inline comments.Sep 28 2021, 11:59 AM

llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
98–99	If we do this for older GPU's (e.g. Tonga/redwood), IR's using aliases will fail on them. I don't think it is acceptable. Is it possible to restrict this change to gfx9 and above? Or should we introduce some feature to indicate 'alias support' and use that to restrict this change to subtargets supporting this feature.
llvm/test/CodeGen/AMDGPU/inline-calls.ll
0–3	need to add check for gfx906 and gfx1030

gandhi21299 added inline comments.Sep 28 2021, 12:01 PM

llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
98–99	Restricting this change to gfx9 and above sounds simpler and more relevant with the problem as well.

Harbormaster completed remote builds in B126151: Diff 375655.Sep 28 2021, 12:03 PM

Sorry, that was a mistake.

inline-calls.ll failed on gfx908 due to the change in SIISelLowering.cpp, line 3015. Without the change, there is a failure in AMDGPUResourceAnalysis.cpp, line 65 because Op.getGlobal() is not a Function.

Since callees may alias to a function pointer, it makes sense for getCalleeFunction(...) to return a Function which is a cast of the operand of a GlobalAlias.

eliminated changes in SIISelLowering

gandhi21299 edited the summary of this revision. (Show Details)Oct 1 2021, 1:00 PM

ping

gandhi21299 added a reviewer: Restricted Project.Oct 7 2021, 8:09 AM

refreshing patch

Harbormaster completed remote builds in B127570: Diff 377922.Oct 7 2021, 12:16 PM

added -nogpulib and -nogpuinc flags to amdgpu-alias-undef-symbols.cu

Harbormaster completed remote builds in B127765: Diff 378218.Oct 8 2021, 9:03 AM

ping

Passed internal CI

gandhi21299 added inline comments.Oct 12 2021, 1:16 PM

llvm/test/CodeGen/AMDGPU/inline-calls.ll
3	@tstellar Is there a way to restrict the AlwaysInliner to only run on amdgcn architecture?

add a restrictions to what architecture AlwaysInliner should run on, updated the inline-calls.ll test.

Harbormaster completed remote builds in B128720: Diff 379530.Oct 13 2021, 3:51 PM

Passed ePSDB

ping

LGTM. Thanks.

This revision is now accepted and ready to land.Oct 15 2021, 8:45 AM

Closed by commit rG03375a3fb33b: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols (authored by gandhi21299). · Explain WhyOct 15 2021, 10:39 AM

This revision was automatically updated to reflect the committed changes.

gandhi21299 added a commit: rG03375a3fb33b: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols.

This breaks tests on Mac: http://45.33.8.238/mac/37119/step_7.txt

Please take a look and revert for now if it takes a while to fix.

@gandhi21299 you may need to add "-target x86_64-unknown-linux-gnu" to your codegen test to avoid issue with Darwin toolchain.

gandhi21299 added a reverting change: rG1830ec94ac02: Revert "[HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined….Oct 15 2021, 3:16 PM

gandhi21299 reopened this revision.Oct 15 2021, 3:25 PM

This revision is now accepted and ready to land.Oct 15 2021, 3:25 PM

added -target option in the test amdgpu-alias-undef-symbols.cu

@thakis can you please check if this solution is sufficient? Thanks for bringing it up

Harbormaster completed remote builds in B129131: Diff 380110.Oct 15 2021, 4:08 PM

MaskRay added a subscriber: MaskRay.Oct 15 2021, 4:35 PM

MaskRay added inline comments.

clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
4	non-driver tests prefer `%clang_cc1`. `%clang` invokes the driver and has varying behaviors on different platforms. Include paths/resource dir may be quite different.

gandhi21299 added inline comments.Oct 16 2021, 12:33 PM

clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu

Alias is not generated when I make the change to:

// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx906 -aux-triple x86_64-unknown-linux-gnu \
// RUN:   -x hip -fcuda-is-device -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true \
// RUN:   -mllvm -amdgpu-function-calls=false -emit-llvm %s -o - | FileCheck %s

Closed by commit rG0567f0333176: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols (authored by gandhi21299). · Explain WhyOct 18 2021, 3:53 PM

This revision was automatically updated to reflect the committed changes.

gandhi21299 added a commit: rG0567f0333176: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols.

arsenm added inline comments.Oct 19 2021, 1:16 PM

llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
67	I thought aliases could include embedded bitcasts of the function type, so the function wouldn't directly appear here

gandhi21299 added inline comments.Oct 20 2021, 10:59 AM

llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
67	Can you please elaborate on "include embedded bitcasts of the function type"? It's a consequence of the AlwaysInliner where the callee gets replaced by the alias to a function, ie. @func_alias gets replaced by @func in the inline-calls.ll test.

arsenm added inline comments.Oct 20 2021, 11:15 AM

llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
67	Something like this where the alias changes the type from the original function: @add1alias3 = alias float (float), bitcast (i32 (i32)* @add1 to float(float)*)

gandhi21299 added inline comments.Oct 20 2021, 11:31 AM

llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
67	I see, that will probably break the compiler since a bitcast expression is not a Function.

Revision Contents

Path

Size

clang/

lib/

Driver/

ToolChains/

Clang.cpp

6 lines

test/

CodeGenCUDA/

amdgpu-alias-undef-symbols.cu

17 lines

llvm/

lib/

Target/

AMDGPU/

AMDGPUAlwaysInlinePass.cpp

5 lines

AMDGPUResourceUsageAnalysis.cpp

5 lines

test/

CodeGen/

AMDGPU/

inline-calls.ll

15 lines

Diff 380537

clang/lib/Driver/ToolChains/Clang.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,083 Lines • ▼ Show 20 Lines	if (Args.hasArg(options::OPT_fdebug_pass_structure)) {
CmdArgs.push_back("Structure");		CmdArgs.push_back("Structure");
}		}
if (Args.hasArg(options::OPT_fdebug_pass_arguments)) {		if (Args.hasArg(options::OPT_fdebug_pass_arguments)) {
CmdArgs.push_back("-mdebug-pass");		CmdArgs.push_back("-mdebug-pass");
CmdArgs.push_back("Arguments");		CmdArgs.push_back("Arguments");
}		}

// Enable -mconstructor-aliases except on darwin, where we have to work around		// Enable -mconstructor-aliases except on darwin, where we have to work around
// a linker bug (see <rdar://problem/7651567>), and CUDA/AMDGPU device code,		// a linker bug (see <rdar://problem/7651567>), and CUDA device code, where
// where aliases aren't supported.		// aliases aren't supported.
if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())		if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
arsenmUnsubmitted Done Reply Inline Actions This looks like an unrelated change? arsenm: This looks like an unrelated change?
gandhi21299AuthorUnsubmitted Done Reply Inline Actions Ahh yes, I will get rid of it. gandhi21299: Ahh yes, I will get rid of it.
gandhi21299AuthorUnsubmitted Done Reply Inline Actions This was part of a revert that is required for this patch to function. gandhi21299: This was part of a revert that is required for this patch to function.
CmdArgs.push_back("-mconstructor-aliases");		CmdArgs.push_back("-mconstructor-aliases");

// Darwin's kernel doesn't support guard variables; just die if we		// Darwin's kernel doesn't support guard variables; just die if we
// try to use them.		// try to use them.
if (KernelOrKext && RawTriple.isOSDarwin())		if (KernelOrKext && RawTriple.isOSDarwin())
CmdArgs.push_back("-fforbid-guard-variables");		CmdArgs.push_back("-fforbid-guard-variables");

if (Args.hasFlag(options::OPT_mms_bitfields, options::OPT_mno_ms_bitfields,		if (Args.hasFlag(options::OPT_mms_bitfields, options::OPT_mno_ms_bitfields,
▲ Show 20 Lines • Show All 2,826 Lines • Show Last 20 Lines

clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu

This file was added.

				// REQUIRES: amdgpu-registered-target, clang-driver

				yaxunlUnsubmitted Done Reply Inline Actions this test needs // REQUIRES: amdgpu-registered-target, clang-driver yaxunl: this test needs // REQUIRES: amdgpu-registered-target, clang-driver
				// RUN: %clang -target x86_64-unknown-linux-gnu --offload-arch=gfx906 --cuda-device-only -nogpulib -nogpuinc -x hip -emit-llvm -S -o - %s \
				// RUN: -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false \| \
				MaskRayUnsubmitted Not Done Reply Inline Actions non-driver tests prefer `%clang_cc1`. `%clang` invokes the driver and has varying behaviors on different platforms. Include paths/resource dir may be quite different. MaskRay: non-driver tests prefer `%clang_cc1`. `%clang` invokes the driver and has varying behaviors on…
				gandhi21299AuthorUnsubmitted Done Reply Inline Actions Alias is not generated when I make the change to: // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx906 -aux-triple x86_64-unknown-linux-gnu \ // RUN: -x hip -fcuda-is-device -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true \ // RUN: -mllvm -amdgpu-function-calls=false -emit-llvm %s -o - \| FileCheck %s gandhi21299: Alias is not generated when I make the change to: ``` // RUN: %clang_cc1 -triple amdgcn-amd…
				// RUN: FileCheck %s

				#include "Inputs/cuda.h"

				// CHECK: %struct.B = type { i8 }
				struct B {

				// CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B, i32), void (%struct.B, i32)* @_ZN1BC2Ei
				__device__ B(int x);
				};

				__device__ B::B(int x) {
				}

llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp

Show All 9 Lines
/// This pass marks all internal functions as always_inline and creates		/// This pass marks all internal functions as always_inline and creates
/// duplicates of all other functions and marks the duplicates as always_inline.		/// duplicates of all other functions and marks the duplicates as always_inline.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "AMDGPU.h"		#include "AMDGPU.h"
#include "AMDGPUTargetMachine.h"		#include "AMDGPUTargetMachine.h"
#include "Utils/AMDGPUBaseInfo.h"		#include "Utils/AMDGPUBaseInfo.h"
		#include "llvm/CodeGen/CommandFlags.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"

using namespace llvm;		using namespace llvm;

namespace {		namespace {

▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	recursivelyVisitUsers(GlobalValue &GV,
}		}
}		}

static bool alwaysInlineImpl(Module &M, bool GlobalOpt) {		static bool alwaysInlineImpl(Module &M, bool GlobalOpt) {
std::vector<GlobalAlias*> AliasesToRemove;		std::vector<GlobalAlias*> AliasesToRemove;

SmallPtrSet<Function *, 8> FuncsToAlwaysInline;		SmallPtrSet<Function *, 8> FuncsToAlwaysInline;
SmallPtrSet<Function *, 8> FuncsToNoInline;		SmallPtrSet<Function *, 8> FuncsToNoInline;
		Triple TT(M.getTargetTriple());

for (GlobalAlias &A : M.aliases()) {		for (GlobalAlias &A : M.aliases()) {
if (Function* F = dyn_cast<Function>(A.getAliasee())) {		if (Function* F = dyn_cast<Function>(A.getAliasee())) {
		if (TT.getArch() == Triple::amdgcn &&
		A.getLinkage() != GlobalValue::InternalLinkage)
		yaxunlUnsubmitted Not Done Reply Inline Actions If we do this for older GPU's (e.g. Tonga/redwood), IR's using aliases will fail on them. I don't think it is acceptable. Is it possible to restrict this change to gfx9 and above? Or should we introduce some feature to indicate 'alias support' and use that to restrict this change to subtargets supporting this feature. yaxunl: If we do this for older GPU's (e.g. Tonga/redwood), IR's using aliases will fail on them. I…
		gandhi21299AuthorUnsubmitted Done Reply Inline Actions Restricting this change to gfx9 and above sounds simpler and more relevant with the problem as well. gandhi21299: Restricting this change to gfx9 and above sounds simpler and more relevant with the problem as…
		continue;
A.replaceAllUsesWith(F);		A.replaceAllUsesWith(F);
AliasesToRemove.push_back(&A);		AliasesToRemove.push_back(&A);
}		}

// FIXME: If the aliasee isn't a function, it's some kind of constant expr		// FIXME: If the aliasee isn't a function, it's some kind of constant expr
// cast that won't be inlined through.		// cast that won't be inlined through.
}		}

▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp

	Show All 23 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "AMDGPUResourceUsageAnalysis.h"			#include "AMDGPUResourceUsageAnalysis.h"
	#include "AMDGPU.h"			#include "AMDGPU.h"
	#include "GCNSubtarget.h"			#include "GCNSubtarget.h"
	#include "SIMachineFunctionInfo.h"			#include "SIMachineFunctionInfo.h"
	#include "llvm/Analysis/CallGraph.h"			#include "llvm/Analysis/CallGraph.h"
	#include "llvm/CodeGen/TargetPassConfig.h"			#include "llvm/CodeGen/TargetPassConfig.h"
				#include "llvm/IR/GlobalAlias.h"
				#include "llvm/IR/GlobalValue.h"
	#include "llvm/Target/TargetMachine.h"			#include "llvm/Target/TargetMachine.h"

	using namespace llvm;			using namespace llvm;
	using namespace llvm::AMDGPU;			using namespace llvm::AMDGPU;

	#define DEBUG_TYPE "amdgpu-resource-usage"			#define DEBUG_TYPE "amdgpu-resource-usage"

	char llvm::AMDGPUResourceUsageAnalysis::ID = 0;			char llvm::AMDGPUResourceUsageAnalysis::ID = 0;
	Show All 16 Lines
	INITIALIZE_PASS(AMDGPUResourceUsageAnalysis, DEBUG_TYPE,			INITIALIZE_PASS(AMDGPUResourceUsageAnalysis, DEBUG_TYPE,
	"Function register usage analysis", true, true)			"Function register usage analysis", true, true)

	static const Function *getCalleeFunction(const MachineOperand &Op) {			static const Function *getCalleeFunction(const MachineOperand &Op) {
	if (Op.isImm()) {			if (Op.isImm()) {
	assert(Op.getImm() == 0);			assert(Op.getImm() == 0);
	return nullptr;			return nullptr;
	}			}
				if (auto *GA = dyn_cast<GlobalAlias>(Op.getGlobal()))
				return cast<Function>(GA->getOperand(0));
				arsenmUnsubmitted Not Done Reply Inline Actions I thought aliases could include embedded bitcasts of the function type, so the function wouldn't directly appear here arsenm: I thought aliases could include embedded bitcasts of the function type, so the function…
				gandhi21299AuthorUnsubmitted Done Reply Inline Actions Can you please elaborate on "include embedded bitcasts of the function type"? It's a consequence of the AlwaysInliner where the callee gets replaced by the alias to a function, ie. @func_alias gets replaced by @func in the inline-calls.ll test. gandhi21299: Can you please elaborate on "include embedded bitcasts of the function type"? It's a…
				arsenmUnsubmitted Not Done Reply Inline Actions Something like this where the alias changes the type from the original function: @add1alias3 = alias float (float), bitcast (i32 (i32)* @add1 to float(float)) arsenm:* Something like this where the alias changes the type from the original function: ```…
				gandhi21299AuthorUnsubmitted Done Reply Inline Actions I see, that will probably break the compiler since a bitcast expression is not a Function. gandhi21299: I see, that will probably break the compiler since a bitcast expression is not a Function.
	return cast<Function>(Op.getGlobal());			return cast<Function>(Op.getGlobal());
				arsenmUnsubmitted Not Done Reply Inline Actions I think this is not the right place for this. If we can determine the callee function, we should have directly set it in the instruction during call lowering arsenm: I think this is not the right place for this. If we can determine the callee function, we…
				gandhi21299AuthorUnsubmitted Done Reply Inline Actions Which file would that be in? gandhi21299: Which file would that be in?
				arsenmUnsubmitted Not Done Reply Inline Actions SIISelLowering and AMDGPUCallLowering arsenm: SIISelLowering and AMDGPUCallLowering
	}			}

	static bool hasAnyNonFlatUseOfReg(const MachineRegisterInfo &MRI,			static bool hasAnyNonFlatUseOfReg(const MachineRegisterInfo &MRI,
	const SIInstrInfo &TII, unsigned Reg) {			const SIInstrInfo &TII, unsigned Reg) {
	for (const MachineOperand &UseOp : MRI.reg_operands(Reg)) {			for (const MachineOperand &UseOp : MRI.reg_operands(Reg)) {
	if (!UseOp.isImplicit() \|\| !TII.isFLAT(*UseOp.getParent()))			if (!UseOp.isImplicit() \|\| !TII.isFLAT(*UseOp.getParent()))
	return true;			return true;
	}			}
	▲ Show 20 Lines • Show All 446 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/inline-calls.ll

	; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s \| FileCheck %s			; RUN: llc -mtriple amdgcn-unknown-linux-gnu -mcpu=tahiti -verify-machineinstrs < %s \| FileCheck %s
	; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s \| FileCheck %s			; RUN: llc -mtriple amdgcn-unknown-linux-gnu -mcpu=tonga -verify-machineinstrs < %s \| FileCheck %s
	; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s \| FileCheck %s			; RUN: llc -mtriple r600-unknown-linux-gnu -mcpu=redwood -verify-machineinstrs < %s \| FileCheck %s --check-prefix=R600
	yaxunlUnsubmitted Not Done Reply Inline Actions why these two lines are removed? yaxunl: why these two lines are removed?
	gandhi21299AuthorUnsubmitted Done Reply Inline Actions @tstellar Is there a way to restrict the AlwaysInliner to only run on amdgcn architecture? gandhi21299: @tstellar Is there a way to restrict the AlwaysInliner to only run on amdgcn architecture?
				yaxunlUnsubmitted Not Done Reply Inline Actions need to add check for gfx906 and gfx1030 yaxunl: need to add check for gfx906 and gfx1030

	; ALL-NOT: {{^}}func:			; ALL-NOT: {{^}}func:
	define internal i32 @func(i32 %a) {			define internal i32 @func(i32 %a) {
	entry:			entry:
	%tmp0 = add i32 %a, 1			%tmp0 = add i32 %a, 1
	ret i32 %tmp0			ret i32 %tmp0
	}			}

	; ALL: {{^}}kernel:			; CHECK: {{^}}kernel:
	; GCN-NOT: s_swappc_b64			; GCN-NOT: s_swappc_b64
	define amdgpu_kernel void @kernel(i32 addrspace(1)* %out) {			define amdgpu_kernel void @kernel(i32 addrspace(1)* %out) {
	entry:			entry:
	%tmp0 = call i32 @func(i32 1)			%tmp0 = call i32 @func(i32 1)
	store i32 %tmp0, i32 addrspace(1)* %out			store i32 %tmp0, i32 addrspace(1)* %out
	ret void			ret void
	}			}

	; CHECK-NOT: func_alias			; CHECK: func_alias
	; ALL-NOT: func_alias			; R600-NOT: func_alias
	@func_alias = alias i32 (i32), i32 (i32)* @func			@func_alias = alias i32 (i32), i32 (i32)* @func

	; ALL: {{^}}kernel3:			; CHECK-NOT: {{^}}kernel3:
	; GCN-NOT: s_swappc_b64			; GCN-NOT: s_swappc_b64
				; R600: {{^}}kernel3:
	define amdgpu_kernel void @kernel3(i32 addrspace(1)* %out) {			define amdgpu_kernel void @kernel3(i32 addrspace(1)* %out) {
	entry:			entry:
	%tmp0 = call i32 @func_alias(i32 1)			%tmp0 = call i32 @func_alias(i32 1)
	store i32 %tmp0, i32 addrspace(1)* %out			store i32 %tmp0, i32 addrspace(1)* %out
	ret void			ret void
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbolsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 380537

clang/lib/Driver/ToolChains/Clang.cpp

clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu

llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp

llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp

llvm/test/CodeGen/AMDGPU/inline-calls.ll

[HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols
ClosedPublic