This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/test/
-
test/
-
CodeGen/
-
2008-07-30-implicit-initialization.c
-
arm-fp16-arguments.c
-
arm-vfp16-arguments2.cpp
-
atomic-ops-libcall.c
-
CodeGenCXX/
-
atomicinit.cpp
-
auto-var-init.cpp
-
discard-name-values.cpp
-
microsoft-abi-dynamic-cast.cpp
-
microsoft-abi-typeid.cpp
-
nrvo.cpp
-
stack-reuse.cpp
-
wasm-args-returns.cpp
-
CodeGenObjCXX/
-
arc-blocks.mm
-
nrvo.mm
-
Lexer/
-
minimize_source_to_dependency_directives_invalid_error.c
-
PCH/
-
no-escaping-block-tail-calls.cpp
-
lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/
-
packages/
-
Python/
-
lldbsuite/
-
test/
-
functionalities/
-
tail_call_frames/
-
ambiguous_tail_call_seq1/
-
Makefile
-
ambiguous_tail_call_seq2/
-
Makefile
-
disambiguate_call_site/
-
Makefile
-
disambiguate_paths_to_common_sink/
-
Makefile
-
disambiguate_tail_call_seq/
-
Makefile
-
inlining_and_tail_calls/
-
Makefile
-
sbapi_support/
-
Makefile
-
thread_step_out_message/
-
Makefile
-
thread_step_out_or_return/
-
Makefile
-
unambiguous_sequence/
-
Makefile
-
llvm/
-
include/llvm/Passes/
-
llvm/
-
Passes/
-
PassBuilder.h
-
lib/
-
Passes/
1/2
PassBuilder.cpp
-
Transforms/IPO/
-
IPO/
7
PassManagerBuilder.cpp
-
test/
-
CodeGen/AMDGPU/
-
AMDGPU/
-
simplify-libcalls.ll
-
Feature/
-
optnone-opt.ll
-
Other/
2
new-pm-defaults.ll
-
new-pm-thinlto-defaults.ll
-
Transforms/
-
MemCpyOpt/
-
lifetime.ll
-
PhaseOrdering/
-
simplifycfg-options.ll
-
two-shifts-by-sext.ll

Differential D65410

[PassManager] First Pass implementation at -O1 pass pipeline
ClosedPublic

Authored by echristo on Jul 29 2019, 11:11 AM.

Download Raw Diff

Details

Reviewers

chandlerc
hfinkel
omjavaid

Commits

rGfd39b1bb20ce: Revert "Revert "As a follow-up to my initial mail to llvm-dev here's a first…
rGc9ddb02659e3: Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at…
rG8ff85ed905a7: As a follow-up to my initial mail to llvm-dev here's a first pass at the O1…

Summary

As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there.

Some rough internal testing using a bootstrap and test of clang has shown a combined build and test time for clang with nearly equivalent performance to O3 and quite a speedup over O0 - it's currently a little slower than the existing O1, likely due to the clang+llvm testsuite use of the same binaries many times rather than a few for individual tests. Build time is a bit better. For a larger build and smaller test time (think a couple of unittests), this is a bit better than either O3, O0, or O1. Overall binary size drops significantly compared to O0.

This change doesn't include any change to move from selection dag to fast isel and that will come with other numbers that should help inform that decision. I also haven't done any real debuggability studies with this pipeline yet, I wanted to get the initial start done so that people could see it and we could start tweaking after.

Test updates: Outside of the newpm tests most of the updates are coming from either optimization passes not run anymore (and without a compelling argument at the moment) that were largely used for canonicalization in clang.

Original post:

http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

echristo created this revision.Jul 29 2019, 11:11 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 29 2019, 11:11 AM

Herald added subscribers: cfe-commits, jfb, dexonsmith and 5 others. · View Herald Transcript

gbedwell added a subscriber: gbedwell.Jul 30 2019, 1:08 AM

Orlando added a subscriber: Orlando.Jul 30 2019, 1:48 AM

phosek added a subscriber: phosek.Aug 2 2019, 5:25 PM

Thanks for starting on this. Can you go ahead and replace the sroa calls with mem2reg calls for O1 and then see what that does to the performance? That strikes me as a major change, but certainly one that potentially makes sense, so I'd rather we go ahead and test it now before we make decisions about other adjustments.

FWIW, I thought that we might run InstCombine less often (or maybe replace it with InstSimplify, in some places). Did you try that?

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
356	By definition, this loses information from the call stack, no?
427	Yes, I'd fall back to using regular DCE.

One high level point that is at least worth clarifying, and maybe others will want to suggest a different approach:

The overall approach here is to have as small of a difference between the O1 and O2 pipelines as possible.

An alternative approach that we could take would be to design a focused O1 pipeline without regard to how much it diverges from the O2 pipeline.

Which approach is used somewhat depends on the goals. I feel like the goal here is to get as close to the level of optimization at O2 as possible without losing compile time or coherent backtraces for test / assertion failures. For that goal, the approach taken makes sense. But it seems important to clarify that goal as otherwise I think we'd want to go in very different directions.

In D65410#1613555, @hfinkel wrote:

Thanks for starting on this. Can you go ahead and replace the sroa calls with mem2reg calls for O1 and then see what that does to the performance? That strikes me as a major change, but certainly one that potentially makes sense, so I'd rather we go ahead and test it now before we make decisions about other adjustments.

I really think we need mem2reg at least at -O1... In fact, I really think we need SROA at O1. If it is actually a compile time problem, I'd like to fix that in SROA. I don't really expect it to be though.

FWIW, I thought that we might run InstCombine less often (or maybe replace it with InstSimplify, in some places). Did you try that?

I think the biggest thing to do would be to avoid repeated runs of instcombine over the same code. I suspect we want at least one run after inliner and inside the CGSCC walk for canonicalization. But it'd be great to limit it to exactly one or maaaaybe one before and one after the loop pipeline.

llvm/lib/Passes/PassBuilder.cpp
412–421	I think you can merge all of these?
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
356	Yeah, I'd have really expected this to be skipped.
427	+1

The goal from the original email was:

The design goal is to rewrite the O1
optimization and code generation pipeline to include the set of
optimizations that minimizes build and test time while retaining our
ability to debug.

"Retaining our ability to debug" is a more constraining goal than Chandler's "coherent backtraces for test / assertion failures." A good debugging experience comes from good variable-location information, and minimal reordering of instructions. This is different from "don't mess with my stack frames" which lets you do pretty much anything that doesn't involve a call.

Without presuming to speak for Eric, I don't think there was an explicit goal to make O1 look like a stripped-down O2? But certain pass orderings make sense, and so where the same passes occur, the same ordering would be pretty likely. So, building an O1 pipeline by erasing some stuff from the O2 sequence makes some sense, because it's the practical thing and not because it's a goal to do it that way.

As for specific pass choices: Based on the data from Greg Bedwell's lightning talk last year (https://llvm.org/devmtg/2018-04/talks.html#Lightning_11) the three worst offenders for debuggability were LICM, InstCombine, and SROA. It looks like this patch does exclude SROA but not the others? Or at least not all cases of LICM.

I'd suspect that SROA's main problem is that we don't do a good job of tracking broken-up fragments (the original SROA rewrite didn't even try, IIRC), and if the pass has a good compile-time versus run-time benefit, it's worth investing in making that work better.
LICM just inherently interferes with smooth debugging; it's hard to do anything useful with loops that remains nicely debuggable.
I don't know what InstCombine's problem is.

In D65410#1617366, @probinson wrote:

The goal from the original email was:

The design goal is to rewrite the O1
optimization and code generation pipeline to include the set of
optimizations that minimizes build and test time while retaining our
ability to debug.

"Retaining our ability to debug" is a more constraining goal than Chandler's "coherent backtraces for test / assertion failures." A good debugging experience comes from good variable-location information, and minimal reordering of instructions. This is different from "don't mess with my stack frames" which lets you do pretty much anything that doesn't involve a call.

Without presuming to speak for Eric, I don't think there was an explicit goal to make O1 look like a stripped-down O2? But certain pass orderings make sense, and so where the same passes occur, the same ordering would be pretty likely. So, building an O1 pipeline by erasing some stuff from the O2 sequence makes some sense, because it's the practical thing and not because it's a goal to do it that way.

Not an explicit goal of a stripped down O2, but honestly probably not too far off.

(further below)...

As for specific pass choices: Based on the data from Greg Bedwell's lightning talk last year (https://llvm.org/devmtg/2018-04/talks.html#Lightning_11) the three worst offenders for debuggability were LICM, InstCombine, and SROA. It looks like this patch does exclude SROA but not the others? Or at least not all cases of LICM.

I'd suspect that SROA's main problem is that we don't do a good job of tracking broken-up fragments (the original SROA rewrite didn't even try, IIRC), and if the pass has a good compile-time versus run-time benefit, it's worth investing in making that work better.
LICM just inherently interferes with smooth debugging; it's hard to do anything useful with loops that remains nicely debuggable.
I don't know what InstCombine's problem is.

Smooth is going to be somewhat subjective and there's some work in instcombine (rnk took a stab at it and is currently reviewing a pretty good patch I think in this area). SROA just needs some tracking work if there are still problems.

Now LICM - this is one of those I've gone back and forth on and I'd like to see what debugging and performance look like with and without. I like the general perspective of "remove the abstraction penalties" as an O1/Og sort of thing. That said, ultimately something that people are able to use to debug their code is the ultimate goal - it won't be 100%, but as good as we can make it in the general case.

At any rate, this is just a first pass through both the individual passes are going to need work for debugging as is the particular pipeline.

-eric

In D65410#1613555, @hfinkel wrote:

Thanks for starting on this. Can you go ahead and replace the sroa calls with mem2reg calls for O1 and then see what that does to the performance? That strikes me as a major change, but certainly one that potentially makes sense, so I'd rather we go ahead and test it now before we make decisions about other adjustments.

I'll give it a shot. I think we'll want it in the long run, but happy to run it through the performance blender just to get an idea of what we're getting for our complexity.

FWIW, I thought that we might run InstCombine less often (or maybe replace it with InstSimplify, in some places). Did you try that?

I haven't. It's one part of pass ordering that we'll want to look at, but I figured some optimizing here could happen later. Happy to try a few things though.

I've gone ahead and enabled SROA here. In the testing I've done so far it's helped execute time quite a bit and compile time/object size as well. It'll be really good for use with the trivial auto var initialization option also. SROA is a bit of a worry right now for debugging, but it's an area that's been improved upon significantly and I'm not worried about it getting a lot better.

Future plans here are likely going to be moving it to a more separate pass pipeline so I can pull out things like superfluous instcombines etc while continuing to do incremental performance improvements. In addition, I'm going to do experiments with enabling fast-isel at O1 so we can look at the compile time and performance impact of individual changes there - and hopefully some debugging analysis in the near term.

Herald added subscribers: aheejin, sbc100, nhaehnle, jvesely. · View Herald TranscriptNov 7 2019, 4:46 PM

Harbormaster failed remote builds in B40664: Diff 228338!Nov 7 2019, 4:46 PM

Update to remove comments around SROA addition.

MaskRay added a subscriber: MaskRay.Nov 7 2019, 4:55 PM

jmorse added a subscriber: jmorse.Nov 8 2019, 3:31 AM

ormris added a subscriber: ormris.Nov 8 2019, 10:36 AM

Ping :)

LGTM to land this and iterate, but you likely want someone else to confirm :)

llvm/test/Other/new-pm-defaults.ll
230	Just a drive-by idea: you could have simplified the changes here with a new prefix: "CHECK-O23sz" that you could add to the non-O1 invocations.

Minor nits around redundant predicates for SROA. With thouse fixed, LGTM.

I'd really love to find a way to make TCO debuggable so that we don't lose that. I'm particularly worried about code that relies on it to not run out of stack. Not sure what the best thing to do here is though. Anyways, not relevant for this iteration. I mostly feel bad for a potential future re-churn of all the tests. ;]

llvm/lib/Passes/PassBuilder.cpp
397	We know `O0` isn't used here, so this should be a no-op.
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
263	We early exit at `O0` above, so this is a no-op.
293	We only reach here if `OptLevel > 0` so this should be redundant?
324	This doesn't have the assert, but I believe this is only used above `O0` as well. Maybe just add the assert?
llvm/test/Other/new-pm-defaults.ll
230	Good idea!

This revision is now accepted and ready to land.Nov 24 2019, 12:44 AM

Closed by commit rG8ff85ed905a7: As a follow-up to my initial mail to llvm-dev here's a first pass at the O1… (authored by echristo). · Explain WhyNov 25 2019, 5:22 PM

This revision was automatically updated to reflect the committed changes.

Re-opening this because I have reverted the commit due to failures seen on LLDB AArch64 buildbot with this commit.

This revision is now accepted and ready to land.Nov 25 2019, 8:42 PM

omjavaid requested changes to this revision.Nov 25 2019, 8:42 PM

omjavaid added a reviewer: omjavaid.

This revision now requires changes to proceed.Nov 25 2019, 8:42 PM

This revision was not accepted when it landed; it landed in state Needs Revision.Nov 26 2019, 8:32 PM

Closed by commit rGfd39b1bb20ce: Revert "Revert "As a follow-up to my initial mail to llvm-dev here's a first… (authored by echristo). · Explain Why

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptNov 26 2019, 8:32 PM

Herald added a subscriber: lldb-commits. · View Herald Transcript

djtodoro mentioned this in D68209: [LiveDebugValues] Introduce entry values of unmodified params.Dec 4 2019, 3:15 AM

Revision Contents

Path

Size

clang/

test/

CodeGen/

2008-07-30-implicit-initialization.c

2 lines

arm-fp16-arguments.c

6 lines

arm-vfp16-arguments2.cpp

6 lines

atomic-ops-libcall.c

34 lines

CodeGenCXX/

atomicinit.cpp

2 lines

auto-var-init.cpp

9 lines

discard-name-values.cpp

4 lines

microsoft-abi-dynamic-cast.cpp

18 lines

microsoft-abi-typeid.cpp

8 lines

nrvo.cpp

18 lines

stack-reuse.cpp

2 lines

wasm-args-returns.cpp

12 lines

CodeGenObjCXX/

arc-blocks.mm

6 lines

nrvo.mm

4 lines

Lexer/

minimize_source_to_dependency_directives_invalid_error.c

32 lines

PCH/

no-escaping-block-tail-calls.cpp

4 lines

lldb/

packages/

Python/

lldbsuite/

test/

functionalities/

tail_call_frames/

ambiguous_tail_call_seq1/

Makefile

2 lines

ambiguous_tail_call_seq2/

Makefile

2 lines

disambiguate_call_site/

Makefile

2 lines

disambiguate_paths_to_common_sink/

Makefile

2 lines

disambiguate_tail_call_seq/

Makefile

2 lines

inlining_and_tail_calls/

Makefile

2 lines

sbapi_support/

Makefile

2 lines

thread_step_out_message/

Makefile

2 lines

thread_step_out_or_return/

Makefile

2 lines

unambiguous_sequence/

Makefile

2 lines

llvm/

include/

llvm/

Passes/

PassBuilder.h

10 lines

lib/

Passes/

PassBuilder.cpp

48 lines

Transforms/

IPO/

PassManagerBuilder.cpp

46 lines

test/

CodeGen/

AMDGPU/

simplify-libcalls.ll

268 lines

Feature/

optnone-opt.ll

6 lines

Other/

new-pm-defaults.ll

78 lines

new-pm-thinlto-defaults.ll

46 lines

Transforms/

MemCpyOpt/

lifetime.ll

2 lines

PhaseOrdering/

simplifycfg-options.ll

8 lines

two-shifts-by-sext.ll

4 lines

Diff 231172

clang/test/CodeGen/2008-07-30-implicit-initialization.c

	// RUN: %clang_cc1 -triple i386-unknown-unknown -O1 -emit-llvm -o - %s \| FileCheck %s			// RUN: %clang_cc1 -triple i386-unknown-unknown -O2 -emit-llvm -o - %s \| FileCheck %s
	// CHECK-LABEL: define i32 @f0()			// CHECK-LABEL: define i32 @f0()
	// CHECK: ret i32 0			// CHECK: ret i32 0
	// CHECK-LABEL: define i32 @f1()			// CHECK-LABEL: define i32 @f1()
	// CHECK: ret i32 0			// CHECK: ret i32 0
	// CHECK-LABEL: define i32 @f2()			// CHECK-LABEL: define i32 @f2()
	// CHECK: ret i32 0			// CHECK: ret i32 0
	// <rdar://problem/6113085>			// <rdar://problem/6113085>

	Show All 19 Lines

clang/test/CodeGen/arm-fp16-arguments.c

	// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi soft -fallow-half-arguments-and-returns -emit-llvm -o - -O1 %s \| FileCheck %s --check-prefix=CHECK --check-prefix=SOFT			// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi soft -fallow-half-arguments-and-returns -emit-llvm -o - -O2 %s \| FileCheck %s --check-prefix=CHECK --check-prefix=SOFT
	// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi hard -fallow-half-arguments-and-returns -emit-llvm -o - -O1 %s \| FileCheck %s --check-prefix=CHECK --check-prefix=HARD			// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi hard -fallow-half-arguments-and-returns -emit-llvm -o - -O2 %s \| FileCheck %s --check-prefix=CHECK --check-prefix=HARD
	// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi soft -fnative-half-arguments-and-returns -emit-llvm -o - -O1 %s \| FileCheck %s --check-prefix=NATIVE			// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi soft -fnative-half-arguments-and-returns -emit-llvm -o - -O2 %s \| FileCheck %s --check-prefix=NATIVE

	__fp16 g;			__fp16 g;

	void t1(__fp16 a) { g = a; }			void t1(__fp16 a) { g = a; }
	// SOFT: define void @t1(i32 [[PARAM:%.*]])			// SOFT: define void @t1(i32 [[PARAM:%.*]])
	// SOFT: [[TRUNC:%.*]] = trunc i32 [[PARAM]] to i16			// SOFT: [[TRUNC:%.*]] = trunc i32 [[PARAM]] to i16
	// HARD: define arm_aapcs_vfpcc void @t1(float [[PARAM:%.*]])			// HARD: define arm_aapcs_vfpcc void @t1(float [[PARAM:%.*]])
	// HARD: [[BITCAST:%.*]] = bitcast float [[PARAM]] to i32			// HARD: [[BITCAST:%.*]] = bitcast float [[PARAM]] to i32
	Show All 40 Lines

clang/test/CodeGen/arm-vfp16-arguments2.cpp

	// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs \			// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs \
	// RUN: -mfloat-abi soft -target-feature +neon -emit-llvm -o - -O1 %s \			// RUN: -mfloat-abi soft -target-feature +neon -emit-llvm -o - -O2 %s \
	// RUN: \| FileCheck %s --check-prefix=CHECK-SOFT			// RUN: \| FileCheck %s --check-prefix=CHECK-SOFT
	// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs \			// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs \
	// RUN: -mfloat-abi hard -target-feature +neon -emit-llvm -o - -O1 %s \			// RUN: -mfloat-abi hard -target-feature +neon -emit-llvm -o - -O2 %s \
	// RUN: \| FileCheck %s --check-prefix=CHECK-HARD			// RUN: \| FileCheck %s --check-prefix=CHECK-HARD
	// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs \			// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs \
	// RUN: -mfloat-abi hard -target-feature +neon -target-feature +fullfp16 \			// RUN: -mfloat-abi hard -target-feature +neon -target-feature +fullfp16 \
	// RUN: -emit-llvm -o - -O1 %s \			// RUN: -emit-llvm -o - -O2 %s \
	// RUN: \| FileCheck %s --check-prefix=CHECK-FULL			// RUN: \| FileCheck %s --check-prefix=CHECK-FULL

	typedef float float32_t;			typedef float float32_t;
	typedef __fp16 float16_t;			typedef __fp16 float16_t;
	typedef __attribute__((neon_vector_type(2))) float32_t float32x2_t;			typedef __attribute__((neon_vector_type(2))) float32_t float32x2_t;
	typedef __attribute__((neon_vector_type(4))) float16_t float16x4_t;			typedef __attribute__((neon_vector_type(4))) float16_t float16x4_t;

	struct S1 {			struct S1 {
	▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

clang/test/CodeGen/atomic-ops-libcall.c

	// RUN: %clang_cc1 < %s -triple armv5e-none-linux-gnueabi -emit-llvm -O1 \| FileCheck %s			// RUN: %clang_cc1 < %s -triple armv5e-none-linux-gnueabi -emit-llvm -O1 \| FileCheck %s

	// FIXME: This file should not be checking -O1 output.			// FIXME: This file should not be checking -O1 output.
	// Ie, it is testing many IR optimizer passes as part of front-end verification.			// Ie, it is testing many IR optimizer passes as part of front-end verification.

	enum memory_order {			enum memory_order {
	memory_order_relaxed, memory_order_consume, memory_order_acquire,			memory_order_relaxed, memory_order_consume, memory_order_acquire,
	memory_order_release, memory_order_acq_rel, memory_order_seq_cst			memory_order_release, memory_order_acq_rel, memory_order_seq_cst
	};			};

	int test_c11_atomic_fetch_add_int_ptr(_Atomic(int ) *p) {			int test_c11_atomic_fetch_add_int_ptr(_Atomic(int ) *p) {
	// CHECK: test_c11_atomic_fetch_add_int_ptr			// CHECK: test_c11_atomic_fetch_add_int_ptr
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_add_4(i8 {{%[0-9]+}}, i32 12, i32 5)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_add_4(i8 {{%[0-9]+}}, i32 12, i32 5)
	return __c11_atomic_fetch_add(p, 3, memory_order_seq_cst);			return __c11_atomic_fetch_add(p, 3, memory_order_seq_cst);
	}			}

	int test_c11_atomic_fetch_sub_int_ptr(_Atomic(int ) *p) {			int test_c11_atomic_fetch_sub_int_ptr(_Atomic(int ) *p) {
	// CHECK: test_c11_atomic_fetch_sub_int_ptr			// CHECK: test_c11_atomic_fetch_sub_int_ptr
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_sub_4(i8 {{%[0-9]+}}, i32 20, i32 5)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_sub_4(i8 {{%[0-9]+}}, i32 20, i32 5)
	return __c11_atomic_fetch_sub(p, 5, memory_order_seq_cst);			return __c11_atomic_fetch_sub(p, 5, memory_order_seq_cst);
	}			}

	int test_c11_atomic_fetch_add_int(_Atomic(int) *p) {			int test_c11_atomic_fetch_add_int(_Atomic(int) *p) {
	// CHECK: test_c11_atomic_fetch_add_int			// CHECK: test_c11_atomic_fetch_add_int
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_add_4(i8 {{%[0-9]+}}, i32 3, i32 5)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_add_4(i8 {{%[0-9]+}}, i32 3, i32 5)
	return __c11_atomic_fetch_add(p, 3, memory_order_seq_cst);			return __c11_atomic_fetch_add(p, 3, memory_order_seq_cst);
	}			}

	int test_c11_atomic_fetch_sub_int(_Atomic(int) *p) {			int test_c11_atomic_fetch_sub_int(_Atomic(int) *p) {
	// CHECK: test_c11_atomic_fetch_sub_int			// CHECK: test_c11_atomic_fetch_sub_int
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_sub_4(i8 {{%[0-9]+}}, i32 5, i32 5)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_sub_4(i8 {{%[0-9]+}}, i32 5, i32 5)
	return __c11_atomic_fetch_sub(p, 5, memory_order_seq_cst);			return __c11_atomic_fetch_sub(p, 5, memory_order_seq_cst);
	}			}

	int fp2a(int *p) {			int fp2a(int *p) {
	// CHECK: @fp2a			// CHECK: @fp2a
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_sub_4(i8 {{%[0-9]+}}, i32 4, i32 0)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_sub_4(i8 {{%[0-9]+}}, i32 4, i32 0)
	// Note, the GNU builtins do not multiply by sizeof(T)!			// Note, the GNU builtins do not multiply by sizeof(T)!
	return __atomic_fetch_sub(p, 4, memory_order_relaxed);			return __atomic_fetch_sub(p, 4, memory_order_relaxed);
	}			}

	int test_atomic_fetch_add(int *p) {			int test_atomic_fetch_add(int *p) {
	// CHECK: test_atomic_fetch_add			// CHECK: test_atomic_fetch_add
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_add_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_add_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	return __atomic_fetch_add(p, 55, memory_order_seq_cst);			return __atomic_fetch_add(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_fetch_sub(int *p) {			int test_atomic_fetch_sub(int *p) {
	// CHECK: test_atomic_fetch_sub			// CHECK: test_atomic_fetch_sub
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_sub_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_sub_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	return __atomic_fetch_sub(p, 55, memory_order_seq_cst);			return __atomic_fetch_sub(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_fetch_and(int *p) {			int test_atomic_fetch_and(int *p) {
	// CHECK: test_atomic_fetch_and			// CHECK: test_atomic_fetch_and
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_and_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_and_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	return __atomic_fetch_and(p, 55, memory_order_seq_cst);			return __atomic_fetch_and(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_fetch_or(int *p) {			int test_atomic_fetch_or(int *p) {
	// CHECK: test_atomic_fetch_or			// CHECK: test_atomic_fetch_or
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_or_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_or_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	return __atomic_fetch_or(p, 55, memory_order_seq_cst);			return __atomic_fetch_or(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_fetch_xor(int *p) {			int test_atomic_fetch_xor(int *p) {
	// CHECK: test_atomic_fetch_xor			// CHECK: test_atomic_fetch_xor
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_xor_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_xor_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	return __atomic_fetch_xor(p, 55, memory_order_seq_cst);			return __atomic_fetch_xor(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_fetch_nand(int *p) {			int test_atomic_fetch_nand(int *p) {
	// CHECK: test_atomic_fetch_nand			// CHECK: test_atomic_fetch_nand
	// CHECK: {{%[^ ]}} = tail call i32 @__atomic_fetch_nand_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: {{%[^ ]}} = call i32 @__atomic_fetch_nand_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	return __atomic_fetch_nand(p, 55, memory_order_seq_cst);			return __atomic_fetch_nand(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_add_fetch(int *p) {			int test_atomic_add_fetch(int *p) {
	// CHECK: test_atomic_add_fetch			// CHECK: test_atomic_add_fetch
	// CHECK: [[CALL:%[^ ]]] = tail call i32 @__atomic_fetch_add_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: [[CALL:%[^ ]]] = call i32 @__atomic_fetch_add_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	// CHECK: {{%[^ ]*}} = add i32 [[CALL]], 55			// CHECK: {{%[^ ]*}} = add i32 [[CALL]], 55
	return __atomic_add_fetch(p, 55, memory_order_seq_cst);			return __atomic_add_fetch(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_sub_fetch(int *p) {			int test_atomic_sub_fetch(int *p) {
	// CHECK: test_atomic_sub_fetch			// CHECK: test_atomic_sub_fetch
	// CHECK: [[CALL:%[^ ]]] = tail call i32 @__atomic_fetch_sub_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: [[CALL:%[^ ]]] = call i32 @__atomic_fetch_sub_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	// CHECK: {{%[^ ]*}} = add i32 [[CALL]], -55			// CHECK: {{%[^ ]*}} = add i32 [[CALL]], -55
	return __atomic_sub_fetch(p, 55, memory_order_seq_cst);			return __atomic_sub_fetch(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_and_fetch(int *p) {			int test_atomic_and_fetch(int *p) {
	// CHECK: test_atomic_and_fetch			// CHECK: test_atomic_and_fetch
	// CHECK: [[CALL:%[^ ]]] = tail call i32 @__atomic_fetch_and_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: [[CALL:%[^ ]]] = call i32 @__atomic_fetch_and_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	// CHECK: {{%[^ ]*}} = and i32 [[CALL]], 55			// CHECK: {{%[^ ]*}} = and i32 [[CALL]], 55
	return __atomic_and_fetch(p, 55, memory_order_seq_cst);			return __atomic_and_fetch(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_or_fetch(int *p) {			int test_atomic_or_fetch(int *p) {
	// CHECK: test_atomic_or_fetch			// CHECK: test_atomic_or_fetch
	// CHECK: [[CALL:%[^ ]]] = tail call i32 @__atomic_fetch_or_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: [[CALL:%[^ ]]] = call i32 @__atomic_fetch_or_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	// CHECK: {{%[^ ]*}} = or i32 [[CALL]], 55			// CHECK: {{%[^ ]*}} = or i32 [[CALL]], 55
	return __atomic_or_fetch(p, 55, memory_order_seq_cst);			return __atomic_or_fetch(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_xor_fetch(int *p) {			int test_atomic_xor_fetch(int *p) {
	// CHECK: test_atomic_xor_fetch			// CHECK: test_atomic_xor_fetch
	// CHECK: [[CALL:%[^ ]]] = tail call i32 @__atomic_fetch_xor_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: [[CALL:%[^ ]]] = call i32 @__atomic_fetch_xor_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	// CHECK: {{%[^ ]*}} = xor i32 [[CALL]], 55			// CHECK: {{%[^ ]*}} = xor i32 [[CALL]], 55
	return __atomic_xor_fetch(p, 55, memory_order_seq_cst);			return __atomic_xor_fetch(p, 55, memory_order_seq_cst);
	}			}

	int test_atomic_nand_fetch(int *p) {			int test_atomic_nand_fetch(int *p) {
	// CHECK: test_atomic_nand_fetch			// CHECK: test_atomic_nand_fetch
	// CHECK: [[CALL:%[^ ]]] = tail call i32 @__atomic_fetch_nand_4(i8 {{%[0-9]+}}, i32 55, i32 5)			// CHECK: [[CALL:%[^ ]]] = call i32 @__atomic_fetch_nand_4(i8 {{%[0-9]+}}, i32 55, i32 5)
	// FIXME: We should not be checking optimized IR. It changes independently of clang.			// FIXME: We should not be checking optimized IR. It changes independently of clang.
	// FIXME-CHECK: [[AND:%[^ ]*]] = and i32 [[CALL]], 55			// FIXME-CHECK: [[AND:%[^ ]*]] = and i32 [[CALL]], 55
	// FIXME-CHECK: {{%[^ ]*}} = xor i32 [[AND]], -1			// FIXME-CHECK: {{%[^ ]*}} = xor i32 [[AND]], -1
	return __atomic_nand_fetch(p, 55, memory_order_seq_cst);			return __atomic_nand_fetch(p, 55, memory_order_seq_cst);
	}			}

clang/test/CodeGenCXX/atomicinit.cpp

Show All 25 Lines	struct B {
B(int x) : i(x) {}		B(int x) : i(x) {}
};		};

_Atomic(B) b;		_Atomic(B) b;

// CHECK-LABEL: define void @_Z11atomic_initR1Ai		// CHECK-LABEL: define void @_Z11atomic_initR1Ai
void atomic_init(A& a, int i) {		void atomic_init(A& a, int i) {
// CHECK-NOT: atomic		// CHECK-NOT: atomic
// CHECK: tail call void @_ZN1BC1Ei		// CHECK: call void @_ZN1BC1Ei
__c11_atomic_init(&b, B(i));		__c11_atomic_init(&b, B(i));
// CHECK-NEXT: ret void		// CHECK-NEXT: ret void
}		}

// CHECK-LABEL: define void @_Z16atomic_init_boolPU7_Atomicbb		// CHECK-LABEL: define void @_Z16atomic_init_boolPU7_Atomicbb
void atomic_init_bool(_Atomic(bool) *ab, bool b) {		void atomic_init_bool(_Atomic(bool) *ab, bool b) {
// CHECK-NOT: atomic		// CHECK-NOT: atomic
// CHECK: {{zext i1.*to i8}}		// CHECK: {{zext i1.*to i8}}
▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

clang/test/CodeGenCXX/auto-var-init.cpp

	Show First 20 Lines • Show All 639 Lines • ▼ Show 20 Lines
	// CHECK-NEXT: call void @{{.}}used{{.}}%uninit)			// CHECK-NEXT: call void @{{.}}used{{.}}%uninit)
	// PATTERN-LABEL: @test_smallpartinit_uninit()			// PATTERN-LABEL: @test_smallpartinit_uninit()
	// PATTERN-O0: call void @llvm.memcpy{{.*}} @__const.test_smallpartinit_uninit.uninit			// PATTERN-O0: call void @llvm.memcpy{{.*}} @__const.test_smallpartinit_uninit.uninit
	// PATTERN-O1: store i8 [[I8]], {{.*}} align 1			// PATTERN-O1: store i8 [[I8]], {{.*}} align 1
	// PATTERN-O1: store i8 42, {{.*}} align 1			// PATTERN-O1: store i8 42, {{.*}} align 1
	// ZERO-LABEL: @test_smallpartinit_uninit()			// ZERO-LABEL: @test_smallpartinit_uninit()
	// ZERO-O0: call void @llvm.memset{{.*}}, i8 0,			// ZERO-O0: call void @llvm.memset{{.*}}, i8 0,
	// ZERO-O1-LEGACY: store i16 0, i16* %uninit, align 2			// ZERO-O1-LEGACY: store i16 0, i16* %uninit, align 2
	// ZERO-O1-NEWPM: store i16 42, i16* %uninit, align 2			// ZERO-O1-NEWPM: store i16 0, i16* %uninit, align 2

	TEST_BRACES(smallpartinit, smallpartinit);			TEST_BRACES(smallpartinit, smallpartinit);
	// CHECK-LABEL: @test_smallpartinit_braces()			// CHECK-LABEL: @test_smallpartinit_braces()
	// CHECK: %braces = alloca %struct.smallpartinit, align [[ALIGN:[0-9]*]]			// CHECK: %braces = alloca %struct.smallpartinit, align [[ALIGN:[0-9]*]]
	// CHECK-NEXT: %[[C:[^ ]]] = getelementptr inbounds %struct.smallpartinit, %struct.smallpartinit %braces, i32 0, i32 0			// CHECK-NEXT: %[[C:[^ ]]] = getelementptr inbounds %struct.smallpartinit, %struct.smallpartinit %braces, i32 0, i32 0
	// CHECK-NEXT: store i8 42, i8* %[[C]], align [[ALIGN]]			// CHECK-NEXT: store i8 42, i8* %[[C]], align [[ALIGN]]
	// CHECK-NEXT: %[[D:[^ ]]] = getelementptr inbounds %struct.smallpartinit, %struct.smallpartinit %braces, i32 0, i32 1			// CHECK-NEXT: %[[D:[^ ]]] = getelementptr inbounds %struct.smallpartinit, %struct.smallpartinit %braces, i32 0, i32 1
	// CHECK-NEXT: store i8 0, i8* %[[D]], align [[ALIGN]]			// CHECK-NEXT: store i8 0, i8* %[[D]], align [[ALIGN]]
	▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
	TEST_UNINIT(paddednullinit, paddednullinit);			TEST_UNINIT(paddednullinit, paddednullinit);
	// CHECK-LABEL: @test_paddednullinit_uninit()			// CHECK-LABEL: @test_paddednullinit_uninit()
	// CHECK: %uninit = alloca %struct.paddednullinit, align			// CHECK: %uninit = alloca %struct.paddednullinit, align
	// CHECK-NEXT: call void @{{.}}paddednullinit{{.}}%uninit)			// CHECK-NEXT: call void @{{.}}paddednullinit{{.}}%uninit)
	// CHECK-NEXT: call void @{{.}}used{{.}}%uninit)			// CHECK-NEXT: call void @{{.}}used{{.}}%uninit)
	// PATTERN-LABEL: @test_paddednullinit_uninit()			// PATTERN-LABEL: @test_paddednullinit_uninit()
	// PATTERN-O0: call void @llvm.memcpy{{.*}} @__const.test_paddednullinit_uninit.uninit			// PATTERN-O0: call void @llvm.memcpy{{.*}} @__const.test_paddednullinit_uninit.uninit
	// PATTERN-O1-LEGACY: store i64 [[I64]], i64* %uninit, align 8			// PATTERN-O1-LEGACY: store i64 [[I64]], i64* %uninit, align 8
	// PATTERN-O1-NEWPM: store i64 2863311360, i64* %uninit, align 8			// PATTERN-O1-NEWPM: store i64 [[I64]], i64* %uninit, align 8
	// ZERO-LABEL: @test_paddednullinit_uninit()			// ZERO-LABEL: @test_paddednullinit_uninit()
	// ZERO-O0: call void @llvm.memset{{.*}}, i8 0,			// ZERO-O0: call void @llvm.memset{{.*}}, i8 0,
	// ZERO-O1: store i64 0, i64* %uninit, align 8			// ZERO-O1: store i64 0, i64* %uninit, align 8

	TEST_BRACES(paddednullinit, paddednullinit);			TEST_BRACES(paddednullinit, paddednullinit);
	// CHECK-LABEL: @test_paddednullinit_braces()			// CHECK-LABEL: @test_paddednullinit_braces()
	// CHECK: %braces = alloca %struct.paddednullinit, align [[ALIGN:[0-9]*]]			// CHECK: %braces = alloca %struct.paddednullinit, align [[ALIGN:[0-9]*]]
	// CHECK-NEXT: %[[C:[^ ]]] = getelementptr inbounds %struct.paddednullinit, %struct.paddednullinit %braces, i32 0, i32 0			// CHECK-NEXT: %[[C:[^ ]]] = getelementptr inbounds %struct.paddednullinit, %struct.paddednullinit %braces, i32 0, i32 0
	▲ Show 20 Lines • Show All 609 Lines • ▼ Show 20 Lines
	// CHECK: %uninit = alloca %struct.virtualderived, align			// CHECK: %uninit = alloca %struct.virtualderived, align
	// CHECK-NEXT: call void @{{.}}virtualderived{{.}}%uninit)			// CHECK-NEXT: call void @{{.}}virtualderived{{.}}%uninit)
	// CHECK-NEXT: call void @{{.}}used{{.}}%uninit)			// CHECK-NEXT: call void @{{.}}used{{.}}%uninit)
	// PATTERN-LABEL: @test_virtualderived_uninit()			// PATTERN-LABEL: @test_virtualderived_uninit()
	// PATTERN-O0: call void @llvm.memcpy{{.*}} @__const.test_virtualderived_uninit.uninit			// PATTERN-O0: call void @llvm.memcpy{{.*}} @__const.test_virtualderived_uninit.uninit
	// ZERO-LABEL: @test_virtualderived_uninit()			// ZERO-LABEL: @test_virtualderived_uninit()
	// ZERO-O0: call void @llvm.memset{{.*}}, i8 0,			// ZERO-O0: call void @llvm.memset{{.*}}, i8 0,
	// ZERO-O1-LEGACY: call void @llvm.memset{{.*}}, i8 0,			// ZERO-O1-LEGACY: call void @llvm.memset{{.*}}, i8 0,
	// ZERO-O1-NEWPM: [[FIELD1:%.]] = getelementptr inbounds %struct.virtualderived, %struct.virtualderived %uninit, i64 0, i32 1, i32 0, i32 0			// ZERO-O1-NEWPM: call void @llvm.memset{{.*}}, i8 0,
	// ZERO-O1-NEWPM: [[FIELD0:%.]] = getelementptr inbounds %struct.virtualderived, %struct.virtualderived %uninit, i64 0, i32 0, i32 0
	// ZERO-O1-NEWPM: store i32 (...) bitcast (i8 getelementptr inbounds ({ [7 x i8], [5 x i8] }, { [7 x i8], [5 x i8] }* @_ZTV14virtualderived, i64 0, inrange i32 0, i64 5) to i32 (...)), i32 (...)* [[FIELD0]], align 8
	// ZERO-O1-NEWPM: store i32 (...) bitcast (i8 getelementptr inbounds ({ [7 x i8], [5 x i8] }, { [7 x i8], [5 x i8] }* @_ZTV14virtualderived, i64 0, inrange i32 1, i64 3) to i32 (...)), i32 (...)* [[FIELD1]], align 8

	TEST_BRACES(virtualderived, virtualderived);			TEST_BRACES(virtualderived, virtualderived);
	// CHECK-LABEL: @test_virtualderived_braces()			// CHECK-LABEL: @test_virtualderived_braces()
	// CHECK: %braces = alloca %struct.virtualderived, align [[ALIGN:[0-9]*]]			// CHECK: %braces = alloca %struct.virtualderived, align [[ALIGN:[0-9]*]]
	// CHECK-NEXT: bitcast			// CHECK-NEXT: bitcast
	// CHECK-NEXT: call void @llvm.memset{{.}}(i8 align [[ALIGN]] %{{.*}}, i8 0, i64 16, i1 false)			// CHECK-NEXT: call void @llvm.memset{{.}}(i8 align [[ALIGN]] %{{.*}}, i8 0, i64 16, i1 false)
	// CHECK-NEXT: call void @{{.}}virtualderived{{.}}%braces)			// CHECK-NEXT: call void @{{.}}virtualderived{{.}}%braces)
	// CHECK-NEXT: call void @{{.}}used{{.}}%braces)			// CHECK-NEXT: call void @{{.}}used{{.}}%braces)
	▲ Show 20 Lines • Show All 259 Lines • Show Last 20 Lines

clang/test/CodeGenCXX/discard-name-values.cpp

	// RUN: %clang_cc1 -emit-llvm -triple=armv7-apple-darwin -std=c++11 %s -o - -O1 \			// RUN: %clang_cc1 -emit-llvm -triple=armv7-apple-darwin -std=c++11 %s -o - -O1 \
	// RUN: \| FileCheck %s			// RUN: \| FileCheck %s
	// RUN: %clang_cc1 -emit-llvm -triple=armv7-apple-darwin -std=c++11 %s -o - -O1 \			// RUN: %clang_cc1 -emit-llvm -triple=armv7-apple-darwin -std=c++11 %s -o - -O1 \
	// RUN: -discard-value-names \| FileCheck %s --check-prefix=DISCARDVALUE			// RUN: -discard-value-names \| FileCheck %s --check-prefix=DISCARDVALUE

	extern "C" void branch();			extern "C" void branch();

	bool test(bool pred) {			bool test(bool pred) {
	// DISCARDVALUE: br i1 %0, label %2, label %3			// DISCARDVALUE: br i1 %0, label %2, label %3
	// CHECK: br i1 %pred, label %if.then, label %if.end			// CHECK: br i1 %pred, label %if.then, label %if.end

	if (pred) {			if (pred) {
	// DISCARDVALUE: 2:			// DISCARDVALUE: 2:
	// DISCARDVALUE-NEXT: tail call void @branch()			// DISCARDVALUE-NEXT: call void @branch()
	// DISCARDVALUE-NEXT: br label %3			// DISCARDVALUE-NEXT: br label %3

	// CHECK: if.then:			// CHECK: if.then:
	// CHECK-NEXT: tail call void @branch()			// CHECK-NEXT: call void @branch()
	// CHECK-NEXT: br label %if.end			// CHECK-NEXT: br label %if.end
	branch();			branch();
	}			}

	// DISCARDVALUE: 3:			// DISCARDVALUE: 3:
	// DISCARDVALUE-NEXT: ret i1 %0			// DISCARDVALUE-NEXT: ret i1 %0

	// CHECK: if.end:			// CHECK: if.end:
	// CHECK-NEXT: ret i1 %pred			// CHECK-NEXT: ret i1 %pred
	return pred;			return pred;
	}			}

clang/test/CodeGenCXX/microsoft-abi-dynamic-cast.cpp

	// RUN: %clang_cc1 -emit-llvm -O1 -o - -fexceptions -triple=i386-pc-win32 %s \| FileCheck %s			// RUN: %clang_cc1 -emit-llvm -O1 -o - -fexceptions -triple=i386-pc-win32 %s \| FileCheck %s

	struct S { char a; };			struct S { char a; };
	struct V { virtual void f(); };			struct V { virtual void f(); };
	struct A : virtual V {};			struct A : virtual V {};
	struct B : S, virtual V {};			struct B : S, virtual V {};
	struct T {};			struct T {};

	T* test0() { return dynamic_cast<T>((B)0); }			T* test0() { return dynamic_cast<T>((B)0); }
	// CHECK-LABEL: define dso_local noalias %struct.T* @"?test0@@YAPAUT@@XZ"()			// CHECK-LABEL: define dso_local noalias %struct.T* @"?test0@@YAPAUT@@XZ"()
	// CHECK: ret %struct.T* null			// CHECK: ret %struct.T* null

	T* test1(V* x) { return &dynamic_cast<T&>(*x); }			T* test1(V* x) { return &dynamic_cast<T&>(*x); }
	// CHECK-LABEL: define dso_local %struct.T* @"?test1@@YAPAUT@@PAUV@@@Z"(%struct.V* %x)			// CHECK-LABEL: define dso_local %struct.T* @"?test1@@YAPAUT@@PAUV@@@Z"(%struct.V* %x)
	// CHECK: [[CAST:%.]] = bitcast %struct.V %x to i8*			// CHECK: [[CAST:%.]] = bitcast %struct.V %x to i8*
	// CHECK-NEXT: [[CALL:%.]] = tail call i8 @__RTDynamicCast(i8* [[CAST]], i32 0, i8* bitcast (%rtti.TypeDescriptor7* @"??_R0?AUV@@@8" to i8), i8 bitcast (%rtti.TypeDescriptor7* @"??_R0?AUT@@@8" to i8*), i32 1)			// CHECK-NEXT: [[CALL:%.]] = call i8 @__RTDynamicCast(i8* [[CAST]], i32 0, i8* bitcast (%rtti.TypeDescriptor7* @"??_R0?AUV@@@8" to i8), i8 bitcast (%rtti.TypeDescriptor7* @"??_R0?AUT@@@8" to i8*), i32 1)
	// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[CALL]] to %struct.T*			// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[CALL]] to %struct.T*
	// CHECK-NEXT: ret %struct.T* [[RET]]			// CHECK-NEXT: ret %struct.T* [[RET]]

	T* test2(A* x) { return &dynamic_cast<T&>(*x); }			T* test2(A* x) { return &dynamic_cast<T&>(*x); }
	// CHECK-LABEL: define dso_local %struct.T* @"?test2@@YAPAUT@@PAUA@@@Z"(%struct.A* %x)			// CHECK-LABEL: define dso_local %struct.T* @"?test2@@YAPAUT@@PAUA@@@Z"(%struct.A* %x)
	// CHECK: [[CAST:%.]] = bitcast %struct.A %x to i8*			// CHECK: [[CAST:%.]] = bitcast %struct.A %x to i8*
	// CHECK-NEXT: [[VBPTRPTR:%.]] = getelementptr %struct.A, %struct.A %x, i32 0, i32 0			// CHECK-NEXT: [[VBPTRPTR:%.]] = getelementptr %struct.A, %struct.A %x, i32 0, i32 0
	// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4			// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4
	// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1			// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1
	// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4			// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4
	// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[CAST]], i32 [[VBOFFS]]			// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[CAST]], i32 [[VBOFFS]]
	// CHECK-NEXT: [[CALL:%.]] = tail call i8 @__RTDynamicCast(i8* [[ADJ]], i32 [[VBOFFS]], i8* bitcast (%rtti.TypeDescriptor7* @"??_R0?AUA@@@8" to i8), i8 bitcast (%rtti.TypeDescriptor7* @"??_R0?AUT@@@8" to i8*), i32 1)			// CHECK-NEXT: [[CALL:%.]] = call i8 @__RTDynamicCast(i8* [[ADJ]], i32 [[VBOFFS]], i8* bitcast (%rtti.TypeDescriptor7* @"??_R0?AUA@@@8" to i8), i8 bitcast (%rtti.TypeDescriptor7* @"??_R0?AUT@@@8" to i8*), i32 1)
	// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[CALL]] to %struct.T*			// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[CALL]] to %struct.T*
	// CHECK-NEXT: ret %struct.T* [[RET]]			// CHECK-NEXT: ret %struct.T* [[RET]]

	T* test3(B* x) { return &dynamic_cast<T&>(*x); }			T* test3(B* x) { return &dynamic_cast<T&>(*x); }
	// CHECK-LABEL: define dso_local %struct.T* @"?test3@@YAPAUT@@PAUB@@@Z"(%struct.B* %x)			// CHECK-LABEL: define dso_local %struct.T* @"?test3@@YAPAUT@@PAUB@@@Z"(%struct.B* %x)
	// CHECK: [[VOIDP:%.]] = getelementptr %struct.B, %struct.B %x, i32 0, i32 0, i32 0			// CHECK: [[VOIDP:%.]] = getelementptr %struct.B, %struct.B %x, i32 0, i32 0, i32 0
	// CHECK-NEXT: [[VBPTR:%.]] = getelementptr inbounds i8, i8 [[VOIDP]], i32 4			// CHECK-NEXT: [[VBPTR:%.]] = getelementptr inbounds i8, i8 [[VOIDP]], i32 4
	// CHECK-NEXT: [[VBPTRPTR:%.]] = bitcast i8 [[VBPTR:%.]] to i32*			// CHECK-NEXT: [[VBPTRPTR:%.]] = bitcast i8 [[VBPTR:%.]] to i32*
	// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4			// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4
	// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1			// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1
	// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4			// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4
	// CHECK-NEXT: [[DELTA:%.*]] = add nsw i32 [[VBOFFS]], 4			// CHECK-NEXT: [[DELTA:%.*]] = add nsw i32 [[VBOFFS]], 4
	// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[VOIDP]], i32 [[DELTA]]			// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[VOIDP]], i32 [[DELTA]]
	// CHECK-NEXT: [[CALL:%.]] = tail call i8 @__RTDynamicCast(i8* [[ADJ]], i32 [[DELTA]], i8* bitcast (%rtti.TypeDescriptor7* @"??_R0?AUB@@@8" to i8), i8 bitcast (%rtti.TypeDescriptor7* @"??_R0?AUT@@@8" to i8*), i32 1)			// CHECK-NEXT: [[CALL:%.]] = call i8 @__RTDynamicCast(i8* [[ADJ]], i32 [[DELTA]], i8* bitcast (%rtti.TypeDescriptor7* @"??_R0?AUB@@@8" to i8), i8 bitcast (%rtti.TypeDescriptor7* @"??_R0?AUT@@@8" to i8*), i32 1)
	// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[CALL]] to %struct.T*			// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[CALL]] to %struct.T*
	// CHECK-NEXT: ret %struct.T* [[RET]]			// CHECK-NEXT: ret %struct.T* [[RET]]

	T* test4(V* x) { return dynamic_cast<T*>(x); }			T* test4(V* x) { return dynamic_cast<T*>(x); }
	// CHECK-LABEL: define dso_local %struct.T* @"?test4@@YAPAUT@@PAUV@@@Z"(%struct.V* %x)			// CHECK-LABEL: define dso_local %struct.T* @"?test4@@YAPAUT@@PAUV@@@Z"(%struct.V* %x)
	// CHECK: [[CAST:%.]] = bitcast %struct.V %x to i8*			// CHECK: [[CAST:%.]] = bitcast %struct.V %x to i8*
	// CHECK-NEXT: [[CALL:%.]] = tail call i8 @__RTDynamicCast(i8* [[CAST]], i32 0, i8* bitcast (%rtti.TypeDescriptor7* @"??_R0?AUV@@@8" to i8), i8 bitcast (%rtti.TypeDescriptor7* @"??_R0?AUT@@@8" to i8*), i32 0)			// CHECK-NEXT: [[CALL:%.]] = call i8 @__RTDynamicCast(i8* [[CAST]], i32 0, i8* bitcast (%rtti.TypeDescriptor7* @"??_R0?AUV@@@8" to i8), i8 bitcast (%rtti.TypeDescriptor7* @"??_R0?AUT@@@8" to i8*), i32 0)
	// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[CALL]] to %struct.T*			// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[CALL]] to %struct.T*
	// CHECK-NEXT: ret %struct.T* [[RET]]			// CHECK-NEXT: ret %struct.T* [[RET]]

	T* test5(A* x) { return dynamic_cast<T*>(x); }			T* test5(A* x) { return dynamic_cast<T*>(x); }
	// CHECK-LABEL: define dso_local %struct.T* @"?test5@@YAPAUT@@PAUA@@@Z"(%struct.A* %x)			// CHECK-LABEL: define dso_local %struct.T* @"?test5@@YAPAUT@@PAUA@@@Z"(%struct.A* %x)
	// CHECK: [[CHECK:%.]] = icmp eq %struct.A %x, null			// CHECK: [[CHECK:%.]] = icmp eq %struct.A %x, null
	// CHECK-NEXT: br i1 [[CHECK]]			// CHECK-NEXT: br i1 [[CHECK]]
	// CHECK: [[VOIDP:%.]] = bitcast %struct.A %x to i8*			// CHECK: [[VOIDP:%.]] = bitcast %struct.A %x to i8*
	// CHECK-NEXT: [[VBPTRPTR:%.]] = getelementptr %struct.A, %struct.A %x, i32 0, i32 0			// CHECK-NEXT: [[VBPTRPTR:%.]] = getelementptr %struct.A, %struct.A %x, i32 0, i32 0
	// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4			// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4
	// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1			// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1
	// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4			// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4
	// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[VOIDP]], i32 [[VBOFFS]]			// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[VOIDP]], i32 [[VBOFFS]]
	// CHECK-NEXT: [[CALL:%.]] = tail call i8 @__RTDynamicCast(i8* nonnull [[ADJ]], i32 [[VBOFFS]], i8* {{.}}bitcast (%rtti.TypeDescriptor7 @"??_R0?AUA@@@8" to i8), i8 {{.}}bitcast (%rtti.TypeDescriptor7 @"??_R0?AUT@@@8" to i8*), i32 0)			// CHECK-NEXT: [[CALL:%.]] = call i8 @__RTDynamicCast(i8* nonnull [[ADJ]], i32 [[VBOFFS]], i8* {{.}}bitcast (%rtti.TypeDescriptor7 @"??_R0?AUA@@@8" to i8), i8 {{.}}bitcast (%rtti.TypeDescriptor7 @"??_R0?AUT@@@8" to i8*), i32 0)
	// CHECK-NEXT: [[RES:%.]] = bitcast i8 [[CALL]] to %struct.T*			// CHECK-NEXT: [[RES:%.]] = bitcast i8 [[CALL]] to %struct.T*
	// CHECK-NEXT: br label			// CHECK-NEXT: br label
	// CHECK: [[RET:%.]] = phi %struct.T			// CHECK: [[RET:%.]] = phi %struct.T
	// CHECK-NEXT: ret %struct.T* [[RET]]			// CHECK-NEXT: ret %struct.T* [[RET]]

	T* test6(B* x) { return dynamic_cast<T*>(x); }			T* test6(B* x) { return dynamic_cast<T*>(x); }
	// CHECK-LABEL: define dso_local %struct.T* @"?test6@@YAPAUT@@PAUB@@@Z"(%struct.B* %x)			// CHECK-LABEL: define dso_local %struct.T* @"?test6@@YAPAUT@@PAUB@@@Z"(%struct.B* %x)
	// CHECK: [[CHECK:%.]] = icmp eq %struct.B %x, null			// CHECK: [[CHECK:%.]] = icmp eq %struct.B %x, null
	// CHECK-NEXT: br i1 [[CHECK]]			// CHECK-NEXT: br i1 [[CHECK]]
	// CHECK: [[CAST:%.]] = getelementptr %struct.B, %struct.B %x, i32 0, i32 0, i32 0			// CHECK: [[CAST:%.]] = getelementptr %struct.B, %struct.B %x, i32 0, i32 0, i32 0
	// CHECK-NEXT: [[VBPTR:%.]] = getelementptr inbounds i8, i8 [[CAST]], i32 4			// CHECK-NEXT: [[VBPTR:%.]] = getelementptr inbounds i8, i8 [[CAST]], i32 4
	// CHECK-NEXT: [[VBPTRPTR:%.]] = bitcast i8 [[VBPTR]] to i32**			// CHECK-NEXT: [[VBPTRPTR:%.]] = bitcast i8 [[VBPTR]] to i32**
	// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4			// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4
	// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1			// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1
	// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4			// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4
	// CHECK-NEXT: [[DELTA:%.*]] = add nsw i32 [[VBOFFS]], 4			// CHECK-NEXT: [[DELTA:%.*]] = add nsw i32 [[VBOFFS]], 4
	// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[CAST]], i32 [[DELTA]]			// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[CAST]], i32 [[DELTA]]
	// CHECK-NEXT: [[CALL:%.]] = tail call i8 @__RTDynamicCast(i8* [[ADJ]], i32 [[DELTA]], i8* {{.}}bitcast (%rtti.TypeDescriptor7 @"??_R0?AUB@@@8" to i8), i8 {{.}}bitcast (%rtti.TypeDescriptor7 @"??_R0?AUT@@@8" to i8*), i32 0)			// CHECK-NEXT: [[CALL:%.]] = call i8 @__RTDynamicCast(i8* [[ADJ]], i32 [[DELTA]], i8* {{.}}bitcast (%rtti.TypeDescriptor7 @"??_R0?AUB@@@8" to i8), i8 {{.}}bitcast (%rtti.TypeDescriptor7 @"??_R0?AUT@@@8" to i8*), i32 0)
	// CHECK-NEXT: [[RES:%.]] = bitcast i8 [[CALL]] to %struct.T*			// CHECK-NEXT: [[RES:%.]] = bitcast i8 [[CALL]] to %struct.T*
	// CHECK-NEXT: br label			// CHECK-NEXT: br label
	// CHECK: [[RET:%.]] = phi %struct.T			// CHECK: [[RET:%.]] = phi %struct.T
	// CHECK-NEXT: ret %struct.T* [[RET]]			// CHECK-NEXT: ret %struct.T* [[RET]]

	void* test7(V* x) { return dynamic_cast<void*>(x); }			void* test7(V* x) { return dynamic_cast<void*>(x); }
	// CHECK-LABEL: define dso_local i8* @"?test7@@YAPAXPAUV@@@Z"(%struct.V* %x)			// CHECK-LABEL: define dso_local i8* @"?test7@@YAPAXPAUV@@@Z"(%struct.V* %x)
	// CHECK: [[CAST:%.]] = bitcast %struct.V %x to i8*			// CHECK: [[CAST:%.]] = bitcast %struct.V %x to i8*
	// CHECK-NEXT: [[RET:%.]] = tail call i8 @__RTCastToVoid(i8* [[CAST]])			// CHECK-NEXT: [[RET:%.]] = call i8 @__RTCastToVoid(i8* [[CAST]])
	// CHECK-NEXT: ret i8* [[RET]]			// CHECK-NEXT: ret i8* [[RET]]

	void* test8(A* x) { return dynamic_cast<void*>(x); }			void* test8(A* x) { return dynamic_cast<void*>(x); }
	// CHECK-LABEL: define dso_local i8* @"?test8@@YAPAXPAUA@@@Z"(%struct.A* %x)			// CHECK-LABEL: define dso_local i8* @"?test8@@YAPAXPAUA@@@Z"(%struct.A* %x)
	// CHECK: [[CHECK:%.]] = icmp eq %struct.A %x, null			// CHECK: [[CHECK:%.]] = icmp eq %struct.A %x, null
	// CHECK-NEXT: br i1 [[CHECK]]			// CHECK-NEXT: br i1 [[CHECK]]
	// CHECK: [[VOIDP:%.]] = bitcast %struct.A %x to i8*			// CHECK: [[VOIDP:%.]] = bitcast %struct.A %x to i8*
	// CHECK-NEXT: [[VBPTRPTR:%.]] = getelementptr %struct.A, %struct.A %x, i32 0, i32 0			// CHECK-NEXT: [[VBPTRPTR:%.]] = getelementptr %struct.A, %struct.A %x, i32 0, i32 0
	// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4			// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4
	// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1			// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1
	// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4			// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4
	// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[VOIDP]], i32 [[VBOFFS]]			// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[VOIDP]], i32 [[VBOFFS]]
	// CHECK-NEXT: [[RES:%.]] = tail call i8 @__RTCastToVoid(i8* nonnull [[ADJ]])			// CHECK-NEXT: [[RES:%.]] = call i8 @__RTCastToVoid(i8* nonnull [[ADJ]])
	// CHECK-NEXT: br label			// CHECK-NEXT: br label
	// CHECK: [[RET:%.]] = phi i8			// CHECK: [[RET:%.]] = phi i8
	// CHECK-NEXT: ret i8* [[RET]]			// CHECK-NEXT: ret i8* [[RET]]

	void* test9(B* x) { return dynamic_cast<void*>(x); }			void* test9(B* x) { return dynamic_cast<void*>(x); }
	// CHECK-LABEL: define dso_local i8* @"?test9@@YAPAXPAUB@@@Z"(%struct.B* %x)			// CHECK-LABEL: define dso_local i8* @"?test9@@YAPAXPAUB@@@Z"(%struct.B* %x)
	// CHECK: [[CHECK:%.]] = icmp eq %struct.B %x, null			// CHECK: [[CHECK:%.]] = icmp eq %struct.B %x, null
	// CHECK-NEXT: br i1 [[CHECK]]			// CHECK-NEXT: br i1 [[CHECK]]
	// CHECK: [[CAST:%.]] = getelementptr %struct.B, %struct.B %x, i32 0, i32 0, i32 0			// CHECK: [[CAST:%.]] = getelementptr %struct.B, %struct.B %x, i32 0, i32 0, i32 0
	// CHECK-NEXT: [[VBPTR:%.]] = getelementptr inbounds i8, i8 [[CAST]], i32 4			// CHECK-NEXT: [[VBPTR:%.]] = getelementptr inbounds i8, i8 [[CAST]], i32 4
	// CHECK-NEXT: [[VBPTRPTR:%.]] = bitcast i8 [[VBPTR]] to i32**			// CHECK-NEXT: [[VBPTRPTR:%.]] = bitcast i8 [[VBPTR]] to i32**
	// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4			// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBPTRPTR]], align 4
	// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1			// CHECK-NEXT: [[VBOFFP:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1
	// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4			// CHECK-NEXT: [[VBOFFS:%.]] = load i32, i32 [[VBOFFP]], align 4
	// CHECK-NEXT: [[DELTA:%.*]] = add nsw i32 [[VBOFFS]], 4			// CHECK-NEXT: [[DELTA:%.*]] = add nsw i32 [[VBOFFS]], 4
	// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[CAST]], i32 [[DELTA]]			// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[CAST]], i32 [[DELTA]]
	// CHECK-NEXT: [[CALL:%.]] = tail call i8 @__RTCastToVoid(i8* [[ADJ]])			// CHECK-NEXT: [[CALL:%.]] = call i8 @__RTCastToVoid(i8* [[ADJ]])
	// CHECK-NEXT: br label			// CHECK-NEXT: br label
	// CHECK: [[RET:%.]] = phi i8			// CHECK: [[RET:%.]] = phi i8
	// CHECK-NEXT: ret i8* [[RET]]			// CHECK-NEXT: ret i8* [[RET]]

	namespace PR25606 {			namespace PR25606 {
	struct Cleanup {			struct Cleanup {
	~Cleanup();			~Cleanup();
	};			};
	Show All 15 Lines

clang/test/CodeGenCXX/microsoft-abi-typeid.cpp

	Show All 19 Lines
	// CHECK: ret %struct.type_info* bitcast (%rtti.TypeDescriptor7* @"??_R0?AUA@@@8" to %struct.type_info*)			// CHECK: ret %struct.type_info* bitcast (%rtti.TypeDescriptor7* @"??_R0?AUA@@@8" to %struct.type_info*)

	const std::type_info* test2_typeid() { return &typeid(&a); }			const std::type_info* test2_typeid() { return &typeid(&a); }
	// CHECK-LABEL: define dso_local %struct.type_info* @"?test2_typeid@@YAPBUtype_info@@XZ"()			// CHECK-LABEL: define dso_local %struct.type_info* @"?test2_typeid@@YAPBUtype_info@@XZ"()
	// CHECK: ret %struct.type_info* bitcast (%rtti.TypeDescriptor7* @"??_R0PAUA@@@8" to %struct.type_info*)			// CHECK: ret %struct.type_info* bitcast (%rtti.TypeDescriptor7* @"??_R0PAUA@@@8" to %struct.type_info*)

	const std::type_info* test3_typeid() { return &typeid(*fn()); }			const std::type_info* test3_typeid() { return &typeid(*fn()); }
	// CHECK-LABEL: define dso_local %struct.type_info* @"?test3_typeid@@YAPBUtype_info@@XZ"()			// CHECK-LABEL: define dso_local %struct.type_info* @"?test3_typeid@@YAPBUtype_info@@XZ"()
	// CHECK: [[CALL:%.]] = tail call %struct.A @"?fn@@YAPAUA@@XZ"()			// CHECK: [[CALL:%.]] = call %struct.A @"?fn@@YAPAUA@@XZ"()
	// CHECK-NEXT: [[CMP:%.]] = icmp eq %struct.A [[CALL]], null			// CHECK-NEXT: [[CMP:%.]] = icmp eq %struct.A [[CALL]], null
	// CHECK-NEXT: br i1 [[CMP]]			// CHECK-NEXT: br i1 [[CMP]]
	// CHECK: tail call i8* @__RTtypeid(i8* null)			// CHECK: call i8* @__RTtypeid(i8* null)
	// CHECK-NEXT: unreachable			// CHECK-NEXT: unreachable
	// CHECK: [[THIS:%.]] = bitcast %struct.A [[CALL]] to i8*			// CHECK: [[THIS:%.]] = bitcast %struct.A [[CALL]] to i8*
	// CHECK-NEXT: [[VBTBLP:%.]] = getelementptr %struct.A, %struct.A [[CALL]], i32 0, i32 0			// CHECK-NEXT: [[VBTBLP:%.]] = getelementptr %struct.A, %struct.A [[CALL]], i32 0, i32 0
	// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBTBLP]], align 4			// CHECK-NEXT: [[VBTBL:%.]] = load i32, i32** [[VBTBLP]], align 4
	// CHECK-NEXT: [[VBSLOT:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1			// CHECK-NEXT: [[VBSLOT:%.]] = getelementptr inbounds i32, i32 [[VBTBL]], i32 1
	// CHECK-NEXT: [[VBASE_OFFS:%.]] = load i32, i32 [[VBSLOT]], align 4			// CHECK-NEXT: [[VBASE_OFFS:%.]] = load i32, i32 [[VBSLOT]], align 4
	// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[THIS]], i32 [[VBASE_OFFS]]			// CHECK-NEXT: [[ADJ:%.]] = getelementptr inbounds i8, i8 [[THIS]], i32 [[VBASE_OFFS]]
	// CHECK-NEXT: [[RT:%.]] = tail call i8 @__RTtypeid(i8* nonnull [[ADJ]])			// CHECK-NEXT: [[RT:%.]] = call i8 @__RTtypeid(i8* nonnull [[ADJ]])
	// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[RT]] to %struct.type_info*			// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[RT]] to %struct.type_info*
	// CHECK-NEXT: ret %struct.type_info* [[RET]]			// CHECK-NEXT: ret %struct.type_info* [[RET]]

	const std::type_info* test4_typeid() { return &typeid(b); }			const std::type_info* test4_typeid() { return &typeid(b); }
	// CHECK: define dso_local %struct.type_info* @"?test4_typeid@@YAPBUtype_info@@XZ"()			// CHECK: define dso_local %struct.type_info* @"?test4_typeid@@YAPBUtype_info@@XZ"()
	// CHECK: ret %struct.type_info* bitcast (%rtti.TypeDescriptor2* @"??_R0H@8" to %struct.type_info*)			// CHECK: ret %struct.type_info* bitcast (%rtti.TypeDescriptor2* @"??_R0H@8" to %struct.type_info*)

	const std::type_info* test5_typeid() { return &typeid(v); }			const std::type_info* test5_typeid() { return &typeid(v); }
	// CHECK: define dso_local %struct.type_info* @"?test5_typeid@@YAPBUtype_info@@XZ"()			// CHECK: define dso_local %struct.type_info* @"?test5_typeid@@YAPBUtype_info@@XZ"()
	// CHECK: [[RT:%.]] = tail call i8 @__RTtypeid(i8* bitcast (%struct.V* @"?v@@3UV@@A" to i8*))			// CHECK: [[RT:%.]] = call i8 @__RTtypeid(i8* bitcast (%struct.V* @"?v@@3UV@@A" to i8*))
	// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[RT]] to %struct.type_info*			// CHECK-NEXT: [[RET:%.]] = bitcast i8 [[RT]] to %struct.type_info*
	// CHECK-NEXT: ret %struct.type_info* [[RET]]			// CHECK-NEXT: ret %struct.type_info* [[RET]]

	namespace PR26329 {			namespace PR26329 {
	struct Polymorphic {			struct Polymorphic {
	virtual ~Polymorphic();			virtual ~Polymorphic();
	};			};

	Show All 13 Lines

clang/test/CodeGenCXX/nrvo.cpp

Show All 27 Lines	X test0() {
// CHECK-EH: call {{.*}} @_ZN1XC1Ev		// CHECK-EH: call {{.*}} @_ZN1XC1Ev
// CHECK-EH-NEXT: ret void		// CHECK-EH-NEXT: ret void
return x;		return x;
}		}

// CHECK-LABEL: define void @_Z5test1b(		// CHECK-LABEL: define void @_Z5test1b(
// CHECK-EH-LABEL: define void @_Z5test1b(		// CHECK-EH-LABEL: define void @_Z5test1b(
X test1(bool B) {		X test1(bool B) {
// CHECK: tail call {{.*}} @_ZN1XC1Ev		// CHECK: call {{.*}} @_ZN1XC1Ev
// CHECK-NEXT: ret void		// CHECK-NEXT: ret void
X x;		X x;
if (B)		if (B)
return (x);		return (x);
return x;		return x;
// CHECK-EH: tail call {{.*}} @_ZN1XC1Ev		// CHECK-EH: call {{.*}} @_ZN1XC1Ev
// CHECK-EH-NEXT: ret void		// CHECK-EH-NEXT: ret void
}		}

// CHECK-LABEL: define void @_Z5test2b		// CHECK-LABEL: define void @_Z5test2b
// CHECK-EH-LABEL: define void @_Z5test2b		// CHECK-EH-LABEL: define void @_Z5test2b
// CHECK-EH-SAME: personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*)		// CHECK-EH-SAME: personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*)
X test2(bool B) {		X test2(bool B) {
// No NRVO.		// No NRVO.
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	X test2(bool B) {
// CHECK-EH-03-NEXT: [[T1:%.]] = extractvalue { i8, i32 } [[T0]], 0		// CHECK-EH-03-NEXT: [[T1:%.]] = extractvalue { i8, i32 } [[T0]], 0
// CHECK-EH-03-NEXT: call void @__clang_call_terminate(i8* [[T1]]) [[NR_NUW:#[0-9]+]]		// CHECK-EH-03-NEXT: call void @__clang_call_terminate(i8* [[T1]]) [[NR_NUW:#[0-9]+]]
// CHECK-EH-03-NEXT: unreachable		// CHECK-EH-03-NEXT: unreachable

}		}

// CHECK-LABEL: define void @_Z5test3b		// CHECK-LABEL: define void @_Z5test3b
X test3(bool B) {		X test3(bool B) {
// CHECK: tail call {{.*}} @_ZN1XC1Ev		// CHECK: call {{.*}} @_ZN1XC1Ev
// CHECK-NOT: call {{.*}} @_ZN1XC1ERKS_		// CHECK-NOT: call {{.*}} @_ZN1XC1ERKS_
// CHECK: call {{.*}} @_ZN1XC1Ev		// CHECK: call {{.*}} @_ZN1XC1Ev
// CHECK: call {{.*}} @_ZN1XC1ERKS_		// CHECK: call {{.*}} @_ZN1XC1ERKS_
if (B) {		if (B) {
X y;		X y;
return y;		return y;
}		}
// FIXME: we should NRVO this variable too.		// FIXME: we should NRVO this variable too.
X x;		X x;
return x;		return x;
}		}

extern "C" void exit(int) throw();		extern "C" void exit(int) throw();

// CHECK-LABEL: define void @_Z5test4b		// CHECK-LABEL: define void @_Z5test4b
X test4(bool B) {		X test4(bool B) {
{		{
// CHECK: tail call {{.*}} @_ZN1XC1Ev		// CHECK: call {{.*}} @_ZN1XC1Ev
X x;		X x;
// CHECK: br i1		// CHECK: br i1
if (B)		if (B)
return x;		return x;
}		}
// CHECK: tail call {{.*}} @_ZN1XD1Ev		// CHECK: call {{.*}} @_ZN1XD1Ev
// CHECK: tail call void @exit(i32 1)		// CHECK: call void @exit(i32 1)
exit(1);		exit(1);
}		}

#ifdef __EXCEPTIONS		#ifdef __EXCEPTIONS
// CHECK-EH-LABEL: define void @_Z5test5		// CHECK-EH-LABEL: define void @_Z5test5
void may_throw();		void may_throw();
X test5() {		X test5() {
try {		try {
Show All 19 Lines	X test6() {
// CHECK-NEXT: call {{.}} @_ZN1XC1ERKS_([[X]] {{%.}}, [[X]] nonnull dereferenceable({{[0-9]+}}) [[A]])		// CHECK-NEXT: call {{.}} @_ZN1XC1ERKS_([[X]] {{%.}}, [[X]] nonnull dereferenceable({{[0-9]+}}) [[A]])
// CHECK-NEXT: call {{.}} @_ZN1XD1Ev([[X]] nonnull [[A]])		// CHECK-NEXT: call {{.}} @_ZN1XD1Ev([[X]] nonnull [[A]])
// CHECK-NEXT: call void @llvm.lifetime.end.p0i8(i64 1, i8* nonnull [[PTR]])		// CHECK-NEXT: call void @llvm.lifetime.end.p0i8(i64 1, i8* nonnull [[PTR]])
// CHECK-NEXT: ret void		// CHECK-NEXT: ret void
}		}

// CHECK-LABEL: define void @_Z5test7b		// CHECK-LABEL: define void @_Z5test7b
X test7(bool b) {		X test7(bool b) {
// CHECK: tail call {{.*}} @_ZN1XC1Ev		// CHECK: call {{.*}} @_ZN1XC1Ev
// CHECK-NEXT: ret		// CHECK-NEXT: ret
if (b) {		if (b) {
X x;		X x;
return x;		return x;
}		}
return X();		return X();
}		}

// CHECK-LABEL: define void @_Z5test8b		// CHECK-LABEL: define void @_Z5test8b
X test8(bool b) {		X test8(bool b) {
// CHECK: tail call {{.*}} @_ZN1XC1Ev		// CHECK: call {{.*}} @_ZN1XC1Ev
// CHECK-NEXT: ret		// CHECK-NEXT: ret
if (b) {		if (b) {
X x;		X x;
return x;		return x;
} else {		} else {
X y;		X y;
return y;		return y;
}		}
}		}

Y<int> test9() {		Y<int> test9() {
Y<int>::f();		Y<int>::f();
}		}

// CHECK-LABEL: define linkonce_odr void @_ZN1YIiE1fEv		// CHECK-LABEL: define linkonce_odr void @_ZN1YIiE1fEv
// CHECK: tail call {{.*}} @_ZN1YIiEC1Ev		// CHECK: call {{.*}} @_ZN1YIiEC1Ev

// CHECK-EH-03: attributes [[NR_NUW]] = { noreturn nounwind }		// CHECK-EH-03: attributes [[NR_NUW]] = { noreturn nounwind }

clang/test/CodeGenCXX/stack-reuse.cpp

	// RUN: %clang_cc1 -triple armv7-unknown-linux-gnueabihf %s -o - -emit-llvm -O1 \| FileCheck %s			// RUN: %clang_cc1 -triple armv7-unknown-linux-gnueabihf %s -o - -emit-llvm -O2 \| FileCheck %s

	// Stack should be reused when possible, no need to allocate two separate slots			// Stack should be reused when possible, no need to allocate two separate slots
	// if they have disjoint lifetime.			// if they have disjoint lifetime.

	// Sizes of objects are related to previously existed threshold of 32. In case			// Sizes of objects are related to previously existed threshold of 32. In case
	// of S_large stack size is rounded to 40 bytes.			// of S_large stack size is rounded to 40 bytes.

	// 32B			// 32B
	▲ Show 20 Lines • Show All 137 Lines • Show Last 20 Lines

clang/test/CodeGenCXX/wasm-args-returns.cpp

	Show All 13 Lines

	struct one_field {			struct one_field {
	double d;			double d;
	};			};
	test(one_field);			test(one_field);
	// CHECK: define double @_Z7forward9one_field(double returned %{{.*}})			// CHECK: define double @_Z7forward9one_field(double returned %{{.*}})
	//			//
	// CHECK: define void @_Z14test_one_fieldv()			// CHECK: define void @_Z14test_one_fieldv()
	// CHECK: %[[call:.*]] = tail call double @_Z13def_one_fieldv()			// CHECK: %[[call:.*]] = call double @_Z13def_one_fieldv()
	// CHECK: tail call void @_Z3use9one_field(double %[[call]])			// CHECK: call void @_Z3use9one_field(double %[[call]])
	// CHECK: ret void			// CHECK: ret void
	//			//
	// CHECK: declare void @_Z3use9one_field(double)			// CHECK: declare void @_Z3use9one_field(double)
	// CHECK: declare double @_Z13def_one_fieldv()			// CHECK: declare double @_Z13def_one_fieldv()

	struct two_fields {			struct two_fields {
	double d, e;			double d, e;
	};			};
	▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines
	// CHECK: declare void @_Z3use17aligned_copy_ctor(%struct.aligned_copy_ctor*)			// CHECK: declare void @_Z3use17aligned_copy_ctor(%struct.aligned_copy_ctor*)
	// CHECK: declare void @_Z21def_aligned_copy_ctorv(%struct.aligned_copy_ctor* sret)			// CHECK: declare void @_Z21def_aligned_copy_ctorv(%struct.aligned_copy_ctor* sret)

	struct empty {};			struct empty {};
	test(empty);			test(empty);
	// CHECK: define void @_Z7forward5empty()			// CHECK: define void @_Z7forward5empty()
	//			//
	// CHECK: define void @_Z10test_emptyv()			// CHECK: define void @_Z10test_emptyv()
	// CHECK: tail call void @_Z9def_emptyv()			// CHECK: call void @_Z9def_emptyv()
	// CHECK: tail call void @_Z3use5empty()			// CHECK: call void @_Z3use5empty()
	// CHECK: ret void			// CHECK: ret void
	//			//
	// CHECK: declare void @_Z3use5empty()			// CHECK: declare void @_Z3use5empty()
	// CHECK: declare void @_Z9def_emptyv()			// CHECK: declare void @_Z9def_emptyv()

	struct one_bitfield {			struct one_bitfield {
	int d : 3;			int d : 3;
	};			};
	test(one_bitfield);			test(one_bitfield);
	// CHECK: define i32 @_Z7forward12one_bitfield(i32 returned %{{.*}})			// CHECK: define i32 @_Z7forward12one_bitfield(i32 returned %{{.*}})
	//			//
	// CHECK: define void @_Z17test_one_bitfieldv()			// CHECK: define void @_Z17test_one_bitfieldv()
	// CHECK: %[[call:.*]] = tail call i32 @_Z16def_one_bitfieldv()			// CHECK: %[[call:.*]] = call i32 @_Z16def_one_bitfieldv()
	// CHECK: tail call void @_Z3use12one_bitfield(i32 %[[call]])			// CHECK: call void @_Z3use12one_bitfield(i32 %[[call]])
	// CHECK: ret void			// CHECK: ret void
	//			//
	// CHECK: declare void @_Z3use12one_bitfield(i32)			// CHECK: declare void @_Z3use12one_bitfield(i32)
	// CHECK: declare i32 @_Z16def_one_bitfieldv()			// CHECK: declare i32 @_Z16def_one_bitfieldv()

clang/test/CodeGenObjCXX/arc-blocks.mm

	Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines

	// CHECK: [[EH_RESUME]]:			// CHECK: [[EH_RESUME]]:
	// CHECK: resume { i8*, i32 }			// CHECK: resume { i8*, i32 }

	// CHECK: [[TERMINATE_LPAD]]:			// CHECK: [[TERMINATE_LPAD]]:
	// CHECK: call void @__clang_call_terminate(			// CHECK: call void @__clang_call_terminate(

	// CHECK-O1-LABEL: define linkonce_odr hidden void @__copy_helper_block_ea8_32s40r48w56c15_ZTSN5test12S0E60c15_ZTSN5test12S0E(			// CHECK-O1-LABEL: define linkonce_odr hidden void @__copy_helper_block_ea8_32s40r48w56c15_ZTSN5test12S0E60c15_ZTSN5test12S0E(
	// CHECK-O1: tail call void @llvm.objc.release({{.}}) {{.}} !clang.imprecise_release			// CHECK-O1: call void @llvm.objc.release({{.}}) {{.}} !clang.imprecise_release
	// CHECK-NOEXCP: define linkonce_odr hidden void @__copy_helper_block_8_32s40r48w56c15_ZTSN5test12S0E60c15_ZTSN5test12S0E(			// CHECK-NOEXCP: define linkonce_odr hidden void @__copy_helper_block_8_32s40r48w56c15_ZTSN5test12S0E60c15_ZTSN5test12S0E(

	// CHECK: define linkonce_odr hidden void @__destroy_helper_block_ea8_32s40r48w56c15_ZTSN5test12S0E60c15_ZTSN5test12S0E(			// CHECK: define linkonce_odr hidden void @__destroy_helper_block_ea8_32s40r48w56c15_ZTSN5test12S0E60c15_ZTSN5test12S0E(
	// CHECK: %[[BLOCK:.]] = bitcast i8 %{{.}} to <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }>			// CHECK: %[[BLOCK:.]] = bitcast i8 %{{.}} to <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }>
	// CHECK: %[[V4:.]] = getelementptr inbounds <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }>, <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }> %[[BLOCK]], i32 0, i32 5			// CHECK: %[[V4:.]] = getelementptr inbounds <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }>, <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }> %[[BLOCK]], i32 0, i32 5
	// CHECK: %[[V2:.]] = getelementptr inbounds <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }>, <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }> %[[BLOCK]], i32 0, i32 6			// CHECK: %[[V2:.]] = getelementptr inbounds <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }>, <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }> %[[BLOCK]], i32 0, i32 6
	// CHECK: %[[V3:.]] = getelementptr inbounds <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }>, <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }> %[[BLOCK]], i32 0, i32 7			// CHECK: %[[V3:.]] = getelementptr inbounds <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }>, <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }> %[[BLOCK]], i32 0, i32 7
	// CHECK: %[[V5:.]] = getelementptr inbounds <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }>, <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }> %[[BLOCK]], i32 0, i32 8			// CHECK: %[[V5:.]] = getelementptr inbounds <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }>, <{ i8, i32, i32, i8, %[[STRUCT_BLOCK_DESCRIPTOR]], i8, i8, i8, %[[STRUCT_TEST1_S0]], %[[STRUCT_TEST1_S0]], %[[STRUCT_TRIVIAL_INTERNAL]] }> %[[BLOCK]], i32 0, i32 8
	Show All 31 Lines

	// CHECK: [[EH_RESUME]]:			// CHECK: [[EH_RESUME]]:
	// CHECK: resume { i8*, i32 }			// CHECK: resume { i8*, i32 }

	// CHECK: [[TERMINATE_LPAD]]:			// CHECK: [[TERMINATE_LPAD]]:
	// CHECK: call void @__clang_call_terminate(			// CHECK: call void @__clang_call_terminate(

	// CHECK-O1-LABEL: define linkonce_odr hidden void @__destroy_helper_block_ea8_32s40r48w56c15_ZTSN5test12S0E60c15_ZTSN5test12S0E(			// CHECK-O1-LABEL: define linkonce_odr hidden void @__destroy_helper_block_ea8_32s40r48w56c15_ZTSN5test12S0E60c15_ZTSN5test12S0E(
	// CHECK-O1: tail call void @llvm.objc.release({{.}}) {{.}} !clang.imprecise_release			// CHECK-O1: call void @llvm.objc.release({{.}}) {{.}} !clang.imprecise_release
	// CHECK-O1: tail call void @llvm.objc.release({{.}}) {{.}} !clang.imprecise_release			// CHECK-O1: call void @llvm.objc.release({{.}}) {{.}} !clang.imprecise_release
	// CHECK-NOEXCP: define linkonce_odr hidden void @__destroy_helper_block_8_32s40r48w56c15_ZTSN5test12S0E60c15_ZTSN5test12S0E(			// CHECK-NOEXCP: define linkonce_odr hidden void @__destroy_helper_block_8_32s40r48w56c15_ZTSN5test12S0E60c15_ZTSN5test12S0E(

	namespace {			namespace {
	struct TrivialInternal {			struct TrivialInternal {
	int a;			int a;
	};			};
	}			}

	▲ Show 20 Lines • Show All 141 Lines • Show Last 20 Lines

clang/test/CodeGenObjCXX/nrvo.mm

	// RUN: %clang_cc1 -emit-llvm -o - -fblocks %s -O1 -fno-experimental-new-pass-manager -triple x86_64-apple-darwin10.0.0 -fobjc-runtime=macosx-fragile-10.5 \| FileCheck %s			// RUN: %clang_cc1 -emit-llvm -o - -fblocks %s -O1 -fno-experimental-new-pass-manager -triple x86_64-apple-darwin10.0.0 -fobjc-runtime=macosx-fragile-10.5 \| FileCheck %s

	// PR10835 / <rdar://problem/10050178>			// PR10835 / <rdar://problem/10050178>
	struct X {			struct X {
	X();			X();
	X(const X&);			X(const X&);
	~X();			~X();
	};			};

	@interface NRVO			@interface NRVO
	@end			@end

	@implementation NRVO			@implementation NRVO
	// CHECK: define internal void @"\01-[NRVO getNRVO]"			// CHECK: define internal void @"\01-[NRVO getNRVO]"
	- (X)getNRVO {			- (X)getNRVO {
	X x;			X x;
	// CHECK: tail call void @_ZN1XC1Ev			// CHECK: call void @_ZN1XC1Ev
	// CHECK-NEXT: ret void			// CHECK-NEXT: ret void
	return x;			return x;
	}			}
	@end			@end

	X blocksNRVO() {			X blocksNRVO() {
	return ^{			return ^{
	// CHECK-LABEL: define internal void @___Z10blocksNRVOv_block_invoke			// CHECK-LABEL: define internal void @___Z10blocksNRVOv_block_invoke
	X x;			X x;
	// CHECK: tail call void @_ZN1XC1Ev			// CHECK: call void @_ZN1XC1Ev
	// CHECK-NEXT: ret void			// CHECK-NEXT: ret void
	return x;			return x;
	}() ;			}() ;
	}			}

clang/test/Lexer/minimize_source_to_dependency_directives_invalid_error.c

	// Test CF+LF are properly handled along with quoted, multi-line #error			// Test CF+LF are properly handled along with quoted, multi-line #error
	// RUN: %clang_cc1 -DOTHER -print-dependency-directives-minimized-source %s 2>&1 \| FileCheck %s			// RUN: %clang_cc1 -DOTHER -print-dependency-directives-minimized-source %s 2>&1 \| FileCheck %s

	#ifndef TEST			#ifndef TEST
	#error "message \			#error "message \
	more message \			more message \
	even more"			even more"
	#endif			#endif

	#ifdef OTHER			#ifdef OTHER
	#include <string>			#include <string>
	#endif			#endif

	// CHECK: #ifdef OTHER			// CHECK: #ifdef OTHER
	// CHECK-NEXT: #include <string>			// CHECK-NEXT: #include <string>
	// CHECK-NEXT: #endif			// CHECK-NEXT: #endif

clang/test/PCH/no-escaping-block-tail-calls.cpp

	// RUN: %clang_cc1 -x c++-header -triple x86_64-apple-darwin11 -emit-pch -O1 -fblocks -fno-escaping-block-tail-calls -o %t %S/no-escaping-block-tail-calls.h			// RUN: %clang_cc1 -x c++-header -triple x86_64-apple-darwin11 -emit-pch -O2 -fblocks -fno-escaping-block-tail-calls -o %t %S/no-escaping-block-tail-calls.h
	// RUN: %clang_cc1 -triple x86_64-apple-darwin11 -include-pch %t -emit-llvm -O1 -fblocks -fno-escaping-block-tail-calls -o - %s \| FileCheck %s			// RUN: %clang_cc1 -triple x86_64-apple-darwin11 -include-pch %t -emit-llvm -O2 -fblocks -fno-escaping-block-tail-calls -o - %s \| FileCheck %s

	// Check that -fno-escaping-block-tail-calls doesn't disable tail-call			// Check that -fno-escaping-block-tail-calls doesn't disable tail-call
	// optimization if the block is non-escaping.			// optimization if the block is non-escaping.

	// CHECK-LABEL: define internal i32 @___ZN1S1mEv_block_invoke(			// CHECK-LABEL: define internal i32 @___ZN1S1mEv_block_invoke(
	// CHECK: %[[CALL:.*]] = tail call i32 @_ZN1S3fooER2S0(			// CHECK: %[[CALL:.*]] = tail call i32 @_ZN1S3fooER2S0(
	// CHECK-NEXT: ret i32 %[[CALL]]			// CHECK-NEXT: ret i32 %[[CALL]]

	void test() {			void test() {
	S s;			S s;
	s.m();			s.m();
	}			}

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/ambiguous_tail_call_seq1/Makefile

	CXX_SOURCES := main.cpp			CXX_SOURCES := main.cpp

	CXXFLAGS_EXTRAS := -g -O1 -glldb			CXXFLAGS_EXTRAS := -g -O2 -glldb
	include Makefile.rules			include Makefile.rules

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/ambiguous_tail_call_seq2/Makefile

	CXX_SOURCES := main.cpp			CXX_SOURCES := main.cpp

	CXXFLAGS_EXTRAS := -g -O1 -glldb			CXXFLAGS_EXTRAS := -g -O2 -glldb
	include Makefile.rules			include Makefile.rules

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/disambiguate_call_site/Makefile

	CXX_SOURCES := main.cpp			CXX_SOURCES := main.cpp

	CXXFLAGS_EXTRAS := -g -O1 -glldb			CXXFLAGS_EXTRAS := -g -O2 -glldb
	include Makefile.rules			include Makefile.rules

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/Makefile

	CXX_SOURCES := main.cpp			CXX_SOURCES := main.cpp

	CXXFLAGS_EXTRAS := -g -O1 -glldb			CXXFLAGS_EXTRAS := -g -O2 -glldb
	include Makefile.rules			include Makefile.rules

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/disambiguate_tail_call_seq/Makefile

	CXX_SOURCES := main.cpp			CXX_SOURCES := main.cpp

	CXXFLAGS_EXTRAS := -g -O1 -glldb			CXXFLAGS_EXTRAS := -g -O2 -glldb
	include Makefile.rules			include Makefile.rules

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/inlining_and_tail_calls/Makefile

	CXX_SOURCES := main.cpp			CXX_SOURCES := main.cpp

	CXXFLAGS_EXTRAS := -g -O1 -glldb			CXXFLAGS_EXTRAS := -g -O2 -glldb
	include Makefile.rules			include Makefile.rules

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/sbapi_support/Makefile

	CXX_SOURCES := main.cpp			CXX_SOURCES := main.cpp

	CXXFLAGS_EXTRAS := -g -O1 -glldb			CXXFLAGS_EXTRAS := -g -O2 -glldb
	include Makefile.rules			include Makefile.rules

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/thread_step_out_message/Makefile

	CXX_SOURCES := main.cpp			CXX_SOURCES := main.cpp

	CXXFLAGS_EXTRAS := -g -O1 -glldb			CXXFLAGS_EXTRAS := -g -O2 -glldb
	include Makefile.rules			include Makefile.rules

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/thread_step_out_or_return/Makefile

	CXX_SOURCES := main.cpp			CXX_SOURCES := main.cpp

	CXXFLAGS_EXTRAS := -g -O1 -glldb			CXXFLAGS_EXTRAS := -g -O2 -glldb
	include Makefile.rules			include Makefile.rules

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/unambiguous_sequence/Makefile

	CXX_SOURCES := main.cpp			CXX_SOURCES := main.cpp

	CXXFLAGS_EXTRAS := -g -O1 -glldb			CXXFLAGS_EXTRAS := -g -O2 -glldb
	include Makefile.rules			include Makefile.rules

llvm/include/llvm/Passes/PassBuilder.h

Show First 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	public:
enum OptimizationLevel {		enum OptimizationLevel {
/// Disable as many optimizations as possible. This doesn't completely		/// Disable as many optimizations as possible. This doesn't completely
/// disable the optimizer in all cases, for example always_inline functions		/// disable the optimizer in all cases, for example always_inline functions
/// can be required to be inlined for correctness.		/// can be required to be inlined for correctness.
O0,		O0,

/// Optimize quickly without destroying debuggability.		/// Optimize quickly without destroying debuggability.
///		///
/// FIXME: The current and historical behavior of this level does not
/// agree with this goal, but we would like to move toward this goal in the
/// future.
///
/// This level is tuned to produce a result from the optimizer as quickly		/// This level is tuned to produce a result from the optimizer as quickly
/// as possible and to avoid destroying debuggability. This tends to result		/// as possible and to avoid destroying debuggability. This tends to result
/// in a very good development mode where the compiled code will be		/// in a very good development mode where the compiled code will be
/// immediately executed as part of testing. As a consequence, where		/// immediately executed as part of testing. As a consequence, where
/// possible, we would like to produce efficient-to-execute code, but not		/// possible, we would like to produce efficient-to-execute code, but not
/// if it significantly slows down compilation or would prevent even basic		/// if it significantly slows down compilation or would prevent even basic
/// debugging of the resulting binary.		/// debugging of the resulting binary.
///		///
/// As an example, complex loop transformations such as versioning,		/// As an example, complex loop transformations such as versioning,
/// vectorization, or fusion might not make sense here due to the degree to		/// vectorization, or fusion don't make sense here due to the degree to
/// which the executed code would differ from the source code, and the		/// which the executed code differs from the source code, and the compile time
/// potential compile time cost.		/// cost.
O1,		O1,

/// Optimize for fast execution as much as possible without triggering		/// Optimize for fast execution as much as possible without triggering
/// significant incremental compile time or code size growth.		/// significant incremental compile time or code size growth.
///		///
/// The key idea is that optimizations at this level should "pay for		/// The key idea is that optimizations at this level should "pay for
/// themselves". So if an optimization increases compile time by 5% or		/// themselves". So if an optimization increases compile time by 5% or
/// increases code size by 5% for a particular benchmark, that benchmark		/// increases code size by 5% for a particular benchmark, that benchmark
▲ Show 20 Lines • Show All 592 Lines • Show Last 20 Lines

llvm/lib/Passes/PassBuilder.cpp

Show First 20 Lines • Show All 388 Lines • ▼ Show 20 Lines
PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,		PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,
ThinLTOPhase Phase,		ThinLTOPhase Phase,
bool DebugLogging) {		bool DebugLogging) {
assert(Level != O0 && "Must request optimizations!");		assert(Level != O0 && "Must request optimizations!");
FunctionPassManager FPM(DebugLogging);		FunctionPassManager FPM(DebugLogging);

// Form SSA out of local memory accesses after breaking apart aggregates into		// Form SSA out of local memory accesses after breaking apart aggregates into
// scalars.		// scalars.
FPM.addPass(SROA());		FPM.addPass(SROA());
		chandlercUnsubmitted Not Done Reply Inline Actions We know `O0` isn't used here, so this should be a no-op. chandlerc: We know `O0` isn't used here, so this should be a no-op.

// Catch trivial redundancies		// Catch trivial redundancies
FPM.addPass(EarlyCSEPass(true /* Enable mem-ssa. */));		FPM.addPass(EarlyCSEPass(true /* Enable mem-ssa. */));

// Hoisting of scalars and load expressions.		// Hoisting of scalars and load expressions.
		if (Level > O1) {
if (EnableGVNHoist)		if (EnableGVNHoist)
FPM.addPass(GVNHoistPass());		FPM.addPass(GVNHoistPass());

// Global value numbering based sinking.		// Global value numbering based sinking.
if (EnableGVNSink) {		if (EnableGVNSink) {
FPM.addPass(GVNSinkPass());		FPM.addPass(GVNSinkPass());
FPM.addPass(SimplifyCFGPass());		FPM.addPass(SimplifyCFGPass());
}		}
		}

// Speculative execution if the target has divergent branches; otherwise nop.		// Speculative execution if the target has divergent branches; otherwise nop.
		if (Level > O1) {
FPM.addPass(SpeculativeExecutionPass());		FPM.addPass(SpeculativeExecutionPass());

// Optimize based on known information about branches, and cleanup afterward.		// Optimize based on known information about branches, and cleanup afterward.
FPM.addPass(JumpThreadingPass());		FPM.addPass(JumpThreadingPass());
FPM.addPass(CorrelatedValuePropagationPass());		FPM.addPass(CorrelatedValuePropagationPass());
		}
		chandlercUnsubmitted Done Reply Inline Actions I think you can merge all of these? chandlerc: I think you can merge all of these?
FPM.addPass(SimplifyCFGPass());		FPM.addPass(SimplifyCFGPass());
if (Level == O3)		if (Level == O3)
FPM.addPass(AggressiveInstCombinePass());		FPM.addPass(AggressiveInstCombinePass());
FPM.addPass(InstCombinePass());		FPM.addPass(InstCombinePass());

if (!isOptimizingForSize(Level))		if (!isOptimizingForSize(Level))
FPM.addPass(LibCallsShrinkWrapPass());		FPM.addPass(LibCallsShrinkWrapPass());

invokePeepholeEPCallbacks(FPM, Level);		invokePeepholeEPCallbacks(FPM, Level);

// For PGO use pipeline, try to optimize memory intrinsics such as memcpy		// For PGO use pipeline, try to optimize memory intrinsics such as memcpy
// using the size value profile. Don't perform this when optimizing for size.		// using the size value profile. Don't perform this when optimizing for size.
if (PGOOpt && PGOOpt->Action == PGOOptions::IRUse &&		if (PGOOpt && PGOOpt->Action == PGOOptions::IRUse &&
!isOptimizingForSize(Level))		!isOptimizingForSize(Level) && Level > O1)
FPM.addPass(PGOMemOPSizeOpt());		FPM.addPass(PGOMemOPSizeOpt());

		// TODO: Investigate the cost/benefit of tail call elimination on debugging.
		if (Level > O1)
FPM.addPass(TailCallElimPass());		FPM.addPass(TailCallElimPass());
FPM.addPass(SimplifyCFGPass());		FPM.addPass(SimplifyCFGPass());

// Form canonically associated expression trees, and simplify the trees using		// Form canonically associated expression trees, and simplify the trees using
// basic mathematical properties. For example, this will form (nearly)		// basic mathematical properties. For example, this will form (nearly)
// minimal multiplication trees.		// minimal multiplication trees.
FPM.addPass(ReassociatePass());		FPM.addPass(ReassociatePass());

// Add the primary loop simplification pipeline.		// Add the primary loop simplification pipeline.
Show All 10 Lines	PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,
// Simplify the loop body. We do this initially to clean up after other loop		// Simplify the loop body. We do this initially to clean up after other loop
// passes run, either when iterating on a loop or on inner loops with		// passes run, either when iterating on a loop or on inner loops with
// implications on the outer loop.		// implications on the outer loop.
LPM1.addPass(LoopInstSimplifyPass());		LPM1.addPass(LoopInstSimplifyPass());
LPM1.addPass(LoopSimplifyCFGPass());		LPM1.addPass(LoopSimplifyCFGPass());

// Rotate Loop - disable header duplication at -Oz		// Rotate Loop - disable header duplication at -Oz
LPM1.addPass(LoopRotatePass(Level != Oz));		LPM1.addPass(LoopRotatePass(Level != Oz));
		// TODO: Investigate promotion cap for O1.
LPM1.addPass(LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap));		LPM1.addPass(LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap));
LPM1.addPass(SimpleLoopUnswitchPass());		LPM1.addPass(SimpleLoopUnswitchPass());
LPM2.addPass(IndVarSimplifyPass());		LPM2.addPass(IndVarSimplifyPass());
LPM2.addPass(LoopIdiomRecognizePass());		LPM2.addPass(LoopIdiomRecognizePass());

for (auto &C : LateLoopOptimizationsEPCallbacks)		for (auto &C : LateLoopOptimizationsEPCallbacks)
C(LPM2, Level);		C(LPM2, Level);

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,

// Run instcombine after redundancy and dead bit elimination to exploit		// Run instcombine after redundancy and dead bit elimination to exploit
// opportunities opened up by them.		// opportunities opened up by them.
FPM.addPass(InstCombinePass());		FPM.addPass(InstCombinePass());
invokePeepholeEPCallbacks(FPM, Level);		invokePeepholeEPCallbacks(FPM, Level);

// Re-consider control flow based optimizations after redundancy elimination,		// Re-consider control flow based optimizations after redundancy elimination,
// redo DCE, etc.		// redo DCE, etc.
		if (Level > O1) {
FPM.addPass(JumpThreadingPass());		FPM.addPass(JumpThreadingPass());
FPM.addPass(CorrelatedValuePropagationPass());		FPM.addPass(CorrelatedValuePropagationPass());
FPM.addPass(DSEPass());		FPM.addPass(DSEPass());
FPM.addPass(createFunctionToLoopPassAdaptor(		FPM.addPass(createFunctionToLoopPassAdaptor(
LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap),		LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap),
EnableMSSALoopDependency, DebugLogging));		EnableMSSALoopDependency, DebugLogging));
		}

for (auto &C : ScalarOptimizerLateEPCallbacks)		for (auto &C : ScalarOptimizerLateEPCallbacks)
C(FPM, Level);		C(FPM, Level);

// Finally, do an expensive DCE pass to catch all the dead code exposed by		// Finally, do an expensive DCE pass to catch all the dead code exposed by
// the simplifications and basic cleanup after all the simplifications.		// the simplifications and basic cleanup after all the simplifications.
		// TODO: Investigate if this is too expensive.
FPM.addPass(ADCEPass());		FPM.addPass(ADCEPass());
FPM.addPass(SimplifyCFGPass());		FPM.addPass(SimplifyCFGPass());
FPM.addPass(InstCombinePass());		FPM.addPass(InstCombinePass());
invokePeepholeEPCallbacks(FPM, Level);		invokePeepholeEPCallbacks(FPM, Level);

if (EnableCHR && Level == O3 && PGOOpt &&		if (EnableCHR && Level == O3 && PGOOpt &&
(PGOOpt->Action == PGOOptions::IRUse \|\|		(PGOOpt->Action == PGOOptions::IRUse \|\|
PGOOpt->Action == PGOOptions::SampleUse))		PGOOpt->Action == PGOOptions::SampleUse))
▲ Show 20 Lines • Show All 1,842 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 254 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateFunctionPassManager(
if (LibraryInfo)		if (LibraryInfo)
FPM.add(new TargetLibraryInfoWrapperPass(*LibraryInfo));		FPM.add(new TargetLibraryInfoWrapperPass(*LibraryInfo));

if (OptLevel == 0) return;		if (OptLevel == 0) return;

addInitialAliasAnalysisPasses(FPM);		addInitialAliasAnalysisPasses(FPM);

FPM.add(createCFGSimplificationPass());		FPM.add(createCFGSimplificationPass());
FPM.add(createSROAPass());		FPM.add(createSROAPass());
		chandlercUnsubmitted Not Done Reply Inline Actions We early exit at `O0` above, so this is a no-op. chandlerc: We early exit at `O0` above, so this is a no-op.
FPM.add(createEarlyCSEPass());		FPM.add(createEarlyCSEPass());
FPM.add(createLowerExpectIntrinsicPass());		FPM.add(createLowerExpectIntrinsicPass());
}		}

// Do PGO instrumentation generation or use pass as the option specified.		// Do PGO instrumentation generation or use pass as the option specified.
void PassManagerBuilder::addPGOInstrPasses(legacy::PassManagerBase &MPM,		void PassManagerBuilder::addPGOInstrPasses(legacy::PassManagerBase &MPM,
bool IsCS = false) {		bool IsCS = false) {
if (IsCS) {		if (IsCS) {
Show All 13 Lines	if (OptLevel > 0 && SizeLevel == 0 && !DisablePreInliner &&
// care about are DefaultThreshold and HintThreshold.		// care about are DefaultThreshold and HintThreshold.
InlineParams IP;		InlineParams IP;
IP.DefaultThreshold = PreInlineThreshold;		IP.DefaultThreshold = PreInlineThreshold;
// FIXME: The hint threshold has the same value used by the regular inliner.		// FIXME: The hint threshold has the same value used by the regular inliner.
// This should probably be lowered after performance testing.		// This should probably be lowered after performance testing.
IP.HintThreshold = 325;		IP.HintThreshold = 325;

MPM.add(createFunctionInliningPass(IP));		MPM.add(createFunctionInliningPass(IP));
MPM.add(createSROAPass());		MPM.add(createSROAPass());
		chandlercUnsubmitted Not Done Reply Inline Actions We only reach here if `OptLevel > 0` so this should be redundant? chandlerc: We only reach here if `OptLevel > 0` so this should be redundant?
MPM.add(createEarlyCSEPass()); // Catch trivial redundancies		MPM.add(createEarlyCSEPass()); // Catch trivial redundancies
MPM.add(createCFGSimplificationPass()); // Merge & remove BBs		MPM.add(createCFGSimplificationPass()); // Merge & remove BBs
MPM.add(createInstructionCombiningPass()); // Combine silly seq's		MPM.add(createInstructionCombiningPass()); // Combine silly seq's
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);
}		}
if ((EnablePGOInstrGen && !IsCS) \|\| (EnablePGOCSInstrGen && IsCS)) {		if ((EnablePGOInstrGen && !IsCS) \|\| (EnablePGOCSInstrGen && IsCS)) {
MPM.add(createPGOInstrumentationGenLegacyPass(IsCS));		MPM.add(createPGOInstrumentationGenLegacyPass(IsCS));
// Add the profile lowering pass.		// Add the profile lowering pass.
Show All 13 Lines	void PassManagerBuilder::addPGOInstrPasses(legacy::PassManagerBase &MPM,
if (OptLevel > 0 && !IsCS)		if (OptLevel > 0 && !IsCS)
MPM.add(		MPM.add(
createPGOIndirectCallPromotionLegacyPass(false, !PGOSampleUse.empty()));		createPGOIndirectCallPromotionLegacyPass(false, !PGOSampleUse.empty()));
}		}
void PassManagerBuilder::addFunctionSimplificationPasses(		void PassManagerBuilder::addFunctionSimplificationPasses(
legacy::PassManagerBase &MPM) {		legacy::PassManagerBase &MPM) {
// Start of function pass.		// Start of function pass.
// Break up aggregate allocas, using SSAUpdater.		// Break up aggregate allocas, using SSAUpdater.
		assert(OptLevel >= 1 && "Calling function optimizer with no optimization level!");
MPM.add(createSROAPass());		MPM.add(createSROAPass());
		chandlercUnsubmitted Not Done Reply Inline Actions This doesn't have the assert, but I believe this is only used above `O0` as well. Maybe just add the assert? chandlerc: This doesn't have the assert, but I believe this is only used above `O0` as well. Maybe just…
MPM.add(createEarlyCSEPass(true /* Enable mem-ssa. */)); // Catch trivial redundancies		MPM.add(createEarlyCSEPass(true /* Enable mem-ssa. */)); // Catch trivial redundancies

		if (OptLevel > 1) {
if (EnableGVNHoist)		if (EnableGVNHoist)
MPM.add(createGVNHoistPass());		MPM.add(createGVNHoistPass());
if (EnableGVNSink) {		if (EnableGVNSink) {
MPM.add(createGVNSinkPass());		MPM.add(createGVNSinkPass());
MPM.add(createCFGSimplificationPass());		MPM.add(createCFGSimplificationPass());
}		}
		}

		if (OptLevel > 1) {
// Speculative execution if the target has divergent branches; otherwise nop.		// Speculative execution if the target has divergent branches; otherwise nop.
MPM.add(createSpeculativeExecutionIfHasBranchDivergencePass());		MPM.add(createSpeculativeExecutionIfHasBranchDivergencePass());

MPM.add(createJumpThreadingPass()); // Thread jumps.		MPM.add(createJumpThreadingPass()); // Thread jumps.
MPM.add(createCorrelatedValuePropagationPass()); // Propagate conditionals		MPM.add(createCorrelatedValuePropagationPass()); // Propagate conditionals
		}
MPM.add(createCFGSimplificationPass()); // Merge & remove BBs		MPM.add(createCFGSimplificationPass()); // Merge & remove BBs
// Combine silly seq's		// Combine silly seq's
if (OptLevel > 2)		if (OptLevel > 2)
MPM.add(createAggressiveInstCombinerPass());		MPM.add(createAggressiveInstCombinerPass());
addInstructionCombiningPass(MPM);		addInstructionCombiningPass(MPM);
if (SizeLevel == 0 && !DisableLibCallsShrinkWrap)		if (SizeLevel == 0 && !DisableLibCallsShrinkWrap)
MPM.add(createLibCallsShrinkWrapPass());		MPM.add(createLibCallsShrinkWrapPass());
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);

// Optimize memory intrinsic calls based on the profiled size information.		// Optimize memory intrinsic calls based on the profiled size information.
if (SizeLevel == 0)		if (SizeLevel == 0)
MPM.add(createPGOMemOPSizeOptLegacyPass());		MPM.add(createPGOMemOPSizeOptLegacyPass());

		// TODO: Investigate the cost/benefit of tail call elimination on debugging.
		hfinkelUnsubmitted Not Done Reply Inline Actions By definition, this loses information from the call stack, no? hfinkel: By definition, this loses information from the call stack, no?
		chandlercUnsubmitted Not Done Reply Inline Actions Yeah, I'd have really expected this to be skipped. chandlerc: Yeah, I'd have really expected this to be skipped.
		if (OptLevel > 1)
MPM.add(createTailCallEliminationPass()); // Eliminate tail calls		MPM.add(createTailCallEliminationPass()); // Eliminate tail calls
MPM.add(createCFGSimplificationPass()); // Merge & remove BBs		MPM.add(createCFGSimplificationPass()); // Merge & remove BBs
MPM.add(createReassociatePass()); // Reassociate expressions		MPM.add(createReassociatePass()); // Reassociate expressions

// Begin the loop pass pipeline.		// Begin the loop pass pipeline.
if (EnableSimpleLoopUnswitch) {		if (EnableSimpleLoopUnswitch) {
// The simple loop unswitch pass relies on separate cleanup passes. Schedule		// The simple loop unswitch pass relies on separate cleanup passes. Schedule
// them first so when we re-process a loop they run before other loop		// them first so when we re-process a loop they run before other loop
// passes.		// passes.
MPM.add(createLoopInstSimplifyPass());		MPM.add(createLoopInstSimplifyPass());
MPM.add(createLoopSimplifyCFGPass());		MPM.add(createLoopSimplifyCFGPass());
}		}
// Rotate Loop - disable header duplication at -Oz		// Rotate Loop - disable header duplication at -Oz
MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1));		MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1));
		// TODO: Investigate promotion cap for O1.
MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));		MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));
if (EnableSimpleLoopUnswitch)		if (EnableSimpleLoopUnswitch)
MPM.add(createSimpleLoopUnswitchLegacyPass());		MPM.add(createSimpleLoopUnswitchLegacyPass());
else		else
MPM.add(createLoopUnswitchPass(SizeLevel \|\| OptLevel < 3, DivergentTarget));		MPM.add(createLoopUnswitchPass(SizeLevel \|\| OptLevel < 3, DivergentTarget));
// FIXME: We break the loop pass pipeline here in order to do full		// FIXME: We break the loop pass pipeline here in order to do full
// simplify-cfg. Eventually loop-simplifycfg should be enhanced to replace the		// simplify-cfg. Eventually loop-simplifycfg should be enhanced to replace the
// need for this.		// need for this.
Show All 26 Lines	void PassManagerBuilder::addFunctionSimplificationPasses(
// computations, and then ADCE will run later to exploit any new DCE		// computations, and then ADCE will run later to exploit any new DCE
// opportunities that creates).		// opportunities that creates).
MPM.add(createBitTrackingDCEPass()); // Delete dead bit computations		MPM.add(createBitTrackingDCEPass()); // Delete dead bit computations

// Run instcombine after redundancy elimination to exploit opportunities		// Run instcombine after redundancy elimination to exploit opportunities
// opened up by them.		// opened up by them.
addInstructionCombiningPass(MPM);		addInstructionCombiningPass(MPM);
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);
		if (OptLevel > 1) {
MPM.add(createJumpThreadingPass()); // Thread jumps		MPM.add(createJumpThreadingPass()); // Thread jumps
MPM.add(createCorrelatedValuePropagationPass());		MPM.add(createCorrelatedValuePropagationPass());
MPM.add(createDeadStoreEliminationPass()); // Delete dead stores		MPM.add(createDeadStoreEliminationPass()); // Delete dead stores
MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));		MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));
		}

addExtensionsToPM(EP_ScalarOptimizerLate, MPM);		addExtensionsToPM(EP_ScalarOptimizerLate, MPM);

if (RerollLoops)		if (RerollLoops)
MPM.add(createLoopRerollPass());		MPM.add(createLoopRerollPass());

		// TODO: Investigate if this is too expensive at O1.
		hfinkelUnsubmitted Not Done Reply Inline Actions Yes, I'd fall back to using regular DCE. hfinkel: Yes, I'd fall back to using regular DCE.
		chandlercUnsubmitted Not Done Reply Inline Actions +1 chandlerc: +1
MPM.add(createAggressiveDCEPass()); // Delete dead instructions		MPM.add(createAggressiveDCEPass()); // Delete dead instructions
MPM.add(createCFGSimplificationPass()); // Merge & remove BBs		MPM.add(createCFGSimplificationPass()); // Merge & remove BBs
// Clean up after everything.		// Clean up after everything.
addInstructionCombiningPass(MPM);		addInstructionCombiningPass(MPM);
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);

if (EnableCHR && OptLevel >= 3 &&		if (EnableCHR && OptLevel >= 3 &&
(!PGOInstrUse.empty() \|\| !PGOSampleUse.empty() \|\| EnablePGOCSInstrGen))		(!PGOInstrUse.empty() \|\| !PGOSampleUse.empty() \|\| EnablePGOCSInstrGen))
▲ Show 20 Lines • Show All 471 Lines • ▼ Show 20 Lines	void PassManagerBuilder::addLTOOptimizationPasses(legacy::PassManagerBase &PM) {
addExtensionsToPM(EP_Peephole, PM);		addExtensionsToPM(EP_Peephole, PM);
PM.add(createJumpThreadingPass());		PM.add(createJumpThreadingPass());

// Break up allocas		// Break up allocas
PM.add(createSROAPass());		PM.add(createSROAPass());

// LTO provides additional opportunities for tailcall elimination due to		// LTO provides additional opportunities for tailcall elimination due to
// link-time inlining, and visibility of nocapture attribute.		// link-time inlining, and visibility of nocapture attribute.
		if (OptLevel > 1)
PM.add(createTailCallEliminationPass());		PM.add(createTailCallEliminationPass());

// Infer attributes on declarations, call sites, arguments, etc.		// Infer attributes on declarations, call sites, arguments, etc.
PM.add(createPostOrderFunctionAttrsLegacyPass()); // Add nocapture.		PM.add(createPostOrderFunctionAttrsLegacyPass()); // Add nocapture.
// Run a few AA driven optimizations here and now, to cleanup the code.		// Run a few AA driven optimizations here and now, to cleanup the code.
PM.add(createGlobalsAAWrapperPass()); // IP alias analysis.		PM.add(createGlobalsAAWrapperPass()); // IP alias analysis.

PM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));		PM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));
PM.add(createMergedLoadStoreMotionPass()); // Merge ld/st in diamonds.		PM.add(createMergedLoadStoreMotionPass()); // Merge ld/st in diamonds.
▲ Show 20 Lines • Show All 226 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/simplify-libcalls.ll

	; RUN: opt -S -O1 -mtriple=amdgcn-- -amdgpu-simplify-libcall < %s \| FileCheck -enable-var-scope -check-prefix=GCN -check-prefix=GCN-POSTLINK %s			; RUN: opt -S -O1 -mtriple=amdgcn-- -amdgpu-simplify-libcall < %s \| FileCheck -enable-var-scope -check-prefix=GCN -check-prefix=GCN-POSTLINK %s
	; RUN: opt -S -O1 -mtriple=amdgcn-- -amdgpu-simplify-libcall -amdgpu-prelink <%s \| FileCheck -enable-var-scope -check-prefix=GCN -check-prefix=GCN-PRELINK %s			; RUN: opt -S -O1 -mtriple=amdgcn-- -amdgpu-simplify-libcall -amdgpu-prelink <%s \| FileCheck -enable-var-scope -check-prefix=GCN -check-prefix=GCN-PRELINK %s
	; RUN: opt -S -O1 -mtriple=amdgcn-- -amdgpu-use-native -amdgpu-prelink < %s \| FileCheck -enable-var-scope -check-prefix=GCN -check-prefix=GCN-NATIVE %s			; RUN: opt -S -O1 -mtriple=amdgcn-- -amdgpu-use-native -amdgpu-prelink < %s \| FileCheck -enable-var-scope -check-prefix=GCN -check-prefix=GCN-NATIVE %s

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos
	; GCN-POSTLINK: tail call fast float @_Z3sinf(			; GCN-POSTLINK: call fast float @_Z3sinf(
	; GCN-POSTLINK: tail call fast float @_Z3cosf(			; GCN-POSTLINK: call fast float @_Z3cosf(
	; GCN-PRELINK: call fast float @_Z6sincosfPf(			; GCN-PRELINK: call fast float @_Z6sincosfPf(
	; GCN-NATIVE: tail call fast float @_Z10native_sinf(			; GCN-NATIVE: call fast float @_Z10native_sinf(
	; GCN-NATIVE: tail call fast float @_Z10native_cosf(			; GCN-NATIVE: call fast float @_Z10native_cosf(
	define amdgpu_kernel void @test_sincos(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_sincos(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3sinf(float %tmp)			%call = call fast float @_Z3sinf(float %tmp)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	%call2 = tail call fast float @_Z3cosf(float %tmp)			%call2 = call fast float @_Z3cosf(float %tmp)
	%arrayidx3 = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx3 = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	store float %call2, float addrspace(1)* %arrayidx3, align 4			store float %call2, float addrspace(1)* %arrayidx3, align 4
	ret void			ret void
	}			}

	declare float @_Z3sinf(float)			declare float @_Z3sinf(float)

	declare float @_Z3cosf(float)			declare float @_Z3cosf(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos_v2			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos_v2
	; GCN-POSTLINK: tail call fast <2 x float> @_Z3sinDv2_f(			; GCN-POSTLINK: call fast <2 x float> @_Z3sinDv2_f(
	; GCN-POSTLINK: tail call fast <2 x float> @_Z3cosDv2_f(			; GCN-POSTLINK: call fast <2 x float> @_Z3cosDv2_f(
	; GCN-PRELINK: call fast <2 x float> @_Z6sincosDv2_fPS_(			; GCN-PRELINK: call fast <2 x float> @_Z6sincosDv2_fPS_(
	; GCN-NATIVE: tail call fast <2 x float> @_Z10native_sinDv2_f(			; GCN-NATIVE: call fast <2 x float> @_Z10native_sinDv2_f(
	; GCN-NATIVE: tail call fast <2 x float> @_Z10native_cosDv2_f(			; GCN-NATIVE: call fast <2 x float> @_Z10native_cosDv2_f(
	define amdgpu_kernel void @test_sincos_v2(<2 x float> addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_sincos_v2(<2 x float> addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load <2 x float>, <2 x float> addrspace(1)* %a, align 8			%tmp = load <2 x float>, <2 x float> addrspace(1)* %a, align 8
	%call = tail call fast <2 x float> @_Z3sinDv2_f(<2 x float> %tmp)			%call = call fast <2 x float> @_Z3sinDv2_f(<2 x float> %tmp)
	store <2 x float> %call, <2 x float> addrspace(1)* %a, align 8			store <2 x float> %call, <2 x float> addrspace(1)* %a, align 8
	%call2 = tail call fast <2 x float> @_Z3cosDv2_f(<2 x float> %tmp)			%call2 = call fast <2 x float> @_Z3cosDv2_f(<2 x float> %tmp)
	%arrayidx3 = getelementptr inbounds <2 x float>, <2 x float> addrspace(1)* %a, i64 1			%arrayidx3 = getelementptr inbounds <2 x float>, <2 x float> addrspace(1)* %a, i64 1
	store <2 x float> %call2, <2 x float> addrspace(1)* %arrayidx3, align 8			store <2 x float> %call2, <2 x float> addrspace(1)* %arrayidx3, align 8
	ret void			ret void
	}			}

	declare <2 x float> @_Z3sinDv2_f(<2 x float>)			declare <2 x float> @_Z3sinDv2_f(<2 x float>)

	declare <2 x float> @_Z3cosDv2_f(<2 x float>)			declare <2 x float> @_Z3cosDv2_f(<2 x float>)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos_v3			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos_v3
	; GCN-POSTLINK: tail call fast <3 x float> @_Z3sinDv3_f(			; GCN-POSTLINK: call fast <3 x float> @_Z3sinDv3_f(
	; GCN-POSTLINK: tail call fast <3 x float> @_Z3cosDv3_f(			; GCN-POSTLINK: call fast <3 x float> @_Z3cosDv3_f(
	; GCN-PRELINK: call fast <3 x float> @_Z6sincosDv3_fPS_(			; GCN-PRELINK: call fast <3 x float> @_Z6sincosDv3_fPS_(
	; GCN-NATIVE: tail call fast <3 x float> @_Z10native_sinDv3_f(			; GCN-NATIVE: call fast <3 x float> @_Z10native_sinDv3_f(
	; GCN-NATIVE: tail call fast <3 x float> @_Z10native_cosDv3_f(			; GCN-NATIVE: call fast <3 x float> @_Z10native_cosDv3_f(
	define amdgpu_kernel void @test_sincos_v3(<3 x float> addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_sincos_v3(<3 x float> addrspace(1)* nocapture %a) {
	entry:			entry:
	%castToVec4 = bitcast <3 x float> addrspace(1)* %a to <4 x float> addrspace(1)*			%castToVec4 = bitcast <3 x float> addrspace(1)* %a to <4 x float> addrspace(1)*
	%loadVec4 = load <4 x float>, <4 x float> addrspace(1)* %castToVec4, align 16			%loadVec4 = load <4 x float>, <4 x float> addrspace(1)* %castToVec4, align 16
	%extractVec4 = shufflevector <4 x float> %loadVec4, <4 x float> undef, <3 x i32> <i32 0, i32 1, i32 2>			%extractVec4 = shufflevector <4 x float> %loadVec4, <4 x float> undef, <3 x i32> <i32 0, i32 1, i32 2>
	%call = tail call fast <3 x float> @_Z3sinDv3_f(<3 x float> %extractVec4)			%call = call fast <3 x float> @_Z3sinDv3_f(<3 x float> %extractVec4)
	%extractVec6 = shufflevector <3 x float> %call, <3 x float> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>			%extractVec6 = shufflevector <3 x float> %call, <3 x float> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>
	store <4 x float> %extractVec6, <4 x float> addrspace(1)* %castToVec4, align 16			store <4 x float> %extractVec6, <4 x float> addrspace(1)* %castToVec4, align 16
	%call11 = tail call fast <3 x float> @_Z3cosDv3_f(<3 x float> %extractVec4)			%call11 = call fast <3 x float> @_Z3cosDv3_f(<3 x float> %extractVec4)
	%arrayidx12 = getelementptr inbounds <3 x float>, <3 x float> addrspace(1)* %a, i64 1			%arrayidx12 = getelementptr inbounds <3 x float>, <3 x float> addrspace(1)* %a, i64 1
	%extractVec13 = shufflevector <3 x float> %call11, <3 x float> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>			%extractVec13 = shufflevector <3 x float> %call11, <3 x float> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>
	%storetmp14 = bitcast <3 x float> addrspace(1)* %arrayidx12 to <4 x float> addrspace(1)*			%storetmp14 = bitcast <3 x float> addrspace(1)* %arrayidx12 to <4 x float> addrspace(1)*
	store <4 x float> %extractVec13, <4 x float> addrspace(1)* %storetmp14, align 16			store <4 x float> %extractVec13, <4 x float> addrspace(1)* %storetmp14, align 16
	ret void			ret void
	}			}

	declare <3 x float> @_Z3sinDv3_f(<3 x float>)			declare <3 x float> @_Z3sinDv3_f(<3 x float>)

	declare <3 x float> @_Z3cosDv3_f(<3 x float>)			declare <3 x float> @_Z3cosDv3_f(<3 x float>)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos_v4			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos_v4
	; GCN-POSTLINK: tail call fast <4 x float> @_Z3sinDv4_f(			; GCN-POSTLINK: call fast <4 x float> @_Z3sinDv4_f(
	; GCN-POSTLINK: tail call fast <4 x float> @_Z3cosDv4_f(			; GCN-POSTLINK: call fast <4 x float> @_Z3cosDv4_f(
	; GCN-PRELINK: call fast <4 x float> @_Z6sincosDv4_fPS_(			; GCN-PRELINK: call fast <4 x float> @_Z6sincosDv4_fPS_(
	; GCN-NATIVE: tail call fast <4 x float> @_Z10native_sinDv4_f(			; GCN-NATIVE: call fast <4 x float> @_Z10native_sinDv4_f(
	; GCN-NATIVE: tail call fast <4 x float> @_Z10native_cosDv4_f(			; GCN-NATIVE: call fast <4 x float> @_Z10native_cosDv4_f(
	define amdgpu_kernel void @test_sincos_v4(<4 x float> addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_sincos_v4(<4 x float> addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load <4 x float>, <4 x float> addrspace(1)* %a, align 16			%tmp = load <4 x float>, <4 x float> addrspace(1)* %a, align 16
	%call = tail call fast <4 x float> @_Z3sinDv4_f(<4 x float> %tmp)			%call = call fast <4 x float> @_Z3sinDv4_f(<4 x float> %tmp)
	store <4 x float> %call, <4 x float> addrspace(1)* %a, align 16			store <4 x float> %call, <4 x float> addrspace(1)* %a, align 16
	%call2 = tail call fast <4 x float> @_Z3cosDv4_f(<4 x float> %tmp)			%call2 = call fast <4 x float> @_Z3cosDv4_f(<4 x float> %tmp)
	%arrayidx3 = getelementptr inbounds <4 x float>, <4 x float> addrspace(1)* %a, i64 1			%arrayidx3 = getelementptr inbounds <4 x float>, <4 x float> addrspace(1)* %a, i64 1
	store <4 x float> %call2, <4 x float> addrspace(1)* %arrayidx3, align 16			store <4 x float> %call2, <4 x float> addrspace(1)* %arrayidx3, align 16
	ret void			ret void
	}			}

	declare <4 x float> @_Z3sinDv4_f(<4 x float>)			declare <4 x float> @_Z3sinDv4_f(<4 x float>)

	declare <4 x float> @_Z3cosDv4_f(<4 x float>)			declare <4 x float> @_Z3cosDv4_f(<4 x float>)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos_v8			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos_v8
	; GCN-POSTLINK: tail call fast <8 x float> @_Z3sinDv8_f(			; GCN-POSTLINK: call fast <8 x float> @_Z3sinDv8_f(
	; GCN-POSTLINK: tail call fast <8 x float> @_Z3cosDv8_f(			; GCN-POSTLINK: call fast <8 x float> @_Z3cosDv8_f(
	; GCN-PRELINK: call fast <8 x float> @_Z6sincosDv8_fPS_(			; GCN-PRELINK: call fast <8 x float> @_Z6sincosDv8_fPS_(
	; GCN-NATIVE: tail call fast <8 x float> @_Z10native_sinDv8_f(			; GCN-NATIVE: call fast <8 x float> @_Z10native_sinDv8_f(
	; GCN-NATIVE: tail call fast <8 x float> @_Z10native_cosDv8_f(			; GCN-NATIVE: call fast <8 x float> @_Z10native_cosDv8_f(
	define amdgpu_kernel void @test_sincos_v8(<8 x float> addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_sincos_v8(<8 x float> addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load <8 x float>, <8 x float> addrspace(1)* %a, align 32			%tmp = load <8 x float>, <8 x float> addrspace(1)* %a, align 32
	%call = tail call fast <8 x float> @_Z3sinDv8_f(<8 x float> %tmp)			%call = call fast <8 x float> @_Z3sinDv8_f(<8 x float> %tmp)
	store <8 x float> %call, <8 x float> addrspace(1)* %a, align 32			store <8 x float> %call, <8 x float> addrspace(1)* %a, align 32
	%call2 = tail call fast <8 x float> @_Z3cosDv8_f(<8 x float> %tmp)			%call2 = call fast <8 x float> @_Z3cosDv8_f(<8 x float> %tmp)
	%arrayidx3 = getelementptr inbounds <8 x float>, <8 x float> addrspace(1)* %a, i64 1			%arrayidx3 = getelementptr inbounds <8 x float>, <8 x float> addrspace(1)* %a, i64 1
	store <8 x float> %call2, <8 x float> addrspace(1)* %arrayidx3, align 32			store <8 x float> %call2, <8 x float> addrspace(1)* %arrayidx3, align 32
	ret void			ret void
	}			}

	declare <8 x float> @_Z3sinDv8_f(<8 x float>)			declare <8 x float> @_Z3sinDv8_f(<8 x float>)

	declare <8 x float> @_Z3cosDv8_f(<8 x float>)			declare <8 x float> @_Z3cosDv8_f(<8 x float>)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos_v16			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_sincos_v16
	; GCN-POSTLINK: tail call fast <16 x float> @_Z3sinDv16_f(			; GCN-POSTLINK: call fast <16 x float> @_Z3sinDv16_f(
	; GCN-POSTLINK: tail call fast <16 x float> @_Z3cosDv16_f(			; GCN-POSTLINK: call fast <16 x float> @_Z3cosDv16_f(
	; GCN-PRELINK: call fast <16 x float> @_Z6sincosDv16_fPS_(			; GCN-PRELINK: call fast <16 x float> @_Z6sincosDv16_fPS_(
	; GCN-NATIVE: tail call fast <16 x float> @_Z10native_sinDv16_f(			; GCN-NATIVE: call fast <16 x float> @_Z10native_sinDv16_f(
	; GCN-NATIVE: tail call fast <16 x float> @_Z10native_cosDv16_f(			; GCN-NATIVE: call fast <16 x float> @_Z10native_cosDv16_f(
	define amdgpu_kernel void @test_sincos_v16(<16 x float> addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_sincos_v16(<16 x float> addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load <16 x float>, <16 x float> addrspace(1)* %a, align 64			%tmp = load <16 x float>, <16 x float> addrspace(1)* %a, align 64
	%call = tail call fast <16 x float> @_Z3sinDv16_f(<16 x float> %tmp)			%call = call fast <16 x float> @_Z3sinDv16_f(<16 x float> %tmp)
	store <16 x float> %call, <16 x float> addrspace(1)* %a, align 64			store <16 x float> %call, <16 x float> addrspace(1)* %a, align 64
	%call2 = tail call fast <16 x float> @_Z3cosDv16_f(<16 x float> %tmp)			%call2 = call fast <16 x float> @_Z3cosDv16_f(<16 x float> %tmp)
	%arrayidx3 = getelementptr inbounds <16 x float>, <16 x float> addrspace(1)* %a, i64 1			%arrayidx3 = getelementptr inbounds <16 x float>, <16 x float> addrspace(1)* %a, i64 1
	store <16 x float> %call2, <16 x float> addrspace(1)* %arrayidx3, align 64			store <16 x float> %call2, <16 x float> addrspace(1)* %arrayidx3, align 64
	ret void			ret void
	}			}

	declare <16 x float> @_Z3sinDv16_f(<16 x float>)			declare <16 x float> @_Z3sinDv16_f(<16 x float>)

	declare <16 x float> @_Z3cosDv16_f(<16 x float>)			declare <16 x float> @_Z3cosDv16_f(<16 x float>)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_native_recip			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_native_recip
	; GCN: store float 0x3FD5555560000000, float addrspace(1)* %a			; GCN: store float 0x3FD5555560000000, float addrspace(1)* %a
	define amdgpu_kernel void @test_native_recip(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_native_recip(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%call = tail call fast float @_Z12native_recipf(float 3.000000e+00)			%call = call fast float @_Z12native_recipf(float 3.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z12native_recipf(float)			declare float @_Z12native_recipf(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_half_recip			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_half_recip
	; GCN: store float 0x3FD5555560000000, float addrspace(1)* %a			; GCN: store float 0x3FD5555560000000, float addrspace(1)* %a
	define amdgpu_kernel void @test_half_recip(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_half_recip(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%call = tail call fast float @_Z10half_recipf(float 3.000000e+00)			%call = call fast float @_Z10half_recipf(float 3.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z10half_recipf(float)			declare float @_Z10half_recipf(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_native_divide			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_native_divide
	; GCN: fmul fast float %tmp, 0x3FD5555560000000			; GCN: fmul fast float %tmp, 0x3FD5555560000000
	define amdgpu_kernel void @test_native_divide(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_native_divide(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z13native_divideff(float %tmp, float 3.000000e+00)			%call = call fast float @_Z13native_divideff(float %tmp, float 3.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z13native_divideff(float, float)			declare float @_Z13native_divideff(float, float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_half_divide			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_half_divide
	; GCN: fmul fast float %tmp, 0x3FD5555560000000			; GCN: fmul fast float %tmp, 0x3FD5555560000000
	define amdgpu_kernel void @test_half_divide(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_half_divide(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z11half_divideff(float %tmp, float 3.000000e+00)			%call = call fast float @_Z11half_divideff(float %tmp, float 3.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z11half_divideff(float, float)			declare float @_Z11half_divideff(float, float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_0f			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_0f
	; GCN: store float 1.000000e+00, float addrspace(1)* %a			; GCN: store float 1.000000e+00, float addrspace(1)* %a
	define amdgpu_kernel void @test_pow_0f(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_0f(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float 0.000000e+00)			%call = call fast float @_Z3powff(float %tmp, float 0.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z3powff(float, float)			declare float @_Z3powff(float, float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_0i			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_0i
	; GCN: store float 1.000000e+00, float addrspace(1)* %a			; GCN: store float 1.000000e+00, float addrspace(1)* %a
	define amdgpu_kernel void @test_pow_0i(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_0i(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float 0.000000e+00)			%call = call fast float @_Z3powff(float %tmp, float 0.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_1f			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_1f
	; GCN: %tmp = load float, float addrspace(1)* %arrayidx, align 4			; GCN: %tmp = load float, float addrspace(1)* %arrayidx, align 4
	; GCN: store float %tmp, float addrspace(1)* %a, align 4			; GCN: store float %tmp, float addrspace(1)* %a, align 4
	define amdgpu_kernel void @test_pow_1f(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_1f(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float 1.000000e+00)			%call = call fast float @_Z3powff(float %tmp, float 1.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_1i			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_1i
	; GCN: %tmp = load float, float addrspace(1)* %arrayidx, align 4			; GCN: %tmp = load float, float addrspace(1)* %arrayidx, align 4
	; GCN: store float %tmp, float addrspace(1)* %a, align 4			; GCN: store float %tmp, float addrspace(1)* %a, align 4
	define amdgpu_kernel void @test_pow_1i(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_1i(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float 1.000000e+00)			%call = call fast float @_Z3powff(float %tmp, float 1.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_2f			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_2f
	; GCN: %tmp = load float, float addrspace(1)* %a, align 4			; GCN: %tmp = load float, float addrspace(1)* %a, align 4
	; GCN: %__pow2 = fmul fast float %tmp, %tmp			; GCN: %__pow2 = fmul fast float %tmp, %tmp
	define amdgpu_kernel void @test_pow_2f(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_2f(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float 2.000000e+00)			%call = call fast float @_Z3powff(float %tmp, float 2.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_2i			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_2i
	; GCN: %tmp = load float, float addrspace(1)* %a, align 4			; GCN: %tmp = load float, float addrspace(1)* %a, align 4
	; GCN: %__pow2 = fmul fast float %tmp, %tmp			; GCN: %__pow2 = fmul fast float %tmp, %tmp
	define amdgpu_kernel void @test_pow_2i(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_2i(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float 2.000000e+00)			%call = call fast float @_Z3powff(float %tmp, float 2.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_m1f			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_m1f
	; GCN: %tmp = load float, float addrspace(1)* %arrayidx, align 4			; GCN: %tmp = load float, float addrspace(1)* %arrayidx, align 4
	; GCN: %__powrecip = fdiv fast float 1.000000e+00, %tmp			; GCN: %__powrecip = fdiv fast float 1.000000e+00, %tmp
	define amdgpu_kernel void @test_pow_m1f(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_m1f(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float -1.000000e+00)			%call = call fast float @_Z3powff(float %tmp, float -1.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_m1i			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_m1i
	; GCN: %tmp = load float, float addrspace(1)* %arrayidx, align 4			; GCN: %tmp = load float, float addrspace(1)* %arrayidx, align 4
	; GCN: %__powrecip = fdiv fast float 1.000000e+00, %tmp			; GCN: %__powrecip = fdiv fast float 1.000000e+00, %tmp
	define amdgpu_kernel void @test_pow_m1i(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_m1i(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float -1.000000e+00)			%call = call fast float @_Z3powff(float %tmp, float -1.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_half			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_half
	; GCN-POSTLINK: tail call fast float @_Z3powff(float %tmp, float 5.000000e-01)			; GCN-POSTLINK: call fast float @_Z3powff(float %tmp, float 5.000000e-01)
	; GCN-PRELINK: %__pow2sqrt = tail call fast float @_Z4sqrtf(float %tmp)			; GCN-PRELINK: %__pow2sqrt = call fast float @_Z4sqrtf(float %tmp)
	define amdgpu_kernel void @test_pow_half(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_half(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float 5.000000e-01)			%call = call fast float @_Z3powff(float %tmp, float 5.000000e-01)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_mhalf			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_mhalf
	; GCN-POSTLINK: tail call fast float @_Z3powff(float %tmp, float -5.000000e-01)			; GCN-POSTLINK: call fast float @_Z3powff(float %tmp, float -5.000000e-01)
	; GCN-PRELINK: %__pow2rsqrt = tail call fast float @_Z5rsqrtf(float %tmp)			; GCN-PRELINK: %__pow2rsqrt = call fast float @_Z5rsqrtf(float %tmp)
	define amdgpu_kernel void @test_pow_mhalf(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_mhalf(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float -5.000000e-01)			%call = call fast float @_Z3powff(float %tmp, float -5.000000e-01)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_c			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow_c
	; GCN: %__powx2 = fmul fast float %tmp, %tmp			; GCN: %__powx2 = fmul fast float %tmp, %tmp
	; GCN: %__powx21 = fmul fast float %__powx2, %__powx2			; GCN: %__powx21 = fmul fast float %__powx2, %__powx2
	; GCN: %__powx22 = fmul fast float %__powx2, %tmp			; GCN: %__powx22 = fmul fast float %__powx2, %tmp
	; GCN: %[[r0:.*]] = fmul fast float %__powx21, %__powx21			; GCN: %[[r0:.*]] = fmul fast float %__powx21, %__powx21
	; GCN: %__powprod3 = fmul fast float %[[r0]], %__powx22			; GCN: %__powprod3 = fmul fast float %[[r0]], %__powx22
	define amdgpu_kernel void @test_pow_c(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow_c(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float 1.100000e+01)			%call = call fast float @_Z3powff(float %tmp, float 1.100000e+01)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_powr_c			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_powr_c
	; GCN: %__powx2 = fmul fast float %tmp, %tmp			; GCN: %__powx2 = fmul fast float %tmp, %tmp
	; GCN: %__powx21 = fmul fast float %__powx2, %__powx2			; GCN: %__powx21 = fmul fast float %__powx2, %__powx2
	; GCN: %__powx22 = fmul fast float %__powx2, %tmp			; GCN: %__powx22 = fmul fast float %__powx2, %tmp
	; GCN: %[[r0:.*]] = fmul fast float %__powx21, %__powx21			; GCN: %[[r0:.*]] = fmul fast float %__powx21, %__powx21
	; GCN: %__powprod3 = fmul fast float %[[r0]], %__powx22			; GCN: %__powprod3 = fmul fast float %[[r0]], %__powx22
	define amdgpu_kernel void @test_powr_c(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_powr_c(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%call = tail call fast float @_Z4powrff(float %tmp, float 1.100000e+01)			%call = call fast float @_Z4powrff(float %tmp, float 1.100000e+01)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z4powrff(float, float)			declare float @_Z4powrff(float, float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pown_c			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pown_c
	; GCN: %__powx2 = fmul fast float %tmp, %tmp			; GCN: %__powx2 = fmul fast float %tmp, %tmp
	; GCN: %__powx21 = fmul fast float %__powx2, %__powx2			; GCN: %__powx21 = fmul fast float %__powx2, %__powx2
	; GCN: %__powx22 = fmul fast float %__powx2, %tmp			; GCN: %__powx22 = fmul fast float %__powx2, %tmp
	; GCN: %[[r0:.*]] = fmul fast float %__powx21, %__powx21			; GCN: %[[r0:.*]] = fmul fast float %__powx21, %__powx21
	; GCN: %__powprod3 = fmul fast float %[[r0]], %__powx22			; GCN: %__powprod3 = fmul fast float %[[r0]], %__powx22
	define amdgpu_kernel void @test_pown_c(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pown_c(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%call = tail call fast float @_Z4pownfi(float %tmp, i32 11)			%call = call fast float @_Z4pownfi(float %tmp, i32 11)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z4pownfi(float, i32)			declare float @_Z4pownfi(float, i32)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pow
	; GCN-POSTLINK: tail call fast float @_Z3powff(float %tmp, float 1.013000e+03)			; GCN-POSTLINK: call fast float @_Z3powff(float %tmp, float 1.013000e+03)
	; GCN-PRELINK: %__fabs = tail call fast float @_Z4fabsf(float %tmp)			; GCN-PRELINK: %__fabs = call fast float @_Z4fabsf(float %tmp)
	; GCN-PRELINK: %__log2 = tail call fast float @_Z4log2f(float %__fabs)			; GCN-PRELINK: %__log2 = call fast float @_Z4log2f(float %__fabs)
	; GCN-PRELINK: %__ylogx = fmul fast float %__log2, 1.013000e+03			; GCN-PRELINK: %__ylogx = fmul fast float %__log2, 1.013000e+03
	; GCN-PRELINK: %__exp2 = tail call fast float @_Z4exp2f(float %__ylogx)			; GCN-PRELINK: %__exp2 = call fast float @_Z4exp2f(float %__ylogx)
	; GCN-PRELINK: %[[r0:.*]] = bitcast float %tmp to i32			; GCN-PRELINK: %[[r0:.*]] = bitcast float %tmp to i32
	; GCN-PRELINK: %__pow_sign = and i32 %[[r0]], -2147483648			; GCN-PRELINK: %__pow_sign = and i32 %[[r0]], -2147483648
	; GCN-PRELINK: %[[r1:.*]] = bitcast float %__exp2 to i32			; GCN-PRELINK: %[[r1:.*]] = bitcast float %__exp2 to i32
	; GCN-PRELINK: %[[r2:.*]] = or i32 %__pow_sign, %[[r1]]			; GCN-PRELINK: %[[r2:.*]] = or i32 %__pow_sign, %[[r1]]
	; GCN-PRELINK: %[[r3:.]] = bitcast float addrspace(1) %a to i32 addrspace(1)*			; GCN-PRELINK: %[[r3:.]] = bitcast float addrspace(1) %a to i32 addrspace(1)*
	; GCN-PRELINK: store i32 %[[r2]], i32 addrspace(1)* %[[r3]], align 4			; GCN-PRELINK: store i32 %[[r2]], i32 addrspace(1)* %[[r3]], align 4
	define amdgpu_kernel void @test_pow(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pow(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3powff(float %tmp, float 1.013000e+03)			%call = call fast float @_Z3powff(float %tmp, float 1.013000e+03)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_powr			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_powr
	; GCN-POSTLINK: tail call fast float @_Z4powrff(float %tmp, float %tmp1)			; GCN-POSTLINK: call fast float @_Z4powrff(float %tmp, float %tmp1)
	; GCN-PRELINK: %__log2 = tail call fast float @_Z4log2f(float %tmp)			; GCN-PRELINK: %__log2 = call fast float @_Z4log2f(float %tmp)
	; GCN-PRELINK: %__ylogx = fmul fast float %__log2, %tmp1			; GCN-PRELINK: %__ylogx = fmul fast float %__log2, %tmp1
	; GCN-PRELINK: %__exp2 = tail call fast float @_Z4exp2f(float %__ylogx)			; GCN-PRELINK: %__exp2 = call fast float @_Z4exp2f(float %__ylogx)
	; GCN-PRELINK: store float %__exp2, float addrspace(1)* %a, align 4			; GCN-PRELINK: store float %__exp2, float addrspace(1)* %a, align 4
	; GCN-NATIVE: %__log2 = tail call fast float @_Z11native_log2f(float %tmp)			; GCN-NATIVE: %__log2 = call fast float @_Z11native_log2f(float %tmp)
	; GCN-NATIVE: %__ylogx = fmul fast float %__log2, %tmp1			; GCN-NATIVE: %__ylogx = fmul fast float %__log2, %tmp1
	; GCN-NATIVE: %__exp2 = tail call fast float @_Z11native_exp2f(float %__ylogx)			; GCN-NATIVE: %__exp2 = call fast float @_Z11native_exp2f(float %__ylogx)
	; GCN-NATIVE: store float %__exp2, float addrspace(1)* %a, align 4			; GCN-NATIVE: store float %__exp2, float addrspace(1)* %a, align 4
	define amdgpu_kernel void @test_powr(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_powr(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%arrayidx1 = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx1 = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp1 = load float, float addrspace(1)* %arrayidx1, align 4			%tmp1 = load float, float addrspace(1)* %arrayidx1, align 4
	%call = tail call fast float @_Z4powrff(float %tmp, float %tmp1)			%call = call fast float @_Z4powrff(float %tmp, float %tmp1)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pown			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_pown
	; GCN-POSTLINK: tail call fast float @_Z4pownfi(float %tmp, i32 %conv)			; GCN-POSTLINK: call fast float @_Z4pownfi(float %tmp, i32 %conv)
	; GCN-PRELINK: %conv = fptosi float %tmp1 to i32			; GCN-PRELINK: %conv = fptosi float %tmp1 to i32
	; GCN-PRELINK: %__fabs = tail call fast float @_Z4fabsf(float %tmp)			; GCN-PRELINK: %__fabs = call fast float @_Z4fabsf(float %tmp)
	; GCN-PRELINK: %__log2 = tail call fast float @_Z4log2f(float %__fabs)			; GCN-PRELINK: %__log2 = call fast float @_Z4log2f(float %__fabs)
	; GCN-PRELINK: %pownI2F = sitofp i32 %conv to float			; GCN-PRELINK: %pownI2F = sitofp i32 %conv to float
	; GCN-PRELINK: %__ylogx = fmul fast float %__log2, %pownI2F			; GCN-PRELINK: %__ylogx = fmul fast float %__log2, %pownI2F
	; GCN-PRELINK: %__exp2 = tail call fast float @_Z4exp2f(float %__ylogx)			; GCN-PRELINK: %__exp2 = call fast float @_Z4exp2f(float %__ylogx)
	; GCN-PRELINK: %__yeven = shl i32 %conv, 31			; GCN-PRELINK: %__yeven = shl i32 %conv, 31
	; GCN-PRELINK: %[[r0:.*]] = bitcast float %tmp to i32			; GCN-PRELINK: %[[r0:.*]] = bitcast float %tmp to i32
	; GCN-PRELINK: %__pow_sign = and i32 %__yeven, %[[r0]]			; GCN-PRELINK: %__pow_sign = and i32 %__yeven, %[[r0]]
	; GCN-PRELINK: %[[r1:.*]] = bitcast float %__exp2 to i32			; GCN-PRELINK: %[[r1:.*]] = bitcast float %__exp2 to i32
	; GCN-PRELINK: %[[r2:.*]] = or i32 %__pow_sign, %[[r1]]			; GCN-PRELINK: %[[r2:.*]] = or i32 %__pow_sign, %[[r1]]
	; GCN-PRELINK: %[[r3:.]] = bitcast float addrspace(1) %a to i32 addrspace(1)*			; GCN-PRELINK: %[[r3:.]] = bitcast float addrspace(1) %a to i32 addrspace(1)*
	; GCN-PRELINK: store i32 %[[r2]], i32 addrspace(1)* %[[r3]], align 4			; GCN-PRELINK: store i32 %[[r2]], i32 addrspace(1)* %[[r3]], align 4
	define amdgpu_kernel void @test_pown(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_pown(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%arrayidx1 = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx1 = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp1 = load float, float addrspace(1)* %arrayidx1, align 4			%tmp1 = load float, float addrspace(1)* %arrayidx1, align 4
	%conv = fptosi float %tmp1 to i32			%conv = fptosi float %tmp1 to i32
	%call = tail call fast float @_Z4pownfi(float %tmp, i32 %conv)			%call = call fast float @_Z4pownfi(float %tmp, i32 %conv)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_rootn_1			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_rootn_1
	; GCN: %tmp = load float, float addrspace(1)* %arrayidx, align 4			; GCN: %tmp = load float, float addrspace(1)* %arrayidx, align 4
	; GCN: store float %tmp, float addrspace(1)* %a, align 4			; GCN: store float %tmp, float addrspace(1)* %a, align 4
	define amdgpu_kernel void @test_rootn_1(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_rootn_1(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%call = tail call fast float @_Z5rootnfi(float %tmp, i32 1)			%call = call fast float @_Z5rootnfi(float %tmp, i32 1)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z5rootnfi(float, i32)			declare float @_Z5rootnfi(float, i32)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_rootn_2			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_rootn_2
	; GCN-POSTLINK: tail call fast float @_Z5rootnfi(float %tmp, i32 2)			; GCN-POSTLINK: call fast float @_Z5rootnfi(float %tmp, i32 2)
	; GCN-PRELINK: %__rootn2sqrt = tail call fast float @_Z4sqrtf(float %tmp)			; GCN-PRELINK: %__rootn2sqrt = call fast float @_Z4sqrtf(float %tmp)
	define amdgpu_kernel void @test_rootn_2(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_rootn_2(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z5rootnfi(float %tmp, i32 2)			%call = call fast float @_Z5rootnfi(float %tmp, i32 2)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_rootn_3			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_rootn_3
	; GCN-POSTLINK: tail call fast float @_Z5rootnfi(float %tmp, i32 3)			; GCN-POSTLINK: call fast float @_Z5rootnfi(float %tmp, i32 3)
	; GCN-PRELINK: %__rootn2cbrt = tail call fast float @_Z4cbrtf(float %tmp)			; GCN-PRELINK: %__rootn2cbrt = call fast float @_Z4cbrtf(float %tmp)
	define amdgpu_kernel void @test_rootn_3(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_rootn_3(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z5rootnfi(float %tmp, i32 3)			%call = call fast float @_Z5rootnfi(float %tmp, i32 3)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_rootn_m1			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_rootn_m1
	; GCN: fdiv fast float 1.000000e+00, %tmp			; GCN: fdiv fast float 1.000000e+00, %tmp
	define amdgpu_kernel void @test_rootn_m1(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_rootn_m1(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z5rootnfi(float %tmp, i32 -1)			%call = call fast float @_Z5rootnfi(float %tmp, i32 -1)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_rootn_m2			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_rootn_m2
	; GCN-POSTLINK: tail call fast float @_Z5rootnfi(float %tmp, i32 -2)			; GCN-POSTLINK: call fast float @_Z5rootnfi(float %tmp, i32 -2)
	; GCN-PRELINK: %__rootn2rsqrt = tail call fast float @_Z5rsqrtf(float %tmp)			; GCN-PRELINK: %__rootn2rsqrt = call fast float @_Z5rsqrtf(float %tmp)
	define amdgpu_kernel void @test_rootn_m2(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_rootn_m2(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z5rootnfi(float %tmp, i32 -2)			%call = call fast float @_Z5rootnfi(float %tmp, i32 -2)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_fma_0x			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_fma_0x
	; GCN: store float %y, float addrspace(1)* %a			; GCN: store float %y, float addrspace(1)* %a
	define amdgpu_kernel void @test_fma_0x(float addrspace(1)* nocapture %a, float %y) {			define amdgpu_kernel void @test_fma_0x(float addrspace(1)* nocapture %a, float %y) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3fmafff(float 0.000000e+00, float %tmp, float %y)			%call = call fast float @_Z3fmafff(float 0.000000e+00, float %tmp, float %y)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z3fmafff(float, float, float)			declare float @_Z3fmafff(float, float, float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_fma_x0			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_fma_x0
	; GCN: store float %y, float addrspace(1)* %a			; GCN: store float %y, float addrspace(1)* %a
	define amdgpu_kernel void @test_fma_x0(float addrspace(1)* nocapture %a, float %y) {			define amdgpu_kernel void @test_fma_x0(float addrspace(1)* nocapture %a, float %y) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3fmafff(float %tmp, float 0.000000e+00, float %y)			%call = call fast float @_Z3fmafff(float %tmp, float 0.000000e+00, float %y)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_mad_0x			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_mad_0x
	; GCN: store float %y, float addrspace(1)* %a			; GCN: store float %y, float addrspace(1)* %a
	define amdgpu_kernel void @test_mad_0x(float addrspace(1)* nocapture %a, float %y) {			define amdgpu_kernel void @test_mad_0x(float addrspace(1)* nocapture %a, float %y) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3madfff(float 0.000000e+00, float %tmp, float %y)			%call = call fast float @_Z3madfff(float 0.000000e+00, float %tmp, float %y)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z3madfff(float, float, float)			declare float @_Z3madfff(float, float, float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_mad_x0			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_mad_x0
	; GCN: store float %y, float addrspace(1)* %a			; GCN: store float %y, float addrspace(1)* %a
	define amdgpu_kernel void @test_mad_x0(float addrspace(1)* nocapture %a, float %y) {			define amdgpu_kernel void @test_mad_x0(float addrspace(1)* nocapture %a, float %y) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3madfff(float %tmp, float 0.000000e+00, float %y)			%call = call fast float @_Z3madfff(float %tmp, float 0.000000e+00, float %y)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_fma_x1y			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_fma_x1y
	; GCN: %fmaadd = fadd fast float %tmp, %y			; GCN: %fmaadd = fadd fast float %tmp, %y
	define amdgpu_kernel void @test_fma_x1y(float addrspace(1)* nocapture %a, float %y) {			define amdgpu_kernel void @test_fma_x1y(float addrspace(1)* nocapture %a, float %y) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3fmafff(float %tmp, float 1.000000e+00, float %y)			%call = call fast float @_Z3fmafff(float %tmp, float 1.000000e+00, float %y)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_fma_1xy			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_fma_1xy
	; GCN: %fmaadd = fadd fast float %tmp, %y			; GCN: %fmaadd = fadd fast float %tmp, %y
	define amdgpu_kernel void @test_fma_1xy(float addrspace(1)* nocapture %a, float %y) {			define amdgpu_kernel void @test_fma_1xy(float addrspace(1)* nocapture %a, float %y) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3fmafff(float 1.000000e+00, float %tmp, float %y)			%call = call fast float @_Z3fmafff(float 1.000000e+00, float %tmp, float %y)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_fma_xy0			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_fma_xy0
	; GCN: %fmamul = fmul fast float %tmp1, %tmp			; GCN: %fmamul = fmul fast float %tmp1, %tmp
	define amdgpu_kernel void @test_fma_xy0(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_fma_xy0(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp = load float, float addrspace(1)* %arrayidx, align 4			%tmp = load float, float addrspace(1)* %arrayidx, align 4
	%tmp1 = load float, float addrspace(1)* %a, align 4			%tmp1 = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3fmafff(float %tmp, float %tmp1, float 0.000000e+00)			%call = call fast float @_Z3fmafff(float %tmp, float %tmp1, float 0.000000e+00)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_exp			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_exp
	; GCN-NATIVE: tail call fast float @_Z10native_expf(float %tmp)			; GCN-NATIVE: call fast float @_Z10native_expf(float %tmp)
	define amdgpu_kernel void @test_use_native_exp(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_use_native_exp(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3expf(float %tmp)			%call = call fast float @_Z3expf(float %tmp)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z3expf(float)			declare float @_Z3expf(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_exp2			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_exp2
	; GCN-NATIVE: tail call fast float @_Z11native_exp2f(float %tmp)			; GCN-NATIVE: call fast float @_Z11native_exp2f(float %tmp)
	define amdgpu_kernel void @test_use_native_exp2(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_use_native_exp2(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z4exp2f(float %tmp)			%call = call fast float @_Z4exp2f(float %tmp)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z4exp2f(float)			declare float @_Z4exp2f(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_exp10			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_exp10
	; GCN-NATIVE: tail call fast float @_Z12native_exp10f(float %tmp)			; GCN-NATIVE: call fast float @_Z12native_exp10f(float %tmp)
	define amdgpu_kernel void @test_use_native_exp10(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_use_native_exp10(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z5exp10f(float %tmp)			%call = call fast float @_Z5exp10f(float %tmp)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z5exp10f(float)			declare float @_Z5exp10f(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_log			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_log
	; GCN-NATIVE: tail call fast float @_Z10native_logf(float %tmp)			; GCN-NATIVE: call fast float @_Z10native_logf(float %tmp)
	define amdgpu_kernel void @test_use_native_log(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_use_native_log(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3logf(float %tmp)			%call = call fast float @_Z3logf(float %tmp)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z3logf(float)			declare float @_Z3logf(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_log2			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_log2
	; GCN-NATIVE: tail call fast float @_Z11native_log2f(float %tmp)			; GCN-NATIVE: call fast float @_Z11native_log2f(float %tmp)
	define amdgpu_kernel void @test_use_native_log2(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_use_native_log2(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z4log2f(float %tmp)			%call = call fast float @_Z4log2f(float %tmp)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z4log2f(float)			declare float @_Z4log2f(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_log10			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_log10
	; GCN-NATIVE: tail call fast float @_Z12native_log10f(float %tmp)			; GCN-NATIVE: call fast float @_Z12native_log10f(float %tmp)
	define amdgpu_kernel void @test_use_native_log10(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_use_native_log10(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z5log10f(float %tmp)			%call = call fast float @_Z5log10f(float %tmp)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z5log10f(float)			declare float @_Z5log10f(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_powr			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_powr
	; GCN-NATIVE: %tmp1 = load float, float addrspace(1)* %arrayidx1, align 4			; GCN-NATIVE: %tmp1 = load float, float addrspace(1)* %arrayidx1, align 4
	; GCN-NATIVE: %__log2 = tail call fast float @_Z11native_log2f(float %tmp)			; GCN-NATIVE: %__log2 = call fast float @_Z11native_log2f(float %tmp)
	; GCN-NATIVE: %__ylogx = fmul fast float %__log2, %tmp1			; GCN-NATIVE: %__ylogx = fmul fast float %__log2, %tmp1
	; GCN-NATIVE: %__exp2 = tail call fast float @_Z11native_exp2f(float %__ylogx)			; GCN-NATIVE: %__exp2 = call fast float @_Z11native_exp2f(float %__ylogx)
	; GCN-NATIVE: store float %__exp2, float addrspace(1)* %a, align 4			; GCN-NATIVE: store float %__exp2, float addrspace(1)* %a, align 4
	define amdgpu_kernel void @test_use_native_powr(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_use_native_powr(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%arrayidx1 = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx1 = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp1 = load float, float addrspace(1)* %arrayidx1, align 4			%tmp1 = load float, float addrspace(1)* %arrayidx1, align 4
	%call = tail call fast float @_Z4powrff(float %tmp, float %tmp1)			%call = call fast float @_Z4powrff(float %tmp, float %tmp1)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_sqrt			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_sqrt
	; GCN-NATIVE: tail call fast float @_Z11native_sqrtf(float %tmp)			; GCN-NATIVE: call fast float @_Z11native_sqrtf(float %tmp)
	define amdgpu_kernel void @test_use_native_sqrt(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_use_native_sqrt(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z4sqrtf(float %tmp)			%call = call fast float @_Z4sqrtf(float %tmp)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_dont_use_native_sqrt_fast_f64			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_dont_use_native_sqrt_fast_f64
	; GCN: tail call fast double @_Z4sqrtd(double %tmp)			; GCN: call fast double @_Z4sqrtd(double %tmp)
	define amdgpu_kernel void @test_dont_use_native_sqrt_fast_f64(double addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_dont_use_native_sqrt_fast_f64(double addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load double, double addrspace(1)* %a, align 8			%tmp = load double, double addrspace(1)* %a, align 8
	%call = tail call fast double @_Z4sqrtd(double %tmp)			%call = call fast double @_Z4sqrtd(double %tmp)
	store double %call, double addrspace(1)* %a, align 8			store double %call, double addrspace(1)* %a, align 8
	ret void			ret void
	}			}

	declare float @_Z4sqrtf(float)			declare float @_Z4sqrtf(float)
	declare double @_Z4sqrtd(double)			declare double @_Z4sqrtd(double)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_rsqrt			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_rsqrt
	; GCN-NATIVE: tail call fast float @_Z12native_rsqrtf(float %tmp)			; GCN-NATIVE: call fast float @_Z12native_rsqrtf(float %tmp)
	define amdgpu_kernel void @test_use_native_rsqrt(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_use_native_rsqrt(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z5rsqrtf(float %tmp)			%call = call fast float @_Z5rsqrtf(float %tmp)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z5rsqrtf(float)			declare float @_Z5rsqrtf(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_tan			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_tan
	; GCN-NATIVE: tail call fast float @_Z10native_tanf(float %tmp)			; GCN-NATIVE: call fast float @_Z10native_tanf(float %tmp)
	define amdgpu_kernel void @test_use_native_tan(float addrspace(1)* nocapture %a) {			define amdgpu_kernel void @test_use_native_tan(float addrspace(1)* nocapture %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%call = tail call fast float @_Z3tanf(float %tmp)			%call = call fast float @_Z3tanf(float %tmp)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z3tanf(float)			declare float @_Z3tanf(float)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_sincos			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_use_native_sincos
	; GCN-NATIVE: tail call float @_Z10native_sinf(float %tmp)			; GCN-NATIVE: call float @_Z10native_sinf(float %tmp)
	; GCN-NATIVE: tail call float @_Z10native_cosf(float %tmp)			; GCN-NATIVE: call float @_Z10native_cosf(float %tmp)
	define amdgpu_kernel void @test_use_native_sincos(float addrspace(1)* %a) {			define amdgpu_kernel void @test_use_native_sincos(float addrspace(1)* %a) {
	entry:			entry:
	%tmp = load float, float addrspace(1)* %a, align 4			%tmp = load float, float addrspace(1)* %a, align 4
	%arrayidx1 = getelementptr inbounds float, float addrspace(1)* %a, i64 1			%arrayidx1 = getelementptr inbounds float, float addrspace(1)* %a, i64 1
	%tmp1 = addrspacecast float addrspace(1)* %arrayidx1 to float*			%tmp1 = addrspacecast float addrspace(1)* %arrayidx1 to float*
	%call = tail call fast float @_Z6sincosfPf(float %tmp, float* %tmp1)			%call = call fast float @_Z6sincosfPf(float %tmp, float* %tmp1)
	store float %call, float addrspace(1)* %a, align 4			store float %call, float addrspace(1)* %a, align 4
	ret void			ret void
	}			}

	declare float @_Z6sincosfPf(float, float*)			declare float @_Z6sincosfPf(float, float*)

	%opencl.pipe_t = type opaque			%opencl.pipe_t = type opaque
	%opencl.reserve_id_t = type opaque			%opencl.reserve_id_t = type opaque

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_read_pipe(%opencl.pipe_t addrspace(1)* %p, i32 addrspace(1)* %ptr)			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_read_pipe(%opencl.pipe_t addrspace(1)* %p, i32 addrspace(1)* %ptr)
	; GCN-PRELINK: call i32 @__read_pipe_2_4(%opencl.pipe_t addrspace(1)* %{{.}}, i32 %{{.*}}) #[[$NOUNWIND:[0-9]+]]			; GCN-PRELINK: call i32 @__read_pipe_2_4(%opencl.pipe_t addrspace(1)* %{{.}}, i32 %{{.*}}) #[[$NOUNWIND:[0-9]+]]
	; GCN-PRELINK: call i32 @__read_pipe_4_4(%opencl.pipe_t addrspace(1)* %{{.}}, %opencl.reserve_id_t addrspace(5) %{{.}}, i32 2, i32 %{{.*}}) #[[$NOUNWIND]]			; GCN-PRELINK: call i32 @__read_pipe_4_4(%opencl.pipe_t addrspace(1)* %{{.}}, %opencl.reserve_id_t addrspace(5) %{{.}}, i32 2, i32 %{{.*}}) #[[$NOUNWIND]]
	define amdgpu_kernel void @test_read_pipe(%opencl.pipe_t addrspace(1)* %p, i32 addrspace(1)* %ptr) local_unnamed_addr {			define amdgpu_kernel void @test_read_pipe(%opencl.pipe_t addrspace(1)* %p, i32 addrspace(1)* %ptr) local_unnamed_addr {
	entry:			entry:
	%tmp = bitcast i32 addrspace(1)* %ptr to i8 addrspace(1)*			%tmp = bitcast i32 addrspace(1)* %ptr to i8 addrspace(1)*
	%tmp1 = addrspacecast i8 addrspace(1)* %tmp to i8*			%tmp1 = addrspacecast i8 addrspace(1)* %tmp to i8*
	%tmp2 = tail call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p, i8* %tmp1, i32 4, i32 4) #0			%tmp2 = call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p, i8* %tmp1, i32 4, i32 4) #0
	%tmp3 = tail call %opencl.reserve_id_t addrspace(5)* @__reserve_read_pipe(%opencl.pipe_t addrspace(1)* %p, i32 2, i32 4, i32 4)			%tmp3 = call %opencl.reserve_id_t addrspace(5)* @__reserve_read_pipe(%opencl.pipe_t addrspace(1)* %p, i32 2, i32 4, i32 4)
	%tmp4 = tail call i32 @__read_pipe_4(%opencl.pipe_t addrspace(1)* %p, %opencl.reserve_id_t addrspace(5)* %tmp3, i32 2, i8* %tmp1, i32 4, i32 4) #0			%tmp4 = call i32 @__read_pipe_4(%opencl.pipe_t addrspace(1)* %p, %opencl.reserve_id_t addrspace(5)* %tmp3, i32 2, i8* %tmp1, i32 4, i32 4) #0
	tail call void @__commit_read_pipe(%opencl.pipe_t addrspace(1)* %p, %opencl.reserve_id_t addrspace(5)* %tmp3, i32 4, i32 4)			call void @__commit_read_pipe(%opencl.pipe_t addrspace(1)* %p, %opencl.reserve_id_t addrspace(5)* %tmp3, i32 4, i32 4)
	ret void			ret void
	}			}

	declare i32 @__read_pipe_2(%opencl.pipe_t addrspace(1), i8, i32, i32)			declare i32 @__read_pipe_2(%opencl.pipe_t addrspace(1), i8, i32, i32)

	declare %opencl.reserve_id_t addrspace(5)* @__reserve_read_pipe(%opencl.pipe_t addrspace(1)*, i32, i32, i32)			declare %opencl.reserve_id_t addrspace(5)* @__reserve_read_pipe(%opencl.pipe_t addrspace(1)*, i32, i32, i32)

	declare i32 @__read_pipe_4(%opencl.pipe_t addrspace(1), %opencl.reserve_id_t addrspace(5), i32, i8*, i32, i32)			declare i32 @__read_pipe_4(%opencl.pipe_t addrspace(1), %opencl.reserve_id_t addrspace(5), i32, i8*, i32, i32)

	declare void @__commit_read_pipe(%opencl.pipe_t addrspace(1), %opencl.reserve_id_t addrspace(5), i32, i32)			declare void @__commit_read_pipe(%opencl.pipe_t addrspace(1), %opencl.reserve_id_t addrspace(5), i32, i32)

	; GCN-LABEL: {{^}}define amdgpu_kernel void @test_write_pipe(%opencl.pipe_t addrspace(1)* %p, i32 addrspace(1)* %ptr)			; GCN-LABEL: {{^}}define amdgpu_kernel void @test_write_pipe(%opencl.pipe_t addrspace(1)* %p, i32 addrspace(1)* %ptr)
	; GCN-PRELINK: call i32 @__write_pipe_2_4(%opencl.pipe_t addrspace(1)* %{{.}}, i32 %{{.*}}) #[[$NOUNWIND]]			; GCN-PRELINK: call i32 @__write_pipe_2_4(%opencl.pipe_t addrspace(1)* %{{.}}, i32 %{{.*}}) #[[$NOUNWIND]]
	; GCN-PRELINK: call i32 @__write_pipe_4_4(%opencl.pipe_t addrspace(1)* %{{.}}, %opencl.reserve_id_t addrspace(5) %{{.}}, i32 2, i32 %{{.*}}) #[[$NOUNWIND]]			; GCN-PRELINK: call i32 @__write_pipe_4_4(%opencl.pipe_t addrspace(1)* %{{.}}, %opencl.reserve_id_t addrspace(5) %{{.}}, i32 2, i32 %{{.*}}) #[[$NOUNWIND]]
	define amdgpu_kernel void @test_write_pipe(%opencl.pipe_t addrspace(1)* %p, i32 addrspace(1)* %ptr) local_unnamed_addr {			define amdgpu_kernel void @test_write_pipe(%opencl.pipe_t addrspace(1)* %p, i32 addrspace(1)* %ptr) local_unnamed_addr {
	entry:			entry:
	%tmp = bitcast i32 addrspace(1)* %ptr to i8 addrspace(1)*			%tmp = bitcast i32 addrspace(1)* %ptr to i8 addrspace(1)*
	%tmp1 = addrspacecast i8 addrspace(1)* %tmp to i8*			%tmp1 = addrspacecast i8 addrspace(1)* %tmp to i8*
	%tmp2 = tail call i32 @__write_pipe_2(%opencl.pipe_t addrspace(1)* %p, i8* %tmp1, i32 4, i32 4) #0			%tmp2 = call i32 @__write_pipe_2(%opencl.pipe_t addrspace(1)* %p, i8* %tmp1, i32 4, i32 4) #0
	%tmp3 = tail call %opencl.reserve_id_t addrspace(5)* @__reserve_write_pipe(%opencl.pipe_t addrspace(1)* %p, i32 2, i32 4, i32 4) #0			%tmp3 = call %opencl.reserve_id_t addrspace(5)* @__reserve_write_pipe(%opencl.pipe_t addrspace(1)* %p, i32 2, i32 4, i32 4) #0
	%tmp4 = tail call i32 @__write_pipe_4(%opencl.pipe_t addrspace(1)* %p, %opencl.reserve_id_t addrspace(5)* %tmp3, i32 2, i8* %tmp1, i32 4, i32 4) #0			%tmp4 = call i32 @__write_pipe_4(%opencl.pipe_t addrspace(1)* %p, %opencl.reserve_id_t addrspace(5)* %tmp3, i32 2, i8* %tmp1, i32 4, i32 4) #0
	tail call void @__commit_write_pipe(%opencl.pipe_t addrspace(1)* %p, %opencl.reserve_id_t addrspace(5)* %tmp3, i32 4, i32 4) #0			call void @__commit_write_pipe(%opencl.pipe_t addrspace(1)* %p, %opencl.reserve_id_t addrspace(5)* %tmp3, i32 4, i32 4) #0
	ret void			ret void
	}			}

	declare i32 @__write_pipe_2(%opencl.pipe_t addrspace(1), i8, i32, i32) local_unnamed_addr			declare i32 @__write_pipe_2(%opencl.pipe_t addrspace(1), i8, i32, i32) local_unnamed_addr

	declare %opencl.reserve_id_t addrspace(5)* @__reserve_write_pipe(%opencl.pipe_t addrspace(1)*, i32, i32, i32) local_unnamed_addr			declare %opencl.reserve_id_t addrspace(5)* @__reserve_write_pipe(%opencl.pipe_t addrspace(1)*, i32, i32, i32) local_unnamed_addr

	declare i32 @__write_pipe_4(%opencl.pipe_t addrspace(1), %opencl.reserve_id_t addrspace(5), i32, i8*, i32, i32) local_unnamed_addr			declare i32 @__write_pipe_4(%opencl.pipe_t addrspace(1), %opencl.reserve_id_t addrspace(5), i32, i8*, i32, i32) local_unnamed_addr
	Show All 10 Lines
	; GCN-PRELINK: call i32 @__read_pipe_2_16(%opencl.pipe_t addrspace(1)* %{{.}}, <2 x i64> %{{.*}}) #[[$NOUNWIND]]			; GCN-PRELINK: call i32 @__read_pipe_2_16(%opencl.pipe_t addrspace(1)* %{{.}}, <2 x i64> %{{.*}}) #[[$NOUNWIND]]
	; GCN-PRELINK: call i32 @__read_pipe_2_32(%opencl.pipe_t addrspace(1)* %{{.}}, <4 x i64> %{{.*}} #[[$NOUNWIND]]			; GCN-PRELINK: call i32 @__read_pipe_2_32(%opencl.pipe_t addrspace(1)* %{{.}}, <4 x i64> %{{.*}} #[[$NOUNWIND]]
	; GCN-PRELINK: call i32 @__read_pipe_2_64(%opencl.pipe_t addrspace(1)* %{{.}}, <8 x i64> %{{.*}} #[[$NOUNWIND]]			; GCN-PRELINK: call i32 @__read_pipe_2_64(%opencl.pipe_t addrspace(1)* %{{.}}, <8 x i64> %{{.*}} #[[$NOUNWIND]]
	; GCN-PRELINK: call i32 @__read_pipe_2_128(%opencl.pipe_t addrspace(1)* %{{.}}, <16 x i64> %{{.*}} #[[$NOUNWIND]]			; GCN-PRELINK: call i32 @__read_pipe_2_128(%opencl.pipe_t addrspace(1)* %{{.}}, <16 x i64> %{{.*}} #[[$NOUNWIND]]
	; GCN-PRELINK: call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %{{.}}, i8 %{{.*}} i32 400, i32 4) #[[$NOUNWIND]]			; GCN-PRELINK: call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %{{.}}, i8 %{{.*}} i32 400, i32 4) #[[$NOUNWIND]]
	define amdgpu_kernel void @test_pipe_size(%opencl.pipe_t addrspace(1)* %p1, i8 addrspace(1)* %ptr1, %opencl.pipe_t addrspace(1)* %p2, i16 addrspace(1)* %ptr2, %opencl.pipe_t addrspace(1)* %p4, i32 addrspace(1)* %ptr4, %opencl.pipe_t addrspace(1)* %p8, i64 addrspace(1)* %ptr8, %opencl.pipe_t addrspace(1)* %p16, <2 x i64> addrspace(1)* %ptr16, %opencl.pipe_t addrspace(1)* %p32, <4 x i64> addrspace(1)* %ptr32, %opencl.pipe_t addrspace(1)* %p64, <8 x i64> addrspace(1)* %ptr64, %opencl.pipe_t addrspace(1)* %p128, <16 x i64> addrspace(1)* %ptr128, %opencl.pipe_t addrspace(1)* %pu, %struct.S addrspace(1)* %ptru) local_unnamed_addr #0 {			define amdgpu_kernel void @test_pipe_size(%opencl.pipe_t addrspace(1)* %p1, i8 addrspace(1)* %ptr1, %opencl.pipe_t addrspace(1)* %p2, i16 addrspace(1)* %ptr2, %opencl.pipe_t addrspace(1)* %p4, i32 addrspace(1)* %ptr4, %opencl.pipe_t addrspace(1)* %p8, i64 addrspace(1)* %ptr8, %opencl.pipe_t addrspace(1)* %p16, <2 x i64> addrspace(1)* %ptr16, %opencl.pipe_t addrspace(1)* %p32, <4 x i64> addrspace(1)* %ptr32, %opencl.pipe_t addrspace(1)* %p64, <8 x i64> addrspace(1)* %ptr64, %opencl.pipe_t addrspace(1)* %p128, <16 x i64> addrspace(1)* %ptr128, %opencl.pipe_t addrspace(1)* %pu, %struct.S addrspace(1)* %ptru) local_unnamed_addr #0 {
	entry:			entry:
	%tmp = addrspacecast i8 addrspace(1)* %ptr1 to i8*			%tmp = addrspacecast i8 addrspace(1)* %ptr1 to i8*
	%tmp1 = tail call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p1, i8* %tmp, i32 1, i32 1) #0			%tmp1 = call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p1, i8* %tmp, i32 1, i32 1) #0
	%tmp2 = bitcast i16 addrspace(1)* %ptr2 to i8 addrspace(1)*			%tmp2 = bitcast i16 addrspace(1)* %ptr2 to i8 addrspace(1)*
	%tmp3 = addrspacecast i8 addrspace(1)* %tmp2 to i8*			%tmp3 = addrspacecast i8 addrspace(1)* %tmp2 to i8*
	%tmp4 = tail call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p2, i8* %tmp3, i32 2, i32 2) #0			%tmp4 = call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p2, i8* %tmp3, i32 2, i32 2) #0
	%tmp5 = bitcast i32 addrspace(1)* %ptr4 to i8 addrspace(1)*			%tmp5 = bitcast i32 addrspace(1)* %ptr4 to i8 addrspace(1)*
	%tmp6 = addrspacecast i8 addrspace(1)* %tmp5 to i8*			%tmp6 = addrspacecast i8 addrspace(1)* %tmp5 to i8*
	%tmp7 = tail call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p4, i8* %tmp6, i32 4, i32 4) #0			%tmp7 = call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p4, i8* %tmp6, i32 4, i32 4) #0
	%tmp8 = bitcast i64 addrspace(1)* %ptr8 to i8 addrspace(1)*			%tmp8 = bitcast i64 addrspace(1)* %ptr8 to i8 addrspace(1)*
	%tmp9 = addrspacecast i8 addrspace(1)* %tmp8 to i8*			%tmp9 = addrspacecast i8 addrspace(1)* %tmp8 to i8*
	%tmp10 = tail call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p8, i8* %tmp9, i32 8, i32 8) #0			%tmp10 = call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p8, i8* %tmp9, i32 8, i32 8) #0
	%tmp11 = bitcast <2 x i64> addrspace(1)* %ptr16 to i8 addrspace(1)*			%tmp11 = bitcast <2 x i64> addrspace(1)* %ptr16 to i8 addrspace(1)*
	%tmp12 = addrspacecast i8 addrspace(1)* %tmp11 to i8*			%tmp12 = addrspacecast i8 addrspace(1)* %tmp11 to i8*
	%tmp13 = tail call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p16, i8* %tmp12, i32 16, i32 16) #0			%tmp13 = call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p16, i8* %tmp12, i32 16, i32 16) #0
	%tmp14 = bitcast <4 x i64> addrspace(1)* %ptr32 to i8 addrspace(1)*			%tmp14 = bitcast <4 x i64> addrspace(1)* %ptr32 to i8 addrspace(1)*
	%tmp15 = addrspacecast i8 addrspace(1)* %tmp14 to i8*			%tmp15 = addrspacecast i8 addrspace(1)* %tmp14 to i8*
	%tmp16 = tail call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p32, i8* %tmp15, i32 32, i32 32) #0			%tmp16 = call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p32, i8* %tmp15, i32 32, i32 32) #0
	%tmp17 = bitcast <8 x i64> addrspace(1)* %ptr64 to i8 addrspace(1)*			%tmp17 = bitcast <8 x i64> addrspace(1)* %ptr64 to i8 addrspace(1)*
	%tmp18 = addrspacecast i8 addrspace(1)* %tmp17 to i8*			%tmp18 = addrspacecast i8 addrspace(1)* %tmp17 to i8*
	%tmp19 = tail call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p64, i8* %tmp18, i32 64, i32 64) #0			%tmp19 = call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p64, i8* %tmp18, i32 64, i32 64) #0
	%tmp20 = bitcast <16 x i64> addrspace(1)* %ptr128 to i8 addrspace(1)*			%tmp20 = bitcast <16 x i64> addrspace(1)* %ptr128 to i8 addrspace(1)*
	%tmp21 = addrspacecast i8 addrspace(1)* %tmp20 to i8*			%tmp21 = addrspacecast i8 addrspace(1)* %tmp20 to i8*
	%tmp22 = tail call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p128, i8* %tmp21, i32 128, i32 128) #0			%tmp22 = call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %p128, i8* %tmp21, i32 128, i32 128) #0
	%tmp23 = bitcast %struct.S addrspace(1)* %ptru to i8 addrspace(1)*			%tmp23 = bitcast %struct.S addrspace(1)* %ptru to i8 addrspace(1)*
	%tmp24 = addrspacecast i8 addrspace(1)* %tmp23 to i8*			%tmp24 = addrspacecast i8 addrspace(1)* %tmp23 to i8*
	%tmp25 = tail call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %pu, i8* %tmp24, i32 400, i32 4) #0			%tmp25 = call i32 @__read_pipe_2(%opencl.pipe_t addrspace(1)* %pu, i8* %tmp24, i32 400, i32 4) #0
	ret void			ret void
	}			}

	; GCN-PRELINK: declare float @_Z4fabsf(float) local_unnamed_addr #[[$NOUNWIND_READONLY:[0-9]+]]			; GCN-PRELINK: declare float @_Z4fabsf(float) local_unnamed_addr #[[$NOUNWIND_READONLY:[0-9]+]]
	; GCN-PRELINK: declare float @_Z4cbrtf(float) local_unnamed_addr #[[$NOUNWIND_READONLY]]			; GCN-PRELINK: declare float @_Z4cbrtf(float) local_unnamed_addr #[[$NOUNWIND_READONLY]]
	; GCN-PRELINK: declare float @_Z11native_sqrtf(float) local_unnamed_addr #[[$NOUNWIND_READONLY]]			; GCN-PRELINK: declare float @_Z11native_sqrtf(float) local_unnamed_addr #[[$NOUNWIND_READONLY]]

	; CGN-PRELINK: attributes #[[$NOUNWIND]] = { nounwind }			; CGN-PRELINK: attributes #[[$NOUNWIND]] = { nounwind }
	; GCN-PRELINK: attributes #[[$NOUNWIND_READONLY]] = { nounwind readonly }			; GCN-PRELINK: attributes #[[$NOUNWIND_READONLY]] = { nounwind readonly }
	attributes #0 = { nounwind }			attributes #0 = { nounwind }

llvm/test/Feature/optnone-opt.ll

	Show All 33 Lines
	attributes #0 = { optnone noinline }			attributes #0 = { optnone noinline }

	; Nothing that runs at -O0 gets skipped.			; Nothing that runs at -O0 gets skipped.
	; OPT-O0-NOT: Skipping pass			; OPT-O0-NOT: Skipping pass

	; IR passes run at -O1 and higher.			; IR passes run at -O1 and higher.
	; OPT-O1-DAG: Skipping pass 'Aggressive Dead Code Elimination'			; OPT-O1-DAG: Skipping pass 'Aggressive Dead Code Elimination'
	; OPT-O1-DAG: Skipping pass 'Combine redundant instructions'			; OPT-O1-DAG: Skipping pass 'Combine redundant instructions'
	; OPT-O1-DAG: Skipping pass 'Dead Store Elimination'
	; OPT-O1-DAG: Skipping pass 'Early CSE'			; OPT-O1-DAG: Skipping pass 'Early CSE'
	; OPT-O1-DAG: Skipping pass 'Jump Threading'
	; OPT-O1-DAG: Skipping pass 'MemCpy Optimization'
	; OPT-O1-DAG: Skipping pass 'Reassociate expressions'			; OPT-O1-DAG: Skipping pass 'Reassociate expressions'
	; OPT-O1-DAG: Skipping pass 'Simplify the CFG'			; OPT-O1-DAG: Skipping pass 'Simplify the CFG'
	; OPT-O1-DAG: Skipping pass 'Sparse Conditional Constant Propagation'			; OPT-O1-DAG: Skipping pass 'Sparse Conditional Constant Propagation'
	; OPT-O1-DAG: Skipping pass 'SROA'
	; OPT-O1-DAG: Skipping pass 'Tail Call Elimination'
	; OPT-O1-DAG: Skipping pass 'Value Propagation'

	; Additional IR passes run at -O2 and higher.			; Additional IR passes run at -O2 and higher.
	; OPT-O2O3-DAG: Skipping pass 'Global Value Numbering'			; OPT-O2O3-DAG: Skipping pass 'Global Value Numbering'
	; OPT-O2O3-DAG: Skipping pass 'SLP Vectorizer'			; OPT-O2O3-DAG: Skipping pass 'SLP Vectorizer'

	; Additional IR passes that opt doesn't turn on by default.			; Additional IR passes that opt doesn't turn on by default.
	; OPT-MORE-DAG: Skipping pass 'Dead Code Elimination'			; OPT-MORE-DAG: Skipping pass 'Dead Code Elimination'
	; OPT-MORE-DAG: Skipping pass 'Dead Instruction Elimination'			; OPT-MORE-DAG: Skipping pass 'Dead Instruction Elimination'
	Show All 13 Lines

llvm/test/Other/new-pm-defaults.ll

	; The IR below was crafted so as:			; The IR below was crafted so as:
	; 1) To have a loop, so we create a loop pass manager			; 1) To have a loop, so we create a loop pass manager
	; 2) To be "immutable" in the sense that no pass in the standard			; 2) To be "immutable" in the sense that no pass in the standard
	; pipeline will modify it.			; pipeline will modify it.
	; Since no transformations take place, we don't expect any analyses			; Since no transformations take place, we don't expect any analyses
	; to be invalidated.			; to be invalidated.
	; Any invalidation that shows up here is a bug, unless we started modifying			; Any invalidation that shows up here is a bug, unless we started modifying
	; the IR, in which case we need to make it immutable harder.			; the IR, in which case we need to make it immutable harder.

	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='default<O1>' -S %s 2>&1 \			; RUN: -passes='default<O1>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O1			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O1
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='default<O2>' -S %s 2>&1 \			; RUN: -passes='default<O2>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O2			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O2 \
				; RUN: --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='default<O3>' -S %s 2>&1 \			; RUN: -passes='default<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \
				; RUN: --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='default<Os>' -S %s 2>&1 \			; RUN: -passes='default<Os>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-Os			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-Os \
				; RUN: --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='default<Oz>' -S %s 2>&1 \			; RUN: -passes='default<Oz>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-Oz			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-Oz \
				; RUN: --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='lto-pre-link<O2>' -S %s 2>&1 \			; RUN: -passes='lto-pre-link<O2>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O2 \			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O2 \
	; RUN: --check-prefix=CHECK-O2-LTO			; RUN: --check-prefix=CHECK-O2-LTO --check-prefix=CHECK-O23SZ

	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes-ep-peephole='no-op-function' \			; RUN: -passes-ep-peephole='no-op-function' \
	; RUN: -passes='default<O3>' -S %s 2>&1 \			; RUN: -passes='default<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \
	; RUN: --check-prefix=CHECK-EP-PEEPHOLE			; RUN: --check-prefix=CHECK-EP-PEEPHOLE --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes-ep-late-loop-optimizations='no-op-loop' \			; RUN: -passes-ep-late-loop-optimizations='no-op-loop' \
	; RUN: -passes='default<O3>' -S %s 2>&1 \			; RUN: -passes='default<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \
	; RUN: --check-prefix=CHECK-EP-LOOP-LATE			; RUN: --check-prefix=CHECK-EP-LOOP-LATE --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes-ep-loop-optimizer-end='no-op-loop' \			; RUN: -passes-ep-loop-optimizer-end='no-op-loop' \
	; RUN: -passes='default<O3>' -S %s 2>&1 \			; RUN: -passes='default<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \
	; RUN: --check-prefix=CHECK-EP-LOOP-END			; RUN: --check-prefix=CHECK-EP-LOOP-END --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes-ep-scalar-optimizer-late='no-op-function' \			; RUN: -passes-ep-scalar-optimizer-late='no-op-function' \
	; RUN: -passes='default<O3>' -S %s 2>&1 \			; RUN: -passes='default<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \
	; RUN: --check-prefix=CHECK-EP-SCALAR-LATE			; RUN: --check-prefix=CHECK-EP-SCALAR-LATE --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes-ep-cgscc-optimizer-late='no-op-cgscc' \			; RUN: -passes-ep-cgscc-optimizer-late='no-op-cgscc' \
	; RUN: -passes='default<O3>' -S %s 2>&1 \			; RUN: -passes='default<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \
	; RUN: --check-prefix=CHECK-EP-CGSCC-LATE			; RUN: --check-prefix=CHECK-EP-CGSCC-LATE --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes-ep-vectorizer-start='no-op-function' \			; RUN: -passes-ep-vectorizer-start='no-op-function' \
	; RUN: -passes='default<O3>' -S %s 2>&1 \			; RUN: -passes='default<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \
	; RUN: --check-prefix=CHECK-EP-VECTORIZER-START			; RUN: --check-prefix=CHECK-EP-VECTORIZER-START --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes-ep-pipeline-start='no-op-module' \			; RUN: -passes-ep-pipeline-start='no-op-module' \
	; RUN: -passes='default<O3>' -S %s 2>&1 \			; RUN: -passes='default<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \
	; RUN: --check-prefix=CHECK-EP-PIPELINE-START			; RUN: --check-prefix=CHECK-EP-PIPELINE-START --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes-ep-pipeline-start='no-op-module' \			; RUN: -passes-ep-pipeline-start='no-op-module' \
	; RUN: -passes='lto-pre-link<O3>' -S %s 2>&1 \			; RUN: -passes='lto-pre-link<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \
	; RUN: --check-prefix=CHECK-EP-PIPELINE-START			; RUN: --check-prefix=CHECK-EP-PIPELINE-START --check-prefix=CHECK-O23SZ
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes-ep-optimizer-last='no-op-function' \			; RUN: -passes-ep-optimizer-last='no-op-function' \
	; RUN: -passes='default<O3>' -S %s 2>&1 \			; RUN: -passes='default<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \			; RUN: \| FileCheck %s --check-prefix=CHECK-O --check-prefix=CHECK-O3 \
	; RUN: --check-prefix=CHECK-EP-OPTIMIZER-LAST			; RUN: --check-prefix=CHECK-EP-OPTIMIZER-LAST --check-prefix=CHECK-O23SZ

	; CHECK-O: Running analysis: PassInstrumentationAnalysis			; CHECK-O: Running analysis: PassInstrumentationAnalysis
	; CHECK-O-NEXT: Starting llvm::Module pass manager run.			; CHECK-O-NEXT: Starting llvm::Module pass manager run.
	; CHECK-O-NEXT: Running pass: PassManager<{{.}}Module{{.}}>			; CHECK-O-NEXT: Running pass: PassManager<{{.}}Module{{.}}>
	; CHECK-O-NEXT: Starting llvm::Module pass manager run.			; CHECK-O-NEXT: Starting llvm::Module pass manager run.
	; CHECK-O-NEXT: Running pass: ForceFunctionAttrsPass			; CHECK-O-NEXT: Running pass: ForceFunctionAttrsPass
	; CHECK-EP-PIPELINE-START-NEXT: Running pass: NoOpModulePass			; CHECK-EP-PIPELINE-START-NEXT: Running pass: NoOpModulePass
	; CHECK-O-NEXT: Running pass: PassManager<{{.}}Module{{.}}>			; CHECK-O-NEXT: Running pass: PassManager<{{.}}Module{{.}}>
	▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass			; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass
	; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass			; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass
	; CHECK-O-NEXT: Running pass: CGSCCToFunctionPassAdaptor<{{.}}PassManager{{.}}>			; CHECK-O-NEXT: Running pass: CGSCCToFunctionPassAdaptor<{{.}}PassManager{{.}}>
	; CHECK-O-NEXT: Starting llvm::Function pass manager run.			; CHECK-O-NEXT: Starting llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: SROA			; CHECK-O-NEXT: Running pass: SROA
	; CHECK-O-NEXT: Running pass: EarlyCSEPass			; CHECK-O-NEXT: Running pass: EarlyCSEPass
	; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis			; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis
	; CHECK-O-NEXT: Running pass: SpeculativeExecutionPass			; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
	; CHECK-O-NEXT: Running pass: JumpThreadingPass			; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
	; CHECK-O-NEXT: Running analysis: LazyValueAnalysis			; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
	; CHECK-O-NEXT: Running pass: CorrelatedValuePropagationPass			; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O3-NEXT: AggressiveInstCombinePass			; CHECK-O3-NEXT: AggressiveInstCombinePass
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-O1-NEXT: Running pass: LibCallsShrinkWrapPass			; CHECK-O1-NEXT: Running pass: LibCallsShrinkWrapPass
	; CHECK-O2-NEXT: Running pass: LibCallsShrinkWrapPass			; CHECK-O2-NEXT: Running pass: LibCallsShrinkWrapPass
	; CHECK-O3-NEXT: Running pass: LibCallsShrinkWrapPass			; CHECK-O3-NEXT: Running pass: LibCallsShrinkWrapPass
	; CHECK-EP-PEEPHOLE-NEXT: Running pass: NoOpFunctionPass			; CHECK-EP-PEEPHOLE-NEXT: Running pass: NoOpFunctionPass
	; CHECK-O-NEXT: Running pass: TailCallElimPass			; CHECK-O23SZ-NEXT: Running pass: TailCallElimPass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O-NEXT: Running pass: ReassociatePass			; CHECK-O-NEXT: Running pass: ReassociatePass
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}OptimizationRemarkEmitterAnalysis			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}OptimizationRemarkEmitterAnalysis
	; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.}}LoopStandardAnalysisResults{{.}}>			; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.}}LoopStandardAnalysisResults{{.}}>
	; CHECK-O-NEXT: Starting llvm::Function pass manager run.			; CHECK-O-NEXT: Starting llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: LoopSimplifyPass			; CHECK-O-NEXT: Running pass: LoopSimplifyPass
	; CHECK-O-NEXT: Running analysis: LoopAnalysis			; CHECK-O-NEXT: Running analysis: LoopAnalysis
	; CHECK-O-NEXT: Running pass: LCSSAPass			; CHECK-O-NEXT: Running pass: LCSSAPass
	Show All 20 Lines
	; CHECK-O-NEXT: Running pass: IndVarSimplifyPass			; CHECK-O-NEXT: Running pass: IndVarSimplifyPass
	; CHECK-O-NEXT: Running pass: LoopIdiomRecognizePass			; CHECK-O-NEXT: Running pass: LoopIdiomRecognizePass
	; CHECK-EP-LOOP-LATE-NEXT: Running pass: NoOpLoopPass			; CHECK-EP-LOOP-LATE-NEXT: Running pass: NoOpLoopPass
	; CHECK-O-NEXT: Running pass: LoopDeletionPass			; CHECK-O-NEXT: Running pass: LoopDeletionPass
	; CHECK-O-NEXT: Running pass: LoopFullUnrollPass			; CHECK-O-NEXT: Running pass: LoopFullUnrollPass
	; CHECK-EP-LOOP-END-NEXT: Running pass: NoOpLoopPass			; CHECK-EP-LOOP-END-NEXT: Running pass: NoOpLoopPass
	; CHECK-O-NEXT: Finished Loop pass manager run.			; CHECK-O-NEXT: Finished Loop pass manager run.
	; CHECK-O-NEXT: Running pass: SROA on foo			; CHECK-O-NEXT: Running pass: SROA on foo
	; CHECK-Os-NEXT: Running pass: MergedLoadStoreMotionPass			; CHECK-O23SZ-NEXT: Running pass: MergedLoadStoreMotionPass
	; CHECK-Os-NEXT: Running pass: GVN			; CHECK-O23SZ-NEXT: Running pass: GVN
	; CHECK-Os-NEXT: Running analysis: MemoryDependenceAnalysis			; CHECK-O23SZ-NEXT: Running analysis: MemoryDependenceAnalysis
	; CHECK-Os-NEXT: Running analysis: PhiValuesAnalysis			; CHECK-O23SZ-NEXT: Running analysis: PhiValuesAnalysis
	; CHECK-Oz-NEXT: Running pass: MergedLoadStoreMotionPass
	; CHECK-Oz-NEXT: Running pass: GVN
	; CHECK-Oz-NEXT: Running analysis: MemoryDependenceAnalysis
	; CHECK-Oz-NEXT: Running analysis: PhiValuesAnalysis
	; CHECK-O2-NEXT: Running pass: MergedLoadStoreMotionPass
	; CHECK-O2-NEXT: Running pass: GVN
	; CHECK-O2-NEXT: Running analysis: MemoryDependenceAnalysis
	; CHECK-O2-NEXT: Running analysis: PhiValuesAnalysis
	; CHECK-O3-NEXT: Running pass: MergedLoadStoreMotionPass
	; CHECK-O3-NEXT: Running pass: GVN
	; CHECK-O3-NEXT: Running analysis: MemoryDependenceAnalysis
	; CHECK-O3-NEXT: Running analysis: PhiValuesAnalysis
	; CHECK-O-NEXT: Running pass: MemCpyOptPass			; CHECK-O-NEXT: Running pass: MemCpyOptPass
	; CHECK-O1-NEXT: Running analysis: MemoryDependenceAnalysis			; CHECK-O1-NEXT: Running analysis: MemoryDependenceAnalysis
	; CHECK-O1-NEXT: Running analysis: PhiValuesAnalysis			; CHECK-O1-NEXT: Running analysis: PhiValuesAnalysis
	; CHECK-O-NEXT: Running pass: SCCPPass			; CHECK-O-NEXT: Running pass: SCCPPass
	; CHECK-O-NEXT: Running pass: BDCEPass			; CHECK-O-NEXT: Running pass: BDCEPass
	; CHECK-O-NEXT: Running analysis: DemandedBitsAnalysis			; CHECK-O-NEXT: Running analysis: DemandedBitsAnalysis
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-EP-PEEPHOLE-NEXT: Running pass: NoOpFunctionPass			; CHECK-EP-PEEPHOLE-NEXT: Running pass: NoOpFunctionPass
	; CHECK-O-NEXT: Running pass: JumpThreadingPass			; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
	; CHECK-O-NEXT: Running pass: CorrelatedValuePropagationPass			; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
	; CHECK-O-NEXT: Running pass: DSEPass			; CHECK-O23SZ-NEXT: Running pass: DSEPass
	; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.}}LICMPass{{.}}>			; CHECK-O23SZ-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.}}LICMPass{{.}}>
	; CHECK-O-NEXT: Starting llvm::Function pass manager run.			; CHECK-O23SZ-NEXT: Starting llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: LoopSimplifyPass			; CHECK-O23SZ-NEXT: Running pass: LoopSimplifyPass
	; CHECK-O-NEXT: Running pass: LCSSAPass			; CHECK-O23SZ-NEXT: Running pass: LCSSAPass
	; CHECK-O-NEXT: Finished llvm::Function pass manager run.			; CHECK-O23SZ-NEXT: Finished llvm::Function pass manager run.
	; CHECK-EP-SCALAR-LATE-NEXT: Running pass: NoOpFunctionPass			; CHECK-EP-SCALAR-LATE-NEXT: Running pass: NoOpFunctionPass
	; CHECK-O-NEXT: Running pass: ADCEPass			; CHECK-O-NEXT: Running pass: ADCEPass
	; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis			; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-EP-PEEPHOLE-NEXT: Running pass: NoOpFunctionPass			; CHECK-EP-PEEPHOLE-NEXT: Running pass: NoOpFunctionPass
	; CHECK-O-NEXT: Finished llvm::Function pass manager run.			; CHECK-O-NEXT: Finished llvm::Function pass manager run.
	; CHECK-EP-CGSCC-LATE-NEXT: Running pass: NoOpCGSCCPass			; CHECK-EP-CGSCC-LATE-NEXT: Running pass: NoOpCGSCCPass
	; CHECK-O-NEXT: Finished CGSCC pass manager run.			; CHECK-O-NEXT: Finished CGSCC pass manager run.
	; CHECK-O-NEXT: Finished llvm::Module pass manager run.			; CHECK-O-NEXT: Finished llvm::Module pass manager run.
	; CHECK-O-NEXT: Running pass: PassManager<{{.}}Module{{.}}>			; CHECK-O-NEXT: Running pass: PassManager<{{.}}Module{{.}}>
	; CHECK-O-NEXT: Starting llvm::Module pass manager run.			; CHECK-O-NEXT: Starting llvm::Module pass manager run.
	; CHECK-O-NEXT: Running pass: GlobalOptPass			; CHECK-O-NEXT: Running pass: GlobalOptPass
	; CHECK-O-NEXT: Running pass: GlobalDCEPass			; CHECK-O-NEXT: Running pass: GlobalDCEPass
	; CHECK-O2-LTO-NOT: Running pass: EliminateAvailableExternallyPass			; CHECK-O2-LTO-NOT: Running pass: EliminateAvailableExternallyPass
	; CHECK-O: Running pass: ReversePostOrderFunctionAttrsPass			; CHECK-O: Running pass: ReversePostOrderFunctionAttrsPass
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA
	; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.}}PassManager{{.}}>			; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.}}PassManager{{.}}>
	; CHECK-O-NEXT: Starting llvm::Function pass manager run.			; CHECK-O-NEXT: Starting llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: Float2IntPass			; CHECK-O-NEXT: Running pass: Float2IntPass
	; CHECK-O-NEXT: Running pass: LowerConstantIntrinsicsPass on foo			; CHECK-O-NEXT: Running pass: LowerConstantIntrinsicsPass on foo
	; CHECK-EP-VECTORIZER-START-NEXT: Running pass: NoOpFunctionPass			; CHECK-EP-VECTORIZER-START-NEXT: Running pass: NoOpFunctionPass
	; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LoopRotatePass			; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LoopRotatePass
	; CHECK-O-NEXT: Starting llvm::Function pass manager run.			; CHECK-O-NEXT: Starting llvm::Function pass manager run.
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Just a drive-by idea: you could have simplified the changes here with a new prefix: "CHECK-O23sz" that you could add to the non-O1 invocations. mehdi_amini: Just a drive-by idea: you could have simplified the changes here with a new prefix: "CHECK…
				chandlercUnsubmitted Not Done Reply Inline Actions Good idea! chandlerc: Good idea!
	; CHECK-O-NEXT: Running pass: LoopSimplifyPass			; CHECK-O-NEXT: Running pass: LoopSimplifyPass
	; CHECK-O-NEXT: Running pass: LCSSAPass			; CHECK-O-NEXT: Running pass: LCSSAPass
	; CHECK-O-NEXT: Finished llvm::Function pass manager run.			; CHECK-O-NEXT: Finished llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: LoopDistributePass			; CHECK-O-NEXT: Running pass: LoopDistributePass
	; CHECK-O-NEXT: Running pass: LoopVectorizePass			; CHECK-O-NEXT: Running pass: LoopVectorizePass
	; CHECK-O-NEXT: Running analysis: BlockFrequencyAnalysis			; CHECK-O-NEXT: Running analysis: BlockFrequencyAnalysis
	; CHECK-O-NEXT: Running analysis: BranchProbabilityAnalysis			; CHECK-O-NEXT: Running analysis: BranchProbabilityAnalysis
	; CHECK-O-NEXT: Running pass: LoopLoadEliminationPass			; CHECK-O-NEXT: Running pass: LoopLoadEliminationPass
	▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines

llvm/test/Other/new-pm-thinlto-defaults.ll

	; The IR below was crafted so as:			; The IR below was crafted so as:
	; 1) To have a loop, so we create a loop pass manager			; 1) To have a loop, so we create a loop pass manager
	; 2) To be "immutable" in the sense that no pass in the standard			; 2) To be "immutable" in the sense that no pass in the standard
	; pipeline will modify it.			; pipeline will modify it.
	; Since no transformations take place, we don't expect any analyses			; Since no transformations take place, we don't expect any analyses
	; to be invalidated.			; to be invalidated.
	; Any invalidation that shows up here is a bug, unless we started modifying			; Any invalidation that shows up here is a bug, unless we started modifying
	; the IR, in which case we need to make it immutable harder.			; the IR, in which case we need to make it immutable harder.
	;			;
	; Prelink pipelines:			; Prelink pipelines:
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto-pre-link<O1>,name-anon-globals' -S %s 2>&1 \			; RUN: -passes='thinlto-pre-link<O1>,name-anon-globals' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O1,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-O1			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O1,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-O1
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto-pre-link<O2>,name-anon-globals' -S %s 2>&1 \			; RUN: -passes='thinlto-pre-link<O2>,name-anon-globals' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-O2			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-O23SZ,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-O2
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto-pre-link<O3>,name-anon-globals' -S -passes-ep-pipeline-start='no-op-module' %s 2>&1 \			; RUN: -passes='thinlto-pre-link<O3>,name-anon-globals' -S -passes-ep-pipeline-start='no-op-module' %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O3,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-O3,CHECK-EP-PIPELINE-START			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O3,CHECK-O23SZ,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-O3,CHECK-EP-PIPELINE-START
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto-pre-link<Os>,name-anon-globals' -S %s 2>&1 \			; RUN: -passes='thinlto-pre-link<Os>,name-anon-globals' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Os,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-Os			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Os,CHECK-O23SZ,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-Os
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto-pre-link<Oz>,name-anon-globals' -S %s 2>&1 \			; RUN: -passes='thinlto-pre-link<Oz>,name-anon-globals' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Oz,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-Oz			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Oz,CHECK-O23SZ,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-Oz
	; RUN: opt -disable-verify -debug-pass-manager -new-pm-debug-info-for-profiling \			; RUN: opt -disable-verify -debug-pass-manager -new-pm-debug-info-for-profiling \
	; RUN: -passes='thinlto-pre-link<O2>,name-anon-globals' -S %s 2>&1 \			; RUN: -passes='thinlto-pre-link<O2>,name-anon-globals' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-DIS,CHECK-O,CHECK-O2,CHECK-PRELINK-O,CHECK-PRELINK-O2			; RUN: \| FileCheck %s --check-prefixes=CHECK-DIS,CHECK-O,CHECK-O2,CHECK-O23SZ,CHECK-PRELINK-O,CHECK-PRELINK-O2
	;			;
	; Postlink pipelines:			; Postlink pipelines:
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto<O1>' -S %s 2>&1 \			; RUN: -passes='thinlto<O1>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O1,CHECK-POSTLINK-O,CHECK-POSTLINK-O1			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O1,CHECK-POSTLINK-O,CHECK-POSTLINK-O1
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto<O2>' -S %s 2>&1 \			; RUN: -passes='thinlto<O2>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-POSTLINK-O,CHECK-POSTLINK-O2			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-O23SZ,CHECK-POSTLINK-O,CHECK-POSTLINK-O2
	; RUN: opt -disable-verify -debug-pass-manager -passes-ep-pipeline-start='no-op-module' \			; RUN: opt -disable-verify -debug-pass-manager -passes-ep-pipeline-start='no-op-module' \
	; RUN: -passes='thinlto<O3>' -S %s 2>&1 \			; RUN: -passes='thinlto<O3>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O3,CHECK-POSTLINK-O,CHECK-POSTLINK-O3			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O3,CHECK-O23SZ,CHECK-POSTLINK-O,CHECK-POSTLINK-O3
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto<Os>' -S %s 2>&1 \			; RUN: -passes='thinlto<Os>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Os,CHECK-POSTLINK-O,CHECK-POSTLINK-Os			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Os,CHECK-O23SZ,CHECK-POSTLINK-O,CHECK-POSTLINK-Os
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto<Oz>' -S %s 2>&1 \			; RUN: -passes='thinlto<Oz>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Oz,CHECK-POSTLINK-O,CHECK-POSTLINK-Oz			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Oz,CHECK-O23SZ,CHECK-POSTLINK-O,CHECK-POSTLINK-Oz
	; RUN: opt -disable-verify -debug-pass-manager -new-pm-debug-info-for-profiling \			; RUN: opt -disable-verify -debug-pass-manager -new-pm-debug-info-for-profiling \
	; RUN: -passes='thinlto<O2>' -S %s 2>&1 \			; RUN: -passes='thinlto<O2>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-POSTLINK-O,CHECK-POSTLINK-O2			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-O23SZ,CHECK-POSTLINK-O,CHECK-POSTLINK-O2
	;			;
	; CHECK-O: Running analysis: PassInstrumentationAnalysis			; CHECK-O: Running analysis: PassInstrumentationAnalysis
	; CHECK-O-NEXT: Starting llvm::Module pass manager run.			; CHECK-O-NEXT: Starting llvm::Module pass manager run.
	; CHECK-O-NEXT: Running pass: PassManager<{{.}}Module{{.}}>			; CHECK-O-NEXT: Running pass: PassManager<{{.}}Module{{.}}>
	; CHECK-O-NEXT: Starting llvm::Module pass manager run.			; CHECK-O-NEXT: Starting llvm::Module pass manager run.
	; CHECK-O-NEXT: Running pass: ForceFunctionAttrsPass			; CHECK-O-NEXT: Running pass: ForceFunctionAttrsPass
	; CHECK-EP-PIPELINE-START-NEXT: Running pass: NoOpModulePass			; CHECK-EP-PIPELINE-START-NEXT: Running pass: NoOpModulePass
	; CHECK-DIS-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.}}AddDiscriminatorsPass{{.}}>			; CHECK-DIS-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.}}AddDiscriminatorsPass{{.}}>
	▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass			; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass
	; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass			; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass
	; CHECK-O-NEXT: Running pass: CGSCCToFunctionPassAdaptor<{{.}}PassManager{{.}}>			; CHECK-O-NEXT: Running pass: CGSCCToFunctionPassAdaptor<{{.}}PassManager{{.}}>
	; CHECK-O-NEXT: Starting llvm::Function pass manager run.			; CHECK-O-NEXT: Starting llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: SROA			; CHECK-O-NEXT: Running pass: SROA
	; CHECK-O-NEXT: Running pass: EarlyCSEPass			; CHECK-O-NEXT: Running pass: EarlyCSEPass
	; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis			; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis
	; CHECK-O-NEXT: Running pass: SpeculativeExecutionPass			; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
	; CHECK-O-NEXT: Running pass: JumpThreadingPass			; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
	; CHECK-O-NEXT: Running analysis: LazyValueAnalysis			; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
	; CHECK-O-NEXT: Running pass: CorrelatedValuePropagationPass			; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O3-NEXT: Running pass: AggressiveInstCombinePass			; CHECK-O3-NEXT: Running pass: AggressiveInstCombinePass
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-O1-NEXT: Running pass: LibCallsShrinkWrapPass			; CHECK-O1-NEXT: Running pass: LibCallsShrinkWrapPass
	; CHECK-O2-NEXT: Running pass: LibCallsShrinkWrapPass			; CHECK-O2-NEXT: Running pass: LibCallsShrinkWrapPass
	; CHECK-O3-NEXT: Running pass: LibCallsShrinkWrapPass			; CHECK-O3-NEXT: Running pass: LibCallsShrinkWrapPass
	; CHECK-O-NEXT: Running pass: TailCallElimPass			; CHECK-O23SZ-NEXT: Running pass: TailCallElimPass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O-NEXT: Running pass: ReassociatePass			; CHECK-O-NEXT: Running pass: ReassociatePass
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}OptimizationRemarkEmitterAnalysis			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}OptimizationRemarkEmitterAnalysis
	; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.}}LoopStandardAnalysisResults{{.}}>			; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.}}LoopStandardAnalysisResults{{.}}>
	; CHECK-O-NEXT: Starting llvm::Function pass manager run			; CHECK-O-NEXT: Starting llvm::Function pass manager run
	; CHECK-O-NEXT: Running pass: LoopSimplifyPass			; CHECK-O-NEXT: Running pass: LoopSimplifyPass
	; CHECK-O-NEXT: Running analysis: LoopAnalysis			; CHECK-O-NEXT: Running analysis: LoopAnalysis
	; CHECK-O-NEXT: Running pass: LCSSAPass			; CHECK-O-NEXT: Running pass: LCSSAPass
	▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	; CHECK-O3-NEXT: Running analysis: PhiValuesAnalysis			; CHECK-O3-NEXT: Running analysis: PhiValuesAnalysis
	; CHECK-O-NEXT: Running pass: MemCpyOptPass			; CHECK-O-NEXT: Running pass: MemCpyOptPass
	; CHECK-O1-NEXT: Running analysis: MemoryDependenceAnalysis			; CHECK-O1-NEXT: Running analysis: MemoryDependenceAnalysis
	; CHECK-O1-NEXT: Running analysis: PhiValuesAnalysis			; CHECK-O1-NEXT: Running analysis: PhiValuesAnalysis
	; CHECK-O-NEXT: Running pass: SCCPPass			; CHECK-O-NEXT: Running pass: SCCPPass
	; CHECK-O-NEXT: Running pass: BDCEPass			; CHECK-O-NEXT: Running pass: BDCEPass
	; CHECK-O-NEXT: Running analysis: DemandedBitsAnalysis			; CHECK-O-NEXT: Running analysis: DemandedBitsAnalysis
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-O-NEXT: Running pass: JumpThreadingPass			; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
	; CHECK-O-NEXT: Running pass: CorrelatedValuePropagationPass			; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
	; CHECK-O-NEXT: Running pass: DSEPass			; CHECK-O23SZ-NEXT: Running pass: DSEPass
	; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.}}LICMPass{{.}}>			; CHECK-O23SZ-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.}}LICMPass{{.}}>
	; CHECK-O-NEXT: Starting llvm::Function pass manager run			; CHECK-O23SZ-NEXT: Starting llvm::Function pass manager run
	; CHECK-O-NEXT: Running pass: LoopSimplifyPass			; CHECK-O23SZ-NEXT: Running pass: LoopSimplifyPass
	; CHECK-O-NEXT: Running pass: LCSSAPass			; CHECK-O23SZ-NEXT: Running pass: LCSSAPass
	; CHECK-O-NEXT: Finished llvm::Function pass manager run			; CHECK-O23SZ-NEXT: Finished llvm::Function pass manager run
	; CHECK-O-NEXT: Running pass: ADCEPass			; CHECK-O-NEXT: Running pass: ADCEPass
	; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis			; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-O-NEXT: Finished llvm::Function pass manager run.			; CHECK-O-NEXT: Finished llvm::Function pass manager run.
	; CHECK-O-NEXT: Finished CGSCC pass manager run.			; CHECK-O-NEXT: Finished CGSCC pass manager run.
	; CHECK-O-NEXT: Finished llvm::Module pass manager run.			; CHECK-O-NEXT: Finished llvm::Module pass manager run.
	; CHECK-PRELINK-O-NEXT: Running pass: GlobalOptPass			; CHECK-PRELINK-O-NEXT: Running pass: GlobalOptPass
	▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

llvm/test/Transforms/MemCpyOpt/lifetime.ll

	; RUN: opt < %s -O1 -S \| FileCheck %s			; RUN: opt < %s -O2 -S \| FileCheck %s

	; performCallSlotOptzn in MemCpy should not exchange the calls to			; performCallSlotOptzn in MemCpy should not exchange the calls to
	; @llvm.lifetime.start and @llvm.memcpy.			; @llvm.lifetime.start and @llvm.memcpy.

	declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly, i64, i1) #1			declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly, i64, i1) #1
	declare void @llvm.lifetime.start.p0i8(i64, i8* nocapture) #1			declare void @llvm.lifetime.start.p0i8(i64, i8* nocapture) #1
	declare void @llvm.lifetime.end.p0i8(i64, i8* nocapture) #1			declare void @llvm.lifetime.end.p0i8(i64, i8* nocapture) #1

	Show All 16 Lines

llvm/test/Transforms/PhaseOrdering/simplifycfg-options.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -O1 -S < %s \| FileCheck %s --check-prefix=ALL --check-prefix=OLDPM			; RUN: opt -O1 -S < %s \| FileCheck %s --check-prefix=ALL --check-prefix=OLDPM
	; RUN: opt -passes='default<O1>' -S < %s \| FileCheck %s --check-prefix=ALL --check-prefix=NEWPM			; RUN: opt -passes='default<O1>' -S < %s \| FileCheck %s --check-prefix=ALL --check-prefix=NEWPM

	; Don't simplify unconditional branches from empty blocks in simplifyCFG			; Don't simplify unconditional branches from empty blocks in simplifyCFG
	; until late in the pipeline because it can destroy canonical loop structure.			; until late in the pipeline because it can destroy canonical loop structure.

	define i1 @PR33605(i32 %a, i32 %b, i32* %c) {			define i1 @PR33605(i32 %a, i32 %b, i32* %c) {
	; ALL-LABEL: @PR33605(			; ALL-LABEL: @PR33605(
	; ALL-NEXT: for.body:			; ALL-NEXT: entry:
	; ALL-NEXT: [[OR:%.]] = or i32 [[B:%.]], [[A:%.*]]			; ALL-NEXT: [[OR:%.]] = or i32 [[B:%.]], [[A:%.*]]
	; ALL-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[C:%.*]], i64 1			; ALL-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[C:%.*]], i64 1
	; ALL-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4			; ALL-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
	; ALL-NEXT: [[CMP:%.*]] = icmp eq i32 [[OR]], [[TMP0]]			; ALL-NEXT: [[CMP:%.*]] = icmp eq i32 [[OR]], [[TMP0]]
	; ALL-NEXT: br i1 [[CMP]], label [[IF_END:%.]], label [[IF_THEN:%.]]			; ALL-NEXT: br i1 [[CMP]], label [[IF_END:%.]], label [[IF_THEN:%.]]
	; ALL: if.then:			; ALL: if.then:
	; ALL-NEXT: store i32 [[OR]], i32* [[ARRAYIDX]], align 4			; ALL-NEXT: store i32 [[OR]], i32* [[ARRAYIDX]], align 4
	; ALL-NEXT: tail call void @foo()			; ALL-NEXT: call void @foo()
	; ALL-NEXT: br label [[IF_END]]			; ALL-NEXT: br label [[IF_END]]
	; ALL: if.end:			; ALL: if.end:
	; ALL-NEXT: [[CHANGED_1_OFF0:%.]] = phi i1 [ true, [[IF_THEN]] ], [ false, [[FOR_BODY:%.]] ]			; ALL-NEXT: [[CHANGED_1_OFF0:%.]] = phi i1 [ true, [[IF_THEN]] ], [ false, [[ENTRY:%.]] ]
	; ALL-NEXT: [[TMP1:%.]] = load i32, i32 [[C]], align 4			; ALL-NEXT: [[TMP1:%.]] = load i32, i32 [[C]], align 4
	; ALL-NEXT: [[CMP_1:%.*]] = icmp eq i32 [[OR]], [[TMP1]]			; ALL-NEXT: [[CMP_1:%.*]] = icmp eq i32 [[OR]], [[TMP1]]
	; ALL-NEXT: br i1 [[CMP_1]], label [[IF_END_1:%.]], label [[IF_THEN_1:%.]]			; ALL-NEXT: br i1 [[CMP_1]], label [[IF_END_1:%.]], label [[IF_THEN_1:%.]]
	; ALL: if.then.1:			; ALL: if.then.1:
	; ALL-NEXT: store i32 [[OR]], i32* [[C]], align 4			; ALL-NEXT: store i32 [[OR]], i32* [[C]], align 4
	; ALL-NEXT: tail call void @foo()			; ALL-NEXT: call void @foo()
	; ALL-NEXT: br label [[IF_END_1]]			; ALL-NEXT: br label [[IF_END_1]]
	; ALL: if.end.1:			; ALL: if.end.1:
	; ALL-NEXT: [[CHANGED_1_OFF0_1:%.*]] = phi i1 [ true, [[IF_THEN_1]] ], [ [[CHANGED_1_OFF0]], [[IF_END]] ]			; ALL-NEXT: [[CHANGED_1_OFF0_1:%.*]] = phi i1 [ true, [[IF_THEN_1]] ], [ [[CHANGED_1_OFF0]], [[IF_END]] ]
	; ALL-NEXT: ret i1 [[CHANGED_1_OFF0_1]]			; ALL-NEXT: ret i1 [[CHANGED_1_OFF0_1]]
	;			;
	entry:			entry:
	br label %for.cond			br label %for.cond

	▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

llvm/test/Transforms/PhaseOrdering/two-shifts-by-sext.ll

Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	;
%len.reloaded.2 = load i32, i32* %wide_len, align 4		%len.reloaded.2 = load i32, i32* %wide_len, align 4
%shr = ashr i32 %shl, %len.reloaded.2		%shr = ashr i32 %shl, %len.reloaded.2
ret i32 %shr		ret i32 %shr
}		}

define i32 @two_shifts_by_sext_with_extra_use(i32 %val, i8 signext %len) {		define i32 @two_shifts_by_sext_with_extra_use(i32 %val, i8 signext %len) {
; CHECK-LABEL: @two_shifts_by_sext_with_extra_use(		; CHECK-LABEL: @two_shifts_by_sext_with_extra_use(
; CHECK-NEXT: [[CONV:%.]] = sext i8 [[LEN:%.]] to i32		; CHECK-NEXT: [[CONV:%.]] = sext i8 [[LEN:%.]] to i32
; CHECK-NEXT: tail call void @use_int32(i32 [[CONV]])		; CHECK-NEXT: call void @use_int32(i32 [[CONV]])
; CHECK-NEXT: [[SHL:%.]] = shl i32 [[VAL:%.]], [[CONV]]		; CHECK-NEXT: [[SHL:%.]] = shl i32 [[VAL:%.]], [[CONV]]
; CHECK-NEXT: [[SHR:%.*]] = ashr i32 [[SHL]], [[CONV]]		; CHECK-NEXT: [[SHR:%.*]] = ashr i32 [[SHL]], [[CONV]]
; CHECK-NEXT: ret i32 [[SHR]]		; CHECK-NEXT: ret i32 [[SHR]]
;		;
%val.addr = alloca i32, align 4		%val.addr = alloca i32, align 4
%len.addr = alloca i8, align 1		%len.addr = alloca i8, align 1
store i32 %val, i32* %val.addr, align 4		store i32 %val, i32* %val.addr, align 4
store i8 %len, i8* %len.addr, align 1		store i8 %len, i8* %len.addr, align 1
Show All 10 Lines	;
ret i32 %shr		ret i32 %shr
}		}

declare void @use_int32(i32)		declare void @use_int32(i32)

define i32 @two_shifts_by_same_sext_with_extra_use(i32 %val, i8 signext %len) {		define i32 @two_shifts_by_same_sext_with_extra_use(i32 %val, i8 signext %len) {
; CHECK-LABEL: @two_shifts_by_same_sext_with_extra_use(		; CHECK-LABEL: @two_shifts_by_same_sext_with_extra_use(
; CHECK-NEXT: [[CONV:%.]] = sext i8 [[LEN:%.]] to i32		; CHECK-NEXT: [[CONV:%.]] = sext i8 [[LEN:%.]] to i32
; CHECK-NEXT: tail call void @use_int32(i32 [[CONV]])		; CHECK-NEXT: call void @use_int32(i32 [[CONV]])
; CHECK-NEXT: [[SHL:%.]] = shl i32 [[VAL:%.]], [[CONV]]		; CHECK-NEXT: [[SHL:%.]] = shl i32 [[VAL:%.]], [[CONV]]
; CHECK-NEXT: [[SHR:%.*]] = ashr i32 [[SHL]], [[CONV]]		; CHECK-NEXT: [[SHR:%.*]] = ashr i32 [[SHL]], [[CONV]]
; CHECK-NEXT: ret i32 [[SHR]]		; CHECK-NEXT: ret i32 [[SHR]]
;		;
%val.addr = alloca i32, align 4		%val.addr = alloca i32, align 4
%len.addr = alloca i8, align 1		%len.addr = alloca i8, align 1
%wide_len = alloca i32, align 4		%wide_len = alloca i32, align 4
store i32 %val, i32* %val.addr, align 4		store i32 %val, i32* %val.addr, align 4
Show All 13 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[PassManager] First Pass implementation at -O1 pass pipelineClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 231172

clang/test/CodeGen/2008-07-30-implicit-initialization.c

clang/test/CodeGen/arm-fp16-arguments.c

clang/test/CodeGen/arm-vfp16-arguments2.cpp

clang/test/CodeGen/atomic-ops-libcall.c

clang/test/CodeGenCXX/atomicinit.cpp

clang/test/CodeGenCXX/auto-var-init.cpp

clang/test/CodeGenCXX/discard-name-values.cpp

clang/test/CodeGenCXX/microsoft-abi-dynamic-cast.cpp

clang/test/CodeGenCXX/microsoft-abi-typeid.cpp

clang/test/CodeGenCXX/nrvo.cpp

clang/test/CodeGenCXX/stack-reuse.cpp

clang/test/CodeGenCXX/wasm-args-returns.cpp

clang/test/CodeGenObjCXX/arc-blocks.mm

clang/test/CodeGenObjCXX/nrvo.mm

clang/test/Lexer/minimize_source_to_dependency_directives_invalid_error.c

clang/test/PCH/no-escaping-block-tail-calls.cpp

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/ambiguous_tail_call_seq1/Makefile

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/ambiguous_tail_call_seq2/Makefile

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/disambiguate_call_site/Makefile

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/Makefile

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/disambiguate_tail_call_seq/Makefile

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/inlining_and_tail_calls/Makefile

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/sbapi_support/Makefile

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/thread_step_out_message/Makefile

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/thread_step_out_or_return/Makefile

lldb/packages/Python/lldbsuite/test/functionalities/tail_call_frames/unambiguous_sequence/Makefile

llvm/include/llvm/Passes/PassBuilder.h

llvm/lib/Passes/PassBuilder.cpp

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

llvm/test/CodeGen/AMDGPU/simplify-libcalls.ll

llvm/test/Feature/optnone-opt.ll

llvm/test/Other/new-pm-defaults.ll

llvm/test/Other/new-pm-thinlto-defaults.ll

llvm/test/Transforms/MemCpyOpt/lifetime.ll

llvm/test/Transforms/PhaseOrdering/simplifycfg-options.ll

llvm/test/Transforms/PhaseOrdering/two-shifts-by-sext.ll

[PassManager] First Pass implementation at -O1 pass pipeline
ClosedPublic