This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
-
DiagnosticDriverKinds.td
-
lib/Driver/ToolChains/
-
Driver/
-
ToolChains/
-
Clang.cpp
-
test/Driver/
-
Driver/
3/14
fsplit-machine-functions-with-cuda-nvptx.c
-
fsplit-machine-functions.c
-
fsplit-machine-functions2.c
-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
-
DiagnosticInfo.h
-
lib/
-
CodeGen/
-
MachineFunctionSplitter.cpp
-
IR/
-
DiagnosticInfo.cpp
-
test/CodeGen/Generic/
-
CodeGen/
-
Generic/
-
Inputs/
-
fsloader-mfs.afdo
2
machine-function-splitter.ll

Differential D157750

Properly handle -fsplit-machine-functions for fatbinary compilation
ClosedPublic

Authored by shenhan on Aug 11 2023, 12:40 PM.

Download Raw Diff

Details

Reviewers

xur
snehasish
dhoekwater

Commits

rG317a0fe5bd71: [Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary…

Summary

When building a fatbinary, the driver invokes the compiler multiple times with different "--target". (For example, with "-x cuda --cuda-gpu-arch=sm_70" flags, clang will be invoded twice, once with --target=x86_64_...., once with --target=sm_70) If we use -fsplit-machine-functions or -fno-split-machine-functions for such invocation, the driver reports an error.

This CL changes the behavior so:

"-fsplit-machine-functions" is now passed to all targets, for non-X86 targets, the flag is a NOOP and causes a warning.
"-fno-split-machine-functions" now negates -fsplit-machine-functions (if -fno-split-machine-functions appears after any -fsplit-machine-functions) for any target triple, previously, it causes an error.
"-fsplit-machine-functions -Xarch_device -fno-split-machine-functions" enables MFS on host but disables MFS for GPUS without warnings/errors.
"-Xarch_host -fsplit-machine-functions" enables MFS on host but disables MFS for GPUS without warnings/errors.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

shenhan created this revision.Aug 11 2023, 12:40 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 11 2023, 12:40 PM

Herald added subscribers: mattd, asavonic, pengfei, hiraditya. · View Herald Transcript

shenhan requested review of this revision.Aug 11 2023, 12:40 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 11 2023, 12:40 PM

Herald added subscribers: llvm-commits, cfe-commits, MaskRay. · View Herald Transcript

snehasish added a subscriber: dhoekwater.Aug 11 2023, 1:30 PM

snehasish added inline comments.

llvm/lib/CodeGen/TargetPassConfig.cpp
1278 ↗	(On Diff #549493)	Can you coordinate with @dhoekwater ? He has some patches in flight for AArch64. I think D157157 is the one which modifies the same logic.
llvm/test/CodeGen/X86/mfs-triple.ll
8 ↗	(On Diff #549493)	Any reason why we can't use the bitcode already in test/CodeGen/machine-function-splitter.ll? (Going to be moved to test/Generic/machine-function-splitter.ll in D157563) IMO we can just reuse the basic test and add these run and check lines.

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
10	We will still see a warning, right? So, for someone compiling with `-Werror` that's going to be a problem. Also, if the warning is issued from the top-level driver, we may not even be able to suppress it when we disable splitting on GPU side with `-Xarch_device -fno-split-machine-functions`.

Harbormaster completed remote builds in B252031: Diff 549493.Aug 11 2023, 2:13 PM

shenhan added inline comments.Aug 11 2023, 2:53 PM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
10	We will still see a warning, right? Yes, there still will be a warning. We've discussed it and we think that pass -fsplit-machine-functions in this case is not a proper usage and a warning is warranted, and it is not good that skip doing split silently while uses explicitly ask for it. Also, if the warning is issued from the top-level driver The warning will not be issued from the top-level driver, it will be issued when configuring optimization passes. So: -fsplit-machine-functions -Xarch_device -fno-split-machine-functions Will enable MFS for host, disable MFS for gpus and without any warnings. -Xarch_host -fsplit-machine-functions The same as the above -Xarch_host -fsplit-machine-functions -Xarch_device -fno-split-machine-functions The same as the above

shenhan edited the summary of this revision. (Show Details)Aug 11 2023, 2:55 PM

shenhan edited the summary of this revision. (Show Details)

shenhan added inline comments.Aug 11 2023, 2:58 PM

llvm/lib/CodeGen/TargetPassConfig.cpp
1278 ↗	(On Diff #549493)	Thanks. Yes, I'll coordinate with @dhoekwater before resolving this.

arsenm added a subscriber: arsenm.Aug 11 2023, 3:05 PM

arsenm added inline comments.

llvm/lib/CodeGen/TargetPassConfig.cpp
1281–1282 ↗	(On Diff #549493)	You cannot spam warnings here. The other instance of printing here looks like a new addition and should be removed

tra added inline comments.Aug 11 2023, 3:14 PM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
10	We've discussed it and we think that pass -fsplit-machine-functions in this case is not a proper usage and a warning is warranted, and it is not good that skip doing split silently while uses explicitly ask for it. I would agree with that assertion if we were talking exclusively about CUDA compilation. However, a common real world use pattern is that the flags are set globally for all C++ compilations, and then CUDA compilations within the project need to do whatever they need to to keep things working. The original user intent was for the option to affect the host compilation. There's no inherent assumption that it will do anything useful for the GPU. In number of similar cases in the past we did settle on silently ignoring some top-level flags that we do expect to encounter in real projects, but which made no sense for the GPU. E.g. sanitizers. If the project is built w/ sanitizer enabled, the idea is to sanitize the host code, The GPU code continues to be built w/o sanitizer enabled. Anyways, as long as we have a way to deal with it it's not a big deal one way or another. -fsplit-machine-functions -Xarch_device -fno-split-machine-functions Will enable MFS for host, disable MFS for gpus and without any warnings. OK. This will work.

shenhan added inline comments.Aug 14 2023, 12:02 PM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
10	In number of similar cases in the past we did settle on silently ignoring some top-level flags that we do expect to encounter in real projects, but which made no sense for the GPU. E.g. sanitizers. If the project is built w/ sanitizer enabled, the idea is to sanitize the host code, The GPU code continues to be built w/o sanitizer enabled. Can I understand it this way - if the compiler is only building for CPUs, then silently ignore any optimization flags is not a good behavior. If the compiler is building CPUs and GPUs, it is still not a good behavior to silently ignore optimization flags for CPUs, but it is probably ok to silently ignore optimization flags for GPUs. OK. This will work. Thanks for confirming.
llvm/lib/CodeGen/TargetPassConfig.cpp
1281–1282 ↗	(On Diff #549493)	Thanks. Do you suggest moving the warnings to the underlying pass? (Although that means we create passes that only issue warnings.)

arsenm added inline comments.Aug 14 2023, 12:05 PM

llvm/lib/CodeGen/TargetPassConfig.cpp
1281–1282 ↗	(On Diff #549493)	Move it to the pass, and use a backend remark, not directly print to the console (e.g. DiagnosticInfoUnsupported)

tra added inline comments.Aug 14 2023, 12:23 PM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
10	it is probably ok to silently ignore optimization flags for GPUs. In this case, yes. I think the most consistent way to handle the situation is to keep the warning in place at cc1 compiler level, but change the driver behavior (and document it) so that it does not pass the splitting options to offloading sub-compilations. This way we'll do the sensible thing for the most common use case, yet would still warn if the user tries to enable the splitting where they should not (e.g. by using `-Xclang -fsplit-machine-functions` during CUDA compilation)

shenhan updated this revision to Diff 550137.Aug 14 2023, 4:44 PM

shenhan marked an inline comment as done.

shenhan added inline comments.

llvm/lib/CodeGen/TargetPassConfig.cpp
1281–1282 ↗	(On Diff #549493)	Thanks, created DiagnosticInfoMachineFunctionSplit and moved the warning to MFS pass.

Harbormaster completed remote builds in B252496: Diff 550137.Aug 14 2023, 8:57 PM

shenhan updated this revision to Diff 550423.Aug 15 2023, 12:11 PM

shenhan marked an inline comment as done.

shenhan marked 3 inline comments as done.Aug 15 2023, 12:15 PM

shenhan added inline comments.

llvm/lib/CodeGen/TargetPassConfig.cpp
1278 ↗	(On Diff #549493)	@dhoekwater will rebase D157157 on top of this.
llvm/test/CodeGen/X86/mfs-triple.ll
8 ↗	(On Diff #549493)	Moved the tests into machine-function-splitter.ll. Either this CL or D157563 can be submitted first, and the other will rebase on top of that.

shenhan added a reviewer: dhoekwater.Aug 15 2023, 2:38 PM

Harbormaster completed remote builds in B252708: Diff 550423.Aug 15 2023, 2:39 PM

shenhan updated this revision to Diff 550834.Aug 16 2023, 11:56 AM

shenhan marked an inline comment as done.

shenhan added inline comments.

llvm/test/CodeGen/X86/mfs-triple.ll
8 ↗	(On Diff #549493)	Rebased on D157563.

Harbormaster completed remote builds in B253007: Diff 550834.Aug 16 2023, 1:47 PM

shenhan updated this revision to Diff 550885.Aug 16 2023, 2:10 PM

lgtm.

This revision is now accepted and ready to land.Aug 16 2023, 2:28 PM

lgtm

This patch will make it difficult to write tests for MFS on AArch64 before it is officially enabled. Currently, because clang performs the Triple check, we can use -enable-split-machine-functions to run tests with MFS on Arm, but after this patch the flag won't do anything. Is there a way that we can land this patch while still making MFS testable on AArch64?

Refined the test case a little bit.

In D157750#4593582, @dhoekwater wrote:

This patch will make it difficult to write tests for MFS on AArch64 before it is officially enabled. Currently, because clang performs the Triple check, we can use -enable-split-machine-functions to run tests with MFS on Arm, but after this patch the flag won't do anything. Is there a way that we can land this patch while still making MFS testable on AArch64?

Discussed with @dhoekwater offline, we decide to use a temporary hidden flag, something like "enable-mfs-for-debugging/testing", to force enable MFS for any triple during the period when MFS for arm is progressively rolled out, and when all is done, we will remove that temporary flag.

Harbormaster completed remote builds in B253060: Diff 550913.Aug 16 2023, 4:44 PM

This revision was landed with ongoing or failed builds.Aug 16 2023, 11:46 PM

Closed by commit rG317a0fe5bd71: [Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary… (authored by shenhan). · Explain Why

This revision was automatically updated to reflect the committed changes.

shenhan added a commit: rG317a0fe5bd71: [Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary….

I disabled two tests without {arm,aarch64}-registered-target in rGeeac4321c517ee8afc30ebe62c5b1778efc1173d; two post-commit comments inline

llvm/test/CodeGen/Generic/machine-function-splitter.ll
18	shouldn't this be `MFS_ON-NOT`?
21

steelannelida added a subscriber: steelannelida.Aug 17 2023, 5:04 AM

steelannelida added inline comments.

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
16	Unfortunately these commands fail in our sandbox due to writing files to readonly directories: `unable to open output file 'fsplit-machine-functions-with-cuda-nvptx.s': 'Permission denied'` Could you please specify the output files via `%t` substitutions? I'm not sure how to do this for cuda compilation.

Hahnfeld added inline comments.Aug 17 2023, 5:08 AM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
16	IIRC the file names are generated based on what you specify with `-o`. Did you try this already?

tra added inline comments.Aug 17 2023, 10:29 AM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
16	The problem is that in this case we didn't pass any -o at all here, so the compiler tries to write into the current directory. We need `-o %t.s` or `-o /dev/null` here.

MaskRay added inline comments.Aug 17 2023, 12:59 PM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
10	There are excessive spaces before `%clang`. We should keep just one space: `RUN: %clang`
16	Driver tests should not run the backend. Most driver commands should use `-###`. `REQUIRES: shell` disables the internal shell, which essentially disables the test on Windows. This should generally be avoided. An idiom is `// RUN: rm -rf %t && mkdir %t && cd %t` (see `ftime-trace.cpp`) if the driver places output files in the current working directory.

Thanks all. Created D158231 to address the post-submit comments and added all as reviewers. @MaskRay @Hahnfeld @steelannelida

Sorry, but I think this change should be reverted.

(a) -fsplit-machine-functions on an unsupported target now emits a warning instead of an error. This diverges from the regular expectation for target-specific features.

% fclang --target=riscv64 -fsplit-machine-functions -c a.c
warning: -fsplit-machine-functions is not valid for riscv64 [-Wbackend-plugin]

warn_drv_for_elf_only is not necessary. We typically just use err_drv_unsupported_opt_for_target.

(b) the test needs substantial change (D158231)

(c) The error reporting should be on the driver side, not in backend. void DiagnosticInfoMachineFunctionSplit::print(DiagnosticPrinter &DP) const is not necessary.
At the very least, it should not be in the generic IR/DiagnosticInfo.cpp

MaskRay added inline comments.Aug 19 2023, 11:23 AM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
10	I agree with @tra's analysis. Either do nothing on Clang side and requiring `-fsplit-machine-functions -Xarch_device -fno-split-machine-functions` or ignoring the option when creating a device job works for me. This patch changed the behavior in an unintended direction.

MaskRay mentioned this in D158231: [clang][test] Fix clang machine-function-split tests.Aug 19 2023, 2:52 PM

shenhan added inline comments.Aug 21 2023, 10:16 AM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
10	Either do nothing on Clang side and requiring -fsplit-machine-functions -Xarch_device -fno-split-machine-functions or ignoring the option when creating a device job works for me. This patch changed the behavior in an unintended direction. Thanks Ray. Just a little bit confused, what this patch does is indeed "requiring -fsplit-machine-functions -Xarch_device -fno-split-machine-functions", before this patch, this usage will cause an error. What do you suggest?

MaskRay added inline comments.Aug 21 2023, 10:22 AM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
10	I think we should go back to the state before this patch. Do either (a) require `-Xarch_device -fno-split-machine-functions` for `--cuda-gpu-arch` users (b) ignore `-fsplit-machine-functions` when creating a `-triple" "nvptx64-nvidia-cuda` job in the driver. Personally I'd prefer (a), as I don't want to maintain a long list of ignored options in the driver for certain very specific optimization options, but I am fine with (b). In either case, I think we need to revert this patch to go to a clean state.

tra added inline comments.Aug 21 2023, 10:45 AM

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c
10	a) require -Xarch_device -fno-split-machine-functions for --cuda-gpu-arch users It will break the currently valid compilation options and CUDA users would have no other viable option other than keep spamming their build flags with `-Xarch_device -fno-split-machine-functions`. E.g. if the flag is set build-wide on all compilations, it will break the previously working CUDA compilations due to no fault on the user side. Yes, `-Xarch_device -fno-split-machine-functions` will work around it, but it will have to be done by everyone who ever faces `-fsplit-machine-functions` in a CUDA compilation. If we have to apply the workaround in 100% of the use cases, we should not require every user to do it. (b) ignore -fsplit-machine-functions when creating a -triple" "nvptx64-nvidia-cuda job in the driver. I believe that's the way to go. We already do that in a number of other cases like sanitizers, or other things not supported on the GPU side in principle.

MaskRay added a reverting change: rG77596e6b167b: Revert D157750 "[Driver][CodeGen] Properly handle -fsplit-machine-functions for….Aug 21 2023, 1:54 PM

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

DiagnosticDriverKinds.td

4 lines

lib/

Driver/

ToolChains/

Clang.cpp

13 lines

test/

Driver/

fsplit-machine-functions-with-cuda-nvptx.c

61 lines

fsplit-machine-functions.c

4 lines

fsplit-machine-functions2.c

3 lines

llvm/

include/

llvm/

IR/

DiagnosticInfo.h

15 lines

lib/

CodeGen/

MachineFunctionSplitter.cpp

22 lines

IR/

DiagnosticInfo.cpp

4 lines

test/

CodeGen/

Generic/

Inputs/

fsloader-mfs.afdo

machine-function-splitter.ll

9 lines

Diff 551017

clang/include/clang/Basic/DiagnosticDriverKinds.td

Show First 20 Lines • Show All 687 Lines • ▼ Show 20 Lines	def warn_drv_jmc_requires_debuginfo : Warning<
"%0 requires debug info. Use %1 or debug options that enable debugger's "		"%0 requires debug info. Use %1 or debug options that enable debugger's "
"stepping function; option ignored">,		"stepping function; option ignored">,
InGroup<OptionIgnored>;		InGroup<OptionIgnored>;

def warn_drv_fjmc_for_elf_only : Warning<		def warn_drv_fjmc_for_elf_only : Warning<
"-fjmc works only for ELF; option ignored">,		"-fjmc works only for ELF; option ignored">,
InGroup<OptionIgnored>;		InGroup<OptionIgnored>;

		def warn_drv_for_elf_only : Warning<
		"'%0' works only for ELF; option ignored">,
		InGroup<OptionIgnored>;

def warn_target_override_arm64ec : Warning<		def warn_target_override_arm64ec : Warning<
"/arm64EC has been overridden by specified target: %0; option ignored">,		"/arm64EC has been overridden by specified target: %0; option ignored">,
InGroup<OptionIgnored>;		InGroup<OptionIgnored>;

def err_drv_target_variant_invalid : Error<		def err_drv_target_variant_invalid : Error<
"unsupported '%0' value '%1'; use 'ios-macabi' instead">;		"unsupported '%0' value '%1'; use 'ios-macabi' instead">;

def err_drv_invalid_directx_shader_module : Error<		def err_drv_invalid_directx_shader_module : Error<
▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/Clang.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,864 Lines • ▼ Show 20 Lines	Args.addOptInFlag(CmdArgs, options::OPT_funique_internal_linkage_names,
options::OPT_fno_unique_internal_linkage_names);		options::OPT_fno_unique_internal_linkage_names);
Args.addOptInFlag(CmdArgs, options::OPT_funique_basic_block_section_names,		Args.addOptInFlag(CmdArgs, options::OPT_funique_basic_block_section_names,
options::OPT_fno_unique_basic_block_section_names);		options::OPT_fno_unique_basic_block_section_names);
Args.addOptInFlag(CmdArgs, options::OPT_fconvergent_functions,		Args.addOptInFlag(CmdArgs, options::OPT_fconvergent_functions,
options::OPT_fno_convergent_functions);		options::OPT_fno_convergent_functions);

if (Arg *A = Args.getLastArg(options::OPT_fsplit_machine_functions,		if (Arg *A = Args.getLastArg(options::OPT_fsplit_machine_functions,
options::OPT_fno_split_machine_functions)) {		options::OPT_fno_split_machine_functions)) {
// This codegen pass is only available on x86-elf targets.		if (A->getOption().matches(options::OPT_fsplit_machine_functions)) {
if (Triple.isX86() && Triple.isOSBinFormatELF()) {		// This codegen pass is only available on elf targets.
if (A->getOption().matches(options::OPT_fsplit_machine_functions))		if (Triple.isOSBinFormatELF())
A->render(Args, CmdArgs);		A->render(Args, CmdArgs);
} else {		else
D.Diag(diag::err_drv_unsupported_opt_for_target)		D.Diag(diag::warn_drv_for_elf_only) << A->getAsString(Args);
<< A->getAsString(Args) << TripleStr;
}		}
		// Do not issue warnings for -fno-split-machine-functions even it is not
		// on ELF.
}		}

Args.AddLastArg(CmdArgs, options::OPT_finstrument_functions,		Args.AddLastArg(CmdArgs, options::OPT_finstrument_functions,
options::OPT_finstrument_functions_after_inlining,		options::OPT_finstrument_functions_after_inlining,
options::OPT_finstrument_function_entry_bare);		options::OPT_finstrument_function_entry_bare);

// NVPTX/AMDGCN doesn't support PGO or coverage. There's no runtime support		// NVPTX/AMDGCN doesn't support PGO or coverage. There's no runtime support
// for sampling, overhead of call arc collection is way too high and there's		// for sampling, overhead of call arc collection is way too high and there's
▲ Show 20 Lines • Show All 2,821 Lines • Show Last 20 Lines

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c

This file was added.

				// REQUIRES: system-linux
				// REQUIRES: x86-registered-target
				// REQUIRES: nvptx-registered-target
				// REQUIRES: shell

				// Check that -fsplit-machine-functions is passed to both x86 and cuda
				// compilation and does not cause driver error.
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -nogpuinc \
				// RUN: --cuda-gpu-arch=sm_70 -x cuda -fsplit-machine-functions -S %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=MFS1
				traUnsubmitted Not Done Reply Inline Actions We will still see a warning, right? So, for someone compiling with `-Werror` that's going to be a problem. Also, if the warning is issued from the top-level driver, we may not even be able to suppress it when we disable splitting on GPU side with `-Xarch_device -fno-split-machine-functions`. tra: We will still see a warning, right? So, for someone compiling with `-Werror` that's going to be…
				shenhanAuthorUnsubmitted Done Reply Inline Actions We will still see a warning, right? Yes, there still will be a warning. We've discussed it and we think that pass -fsplit-machine-functions in this case is not a proper usage and a warning is warranted, and it is not good that skip doing split silently while uses explicitly ask for it. Also, if the warning is issued from the top-level driver The warning will not be issued from the top-level driver, it will be issued when configuring optimization passes. So: -fsplit-machine-functions -Xarch_device -fno-split-machine-functions Will enable MFS for host, disable MFS for gpus and without any warnings. -Xarch_host -fsplit-machine-functions The same as the above -Xarch_host -fsplit-machine-functions -Xarch_device -fno-split-machine-functions The same as the above shenhan: > We will still see a warning, right? Yes, there still will be a warning. We've discussed it…
				traUnsubmitted Not Done Reply Inline Actions We've discussed it and we think that pass -fsplit-machine-functions in this case is not a proper usage and a warning is warranted, and it is not good that skip doing split silently while uses explicitly ask for it. I would agree with that assertion if we were talking exclusively about CUDA compilation. However, a common real world use pattern is that the flags are set globally for all C++ compilations, and then CUDA compilations within the project need to do whatever they need to to keep things working. The original user intent was for the option to affect the host compilation. There's no inherent assumption that it will do anything useful for the GPU. In number of similar cases in the past we did settle on silently ignoring some top-level flags that we do expect to encounter in real projects, but which made no sense for the GPU. E.g. sanitizers. If the project is built w/ sanitizer enabled, the idea is to sanitize the host code, The GPU code continues to be built w/o sanitizer enabled. Anyways, as long as we have a way to deal with it it's not a big deal one way or another. -fsplit-machine-functions -Xarch_device -fno-split-machine-functions Will enable MFS for host, disable MFS for gpus and without any warnings. OK. This will work. tra: > We've discussed it and we think that pass -fsplit-machine-functions in this case is not a…
				shenhanAuthorUnsubmitted Done Reply Inline Actions In number of similar cases in the past we did settle on silently ignoring some top-level flags that we do expect to encounter in real projects, but which made no sense for the GPU. E.g. sanitizers. If the project is built w/ sanitizer enabled, the idea is to sanitize the host code, The GPU code continues to be built w/o sanitizer enabled. Can I understand it this way - if the compiler is only building for CPUs, then silently ignore any optimization flags is not a good behavior. If the compiler is building CPUs and GPUs, it is still not a good behavior to silently ignore optimization flags for CPUs, but it is probably ok to silently ignore optimization flags for GPUs. OK. This will work. Thanks for confirming. shenhan: > In number of similar cases in the past we did settle on silently ignoring some top-level…
				traUnsubmitted Not Done Reply Inline Actions it is probably ok to silently ignore optimization flags for GPUs. In this case, yes. I think the most consistent way to handle the situation is to keep the warning in place at cc1 compiler level, but change the driver behavior (and document it) so that it does not pass the splitting options to offloading sub-compilations. This way we'll do the sensible thing for the most common use case, yet would still warn if the user tries to enable the splitting where they should not (e.g. by using `-Xclang -fsplit-machine-functions` during CUDA compilation) tra: > it is probably ok to silently ignore optimization flags for GPUs. In this case, yes. I…
				MaskRayUnsubmitted Not Done Reply Inline Actions There are excessive spaces before `%clang`. We should keep just one space: `RUN: %clang` MaskRay: There are excessive spaces before `%clang`. We should keep just one space: `RUN: %clang`
				MaskRayUnsubmitted Not Done Reply Inline Actions I agree with @tra's analysis. Either do nothing on Clang side and requiring `-fsplit-machine-functions -Xarch_device -fno-split-machine-functions` or ignoring the option when creating a device job works for me. This patch changed the behavior in an unintended direction. MaskRay: I agree with @tra's analysis. Either do nothing on Clang side and requiring `-fsplit-machine…
				shenhanAuthorUnsubmitted Done Reply Inline Actions Either do nothing on Clang side and requiring -fsplit-machine-functions -Xarch_device -fno-split-machine-functions or ignoring the option when creating a device job works for me. This patch changed the behavior in an unintended direction. Thanks Ray. Just a little bit confused, what this patch does is indeed "requiring -fsplit-machine-functions -Xarch_device -fno-split-machine-functions", before this patch, this usage will cause an error. What do you suggest? shenhan: > Either do nothing on Clang side and requiring -fsplit-machine-functions -Xarch_device -fno…
				MaskRayUnsubmitted Not Done Reply Inline Actions I think we should go back to the state before this patch. Do either (a) require `-Xarch_device -fno-split-machine-functions` for `--cuda-gpu-arch` users (b) ignore `-fsplit-machine-functions` when creating a `-triple" "nvptx64-nvidia-cuda` job in the driver. Personally I'd prefer (a), as I don't want to maintain a long list of ignored options in the driver for certain very specific optimization options, but I am fine with (b). In either case, I think we need to revert this patch to go to a clean state. MaskRay: I think we should go back to the state before this patch. Do either (a) require `-Xarch_device…
				traUnsubmitted Not Done Reply Inline Actions a) require -Xarch_device -fno-split-machine-functions for --cuda-gpu-arch users It will break the currently valid compilation options and CUDA users would have no other viable option other than keep spamming their build flags with `-Xarch_device -fno-split-machine-functions`. E.g. if the flag is set build-wide on all compilations, it will break the previously working CUDA compilations due to no fault on the user side. Yes, `-Xarch_device -fno-split-machine-functions` will work around it, but it will have to be done by everyone who ever faces `-fsplit-machine-functions` in a CUDA compilation. If we have to apply the workaround in 100% of the use cases, we should not require every user to do it. (b) ignore -fsplit-machine-functions when creating a -triple" "nvptx64-nvidia-cuda job in the driver. I believe that's the way to go. We already do that in a number of other cases like sanitizers, or other things not supported on the GPU side in principle. tra: > a) require -Xarch_device -fno-split-machine-functions for --cuda-gpu-arch users It will…
				// MFS1: "-target-cpu" "x86-64"{{.*}}"-fsplit-machine-functions"
				// MFS1: "-target-cpu" "sm_70"{{.*}}"-fsplit-machine-functions"

				// Check that -fsplit-machine-functions is passed to cuda and it
				// causes a warning.
				// RUN: %clang --target=x86_64-unknown-linux-gnu -nogpulib -nogpuinc \
				steelannelidaUnsubmitted Not Done Reply Inline Actions Unfortunately these commands fail in our sandbox due to writing files to readonly directories: `unable to open output file 'fsplit-machine-functions-with-cuda-nvptx.s': 'Permission denied'` Could you please specify the output files via `%t` substitutions? I'm not sure how to do this for cuda compilation. steelannelida: Unfortunately these commands fail in our sandbox due to writing files to readonly directories…
				HahnfeldUnsubmitted Not Done Reply Inline Actions IIRC the file names are generated based on what you specify with `-o`. Did you try this already? Hahnfeld: IIRC the file names are generated based on what you specify with `-o`. Did you try this already?
				traUnsubmitted Not Done Reply Inline Actions The problem is that in this case we didn't pass any -o at all here, so the compiler tries to write into the current directory. We need `-o %t.s` or `-o /dev/null` here. tra: The problem is that in this case we didn't pass any -o at all here, so the compiler tries to…
				MaskRayUnsubmitted Not Done Reply Inline Actions Driver tests should not run the backend. Most driver commands should use `-###`. `REQUIRES: shell` disables the internal shell, which essentially disables the test on Windows. This should generally be avoided. An idiom is `// RUN: rm -rf %t && mkdir %t && cd %t` (see `ftime-trace.cpp`) if the driver places output files in the current working directory. MaskRay: Driver tests should not run the backend. Most driver commands should use `-###`. `REQUIRES…
				// RUN: --cuda-gpu-arch=sm_70 -x cuda -fsplit-machine-functions -S %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=MFS2
				// MFS2: warning: -fsplit-machine-functions is not valid for nvptx

				// Check that -Xarch_host -fsplit-machine-functions is passed only to
				// native compilation.
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -nogpuinc \
				// RUN: --cuda-gpu-arch=sm_70 -x cuda -Xarch_host \
				// RUN: -fsplit-machine-functions -S %s \
				// RUN: 2>&1 \| FileCheck %s --check-prefix=MFS3
				// MFS3: "-target-cpu" "x86-64"{{.*}}"-fsplit-machine-functions"
				// MFS3-NOT: "-target-cpu" "sm_70"{{.*}}"-fsplit-machine-functions"

				// Check that -Xarch_host -fsplit-machine-functions does not cause any warning.
				// RUN: %clang --target=x86_64-unknown-linux-gnu -nogpulib -nogpuinc \
				// RUN --cuda-gpu-arch=sm_70 -x cuda -Xarch_host \
				// RUN -fsplit-machine-functions -S %s \|\| { echo \
				// RUN "warning: -fsplit-machine-functions is not valid for" ; } \
				// RUN 2>&1 \| FileCheck %s --check-prefix=MFS4
				// MFS4-NOT: warning: -fsplit-machine-functions is not valid for

				// Check that -Xarch_device -fsplit-machine-functions does cause the warning.
				// RUN: %clang --target=x86_64-unknown-linux-gnu -nogpulib -nogpuinc \
				// RUN: --cuda-gpu-arch=sm_70 -x cuda -Xarch_device \
				// RUN: -fsplit-machine-functions -S %s 2>&1 \| \
				// RUN: FileCheck %s --check-prefix=MFS5
				// MFS5: warning: -fsplit-machine-functions is not valid for

				// Check that -fsplit-machine-functions -Xarch_device
				// -fno-split-machine-functions only passes MFS to x86
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -nogpuinc \
				// RUN: --cuda-gpu-arch=sm_70 -x cuda -fsplit-machine-functions \
				// RUN: -Xarch_device -fno-split-machine-functions -S %s \
				// RUN: 2>&1 \| FileCheck %s --check-prefix=MFS6
				// MFS6: "-target-cpu" "x86-64"{{.*}}"-fsplit-machine-functions"
				// MFS6-NOT: "-target-cpu" "sm_70"{{.*}}"-fsplit-machine-functions"

				// Check that -fsplit-machine-functions -Xarch_device
				// -fno-split-machine-functions has no warnings
				// RUN: %clang --target=x86_64-unknown-linux-gnu -nogpulib -nogpuinc \
				// RUN: --cuda-gpu-arch=sm_70 -x cuda -fsplit-machine-functions \
				// RUN: -Xarch_device -fno-split-machine-functions -S %s \
				// RUN: \|\| { echo "warning: -fsplit-machine-functions is not valid for"; } \
				// RUN: 2>&1 \| FileCheck %s --check-prefix=MFS7
				// MFS7-NOT: warning: -fsplit-machine-functions is not valid for

clang/test/Driver/fsplit-machine-functions.c

	// RUN: %clang -### -target x86_64 -fprofile-use=default.profdata -fsplit-machine-functions %s -c 2>&1 \| FileCheck -check-prefix=CHECK-OPT %s			// RUN: %clang -### -target x86_64 -fprofile-use=default.profdata -fsplit-machine-functions %s -c 2>&1 \| FileCheck -check-prefix=CHECK-OPT %s
	// RUN: %clang -### -target x86_64 -fsplit-machine-functions %s -c 2>&1 \| FileCheck -check-prefix=CHECK-OPT %s			// RUN: %clang -### -target x86_64 -fsplit-machine-functions %s -c 2>&1 \| FileCheck -check-prefix=CHECK-OPT %s
	// RUN: %clang -### -target x86_64 -fprofile-use=default.profdata -fsplit-machine-functions -fno-split-machine-functions %s -c 2>&1 \| FileCheck -check-prefix=CHECK-NOOPT %s			// RUN: %clang -### -target x86_64 -fprofile-use=default.profdata -fsplit-machine-functions -fno-split-machine-functions %s -c 2>&1 \| FileCheck -check-prefix=CHECK-NOOPT %s
	// RUN: not %clang -c -target arm-unknown-linux -fsplit-machine-functions %s 2>&1 \| FileCheck -check-prefix=CHECK-TRIPLE %s			// RUN: %clang -c -target arm-unknown-linux -fsplit-machine-functions %s 2>&1 \| FileCheck -check-prefix=CHECK-TRIPLE %s

	// CHECK-OPT: "-fsplit-machine-functions"			// CHECK-OPT: "-fsplit-machine-functions"
	// CHECK-NOOPT-NOT: "-fsplit-machine-functions"			// CHECK-NOOPT-NOT: "-fsplit-machine-functions"
	// CHECK-TRIPLE: error: unsupported option '-fsplit-machine-functions' for target			// CHECK-TRIPLE: warning: -fsplit-machine-functions is not valid for arm

clang/test/Driver/fsplit-machine-functions2.c

	// Test -fsplit-machine-functions option pass-through with lto			// Test -fsplit-machine-functions option pass-through with lto
	// RUN: %clang -### -target x86_64-unknown-linux -flto -fsplit-machine-functions %s 2>&1 \| FileCheck %s -check-prefix=CHECK-PASS			// RUN: %clang -### -target x86_64-unknown-linux -flto -fsplit-machine-functions %s 2>&1 \| FileCheck %s -check-prefix=CHECK-PASS

	// Test no pass-through to ld without lto			// Test no pass-through to ld without lto
	// RUN: %clang -### -target x86_64-unknown-linux -fsplit-machine-functions %s 2>&1 \| FileCheck %s -check-prefix=CHECK-NOPASS			// RUN: %clang -### -target x86_64-unknown-linux -fsplit-machine-functions %s 2>&1 \| FileCheck %s -check-prefix=CHECK-NOPASS

	// Test the mix of -fsplit-machine-functions and -fno-split-machine-functions			// Test the mix of -fsplit-machine-functions and -fno-split-machine-functions
	// RUN: %clang -### -target x86_64-unknown-linux -flto -fsplit-machine-functions -fno-split-machine-functions %s 2>&1 \| FileCheck %s -check-prefix=CHECK-NOPASS			// RUN: %clang -### -target x86_64-unknown-linux -flto -fsplit-machine-functions -fno-split-machine-functions %s 2>&1 \| FileCheck %s -check-prefix=CHECK-NOPASS
	// RUN: %clang -### -target x86_64-unknown-linux -flto -fno-split-machine-functions -fsplit-machine-functions %s 2>&1 \| FileCheck %s -check-prefix=CHECK-PASS			// RUN: %clang -### -target x86_64-unknown-linux -flto -fno-split-machine-functions -fsplit-machine-functions %s 2>&1 \| FileCheck %s -check-prefix=CHECK-PASS
				// Check that for non-X86, passing no-split-machine-functions does not cause error.
				// RUN: %clang -### -target aarch64-unknown-linux -flto -fsplit-machine-functions -fno-split-machine-functions %s 2>&1 \| FileCheck %s -check-prefix=CHECK-NOPASS2

	// CHECK-PASS: "-plugin-opt=-split-machine-functions"			// CHECK-PASS: "-plugin-opt=-split-machine-functions"
	// CHECK-NOPASS-NOT: "-plugin-opt=-split-machine-functions"			// CHECK-NOPASS-NOT: "-plugin-opt=-split-machine-functions"
				// CHECK-NOPASS2-NOT: "-plugin-opt=-split-machine-functions"

llvm/include/llvm/IR/DiagnosticInfo.h

Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	enum DiagnosticKind {
DK_FirstMachineRemark = DK_MachineOptimizationRemark,		DK_FirstMachineRemark = DK_MachineOptimizationRemark,
DK_LastMachineRemark = DK_MachineOptimizationRemarkAnalysis,		DK_LastMachineRemark = DK_MachineOptimizationRemarkAnalysis,
DK_MIRParser,		DK_MIRParser,
DK_PGOProfile,		DK_PGOProfile,
DK_Unsupported,		DK_Unsupported,
DK_SrcMgr,		DK_SrcMgr,
DK_DontCall,		DK_DontCall,
DK_MisExpect,		DK_MisExpect,
		DK_MachineFunctionSplit,
DK_FirstPluginKind // Must be last value to work with		DK_FirstPluginKind // Must be last value to work with
// getNextAvailablePluginDiagnosticKind		// getNextAvailablePluginDiagnosticKind
};		};

/// Get the next available kind ID for a plugin diagnostic.		/// Get the next available kind ID for a plugin diagnostic.
/// Each time this function is called, it returns a different number.		/// Each time this function is called, it returns a different number.
/// Therefore, a plugin that wants to "identify" its own classes		/// Therefore, a plugin that wants to "identify" its own classes
/// with a dynamic identifier, just have to use this method to get a new ID		/// with a dynamic identifier, just have to use this method to get a new ID
▲ Show 20 Lines • Show All 1,015 Lines • ▼ Show 20 Lines	public:
StringRef getNote() const { return Note; }		StringRef getNote() const { return Note; }
unsigned getLocCookie() const { return LocCookie; }		unsigned getLocCookie() const { return LocCookie; }
void print(DiagnosticPrinter &DP) const override;		void print(DiagnosticPrinter &DP) const override;
static bool classof(const DiagnosticInfo *DI) {		static bool classof(const DiagnosticInfo *DI) {
return DI->getKind() == DK_DontCall;		return DI->getKind() == DK_DontCall;
}		}
};		};

		class DiagnosticInfoMachineFunctionSplit : public DiagnosticInfo {
		StringRef TargetTriple;

		public:
		DiagnosticInfoMachineFunctionSplit(StringRef TargetTriple,
		DiagnosticSeverity DS)
		: DiagnosticInfo(DK_MachineFunctionSplit, DS),
		TargetTriple(TargetTriple) {}
		void print(DiagnosticPrinter &DP) const override;
		static bool classof(const DiagnosticInfo *DI) {
		return DI->getKind() == DK_MachineFunctionSplit;
		}
		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_IR_DIAGNOSTICINFO_H		#endif // LLVM_IR_DIAGNOSTICINFO_H

llvm/lib/CodeGen/MachineFunctionSplitter.cpp

Show All 29 Lines
#include "llvm/Analysis/ProfileSummaryInfo.h"		#include "llvm/Analysis/ProfileSummaryInfo.h"
#include "llvm/CodeGen/BasicBlockSectionUtils.h"		#include "llvm/CodeGen/BasicBlockSectionUtils.h"
#include "llvm/CodeGen/MachineBasicBlock.h"		#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"		#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineModuleInfo.h"		#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/Passes.h"		#include "llvm/CodeGen/Passes.h"
		#include "llvm/IR/DiagnosticInfo.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
		#include "llvm/TargetParser/Triple.h"
#include <optional>		#include <optional>

using namespace llvm;		using namespace llvm;

// FIXME: This cutoff value is CPU dependent and should be moved to		// FIXME: This cutoff value is CPU dependent and should be moved to
// TargetTransformInfo once we consider enabling this on other platforms.		// TargetTransformInfo once we consider enabling this on other platforms.
// The value is expressed as a ProfileSummaryInfo integer percentile cutoff.		// The value is expressed as a ProfileSummaryInfo integer percentile cutoff.
// Defaults to 999950, i.e. all blocks colder than 99.995 percentile are split.		// Defaults to 999950, i.e. all blocks colder than 99.995 percentile are split.
Show All 28 Lines	public:

StringRef getPassName() const override {		StringRef getPassName() const override {
return "Machine Function Splitter Transformation";		return "Machine Function Splitter Transformation";
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override;		void getAnalysisUsage(AnalysisUsage &AU) const override;

bool runOnMachineFunction(MachineFunction &F) override;		bool runOnMachineFunction(MachineFunction &F) override;

		bool doInitialization(Module &) override;

		static bool isSupportedTriple(const Triple &T) { return T.isX86(); }

		private:
		bool UnsupportedTriple = false;
};		};
} // end anonymous namespace		} // end anonymous namespace

/// setDescendantEHBlocksCold - This splits all EH pads and blocks reachable		/// setDescendantEHBlocksCold - This splits all EH pads and blocks reachable
/// only by EH pad as cold. This will help mark EH pads statically cold		/// only by EH pad as cold. This will help mark EH pads statically cold
/// instead of relying on profile data.		/// instead of relying on profile data.
static void setDescendantEHBlocksCold(MachineFunction &MF) {		static void setDescendantEHBlocksCold(MachineFunction &MF) {
DenseSet<MachineBasicBlock *> EHBlocks;		DenseSet<MachineBasicBlock *> EHBlocks;
Show All 29 Lines	if (PSI->hasInstrumentationProfile() \|\| PSI->hasCSInstrumentationProfile()) {
// For sample profile, no count means "do not judege coldness".		// For sample profile, no count means "do not judege coldness".
if (!Count)		if (!Count)
return false;		return false;
}		}

return (*Count < ColdCountThreshold);		return (*Count < ColdCountThreshold);
}		}

		bool MachineFunctionSplitter::doInitialization(Module &M) {
		StringRef T = M.getTargetTriple();
		if (!isSupportedTriple(Triple(T))) {
		UnsupportedTriple = true;
		M.getContext().diagnose(
		DiagnosticInfoMachineFunctionSplit(T, DS_Warning));
		return false;
		}
		return MachineFunctionPass::doInitialization(M);
		}

bool MachineFunctionSplitter::runOnMachineFunction(MachineFunction &MF) {		bool MachineFunctionSplitter::runOnMachineFunction(MachineFunction &MF) {
		if (UnsupportedTriple)
		return false;
// We target functions with profile data. Static information in the form		// We target functions with profile data. Static information in the form
// of exception handling code may be split to cold if user passes the		// of exception handling code may be split to cold if user passes the
// mfs-split-ehcode flag.		// mfs-split-ehcode flag.
bool UseProfileData = MF.getFunction().hasProfileData();		bool UseProfileData = MF.getFunction().hasProfileData();
if (!UseProfileData && !SplitAllEHCode)		if (!UseProfileData && !SplitAllEHCode)
return false;		return false;

// TODO: We don't split functions where a section attribute has been set		// TODO: We don't split functions where a section attribute has been set
▲ Show 20 Lines • Show All 86 Lines • Show Last 20 Lines

llvm/lib/IR/DiagnosticInfo.cpp

Show First 20 Lines • Show All 443 Lines • ▼ Show 20 Lines	void DiagnosticInfoDontCall::print(DiagnosticPrinter &DP) const {
DP << "call to " << demangle(getFunctionName()) << " marked \"dontcall-";		DP << "call to " << demangle(getFunctionName()) << " marked \"dontcall-";
if (getSeverity() == DiagnosticSeverity::DS_Error)		if (getSeverity() == DiagnosticSeverity::DS_Error)
DP << "error\"";		DP << "error\"";
else		else
DP << "warn\"";		DP << "warn\"";
if (!getNote().empty())		if (!getNote().empty())
DP << ": " << getNote();		DP << ": " << getNote();
}		}

		void DiagnosticInfoMachineFunctionSplit::print(DiagnosticPrinter &DP) const {
		DP << "-fsplit-machine-functions is not valid for " << TargetTriple;
		}

llvm/test/CodeGen/Generic/Inputs/fsloader-mfs.afdo

This binary file was added.

llvm/test/CodeGen/Generic/machine-function-splitter.ll

; REQUIRES: x86-registered-target ; REQUIRES: x86-registered-target

; COM: Machine function splitting with FDO profiles ; COM: Machine function splitting with FDO profiles

; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions | FileCheck %s -check-prefixes=MFS-DEFAULTS,MFS-DEFAULTS-X86 ; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions | FileCheck %s -check-prefixes=MFS-DEFAULTS,MFS-DEFAULTS-X86

; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-psi-cutoff=0 -mfs-count-threshold=2000 | FileCheck %s --dump-input=always -check-prefixes=MFS-OPTS1,MFS-OPTS1-X86 ; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-psi-cutoff=0 -mfs-count-threshold=2000 | FileCheck %s --dump-input=always -check-prefixes=MFS-OPTS1,MFS-OPTS1-X86

; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-psi-cutoff=950000 | FileCheck %s -check-prefixes=MFS-OPTS2,MFS-OPTS2-X86 ; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-psi-cutoff=950000 | FileCheck %s -check-prefixes=MFS-OPTS2,MFS-OPTS2-X86

; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-split-ehcode | FileCheck %s -check-prefixes=MFS-EH-SPLIT,MFS-EH-SPLIT-X86 ; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-split-ehcode | FileCheck %s -check-prefixes=MFS-EH-SPLIT,MFS-EH-SPLIT-X86

; COM: Machine function splitting with AFDO profiles ; COM: Machine function splitting with AFDO profiles

; RUN: sed 's/InstrProf/SampleProfile/g' %s > %t.ll ; RUN: sed 's/InstrProf/SampleProfile/g' %s > %t.ll

; RUN: llc < %t.ll -mtriple=x86_64-unknown-linux-gnu -split-machine-functions | FileCheck %s --check-prefix=FSAFDO-MFS ; RUN: llc < %t.ll -mtriple=x86_64-unknown-linux-gnu -split-machine-functions | FileCheck %s --check-prefix=FSAFDO-MFS

; RUN: llc < %t.ll -mtriple=x86_64-unknown-linux-gnu -split-machine-functions | FileCheck %s --check-prefix=FSAFDO-MFS2 ; RUN: llc < %t.ll -mtriple=x86_64-unknown-linux-gnu -split-machine-functions | FileCheck %s --check-prefix=FSAFDO-MFS2

; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -debug-pass=Structure -fs-profile-file=%S/Inputs/fsloader-mfs.afdo -enable-fs-discriminator=true -improved-fs-discriminator=true -split-machine-functions 2>&1 | FileCheck %s --check-prefix=MFS_ON

; RUN: llc < %s -mtriple=aarch64-unknown-linux-gnu -debug-pass=Structure -fs-profile-file=%S/Inputs/fsloader-mfs.afdo -enable-fs-discriminator=true -improved-fs-discriminator=true -split-machine-functions 2>&1 | FileCheck %s --check-prefix=MFS_OFF

;; Check that MFS is on for X86 targets.

; MFS_ON: Machine Function Splitter Transformation

; MFS_ON_NO: warning: -fsplit-machine-functions is not valid for

HahnfeldUnsubmitted

Not Done

; MFS_ON: Machine Function Splitter Transformation

- ; MFS_ON_NO: warning: -fsplit-machine-functions is not valid for

+ ; MFS_ON-NOT: warning: -fsplit-machine-functions is not valid for

;; Check that MFS is not on for non-X86 targets.

shouldn't this be MFS_ON-NOT?

Hahnfeld: shouldn't this be `MFS_ON-NOT`?

;; Check that MFS is not on for non-X86 targets.

; MFS_OFF: warning: -fsplit-machine-functions is not valid for

; MFS_OFF_NO: Machine Function Splitter Transformation

HahnfeldUnsubmitted

Not Done

; MFS_OFF: warning: -fsplit-machine-functions is not valid for

- ; MFS_OFF_NO: Machine Function Splitter Transformation

+ ; MFS_OFF-NOT: Machine Function Splitter Transformation

define void @foo1(i1 zeroext %0) nounwind !prof !14 !section_prefix !15 {

Hahnfeld:

define void @foo1(i1 zeroext %0) nounwind !prof !14 !section_prefix !15 { define void @foo1(i1 zeroext %0) nounwind !prof !14 !section_prefix !15 {

;; Check that cold block is moved to .text.split. ;; Check that cold block is moved to .text.split.

; MFS-DEFAULTS-LABEL: foo1 ; MFS-DEFAULTS-LABEL: foo1

; MFS-DEFAULTS: .section .text.split.foo1 ; MFS-DEFAULTS: .section .text.split.foo1

; MFS-DEFAULTS-NEXT: foo1.cold: ; MFS-DEFAULTS-NEXT: foo1.cold:

; MFS-DEFAULTS-X86-NOT: callq bar ; MFS-DEFAULTS-X86-NOT: callq bar

; MFS-DEFAULTS-X86-NEXT: callq baz ; MFS-DEFAULTS-X86-NEXT: callq baz

▲ Show 20 Lines • Show All 459 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Properly handle -fsplit-machine-functions for fatbinary compilationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 551017

clang/include/clang/Basic/DiagnosticDriverKinds.td

clang/lib/Driver/ToolChains/Clang.cpp

clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c

clang/test/Driver/fsplit-machine-functions.c

clang/test/Driver/fsplit-machine-functions2.c

llvm/include/llvm/IR/DiagnosticInfo.h

llvm/lib/CodeGen/MachineFunctionSplitter.cpp

llvm/lib/IR/DiagnosticInfo.cpp

llvm/test/CodeGen/Generic/Inputs/fsloader-mfs.afdo

llvm/test/CodeGen/Generic/machine-function-splitter.ll

Properly handle -fsplit-machine-functions for fatbinary compilation
ClosedPublic