This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
1
LTOVisibility.rst
-
include/clang/Driver/
-
clang/
-
Driver/
-
CC1Options.td
-
Options.td
-
SanitizerArgs.h
-
lib/Driver/
-
Driver/
-
SanitizerArgs.cpp
-
ToolChains/
-
Clang.cpp
-
test/Driver/
-
Driver/
-
lto-unit.c

Differential D53524

[ThinLTO] Enable LTOUnit only when it is needed
AbandonedPublic

Authored by tejohnson on Oct 22 2018, 1:09 PM.

Download Raw Diff

Details

Reviewers

pcc

Summary

Currently, -flto-unit is specified whenever LTO options are used
(unless using the old LTO API). This causes vtable defs to be processed
using regular LTO, which is needed for CFI and whole program vtable
optimizations, since they need to modify the vtables in a whole program
manner.

However, this causes non-negligible overhead due to the regular
LTO processing. Since this isn't needed when not using CFI or
-fwhole-program-vtables, only enable -flto-unit in those cases.
Otherwise all ThinLTO compiles pay the overhead, even when not needed.

Diff Detail

Repository

rC Clang

Build Status

Buildable 24468
Build 24467: arc lint + arc unit

Event Timeline

tejohnson created this revision.Oct 22 2018, 1:09 PM

Harbormaster completed remote builds in B24021: Diff 170481.Oct 22 2018, 1:09 PM

Herald added subscribers: dexonsmith, steven_wu, inglorion, mehdi_amini. · View Herald TranscriptOct 22 2018, 1:09 PM

The reason why LTO unit is always enabled is so that you can link translation units compiled with -fsanitize=cfi and/or -fwhole-program-vtables against translation units compiled without CFI/WPD. With this change we will see miscompiles in the translation units compiled with CFI/WPD if they use vtables in the translation units compiled without CFI/WPD. If we really need this option I think it should be an opt out.

This revision now requires changes to proceed.Oct 22 2018, 1:42 PM

In D53524#1271357, @pcc wrote:

The reason why LTO unit is always enabled is so that you can link translation units compiled with -fsanitize=cfi and/or -fwhole-program-vtables against translation units compiled without CFI/WPD. With this change we will see miscompiles in the translation units compiled with CFI/WPD if they use vtables in the translation units compiled without CFI/WPD. If we really need this option I think it should be an opt out.

Is there an important use case for support thing mixing and matching? The issue is that it comes at a cost to all ThinLTO compiles for codes with vtables by requiring them all to process IR during the thin link. Can we detect that TUs compiled with -flto-unit are being mixed with those not built without -flto-unit at the thin link time and issue an error?

In D53524#1271387, @tejohnson wrote:

In D53524#1271357, @pcc wrote:

The reason why LTO unit is always enabled is so that you can link translation units compiled with -fsanitize=cfi and/or -fwhole-program-vtables against translation units compiled without CFI/WPD. With this change we will see miscompiles in the translation units compiled with CFI/WPD if they use vtables in the translation units compiled without CFI/WPD. If we really need this option I think it should be an opt out.

Is there an important use case for support thing mixing and matching? The issue is that it comes at a cost to all ThinLTO compiles for codes with vtables by requiring them all to process IR during the thin link.

Ping on the question of why this mode needs to be default. If it was a matter of a few percent overhead that would be one thing, but we're talking a *huge* overhead (as noted off-patch for my app I'm seeing >20x thin link time currently, and with improvements to the hashing to always get successful splitting we could potentially get down to closer to 2x - still a big overhead). This kind of overhead should be opt-in. The average ThinLTO user is not going to realize they are paying a big overhead because CFI is always pre-enabled.

Can we detect that TUs compiled with -flto-unit are being mixed with those not built without -flto-unit at the thin link time and issue an error?

This would be doable pretty easily. E.g. add a flag at the index level that the module would have been split but wasn't. Users who get the error and want to support always-enabled CFI could opt in via -flto-unit.

In D53524#1274505, @tejohnson wrote:

In D53524#1271387, @tejohnson wrote:

In D53524#1271357, @pcc wrote:

The reason why LTO unit is always enabled is so that you can link translation units compiled with -fsanitize=cfi and/or -fwhole-program-vtables against translation units compiled without CFI/WPD. With this change we will see miscompiles in the translation units compiled with CFI/WPD if they use vtables in the translation units compiled without CFI/WPD. If we really need this option I think it should be an opt out.

Is there an important use case for support thing mixing and matching? The issue is that it comes at a cost to all ThinLTO compiles for codes with vtables by requiring them all to process IR during the thin link.

Ping on the question of why this mode needs to be default. If it was a matter of a few percent overhead that would be one thing, but we're talking a *huge* overhead (as noted off-patch for my app I'm seeing >20x thin link time currently, and with improvements to the hashing to always get successful splitting we could potentially get down to closer to 2x - still a big overhead). This kind of overhead should be opt-in. The average ThinLTO user is not going to realize they are paying a big overhead because CFI is always pre-enabled.

Well, the intent was always that the overhead would be minimal, which is why things are set up the way that they are. But it doesn't sound like anyone is going to have the time to fully address the performance problems that you've seen any time soon, so maybe it would be fine to introduce the -flto-unit flag. I guess we can always change the flag so that it has no effect if/when the performance problem is addressed.

Can we detect that TUs compiled with -flto-unit are being mixed with those not built without -flto-unit at the thin link time and issue an error?

This would be doable pretty easily. E.g. add a flag at the index level that the module would have been split but wasn't. Users who get the error and want to support always-enabled CFI could opt in via -flto-unit.

Yes. I don't think we should make a change like this unless there is something like that in place, though. The documentation (LTOVisibility.rst) needs to be updated too.

In D53524#1276038, @pcc wrote:

In D53524#1274505, @tejohnson wrote:

In D53524#1271387, @tejohnson wrote:

In D53524#1271357, @pcc wrote:

The reason why LTO unit is always enabled is so that you can link translation units compiled with -fsanitize=cfi and/or -fwhole-program-vtables against translation units compiled without CFI/WPD. With this change we will see miscompiles in the translation units compiled with CFI/WPD if they use vtables in the translation units compiled without CFI/WPD. If we really need this option I think it should be an opt out.

Is there an important use case for support thing mixing and matching? The issue is that it comes at a cost to all ThinLTO compiles for codes with vtables by requiring them all to process IR during the thin link.

Ping on the question of why this mode needs to be default. If it was a matter of a few percent overhead that would be one thing, but we're talking a *huge* overhead (as noted off-patch for my app I'm seeing >20x thin link time currently, and with improvements to the hashing to always get successful splitting we could potentially get down to closer to 2x - still a big overhead). This kind of overhead should be opt-in. The average ThinLTO user is not going to realize they are paying a big overhead because CFI is always pre-enabled.

Well, the intent was always that the overhead would be minimal, which is why things are set up the way that they are. But it doesn't sound like anyone is going to have the time to fully address the performance problems that you've seen any time soon, so maybe it would be fine to introduce the -flto-unit flag. I guess we can always change the flag so that it has no effect if/when the performance problem is addressed.

Just to clarify, since there is already a -flto-unit flag: it is currently a cc1 flag, did you want it made into a driver option as well?

Can we detect that TUs compiled with -flto-unit are being mixed with those not built without -flto-unit at the thin link time and issue an error?

This would be doable pretty easily. E.g. add a flag at the index level that the module would have been split but wasn't. Users who get the error and want to support always-enabled CFI could opt in via -flto-unit.

Yes. I don't think we should make a change like this unless there is something like that in place, though. The documentation (LTOVisibility.rst) needs to be updated too.

Ok, let me work on that now and we can get that in before this one.

In D53524#1279288, @tejohnson wrote:

In D53524#1276038, @pcc wrote:

In D53524#1274505, @tejohnson wrote:

In D53524#1271387, @tejohnson wrote:

In D53524#1271357, @pcc wrote:

The reason why LTO unit is always enabled is so that you can link translation units compiled with -fsanitize=cfi and/or -fwhole-program-vtables against translation units compiled without CFI/WPD. With this change we will see miscompiles in the translation units compiled with CFI/WPD if they use vtables in the translation units compiled without CFI/WPD. If we really need this option I think it should be an opt out.

Is there an important use case for support thing mixing and matching? The issue is that it comes at a cost to all ThinLTO compiles for codes with vtables by requiring them all to process IR during the thin link.

Ping on the question of why this mode needs to be default. If it was a matter of a few percent overhead that would be one thing, but we're talking a *huge* overhead (as noted off-patch for my app I'm seeing >20x thin link time currently, and with improvements to the hashing to always get successful splitting we could potentially get down to closer to 2x - still a big overhead). This kind of overhead should be opt-in. The average ThinLTO user is not going to realize they are paying a big overhead because CFI is always pre-enabled.

Well, the intent was always that the overhead would be minimal, which is why things are set up the way that they are. But it doesn't sound like anyone is going to have the time to fully address the performance problems that you've seen any time soon, so maybe it would be fine to introduce the -flto-unit flag. I guess we can always change the flag so that it has no effect if/when the performance problem is addressed.

Just to clarify, since there is already a -flto-unit flag: it is currently a cc1 flag, did you want it made into a driver option as well?

Yes, that's what I had in mind.

Can we detect that TUs compiled with -flto-unit are being mixed with those not built without -flto-unit at the thin link time and issue an error?

This would be doable pretty easily. E.g. add a flag at the index level that the module would have been split but wasn't. Users who get the error and want to support always-enabled CFI could opt in via -flto-unit.

Yes. I don't think we should make a change like this unless there is something like that in place, though. The documentation (LTOVisibility.rst) needs to be updated too.

Ok, let me work on that now and we can get that in before this one.

In D53524#1279288, @tejohnson wrote:

In D53524#1276038, @pcc wrote:

In D53524#1274505, @tejohnson wrote:

In D53524#1271387, @tejohnson wrote:

Can we detect that TUs compiled with -flto-unit are being mixed with those not built without -flto-unit at the thin link time and issue an error?

This would be doable pretty easily. E.g. add a flag at the index level that the module would have been split but wasn't. Users who get the error and want to support always-enabled CFI could opt in via -flto-unit.

Yes. I don't think we should make a change like this unless there is something like that in place, though. The documentation (LTOVisibility.rst) needs to be updated too.

Ok, let me work on that now and we can get that in before this one.

Mailed D53890 for this part.

Address comments:
Promote -flto-unit to clang driver option (and test it)
Adjust LTOVisibility.rst to reflect change of default and new option.

Harbormaster completed remote builds in B24468: Diff 172181.Nov 1 2018, 11:06 AM

tejohnson added parent revisions: D53891: [LTO] Add option to enable LTOUnit splitting, and disable unless needed, D53890: [LTO] Record whether LTOUnit splitting is enabled in index.Nov 1 2018, 11:09 AM

pcc added inline comments.Nov 9 2018, 3:59 PM

docs/LTOVisibility.rst
9	It's a little confusing to talk about "LTO units" as a property of a translation unit when there is only one LTO unit per linkage unit. I think this should say that an LTO unit is the subset of the linkage unit compiled with certain flags. Then in the rest of the document you can talk about translation units that are either part of or not part of the LTO unit.

Abandoned in favor of new approach in D53890/D53891.

Revision Contents

Path

Size

docs/

LTOVisibility.rst

18 lines

include/

clang/

Driver/

CC1Options.td

2 lines

Options.td

3 lines

SanitizerArgs.h

1 line

lib/

Driver/

SanitizerArgs.cpp

4 lines

ToolChains/

Clang.cpp

40 lines

test/

Driver/

lto-unit.c

21 lines

Diff 172181

docs/LTOVisibility.rst

	==============			==============
	LTO Visibility			LTO Visibility
	==============			==============

	LTO visibility is a property of an entity that specifies whether it can be			LTO visibility is a property of an entity that specifies whether it can be
	referenced from outside the current LTO unit. A linkage unit is a set of			referenced from outside the current LTO unit. A linkage unit is a set of
	translation units linked together into an executable or DSO, and a linkage			translation units linked together into an executable or DSO, and a linkage
	unit's LTO unit is the subset of the linkage unit that is linked together			unit's LTO unit is the subset of the linkage unit that is linked together
	using link-time optimization; in the case where LTO is not being used, the			using link-time optimization; in the case where LTO units are not being used,
				pccUnsubmitted Not Done Reply Inline Actions It's a little confusing to talk about "LTO units" as a property of a translation unit when there is only one LTO unit per linkage unit. I think this should say that an LTO unit is the subset of the linkage unit compiled with certain flags. Then in the rest of the document you can talk about translation units that are either part of or not part of the LTO unit. pcc: It's a little confusing to talk about "LTO units" as a property of a translation unit when…
	linkage unit's LTO unit is empty. Each linkage unit has only a single LTO unit.			the linkage unit's LTO unit is empty. Each linkage unit has only a single LTO
				unit.

				LTO units are produced during LTO compiles when also compiling with
				``-fwhole-program-vtables`` or control flow integrity (e.g.
				``-fsanitize=cfi-vcall`` and ``-fsanitize=cfi-mfcall``), or when LTO units
				are explicitly enabled (``-flto-unit``).

	The LTO visibility of a class is used by the compiler to determine which			The LTO visibility of a class is used by the compiler to determine which
	classes the whole-program devirtualization (``-fwhole-program-vtables``) and			classes the whole-program devirtualization (``-fwhole-program-vtables``) and
	control flow integrity (``-fsanitize=cfi-vcall`` and ``-fsanitize=cfi-mfcall``)			control flow integrity (``-fsanitize=cfi-vcall`` and ``-fsanitize=cfi-mfcall``)
	features apply to. These features use whole-program information, so they			features apply to. These features use whole-program information, so they
	require the entire class hierarchy to be visible in order to work correctly.			require the entire class hierarchy to be visible in order to work correctly.

	If any translation unit in the program uses either of the whole-program			If any translation unit in the program uses either of the whole-program
	devirtualization or control flow integrity features, it is effectively an ODR			devirtualization or control flow integrity features, it is effectively an ODR
	violation to define a class with hidden LTO visibility in multiple linkage			violation to define a class with hidden LTO visibility in multiple linkage
	units. A class with public LTO visibility may be defined in multiple linkage			units. A class with public LTO visibility may be defined in multiple linkage
	units, but the tradeoff is that the whole-program devirtualization and			units, but the tradeoff is that the whole-program devirtualization and
	control flow integrity features can only be applied to classes with hidden LTO			control flow integrity features can only be applied to classes with hidden LTO
	visibility. A class's LTO visibility is treated as an ODR-relevant property			visibility. A class's LTO visibility is treated as an ODR-relevant property
	of its definition, so it must be consistent between translation units.			of its definition, so it must be consistent between translation units.

	In translation units built with LTO, LTO visibility is based on the			In translation units built with LTO units, LTO visibility is based on the
	class's symbol visibility as expressed at the source level (i.e. the			class's symbol visibility as expressed at the source level (i.e. the
	``__attribute__((visibility("...")))`` attribute, or the ``-fvisibility=``			``__attribute__((visibility("...")))`` attribute, or the ``-fvisibility=``
	flag) or, on the Windows platform, the dllimport and dllexport attributes. When			flag) or, on the Windows platform, the dllimport and dllexport attributes. When
	targeting non-Windows platforms, classes with a visibility other than hidden			targeting non-Windows platforms, classes with a visibility other than hidden
	visibility receive public LTO visibility. When targeting Windows, classes			visibility receive public LTO visibility. When targeting Windows, classes
	with dllimport or dllexport attributes receive public LTO visibility. All			with dllimport or dllexport attributes receive public LTO visibility. All
	other classes receive hidden LTO visibility. Classes with internal linkage			other classes receive hidden LTO visibility. Classes with internal linkage
	(e.g. classes declared in unnamed namespaces) also receive hidden LTO			(e.g. classes declared in unnamed namespaces) also receive hidden LTO
	visibility.			visibility.

	A class defined in a translation unit built without LTO receives public			A class defined in a translation unit built without LTO units receives public
	LTO visibility regardless of its object file visibility, linkage or other			LTO visibility regardless of its object file visibility, linkage or other
	attributes.			attributes.

	This mechanism will produce the correct result in most cases, but there are			This mechanism will produce the correct result in most cases, but there are
	two cases where it may wrongly infer hidden LTO visibility.			two cases where it may wrongly infer hidden LTO visibility.

	1. As a corollary of the above rules, if a linkage unit is produced from a			1. As a corollary of the above rules, if a linkage unit is produced from a
	combination of LTO object files and non-LTO object files, any hidden			combination of LTO unit object files and non-LTO unit object files, any
	visibility class defined in both a translation unit built with LTO and			hidden visibility class defined in both a translation unit built with LTO and
	a translation unit built without LTO must be defined with public LTO			a translation unit built without LTO must be defined with public LTO
	visibility in order to avoid an ODR violation.			visibility in order to avoid an ODR violation.

	2. Some ABIs provide the ability to define an abstract base class without			2. Some ABIs provide the ability to define an abstract base class without
	visibility attributes in multiple linkage units and have virtual calls			visibility attributes in multiple linkage units and have virtual calls
	to derived classes in other linkage units work correctly. One example of			to derived classes in other linkage units work correctly. One example of
	this is COM on Windows platforms. If the ABI allows this, any base class			this is COM on Windows platforms. If the ABI allows this, any base class
	used in this way must be defined with public LTO visibility.			used in this way must be defined with public LTO visibility.
	▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

include/clang/Driver/CC1Options.td

Show First 20 Lines • Show All 339 Lines • ▼ Show 20 Lines	def fprofile_instrument_path_EQ : Joined<["-"], "fprofile-instrument-path=">,
HelpText<"Generate instrumented code to collect execution counts into "		HelpText<"Generate instrumented code to collect execution counts into "
"<file> (overridden by LLVM_PROFILE_FILE env var)">;		"<file> (overridden by LLVM_PROFILE_FILE env var)">;
def fprofile_instrument_use_path_EQ :		def fprofile_instrument_use_path_EQ :
Joined<["-"], "fprofile-instrument-use-path=">,		Joined<["-"], "fprofile-instrument-use-path=">,
HelpText<"Specify the profile path in PGO use compilation">;		HelpText<"Specify the profile path in PGO use compilation">;
def flto_visibility_public_std:		def flto_visibility_public_std:
Flag<["-"], "flto-visibility-public-std">,		Flag<["-"], "flto-visibility-public-std">,
HelpText<"Use public LTO visibility for classes in std and stdext namespaces">;		HelpText<"Use public LTO visibility for classes in std and stdext namespaces">;
def flto_unit: Flag<["-"], "flto-unit">,
HelpText<"Emit IR to support LTO unit features (CFI, whole program vtable opt)">;
def fno_lto_unit: Flag<["-"], "fno-lto-unit">;		def fno_lto_unit: Flag<["-"], "fno-lto-unit">;
def fthin_link_bitcode_EQ : Joined<["-"], "fthin-link-bitcode=">,		def fthin_link_bitcode_EQ : Joined<["-"], "fthin-link-bitcode=">,
HelpText<"Write minimized bitcode to <file> for the ThinLTO thin link only">;		HelpText<"Write minimized bitcode to <file> for the ThinLTO thin link only">;
def fdebug_pass_manager : Flag<["-"], "fdebug-pass-manager">,		def fdebug_pass_manager : Flag<["-"], "fdebug-pass-manager">,
HelpText<"Prints debug information for the new pass manager">;		HelpText<"Prints debug information for the new pass manager">;
def fno_debug_pass_manager : Flag<["-"], "fno-debug-pass-manager">,		def fno_debug_pass_manager : Flag<["-"], "fno-debug-pass-manager">,
HelpText<"Disables debug printing for the new pass manager">;		HelpText<"Disables debug printing for the new pass manager">;

▲ Show 20 Lines • Show All 494 Lines • Show Last 20 Lines

include/clang/Driver/Options.td

	Show First 20 Lines • Show All 1,721 Lines • ▼ Show 20 Lines
	def fvisibility_ms_compat : Flag<["-"], "fvisibility-ms-compat">, Group<f_Group>,			def fvisibility_ms_compat : Flag<["-"], "fvisibility-ms-compat">, Group<f_Group>,
	HelpText<"Give global types 'default' visibility and global functions and "			HelpText<"Give global types 'default' visibility and global functions and "
	"variables 'hidden' visibility by default">;			"variables 'hidden' visibility by default">;
	def fwhole_program_vtables : Flag<["-"], "fwhole-program-vtables">, Group<f_Group>,			def fwhole_program_vtables : Flag<["-"], "fwhole-program-vtables">, Group<f_Group>,
	Flags<[CoreOption, CC1Option]>,			Flags<[CoreOption, CC1Option]>,
	HelpText<"Enables whole-program vtable optimization. Requires -flto">;			HelpText<"Enables whole-program vtable optimization. Requires -flto">;
	def fno_whole_program_vtables : Flag<["-"], "fno-whole-program-vtables">, Group<f_Group>,			def fno_whole_program_vtables : Flag<["-"], "fno-whole-program-vtables">, Group<f_Group>,
	Flags<[CoreOption]>;			Flags<[CoreOption]>;
				def flto_unit: Flag<["-"], "flto-unit">, Group<f_Group>,
				Flags<[CoreOption, CC1Option]>,
				HelpText<"Emit IR to support LTO unit features (CFI, whole program vtable opt)">;
	def fforce_emit_vtables : Flag<["-"], "fforce-emit-vtables">, Group<f_Group>,			def fforce_emit_vtables : Flag<["-"], "fforce-emit-vtables">, Group<f_Group>,
	Flags<[CC1Option]>,			Flags<[CC1Option]>,
	HelpText<"Emits more virtual tables to improve devirtualization">;			HelpText<"Emits more virtual tables to improve devirtualization">;
	def fno_force_emit_vtables : Flag<["-"], "fno-force-emit-vtables">, Group<f_Group>,			def fno_force_emit_vtables : Flag<["-"], "fno-force-emit-vtables">, Group<f_Group>,
	Flags<[CoreOption]>;			Flags<[CoreOption]>;
	def fwrapv : Flag<["-"], "fwrapv">, Group<f_Group>, Flags<[CC1Option]>,			def fwrapv : Flag<["-"], "fwrapv">, Group<f_Group>, Flags<[CC1Option]>,
	HelpText<"Treat signed integer overflow as two's complement">;			HelpText<"Treat signed integer overflow as two's complement">;
	def fwritable_strings : Flag<["-"], "fwritable-strings">, Group<f_Group>, Flags<[CC1Option]>,			def fwritable_strings : Flag<["-"], "fwritable-strings">, Group<f_Group>, Flags<[CC1Option]>,
	▲ Show 20 Lines • Show All 1,319 Lines • Show Last 20 Lines

include/clang/Driver/SanitizerArgs.h

Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	public:
bool needsStatsRt() const { return Stats; }		bool needsStatsRt() const { return Stats; }
bool needsEsanRt() const {		bool needsEsanRt() const {
return Sanitizers.hasOneOf(SanitizerKind::Efficiency);		return Sanitizers.hasOneOf(SanitizerKind::Efficiency);
}		}
bool needsScudoRt() const { return Sanitizers.has(SanitizerKind::Scudo); }		bool needsScudoRt() const { return Sanitizers.has(SanitizerKind::Scudo); }

bool requiresPIE() const;		bool requiresPIE() const;
bool needsUnwindTables() const;		bool needsUnwindTables() const;
		bool needsLTO() const;
bool linkCXXRuntimes() const { return LinkCXXRuntimes; }		bool linkCXXRuntimes() const { return LinkCXXRuntimes; }
bool hasCrossDsoCfi() const { return CfiCrossDso; }		bool hasCrossDsoCfi() const { return CfiCrossDso; }
void addArgs(const ToolChain &TC, const llvm::opt::ArgList &Args,		void addArgs(const ToolChain &TC, const llvm::opt::ArgList &Args,
llvm::opt::ArgStringList &CmdArgs, types::ID InputType) const;		llvm::opt::ArgStringList &CmdArgs, types::ID InputType) const;
};		};

} // namespace driver		} // namespace driver
} // namespace clang		} // namespace clang

#endif		#endif

lib/Driver/SanitizerArgs.cpp

	Show First 20 Lines • Show All 201 Lines • ▼ Show 20 Lines
	bool SanitizerArgs::requiresPIE() const {			bool SanitizerArgs::requiresPIE() const {
	return NeedPIE \|\| (Sanitizers.Mask & RequiresPIE);			return NeedPIE \|\| (Sanitizers.Mask & RequiresPIE);
	}			}

	bool SanitizerArgs::needsUnwindTables() const {			bool SanitizerArgs::needsUnwindTables() const {
	return Sanitizers.Mask & NeedsUnwindTables;			return Sanitizers.Mask & NeedsUnwindTables;
	}			}

				bool SanitizerArgs::needsLTO() const {
				return Sanitizers.Mask & NeedsLTO;
				}

	SanitizerArgs::SanitizerArgs(const ToolChain &TC,			SanitizerArgs::SanitizerArgs(const ToolChain &TC,
	const llvm::opt::ArgList &Args) {			const llvm::opt::ArgList &Args) {
	SanitizerMask AllRemove = 0; // During the loop below, the accumulated set of			SanitizerMask AllRemove = 0; // During the loop below, the accumulated set of
	// sanitizers disabled by the current sanitizer			// sanitizers disabled by the current sanitizer
	// argument or any argument after it.			// argument or any argument after it.
	SanitizerMask AllAddedKinds = 0; // Mask of all sanitizers ever enabled by			SanitizerMask AllAddedKinds = 0; // Mask of all sanitizers ever enabled by
	// -fsanitize= flags (directly or via group			// -fsanitize= flags (directly or via group
	// expansion), some of which may be disabled			// expansion), some of which may be disabled
	▲ Show 20 Lines • Show All 806 Lines • Show Last 20 Lines

lib/Driver/ToolChains/Clang.cpp

Show First 20 Lines • Show All 3,358 Lines • ▼ Show 20 Lines	if (isa<AnalyzeJobAction>(JA)) {
}		}

// Preserve use-list order by default when emitting bitcode, so that		// Preserve use-list order by default when emitting bitcode, so that
// loading the bitcode up in 'opt' or 'llc' and running passes gives the		// loading the bitcode up in 'opt' or 'llc' and running passes gives the
// same result as running passes here. For LTO, we don't need to preserve		// same result as running passes here. For LTO, we don't need to preserve
// the use-list order, since serialization to bitcode is part of the flow.		// the use-list order, since serialization to bitcode is part of the flow.
if (JA.getType() == types::TY_LLVM_BC)		if (JA.getType() == types::TY_LLVM_BC)
CmdArgs.push_back("-emit-llvm-uselists");		CmdArgs.push_back("-emit-llvm-uselists");

// Device-side jobs do not support LTO.
bool isDeviceOffloadAction = !(JA.isDeviceOffloading(Action::OFK_None) \|\|
JA.isDeviceOffloading(Action::OFK_Host));

if (D.isUsingLTO() && !isDeviceOffloadAction) {
Args.AddLastArg(CmdArgs, options::OPT_flto, options::OPT_flto_EQ);

// The Darwin and PS4 linkers currently use the legacy LTO API, which
// does not support LTO unit features (CFI, whole program vtable opt)
// under ThinLTO.
if (!(RawTriple.isOSDarwin() \|\| RawTriple.isPS4()) \|\|
D.getLTOMode() == LTOK_Full)
CmdArgs.push_back("-flto-unit");
}
}		}

if (const Arg *A = Args.getLastArg(options::OPT_fthinlto_index_EQ)) {		if (const Arg *A = Args.getLastArg(options::OPT_fthinlto_index_EQ)) {
if (!types::isLLVMIR(Input.getType()))		if (!types::isLLVMIR(Input.getType()))
D.Diag(diag::err_drv_argument_only_allowed_with) << A->getAsString(Args)		D.Diag(diag::err_drv_argument_only_allowed_with) << A->getAsString(Args)
<< "-x ir";		<< "-x ir";
Args.AddLastArg(CmdArgs, options::OPT_fthinlto_index_EQ);		Args.AddLastArg(CmdArgs, options::OPT_fthinlto_index_EQ);
}		}
▲ Show 20 Lines • Show All 1,585 Lines • ▼ Show 20 Lines
if (WholeProgramVTables) {		if (WholeProgramVTables) {
if (!D.isUsingLTO())		if (!D.isUsingLTO())
D.Diag(diag::err_drv_argument_only_allowed_with)		D.Diag(diag::err_drv_argument_only_allowed_with)
<< "-fwhole-program-vtables"		<< "-fwhole-program-vtables"
<< "-flto";		<< "-flto";
CmdArgs.push_back("-fwhole-program-vtables");		CmdArgs.push_back("-fwhole-program-vtables");
}		}

		bool LTOUnit = Args.hasArg(options::OPT_flto_unit);
		if (LTOUnit && !D.isUsingLTO())
		D.Diag(diag::err_drv_argument_only_allowed_with) << "-flto-unit"
		<< "-flto";

		// Device-side jobs do not support LTO.
		bool isDeviceOffloadAction = !(JA.isDeviceOffloading(Action::OFK_None) \|\|
		JA.isDeviceOffloading(Action::OFK_Host));

		if (D.isUsingLTO() &&
		(isa<CompileJobAction>(JA) \|\| isa<BackendJobAction>(JA)) &&
		!isDeviceOffloadAction) {
		Args.AddLastArg(CmdArgs, options::OPT_flto, options::OPT_flto_EQ);

		// Enable LTO unit if need for CFI or whole program vtable optimization.
		// The Darwin and PS4 linkers currently use the legacy LTO API, which
		// does not support LTO unit features (CFI, whole program vtable opt)
		// under ThinLTO.
		bool SupportsLTOUnit = !(RawTriple.isOSDarwin() \|\| RawTriple.isPS4()) \|\|
		D.getLTOMode() == LTOK_Full;
		if ((LTOUnit \|\| WholeProgramVTables \|\| Sanitize.needsLTO()) &&
		SupportsLTOUnit)
		CmdArgs.push_back("-flto-unit");
		}

if (Arg *A = Args.getLastArg(options::OPT_fexperimental_isel,		if (Arg *A = Args.getLastArg(options::OPT_fexperimental_isel,
options::OPT_fno_experimental_isel)) {		options::OPT_fno_experimental_isel)) {
CmdArgs.push_back("-mllvm");		CmdArgs.push_back("-mllvm");
if (A->getOption().matches(options::OPT_fexperimental_isel)) {		if (A->getOption().matches(options::OPT_fexperimental_isel)) {
CmdArgs.push_back("-global-isel=1");		CmdArgs.push_back("-global-isel=1");

// GISel is on by default on AArch64 -O0, so don't bother adding		// GISel is on by default on AArch64 -O0, so don't bother adding
// the fallback remarks for it. Other combinations will add a warning of		// the fallback remarks for it. Other combinations will add a warning of
▲ Show 20 Lines • Show All 937 Lines • Show Last 20 Lines

test/Driver/lto-unit.c

	// RUN: %clang -target x86_64-unknown-linux -### %s -flto=full 2>&1 \| FileCheck --check-prefix=UNIT %s			// RUN: %clang -target x86_64-unknown-linux -### %s -flto=full -fwhole-program-vtables 2>&1 \| FileCheck --check-prefix=UNIT %s
	// RUN: %clang -target x86_64-unknown-linux -### %s -flto=thin 2>&1 \| FileCheck --check-prefix=UNIT %s			// RUN: %clang -target x86_64-unknown-linux -### %s -flto=thin -fwhole-program-vtables 2>&1 \| FileCheck --check-prefix=UNIT %s
	// RUN: %clang -target x86_64-apple-darwin13.3.0 -### %s -flto=full 2>&1 \| FileCheck --check-prefix=UNIT %s			// RUN: %clang -target x86_64-unknown-linux -### %s -flto=full -flto-unit 2>&1 \| FileCheck --check-prefix=UNIT %s
	// RUN: %clang -target x86_64-apple-darwin13.3.0 -### %s -flto=thin 2>&1 \| FileCheck --check-prefix=NOUNIT %s			// RUN: %clang -target x86_64-unknown-linux -### %s -flto=thin -flto-unit 2>&1 \| FileCheck --check-prefix=UNIT %s
	// RUN: %clang -target x86_64-scei-ps4 -### %s -flto=full 2>&1 \| FileCheck --check-prefix=UNIT %s			// RUN: %clang -target x86_64-apple-darwin13.3.0 -### %s -flto=full -fwhole-program-vtables 2>&1 \| FileCheck --check-prefix=UNIT %s
	// RUN: %clang -target x86_64-scei-ps4 -### %s -flto=thin 2>&1 \| FileCheck --check-prefix=NOUNIT %s			// RUN: %clang -target x86_64-apple-darwin13.3.0 -### %s -flto=thin -fwhole-program-vtables 2>&1 \| FileCheck --check-prefix=NOUNIT %s
				// RUN: %clang -target x86_64-apple-darwin13.3.0 -### %s -flto=full -flto-unit 2>&1 \| FileCheck --check-prefix=UNIT %s
				// RUN: %clang -target x86_64-apple-darwin13.3.0 -### %s -flto=thin -flto-unit 2>&1 \| FileCheck --check-prefix=NOUNIT %s
				// RUN: %clang -target x86_64-scei-ps4 -### %s -flto=full -fwhole-program-vtables 2>&1 \| FileCheck --check-prefix=UNIT %s
				// RUN: %clang -target x86_64-scei-ps4 -### %s -flto=thin -fwhole-program-vtables 2>&1 \| FileCheck --check-prefix=NOUNIT %s
				// RUN: %clang -target x86_64-scei-ps4 -### %s -flto=full -flto-unit 2>&1 \| FileCheck --check-prefix=UNIT %s
				// RUN: %clang -target x86_64-scei-ps4 -### %s -flto=thin -flto-unit 2>&1 \| FileCheck --check-prefix=NOUNIT %s

	// UNIT: "-flto-unit"			// UNIT: "-flto-unit"
	// NOUNIT-NOT: "-flto-unit"			// NOUNIT-NOT: "-flto-unit"

				// RUN: %clang -target x86_64-unknown-linux -### %s -flto-unit 2>&1 \| FileCheck --check-prefix=NO-LTO %s
				// NO-LTO: invalid argument '-flto-unit' only allowed with '-flto'

This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Enable LTOUnit only when it is neededAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 172181

docs/LTOVisibility.rst

include/clang/Driver/CC1Options.td

include/clang/Driver/Options.td

include/clang/Driver/SanitizerArgs.h

lib/Driver/SanitizerArgs.cpp

lib/Driver/ToolChains/Clang.cpp

test/Driver/lto-unit.c

[ThinLTO] Enable LTOUnit only when it is needed
AbandonedPublic