This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
compiler-rt/test/cfi/
-
test/
-
cfi/
-
simple-pass.cpp
-
llvm/
-
include/llvm/
-
llvm/
-
ADT/
-
PointerUnion.h
-
CodeGen/
-
TargetOpcodes.def
-
IR/
2
Intrinsics.td
2
ModuleSummaryIndex.h
-
ModuleSummaryIndexYAML.h
-
Target/
3
Target.td
-
lib/
-
CodeGen/SelectionDAG/
-
SelectionDAG/
2/4
SelectionDAGBuilder.cpp
-
IR/
-
Verifier.cpp
-
Target/X86/
-
X86/
-
X86ExpandPseudo.cpp
-
Transforms/IPO/
-
IPO/
1/11
LowerTypeTests.cpp
-
WholeProgramDevirt.cpp
-
test/Transforms/
-
Transforms/
-
LowerTypeTests/
-
icall-jumptable.ll
-
WholeProgramDevirt/
-
Inputs/
-
import-jumptable.yaml
-
import-vcp-jumptable.yaml
-
import.ll
-
jumptable.ll

Differential D42453

Use branch funnels for virtual calls when retpoline mitigation is enabled.
ClosedPublic

Authored by pcc on Jan 23 2018, 5:08 PM.

Download Raw Diff

Details

Reviewers

eugenis
vlad.tsyrklevich
chandlerc

Commits

rG2974856ad432: Use branch funnels for virtual calls when retpoline mitigation is enabled.
rCRT327163: Use branch funnels for virtual calls when retpoline mitigation is enabled.
rL327163: Use branch funnels for virtual calls when retpoline mitigation is enabled.

Summary

The retpoline mitigation for variant 2 of CVE-2017-5715 inhibits the
branch predictor, and as a result it can lead to a measurable loss of
performance. We can reduce the performance impact of retpolined virtual
calls by replacing them with a special construct known as a branch
funnel, which is an instruction sequence that implements virtual calls
to a set of known targets using a binary tree of direct branches. This
allows the processor to speculately execute valid implementations of the
virtual function without allowing for speculative execution of of calls
to arbitrary addresses.

This patch extends the whole-program devirtualization pass to replace
certain virtual calls with calls to branch funnels, which are
represented using a new llvm.icall.jumptable intrinsic. It also extends
the LowerTypeTests pass to recognize the new intrinsic, generate code
for the branch funnels (x86_64 only for now) and lay out virtual tables
as required for each branch funnel.

The implementation supports full LTO as well as ThinLTO, and extends the
ThinLTO summary format used for whole-program devirtualization to
support branch funnels.

For more details see RFC:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120672.html

Diff Detail

Build Status

Buildable 15234
Build 15234: arc lint + arc unit

Event Timeline

pcc created this revision.Jan 23 2018, 5:08 PM

Harbormaster completed remote builds in B14167: Diff 131172.Jan 23 2018, 5:08 PM

Herald added subscribers: mgrang, hiraditya, Prazek, mehdi_amini. · View Herald TranscriptJan 23 2018, 5:08 PM

chandlerc added inline comments.Jan 23 2018, 6:00 PM

llvm/include/llvm/IR/Intrinsics.td
877	A jump table usually refers to an indirect jump using a table of addresses... Maybe "search tree" or "branch funnel" or some other term would be more clear here and elsewhere?
llvm/lib/Transforms/IPO/LowerTypeTests.cpp
1086–1088	Emitting all of this with (I assume) inline assembly seems like a really messy design. It makes all of this entirely x86-specific, but this part is completely independent of architecture. Why not emit LLVM IR? Is there no pattern of IR that actually lowers to reasonable branches? I find that a little bit surprising, but maybe we should just fix the lowering in that case?

pcc added inline comments.Jan 23 2018, 8:26 PM

llvm/include/llvm/IR/Intrinsics.td
877	Yes, I'm a little inconsistent in the terminology in this patch. I'll go through the patch and try to consistently use "branch funnel".
llvm/lib/Transforms/IPO/LowerTypeTests.cpp
1086–1088	The main reason is that we need to make sure that whatever code we generate conforms with the calling convention used for the virtual call. At this point there is no guarantee that we have correct information about the calling convention to be used here because the "source of truth" for the calling convention is the call site itself, and with ThinLTO there may not be a call site in the current module. We cannot even rely on the function prototype with ThinLTO because we may have discarded the prototype by this point. Furthermore there doesn't seem to be a point in preserving the calling convention because no matter what it is, we will always want the same code. That is what led me to the conclusion that what we need here is an IR construct that represents the notion of a branch funnel call to a function with an unknown prototype -- that is what the `llvm.icall.jumptable` intrinsic is. I considered putting the implementation of this intrinsic in the backend like a regular intrinsic, but one of the problems with that is that one of the things that we need to be able to do is stitch together multiple global variables into a single global in order to implement the branch funnel as a binary search, and it is too late to do that in the backend. The LowerTypeTests pass was a convenient place to do it (since it already needs to know how to do it in order to implement `llvm.type.test`) but admittedly it may not be the best place. It occurred to me that it may be possible to implement the stitching together of globals in a pre-isel pass and the rest of `llvm.icall.jumptable` in the backend, but I think that's something that's best considered for a followup patch since it may have implications for how we implement `llvm.type.test` as well.

chandlerc added inline comments.Jan 23 2018, 8:40 PM

llvm/lib/Transforms/IPO/LowerTypeTests.cpp
1086–1088	The more I think about this the more I think that implementing this with inline asm is just the wrong design. I understand the challenges you're hitting here, but I think we should solve them by lowering in the backend, not in the IR. The IR really can't model the kinds of things you're doing (and it shouldn't!). My suggestion would be to change the `llvm.icall.<mumble>` intrinsic or add a during-lowering intrinsic that lets you prepare the global variable stuff as necessary in the IR pass, and hand that prepared information cleanly (and abstractly) to the backend where you can effectively lower it to branch instructions. Since you are (notionally) lowering a call, you may even be able to use the intrinsic all the way into the code generator and just lower this manually with an MI pass. Another aspect that this will improve is that you shouldn't need to embed things into naked functions at all. This should be something that you can just expand wherever it is needed into the code.

pcc added inline comments.Jan 24 2018, 11:50 AM

llvm/lib/Transforms/IPO/LowerTypeTests.cpp
1086–1088	I'm not sure that I understand your proposal. Are you proposing that in the WholeProgramDevirt pass we would transform each virtual call site into an intrinsic call with the list of possible targets, and then in the backend we would lower the intrinsic call into a call to a branch funnel "function" together with its definition?

chandlerc added inline comments.Jan 24 2018, 6:59 PM

llvm/lib/Transforms/IPO/LowerTypeTests.cpp
1086–1088	I'm imagining in the backend, we would lower the intrinsic call into an inline branch funnel across the destinations. (I can also imagine using a pseudo instead of call to an intrinsic, but as you want to capture a call with some specific calling convention / signature / etc, using a call to an intrinsic seems a reasonable representation... if needed, you could even put the potential destinations into an operand bundle so that the formal argument list is actually the argument list which should be used for calling the target function.)

pcc added inline comments.Jan 24 2018, 7:54 PM

llvm/lib/Transforms/IPO/LowerTypeTests.cpp
1086–1088	Are you sure you want an inline branch funnel? It will make the code more "branchy", which doesn't seem great for code size. i.e. the two options are call branchfunnel branchfunnel: cmp ... jb target1 je target2 jmp target3 and cmp ... jb .Ltmp1 je .Ltmp2 call target3 jmp .Ltmp3 .Ltmp1: call target1 jmp .Ltmp3 .Ltmp2: call target2 .Ltmp3: . Putting that aside, there are problems with the idea of putting the list of targets in every intrinsic. Most importantly, it complicates matters significantly for ThinLTO. It means that each backend job would need to know the list of targets as well as the layout of the combined global that stores the vtables. We don't need that for any other purpose, so we would need to include that information in the summary just to support this. Because we are including the information in the summary, it means that any change to the class hierarchy would cause a ThinLTO cache miss. (Right now, in many cases, we can avoid a cache miss as a result of careful summary design.) In other words, this proposal would be making the code more complicated and less efficient. So it doesn't seem like a good direction to me. I think we can both agree that lowering to inline asm isn't great. But I see a different way for this code to evolve so that it is properly layered. Essentially we would invent a new top-level entity (like a function or a global variable) that represents a thunk. There is already a need for such a top-level entity to represent CFI jump tables (currently we represent them in the same hackish way using inline asm), so to start with there would be two kinds of thunks: jump tables and branch funnels. Target specific code in the backend would lower the thunks to MI or MCInstrs. This would have a few advantages which would be shared with your intrinsic proposal: the IR would be somewhat platform independent until the backend, thus it can be easily optimized by midlevel passes, the backend would be able to choose whether to emit the branch funnel inline or outline, and most importantly for ThinLTO: it would support separate compilation. Please let me know what you think.

tschuett added a subscriber: tschuett.Jan 24 2018, 11:54 PM

Sorry for the delay, took me a bit to internalize what you were proposing.

llvm/lib/Transforms/IPO/LowerTypeTests.cpp
1086–1088	Having separate inline branch funnels may actually be nice when there are very few targets as they may be better predicted. There is a size/speed trade-off here. However, the separate compilation issue is very compelling. I think you can use an intrinsic to implement your proposal and get the advantages you describe without any new top-level construct: You could define / declare a thunk IR function which contains just this intrinsic. Only one TU gets the definition which calls the intrinsic and needs the global information. The others just get a declaration. No new top-level construct, and you lower it exactly as you describe. I'm not sure if this will directly map to the jump table case, but I'm happy for that to be dealt with in a follow-up as that is already in tree. If this idea doesn't work, let's chat about what alternatives would work or how to model the top-level entity. I feel pretty strongly that the current code with the current amount of inline assembly is really not up to the quality that should go into the tree even as a temporary thing. There doesn't seem to be such urgency here that we can't take the time to engineer this the right way. Also: We didn't actually clean up the inline asm for the CFI jump tables despite those being in-tree for a long time, so I think it is reasonable to insist we don't make things worse. This is a much more significant usage of inline asm, and so it seems much less reasonable as a temporary solution.

pcc added inline comments.Feb 5 2018, 9:08 PM

llvm/lib/Transforms/IPO/LowerTypeTests.cpp
1086–1088	You could define / declare a thunk IR function which contains just this intrinsic. Only one TU gets the definition which calls the intrinsic and needs the global information. The others just get a declaration. No new top-level construct, and you lower it exactly as you describe. The problem with that is: what would be the signature/calling convention of that function? That is the problem that I described on the first comment on this thread. If this idea doesn't work, let's chat about what alternatives would work or how to model the top-level entity. I feel pretty strongly that the current code with the current amount of inline assembly is really not up to the quality that should go into the tree even as a temporary thing. There doesn't seem to be such urgency here that we can't take the time to engineer this the right way. Also: We didn't actually clean up the inline asm for the CFI jump tables despite those being in-tree for a long time, so I think it is reasonable to insist we don't make things worse. This is a much more significant usage of inline asm, and so it seems much less reasonable as a temporary solution. Okay, fair enough. Assuming that you agree with my objection above, I will try to sketch out a more detailed plan for how I think the new top-level entity should look in the IR.

echristo added inline comments.Feb 8 2018, 11:28 AM

llvm/lib/Transforms/IPO/LowerTypeTests.cpp
1086–1088	I think you could count it as a naked function yes? I don't see a reason for a top level "thunk" entity in the IR necessarily.

First draft of MI-based lowering -- tests to come

Harbormaster completed remote builds in B15234: Diff 135173.Feb 20 2018, 4:55 PM

pcc added inline comments.Feb 20 2018, 5:12 PM

llvm/lib/Transforms/IPO/LowerTypeTests.cpp
1086–1088	The main issue was that the thunk needs to be calling-convention-independent, which I thought was unrepresentable in IR, but it turns out that we can base it on a representation which is apparently used for thunks on Windows. That representation is to mark the thunk as varargs and mark the call as musttail. That is what this patch implements.

@chandlerc ping -- let me know whether this MI-based implementation seems reasonable to you.

In D42453#1021513, @pcc wrote:

@chandlerc ping -- let me know whether this MI-based implementation seems reasonable to you.

Thanks for the ping, I missed this while on vacation last week. Will have feedback shortly....

I've not looked at all of the stuff yet, but focusing on the MI and lowering, yeah, this looks exactly like the direction I was imagining. Do you want me to go ahead and do a full review, or do you have other cleanups you need to make first? Were there any problems you ran into with this approach (other than the need to wire everything up here) that still need to be sorted out or that I can help with?

The code is ready for review, but the tests still need to be updated. Up to you whether you'd like to take a look now or wait for the tests.

Did a pass over the code. Really only minor nits and naming questions. Generally, the structure and such makes a lot of sense to me now. Thanks so much for working on threading this through all the layers so nicely!

Happy to take another look when you get tests and such updated.

llvm/include/llvm/IR/ModuleSummaryIndex.h
607–608	Is it really a jump table though? You use this term throughout, so only commenting here, but I wonder if there is a better term that makes it more clear what is going on here. I'm worried people will assume this is actually implemented with a jump table as opposed to a branch funnel (or binary-ish search).
llvm/include/llvm/Target/Target.td
1146	Does it make sense to put a comment string here to make reading the dump of MI easier? (I'm not sure, genuine question here.)
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6049	SmallVector? Seems likely to have lots of cases where 8 or 16 would handle..
6064	Same question here.
6070–6071	Is it worth teaching the verifier about this and the other invariants below? Could then make this just an assert. Minor point though, and would be fine as a follow-up.
llvm/lib/Transforms/IPO/LowerTypeTests.cpp
297	nit: I would have capitalized `Icall` as `ICall` throughout.

This revision now requires changes to proceed.Mar 2 2018, 2:23 AM

Update tests and fix a couple of benign bugs in the code
SmallVector
Renaming

Harbormaster completed remote builds in B15813: Diff 137512.Mar 7 2018, 5:02 PM

pcc added inline comments.Mar 7 2018, 5:02 PM

llvm/include/llvm/IR/ModuleSummaryIndex.h
607–608	Sorry, that was one of the things that I forgot to update in the code. Renamed to "branch funnel".
llvm/include/llvm/Target/Target.td
1146	No, the MI dump already says "ICALL_BRANCH_FUNNEL". This string is used in the asm output for instructions that survive until MC, and this one doesn't.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6070–6071	This particular check can't be a verifier check because the wholeprogramdevirt pass creates IR that would not pass this check (it is later fixed up by lowertypetests). The others would be fine as verifier checks and we might do that as a followup.

This looks really, really nice. Thanks for all the hard work here!

llvm/include/llvm/Target/Target.td
1146	Ah, nice. Thanks for checking!

This revision is now accepted and ready to land.Mar 8 2018, 4:03 PM

Closed by commit rL327163: Use branch funnels for virtual calls when retpoline mitigation is enabled. (authored by pcc). · Explain WhyMar 9 2018, 11:14 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: delcypher. · View Herald TranscriptMar 9 2018, 11:14 AM

Revision Contents

Path

Size

compiler-rt/

test/

cfi/

simple-pass.cpp

2 lines

llvm/

include/

llvm/

ADT/

PointerUnion.h

6 lines

CodeGen/

TargetOpcodes.def

2 lines

IR/

Intrinsics.td

4 lines

ModuleSummaryIndex.h

2 lines

ModuleSummaryIndexYAML.h

1 line

Target/

Target.td

6 lines

lib/

CodeGen/

SelectionDAG/

SelectionDAGBuilder.cpp

52 lines

IR/

Verifier.cpp

17 lines

Target/

X86/

X86ExpandPseudo.cpp

102 lines

Transforms/

IPO/

LowerTypeTests.cpp

115 lines

WholeProgramDevirt.cpp

211 lines

test/

Transforms/

LowerTypeTests/

icall-jumptable.ll

117 lines

WholeProgramDevirt/

Inputs/

import-jumptable.yaml

11 lines

import-vcp-jumptable.yaml

23 lines

import.ll

32 lines

jumptable.ll

157 lines

Diff 135173

compiler-rt/test/cfi/simple-pass.cpp

	// RUN: %clangxx_cfi -o %t %s			// RUN: %clangxx_cfi -o %t %s
	// RUN: %run %t			// RUN: %run %t
				// RUN: %clangxx_cfi -mretpoline -o %t2 %s
				// RUN: %run %t2

	// Tests that the CFI mechanism does not crash the program when making various			// Tests that the CFI mechanism does not crash the program when making various
	// kinds of valid calls involving classes with various different linkages and			// kinds of valid calls involving classes with various different linkages and
	// types of inheritance, and both virtual and non-virtual member functions.			// types of inheritance, and both virtual and non-virtual member functions.

	#include "utils.h"			#include "utils.h"

	struct A {			struct A {
	▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/include/llvm/ADT/PointerUnion.h

Show First 20 Lines • Show All 340 Lines • ▼ Show 20 Lines	struct PointerLikeTypeTraits<PointerUnion3<PT1, PT2, PT3>> {

// The number of bits available are the min of the two pointer types.		// The number of bits available are the min of the two pointer types.
enum {		enum {
NumLowBitsAvailable = PointerLikeTypeTraits<		NumLowBitsAvailable = PointerLikeTypeTraits<
typename PointerUnion3<PT1, PT2, PT3>::ValTy>::NumLowBitsAvailable		typename PointerUnion3<PT1, PT2, PT3>::ValTy>::NumLowBitsAvailable
};		};
};		};

		template <typename PT1, typename PT2, typename PT3>
		bool operator<(PointerUnion3<PT1, PT2, PT3> lhs,
		PointerUnion3<PT1, PT2, PT3> rhs) {
		return lhs.getOpaqueValue() < rhs.getOpaqueValue();
		}

/// A pointer union of four pointer types. See documentation for PointerUnion		/// A pointer union of four pointer types. See documentation for PointerUnion
/// for usage.		/// for usage.
template <typename PT1, typename PT2, typename PT3, typename PT4>		template <typename PT1, typename PT2, typename PT3, typename PT4>
class PointerUnion4 {		class PointerUnion4 {
public:		public:
using InnerUnion1 = PointerUnion<PT1, PT2>;		using InnerUnion1 = PointerUnion<PT1, PT2>;
using InnerUnion2 = PointerUnion<PT3, PT4>;		using InnerUnion2 = PointerUnion<PT3, PT4>;
using ValTy = PointerUnion<InnerUnion1, InnerUnion2>;		using ValTy = PointerUnion<InnerUnion1, InnerUnion2>;
▲ Show 20 Lines • Show All 129 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/TargetOpcodes.def

	Show First 20 Lines • Show All 181 Lines • ▼ Show 20 Lines
	/// either before or after the tail exit. We use this as a disambiguation from			/// either before or after the tail exit. We use this as a disambiguation from
	/// PATCHABLE_RET which specifically only works for return instructions.			/// PATCHABLE_RET which specifically only works for return instructions.
	HANDLE_TARGET_OPCODE(PATCHABLE_TAIL_CALL)			HANDLE_TARGET_OPCODE(PATCHABLE_TAIL_CALL)

	/// Wraps a logging call and its arguments with nop sleds. At runtime, this can be			/// Wraps a logging call and its arguments with nop sleds. At runtime, this can be
	/// patched to insert instrumentation instructions.			/// patched to insert instrumentation instructions.
	HANDLE_TARGET_OPCODE(PATCHABLE_EVENT_CALL)			HANDLE_TARGET_OPCODE(PATCHABLE_EVENT_CALL)

				HANDLE_TARGET_OPCODE(ICALL_JUMPTABLE)

	/// The following generic opcodes are not supposed to appear after ISel.			/// The following generic opcodes are not supposed to appear after ISel.
	/// This is something we might want to relax, but for now, this is convenient			/// This is something we might want to relax, but for now, this is convenient
	/// to produce diagnostics.			/// to produce diagnostics.

	/// Generic ADD instruction. This is an integer add.			/// Generic ADD instruction. This is an integer add.
	HANDLE_TARGET_OPCODE(G_ADD)			HANDLE_TARGET_OPCODE(G_ADD)
	HANDLE_TARGET_OPCODE_MARKER(PRE_ISEL_GENERIC_OPCODE_START, G_ADD)			HANDLE_TARGET_OPCODE_MARKER(PRE_ISEL_GENERIC_OPCODE_START, G_ADD)

	▲ Show 20 Lines • Show All 264 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 868 Lines • ▼ Show 20 Lines
	def int_type_test : Intrinsic<[llvm_i1_ty], [llvm_ptr_ty, llvm_metadata_ty],			def int_type_test : Intrinsic<[llvm_i1_ty], [llvm_ptr_ty, llvm_metadata_ty],
	[IntrNoMem]>;			[IntrNoMem]>;

	// Safely loads a function pointer from a virtual table pointer using type metadata.			// Safely loads a function pointer from a virtual table pointer using type metadata.
	def int_type_checked_load : Intrinsic<[llvm_ptr_ty, llvm_i1_ty],			def int_type_checked_load : Intrinsic<[llvm_ptr_ty, llvm_i1_ty],
	[llvm_ptr_ty, llvm_i32_ty, llvm_metadata_ty],			[llvm_ptr_ty, llvm_i32_ty, llvm_metadata_ty],
	[IntrNoMem]>;			[IntrNoMem]>;

				// Create a binary search jump table that implements an indirect call to a
				chandlercUnsubmitted Not Done Reply Inline Actions A jump table usually refers to an indirect jump using a table of addresses... Maybe "search tree" or "branch funnel" or some other term would be more clear here and elsewhere? chandlerc: A jump table usually refers to an indirect jump using a table of addresses... Maybe "search…
				pccAuthorUnsubmitted Not Done Reply Inline Actions Yes, I'm a little inconsistent in the terminology in this patch. I'll go through the patch and try to consistently use "branch funnel". pcc: Yes, I'm a little inconsistent in the terminology in this patch. I'll go through the patch and…
				// limited set of callees. This needs to be a musttail call.
				def int_icall_jumptable : Intrinsic<[], [llvm_vararg_ty], []>;

	def int_load_relative: Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_anyint_ty],			def int_load_relative: Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_anyint_ty],
	[IntrReadMem, IntrArgMemOnly]>;			[IntrReadMem, IntrArgMemOnly]>;

	// Xray intrinsics			// Xray intrinsics
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Custom event logging for x-ray.			// Custom event logging for x-ray.
	// Takes a pointer to a string and the length of the string.			// Takes a pointer to a string and the length of the string.
	def int_xray_customevent : Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty],			def int_xray_customevent : Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty],
	▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

llvm/include/llvm/IR/ModuleSummaryIndex.h

Show First 20 Lines • Show All 598 Lines • ▼ Show 20 Lines	struct TypeTestResolution {
uint8_t BitMask = 0;		uint8_t BitMask = 0;
uint64_t InlineBits = 0;		uint64_t InlineBits = 0;
};		};

struct WholeProgramDevirtResolution {		struct WholeProgramDevirtResolution {
enum Kind {		enum Kind {
Indir, ///< Just do a regular virtual call		Indir, ///< Just do a regular virtual call
SingleImpl, ///< Single implementation devirtualization		SingleImpl, ///< Single implementation devirtualization
		JumpTable, ///< When retpoline mitigation is enabled, use a jump table that
		///< is defined in the merged module. Otherwise same as Indir.
		chandlercUnsubmitted Not Done Reply Inline Actions Is it really a jump table though? You use this term throughout, so only commenting here, but I wonder if there is a better term that makes it more clear what is going on here. I'm worried people will assume this is actually implemented with a jump table as opposed to a branch funnel (or binary-ish search). chandlerc: Is it really a jump table though? You use this term throughout, so only commenting here, but I…
		pccAuthorUnsubmitted Not Done Reply Inline Actions Sorry, that was one of the things that I forgot to update in the code. Renamed to "branch funnel". pcc: Sorry, that was one of the things that I forgot to update in the code. Renamed to "branch…
} TheKind = Indir;		} TheKind = Indir;

std::string SingleImplName;		std::string SingleImplName;

struct ByArg {		struct ByArg {
enum Kind {		enum Kind {
Indir, ///< Just do a regular virtual call		Indir, ///< Just do a regular virtual call
UniformRetVal, ///< Uniform return value optimization		UniformRetVal, ///< Uniform return value optimization
▲ Show 20 Lines • Show All 303 Lines • Show Last 20 Lines

llvm/include/llvm/IR/ModuleSummaryIndexYAML.h

Show First 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	static void output(
}		}
}		}
};		};

template <> struct ScalarEnumerationTraits<WholeProgramDevirtResolution::Kind> {		template <> struct ScalarEnumerationTraits<WholeProgramDevirtResolution::Kind> {
static void enumeration(IO &io, WholeProgramDevirtResolution::Kind &value) {		static void enumeration(IO &io, WholeProgramDevirtResolution::Kind &value) {
io.enumCase(value, "Indir", WholeProgramDevirtResolution::Indir);		io.enumCase(value, "Indir", WholeProgramDevirtResolution::Indir);
io.enumCase(value, "SingleImpl", WholeProgramDevirtResolution::SingleImpl);		io.enumCase(value, "SingleImpl", WholeProgramDevirtResolution::SingleImpl);
		io.enumCase(value, "JumpTable", WholeProgramDevirtResolution::JumpTable);
}		}
};		};

template <> struct MappingTraits<WholeProgramDevirtResolution> {		template <> struct MappingTraits<WholeProgramDevirtResolution> {
static void mapping(IO &io, WholeProgramDevirtResolution &res) {		static void mapping(IO &io, WholeProgramDevirtResolution &res) {
io.mapOptional("Kind", res.TheKind);		io.mapOptional("Kind", res.TheKind);
io.mapOptional("SingleImplName", res.SingleImplName);		io.mapOptional("SingleImplName", res.SingleImplName);
io.mapOptional("ResByArg", res.ResByArg);		io.mapOptional("ResByArg", res.ResByArg);
▲ Show 20 Lines • Show All 166 Lines • Show Last 20 Lines

llvm/include/llvm/Target/Target.td

Show First 20 Lines • Show All 1,134 Lines • ▼ Show 20 Lines	def FENTRY_CALL : StandardPseudoInstruction {
let OutOperandList = (outs unknown:$dst);		let OutOperandList = (outs unknown:$dst);
let InOperandList = (ins variable_ops);		let InOperandList = (ins variable_ops);
let AsmString = "# FEntry call";		let AsmString = "# FEntry call";
let usesCustomInserter = 1;		let usesCustomInserter = 1;
let mayLoad = 1;		let mayLoad = 1;
let mayStore = 1;		let mayStore = 1;
let hasSideEffects = 1;		let hasSideEffects = 1;
}		}
		def ICALL_JUMPTABLE : StandardPseudoInstruction {
		let OutOperandList = (outs unknown:$dst);
		let InOperandList = (ins variable_ops);
		let AsmString = "";
		chandlercUnsubmitted Not Done Reply Inline Actions Does it make sense to put a comment string here to make reading the dump of MI easier? (I'm not sure, genuine question here.) chandlerc: Does it make sense to put a comment string here to make reading the dump of MI easier? (I'm not…
		pccAuthorUnsubmitted Not Done Reply Inline Actions No, the MI dump already says "ICALL_BRANCH_FUNNEL". This string is used in the asm output for instructions that survive until MC, and this one doesn't. pcc: No, the MI dump already says "ICALL_BRANCH_FUNNEL". This string is used in the asm output for…
		chandlercUnsubmitted Not Done Reply Inline Actions Ah, nice. Thanks for checking! chandlerc: Ah, nice. Thanks for checking!
		let hasSideEffects = 1;
		}

// Generic opcodes used in GlobalISel.		// Generic opcodes used in GlobalISel.
include "llvm/Target/GenericOpcodes.td"		include "llvm/Target/GenericOpcodes.td"

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// AsmParser - This class can be implemented by targets that wish to implement		// AsmParser - This class can be implemented by targets that wish to implement
// .s file parsing.		// .s file parsing.
//		//
▲ Show 20 Lines • Show All 363 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 6,038 Lines • ▼ Show 20 Lines
	case Intrinsic::experimental_vector_reduce_smax:			case Intrinsic::experimental_vector_reduce_smax:
	case Intrinsic::experimental_vector_reduce_smin:			case Intrinsic::experimental_vector_reduce_smin:
	case Intrinsic::experimental_vector_reduce_umax:			case Intrinsic::experimental_vector_reduce_umax:
	case Intrinsic::experimental_vector_reduce_umin:			case Intrinsic::experimental_vector_reduce_umin:
	case Intrinsic::experimental_vector_reduce_fmax:			case Intrinsic::experimental_vector_reduce_fmax:
	case Intrinsic::experimental_vector_reduce_fmin:			case Intrinsic::experimental_vector_reduce_fmin:
	visitVectorReduce(I, Intrinsic);			visitVectorReduce(I, Intrinsic);
	return nullptr;			return nullptr;

				case Intrinsic::icall_jumptable: {
				std::vector<SDValue> Ops;
				chandlercUnsubmitted Done Reply Inline Actions SmallVector? Seems likely to have lots of cases where 8 or 16 would handle.. chandlerc: SmallVector? Seems likely to have lots of cases where 8 or 16 would handle..
				Ops.push_back(DAG.getRoot());
				Ops.push_back(getValue(I.getArgOperand(0)));

				int64_t Offset;
				auto *Base = dyn_cast<GlobalObject>(GetPointerBaseWithConstantOffset(
				I.getArgOperand(1), Offset, DAG.getDataLayout()));
				if (!Base)
				report_fatal_error("llvm.icall.jumptable operand must be a GlobalValue");
				Ops.push_back(DAG.getTargetGlobalAddress(Base, getCurSDLoc(), MVT::i64, 0));

				struct JumpTableTarget {
				int64_t Offset;
				SDValue Target;
				};
				std::vector<JumpTableTarget> Targets;
				chandlercUnsubmitted Done Reply Inline Actions Same question here. chandlerc: Same question here.

				for (unsigned Op = 1, N = I.getNumArgOperands(); Op != N; Op += 2) {
				auto *ElemBase = dyn_cast<GlobalObject>(GetPointerBaseWithConstantOffset(
				I.getArgOperand(Op), Offset, DAG.getDataLayout()));
				if (ElemBase != Base)
				report_fatal_error("all llvm.icall.jumptable operands must refer to "
				"the same GlobalValue");
				chandlercUnsubmitted Not Done Reply Inline Actions Is it worth teaching the verifier about this and the other invariants below? Could then make this just an assert. Minor point though, and would be fine as a follow-up. chandlerc: Is it worth teaching the verifier about this and the other invariants below? Could then make…
				pccAuthorUnsubmitted Not Done Reply Inline Actions This particular check can't be a verifier check because the wholeprogramdevirt pass creates IR that would not pass this check (it is later fixed up by lowertypetests). The others would be fine as verifier checks and we might do that as a followup. pcc: This particular check can't be a verifier check because the wholeprogramdevirt pass creates IR…

				SDValue Val = getValue(I.getArgOperand(Op + 1));
				auto *GA = dyn_cast<GlobalAddressSDNode>(Val);
				if (!GA)
				report_fatal_error("llvm.icall.jumptable operand must be a GlobalValue");
				Targets.push_back({Offset, DAG.getTargetGlobalAddress(
				GA->getGlobal(), getCurSDLoc(),
				Val.getValueType(), GA->getOffset())});
				}
				std::sort(Targets.begin(), Targets.end(),
				[](const JumpTableTarget &T1, const JumpTableTarget &T2) {
				return T1.Offset < T2.Offset;
				});

				for (auto &T : Targets) {
				Ops.push_back(DAG.getTargetConstant(T.Offset, getCurSDLoc(), MVT::i32));
				Ops.push_back(T.Target);
				}

				SDValue N(DAG.getMachineNode(TargetOpcode::ICALL_JUMPTABLE, getCurSDLoc(),
				MVT::Other, Ops),
				0);
				DAG.setRoot(N);
				setValue(&I, N);
				HasTailCall = true;
				return nullptr;
				}
	}			}
	}			}

	void SelectionDAGBuilder::visitConstrainedFPIntrinsic(			void SelectionDAGBuilder::visitConstrainedFPIntrinsic(
	const ConstrainedFPIntrinsic &FPI) {			const ConstrainedFPIntrinsic &FPI) {
	SDLoc sdl = getCurSDLoc();			SDLoc sdl = getCurSDLoc();
	unsigned Opcode;			unsigned Opcode;
	switch (FPI.getIntrinsicID()) {			switch (FPI.getIntrinsicID()) {
	▲ Show 20 Lines • Show All 3,998 Lines • Show Last 20 Lines

llvm/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 2,855 Lines • ▼ Show 20 Lines	void Verifier::verifyMustTailCall(CallInst &CI) {
Assert(!CI.isInlineAsm(), "cannot use musttail call with inline asm", &CI);		Assert(!CI.isInlineAsm(), "cannot use musttail call with inline asm", &CI);

// - The caller and callee prototypes must match. Pointer types of		// - The caller and callee prototypes must match. Pointer types of
// parameters or return types may differ in pointee type, but not		// parameters or return types may differ in pointee type, but not
// address space.		// address space.
Function *F = CI.getParent()->getParent();		Function *F = CI.getParent()->getParent();
FunctionType *CallerTy = F->getFunctionType();		FunctionType *CallerTy = F->getFunctionType();
FunctionType *CalleeTy = CI.getFunctionType();		FunctionType *CalleeTy = CI.getFunctionType();
		if (!CI.getCalledFunction() \|\| !CI.getCalledFunction()->isIntrinsic()) {
Assert(CallerTy->getNumParams() == CalleeTy->getNumParams(),		Assert(CallerTy->getNumParams() == CalleeTy->getNumParams(),
"cannot guarantee tail call due to mismatched parameter counts", &CI);		"cannot guarantee tail call due to mismatched parameter counts",
Assert(CallerTy->isVarArg() == CalleeTy->isVarArg(),		&CI);
"cannot guarantee tail call due to mismatched varargs", &CI);
Assert(isTypeCongruent(CallerTy->getReturnType(), CalleeTy->getReturnType()),
"cannot guarantee tail call due to mismatched return types", &CI);
for (int I = 0, E = CallerTy->getNumParams(); I != E; ++I) {		for (int I = 0, E = CallerTy->getNumParams(); I != E; ++I) {
Assert(		Assert(
isTypeCongruent(CallerTy->getParamType(I), CalleeTy->getParamType(I)),		isTypeCongruent(CallerTy->getParamType(I), CalleeTy->getParamType(I)),
"cannot guarantee tail call due to mismatched parameter types", &CI);		"cannot guarantee tail call due to mismatched parameter types", &CI);
}		}
		}
		Assert(CallerTy->isVarArg() == CalleeTy->isVarArg(),
		"cannot guarantee tail call due to mismatched varargs", &CI);
		Assert(isTypeCongruent(CallerTy->getReturnType(), CalleeTy->getReturnType()),
		"cannot guarantee tail call due to mismatched return types", &CI);

// - The calling conventions of the caller and callee must match.		// - The calling conventions of the caller and callee must match.
Assert(F->getCallingConv() == CI.getCallingConv(),		Assert(F->getCallingConv() == CI.getCallingConv(),
"cannot guarantee tail call due to mismatched calling conv", &CI);		"cannot guarantee tail call due to mismatched calling conv", &CI);

// - All ABI-impacting function attributes, such as sret, byval, inreg,		// - All ABI-impacting function attributes, such as sret, byval, inreg,
// returned, and inalloca, must match.		// returned, and inalloca, must match.
AttributeList CallerAttrs = F->getAttributes();		AttributeList CallerAttrs = F->getAttributes();
▲ Show 20 Lines • Show All 2,229 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ExpandPseudo.cpp

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	return MachineFunctionProperties().set(
MachineFunctionProperties::Property::NoVRegs);		MachineFunctionProperties::Property::NoVRegs);
}		}

StringRef getPassName() const override {		StringRef getPassName() const override {
return "X86 pseudo instruction expansion pass";		return "X86 pseudo instruction expansion pass";
}		}

private:		private:
		void ExpandICallJumpTable(MachineBasicBlock *MBB,
		MachineBasicBlock::iterator MBBI);

bool ExpandMI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);		bool ExpandMI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);
bool ExpandMBB(MachineBasicBlock &MBB);		bool ExpandMBB(MachineBasicBlock &MBB);
};		};
char X86ExpandPseudo::ID = 0;		char X86ExpandPseudo::ID = 0;
} // End anonymous namespace.		} // End anonymous namespace.

		void X86ExpandPseudo::ExpandICallJumpTable(MachineBasicBlock *MBB,
		MachineBasicBlock::iterator MBBI) {
		MachineBasicBlock *JTMBB = MBB;
		MachineInstr JTInst = &MBBI;
		MachineFunction *MF = MBB->getParent();
		const BasicBlock *BB = MBB->getBasicBlock();
		auto InsPt = MachineFunction::iterator(MBB);
		++InsPt;

		std::vector<std::pair<MachineBasicBlock *, unsigned>> TargetMBBs;
		DebugLoc DL = JTInst->getDebugLoc();
		MachineOperand Selector = JTInst->getOperand(0);
		const GlobalValue *CombinedGlobal = JTInst->getOperand(1).getGlobal();

		auto CmpTarget = [&](unsigned Target) {
		BuildMI(*MBB, MBBI, DL, TII->get(X86::LEA64r), X86::R11)
		.addReg(X86::RIP)
		.addImm(1)
		.addReg(0)
		.addGlobalAddress(CombinedGlobal,
		JTInst->getOperand(2 + 2 * Target).getImm())
		.addReg(0);
		BuildMI(*MBB, MBBI, DL, TII->get(X86::CMP64rr))
		.add(Selector)
		.addReg(X86::R11);
		};

		auto CreateMBB = [&]() {
		auto *NewMBB = MF->CreateMachineBasicBlock(BB);
		MBB->addSuccessor(NewMBB);
		return NewMBB;
		};

		auto EmitCondJump = [&](unsigned Opcode, MachineBasicBlock *ThenMBB) {
		BuildMI(*MBB, MBBI, DL, TII->get(Opcode)).addMBB(ThenMBB);

		auto *ElseMBB = CreateMBB();
		MF->insert(InsPt, ElseMBB);
		MBB = ElseMBB;
		MBBI = MBB->end();
		};

		auto EmitCondJumpTarget = [&](unsigned Opcode, unsigned Target) {
		auto *ThenMBB = CreateMBB();
		TargetMBBs.push_back({ThenMBB, Target});
		EmitCondJump(Opcode, ThenMBB);
		};

		auto EmitTailCall = [&](unsigned Target) {
		BuildMI(*MBB, MBBI, DL, TII->get(X86::TAILJMPd64))
		.add(JTInst->getOperand(3 + 2 * Target));
		};

		std::function<void(unsigned, unsigned)> EmitJumpTable =
		[&](unsigned FirstTarget, unsigned NumTargets) {
		if (NumTargets == 1) {
		EmitTailCall(FirstTarget);
		return;
		}

		if (NumTargets == 2) {
		CmpTarget(FirstTarget + 1);
		EmitCondJumpTarget(X86::JB_1, FirstTarget);
		EmitTailCall(FirstTarget + 1);
		return;
		}

		if (NumTargets < 6) {
		CmpTarget(FirstTarget + 1);
		EmitCondJumpTarget(X86::JB_1, FirstTarget);
		EmitCondJumpTarget(X86::JE_1, FirstTarget + 1);
		EmitJumpTable(FirstTarget + 2, NumTargets - 2);
		}

		auto *ThenMBB = CreateMBB();
		CmpTarget(FirstTarget + (NumTargets / 2));
		EmitCondJump(X86::JB_1, ThenMBB);
		EmitCondJumpTarget(X86::JE_1, FirstTarget + (NumTargets / 2));
		EmitJumpTable(FirstTarget + (NumTargets / 2),
		NumTargets - (NumTargets / 2));

		MF->insert(InsPt, ThenMBB);
		MBB = ThenMBB;
		MBBI = MBB->end();
		EmitJumpTable(FirstTarget, NumTargets / 2);
		};

		EmitJumpTable(0, (JTInst->getNumOperands() - 2) / 2);
		for (auto P : TargetMBBs) {
		MF->insert(InsPt, P.first);
		BuildMI(P.first, DL, TII->get(X86::TAILJMPd64))
		.add(JTInst->getOperand(3 + 2 * P.second));
		}
		JTMBB->erase(JTInst);
		}

/// If \p MBBI is a pseudo instruction, this method expands		/// If \p MBBI is a pseudo instruction, this method expands
/// it to the corresponding (sequence of) actual instruction(s).		/// it to the corresponding (sequence of) actual instruction(s).
/// \returns true if \p MBBI has been expanded.		/// \returns true if \p MBBI has been expanded.
bool X86ExpandPseudo::ExpandMI(MachineBasicBlock &MBB,		bool X86ExpandPseudo::ExpandMI(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI) {		MachineBasicBlock::iterator MBBI) {
MachineInstr &MI = *MBBI;		MachineInstr &MI = *MBBI;
unsigned Opcode = MI.getOpcode();		unsigned Opcode = MI.getOpcode();
DebugLoc DL = MBBI->getDebugLoc();		DebugLoc DL = MBBI->getDebugLoc();
▲ Show 20 Lines • Show All 178 Lines • ▼ Show 20 Lines	case X86::LCMPXCHG16B_SAVE_RBX: {
// Finally, restore the value of RBX.		// Finally, restore the value of RBX.
TII->copyPhysReg(MBB, MBBI, DL, ActualInArg, SaveRbx,		TII->copyPhysReg(MBB, MBBI, DL, ActualInArg, SaveRbx,
/SrcIsKill/ true);		/SrcIsKill/ true);

// Delete the pseudo.		// Delete the pseudo.
MBBI->eraseFromParent();		MBBI->eraseFromParent();
return true;		return true;
}		}
		case TargetOpcode::ICALL_JUMPTABLE:
		ExpandICallJumpTable(&MBB, MBBI);
		return true;
}		}
llvm_unreachable("Previous switch has a fallthrough?");		llvm_unreachable("Previous switch has a fallthrough?");
}		}

/// Expand all pseudo instructions contained in \p MBB.		/// Expand all pseudo instructions contained in \p MBB.
/// \returns true if any expansion occurred for \p MBB.		/// \returns true if any expansion occurred for \p MBB.
bool X86ExpandPseudo::ExpandMBB(MachineBasicBlock &MBB) {		bool X86ExpandPseudo::ExpandMBB(MachineBasicBlock &MBB) {
bool Modified = false;		bool Modified = false;
Show All 29 Lines

llvm/lib/Transforms/IPO/LowerTypeTests.cpp

//===- LowerTypeTests.cpp - type metadata lowering pass -------------------===//		//===- LowerTypeTests.cpp - type metadata lowering pass -------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This pass lowers type metadata and calls to the llvm.type.test intrinsic.		// This pass lowers type metadata and calls to the llvm.type.test intrinsic.
		// It also ensures that globals are properly laid out for the
		// llvm.icall.jumptable intrinsic.
// See http://llvm.org/docs/TypeMetadata.html for more information.		// See http://llvm.org/docs/TypeMetadata.html for more information.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Transforms/IPO/LowerTypeTests.h"		#include "llvm/Transforms/IPO/LowerTypeTests.h"
#include "llvm/ADT/APInt.h"		#include "llvm/ADT/APInt.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/EquivalenceClasses.h"		#include "llvm/ADT/EquivalenceClasses.h"
#include "llvm/ADT/PointerUnion.h"		#include "llvm/ADT/PointerUnion.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/TinyPtrVector.h"		#include "llvm/ADT/TinyPtrVector.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
#include "llvm/Analysis/TypeMetadataUtils.h"		#include "llvm/Analysis/TypeMetadataUtils.h"
		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/Attributes.h"		#include "llvm/IR/Attributes.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/GlobalAlias.h"		#include "llvm/IR/GlobalAlias.h"
▲ Show 20 Lines • Show All 250 Lines • ▼ Show 20 Lines	bool isExported() const {
return IsExported;		return IsExported;
}		}

ArrayRef<MDNode *> types() const {		ArrayRef<MDNode *> types() const {
return makeArrayRef(getTrailingObjects<MDNode *>(), NTypes);		return makeArrayRef(getTrailingObjects<MDNode *>(), NTypes);
}		}
};		};

		struct IcallJumptable final
		chandlercUnsubmitted Done Reply Inline Actions nit: I would have capitalized `Icall` as `ICall` throughout. chandlerc: nit: I would have capitalized `Icall` as `ICall` throughout.
		: TrailingObjects<IcallJumptable, GlobalTypeMember *> {
		static IcallJumptable create(BumpPtrAllocator &Alloc, CallInst CI,
		ArrayRef<GlobalTypeMember *> Targets) {
		auto Call = static_cast<IcallJumptable >(
		Alloc.Allocate(totalSizeToAlloc<GlobalTypeMember *>(Targets.size()),
		alignof(IcallJumptable)));
		Call->CI = CI;
		Call->NTargets = Targets.size();
		std::uninitialized_copy(Targets.begin(), Targets.end(),
		Call->getTrailingObjects<GlobalTypeMember *>());
		return Call;
		}

		CallInst *CI;
		ArrayRef<GlobalTypeMember *> targets() const {
		return makeArrayRef(getTrailingObjects<GlobalTypeMember *>(), NTargets);
		}

		private:
		size_t NTargets;
		};

class LowerTypeTestsModule {		class LowerTypeTestsModule {
Module &M;		Module &M;

ModuleSummaryIndex *ExportSummary;		ModuleSummaryIndex *ExportSummary;
const ModuleSummaryIndex *ImportSummary;		const ModuleSummaryIndex *ImportSummary;

Triple::ArchType Arch;		Triple::ArchType Arch;
Triple::OSType OS;		Triple::OSType OS;
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	class LowerTypeTestsModule {
void allocateByteArrays();		void allocateByteArrays();
Value *createBitSetTest(IRBuilder<> &B, const TypeIdLowering &TIL,		Value *createBitSetTest(IRBuilder<> &B, const TypeIdLowering &TIL,
Value *BitOffset);		Value *BitOffset);
void lowerTypeTestCalls(		void lowerTypeTestCalls(
ArrayRef<Metadata > TypeIds, Constant CombinedGlobalAddr,		ArrayRef<Metadata > TypeIds, Constant CombinedGlobalAddr,
const DenseMap<GlobalTypeMember *, uint64_t> &GlobalLayout);		const DenseMap<GlobalTypeMember *, uint64_t> &GlobalLayout);
Value lowerTypeTestCall(Metadata TypeId, CallInst *CI,		Value lowerTypeTestCall(Metadata TypeId, CallInst *CI,
const TypeIdLowering &TIL);		const TypeIdLowering &TIL);

void buildBitSetsFromGlobalVariables(ArrayRef<Metadata *> TypeIds,		void buildBitSetsFromGlobalVariables(ArrayRef<Metadata *> TypeIds,
ArrayRef<GlobalTypeMember *> Globals);		ArrayRef<GlobalTypeMember *> Globals);
unsigned getJumpTableEntrySize();		unsigned getJumpTableEntrySize();
Type *getJumpTableEntryType();		Type *getJumpTableEntryType();
void createJumpTableEntry(raw_ostream &AsmOS, raw_ostream &ConstraintOS,		void createJumpTableEntry(raw_ostream &AsmOS, raw_ostream &ConstraintOS,
Triple::ArchType JumpTableArch,		Triple::ArchType JumpTableArch,
SmallVectorImpl<Value > &AsmArgs, Function Dest);		SmallVectorImpl<Value > &AsmArgs, Function Dest);
void verifyTypeMDNode(GlobalObject GO, MDNode Type);		void verifyTypeMDNode(GlobalObject GO, MDNode Type);
void buildBitSetsFromFunctions(ArrayRef<Metadata *> TypeIds,		void buildBitSetsFromFunctions(ArrayRef<Metadata *> TypeIds,
ArrayRef<GlobalTypeMember *> Functions);		ArrayRef<GlobalTypeMember *> Functions);
void buildBitSetsFromFunctionsNative(ArrayRef<Metadata *> TypeIds,		void buildBitSetsFromFunctionsNative(ArrayRef<Metadata *> TypeIds,
ArrayRef<GlobalTypeMember *> Functions);		ArrayRef<GlobalTypeMember *> Functions);
void buildBitSetsFromFunctionsWASM(ArrayRef<Metadata *> TypeIds,		void buildBitSetsFromFunctionsWASM(ArrayRef<Metadata *> TypeIds,
ArrayRef<GlobalTypeMember *> Functions);		ArrayRef<GlobalTypeMember *> Functions);
void buildBitSetsFromDisjointSet(ArrayRef<Metadata *> TypeIds,		void
ArrayRef<GlobalTypeMember *> Globals);		buildBitSetsFromDisjointSet(ArrayRef<Metadata *> TypeIds,
		ArrayRef<GlobalTypeMember *> Globals,
		ArrayRef<IcallJumptable *> IcallJumptables);

void replaceWeakDeclarationWithJumpTablePtr(Function F, Constant JT);		void replaceWeakDeclarationWithJumpTablePtr(Function F, Constant JT);
void moveInitializerToModuleConstructor(GlobalVariable *GV);		void moveInitializerToModuleConstructor(GlobalVariable *GV);
void findGlobalVariableUsersOf(Constant *C,		void findGlobalVariableUsersOf(Constant *C,
SmallSetVector<GlobalVariable *, 8> &Out);		SmallSetVector<GlobalVariable *, 8> &Out);

void createJumpTable(Function F, ArrayRef<GlobalTypeMember > Functions);		void createJumpTable(Function F, ArrayRef<GlobalTypeMember > Functions);

▲ Show 20 Lines • Show All 650 Lines • ▼ Show 20 Lines
}		}

void LowerTypeTestsModule::verifyTypeMDNode(GlobalObject GO, MDNode Type) {		void LowerTypeTestsModule::verifyTypeMDNode(GlobalObject GO, MDNode Type) {
if (Type->getNumOperands() != 2)		if (Type->getNumOperands() != 2)
report_fatal_error("All operands of type metadata must have 2 elements");		report_fatal_error("All operands of type metadata must have 2 elements");

if (GO->isThreadLocal())		if (GO->isThreadLocal())
report_fatal_error("Bit set element may not be thread-local");		report_fatal_error("Bit set element may not be thread-local");
if (isa<GlobalVariable>(GO) && GO->hasSection())		if (isa<GlobalVariable>(GO) && GO->hasSection())
report_fatal_error(		report_fatal_error(
"A member of a type identifier may not have an explicit section");		"A member of a type identifier may not have an explicit section");
		chandlercUnsubmitted Not Done Reply Inline Actions Emitting all of this with (I assume) inline assembly seems like a really messy design. It makes all of this entirely x86-specific, but this part is completely independent of architecture. Why not emit LLVM IR? Is there no pattern of IR that actually lowers to reasonable branches? I find that a little bit surprising, but maybe we should just fix the lowering in that case? chandlerc: Emitting all of this with (I assume) inline assembly seems like a really messy design. It makes…
		pccAuthorUnsubmitted Not Done Reply Inline Actions The main reason is that we need to make sure that whatever code we generate conforms with the calling convention used for the virtual call. At this point there is no guarantee that we have correct information about the calling convention to be used here because the "source of truth" for the calling convention is the call site itself, and with ThinLTO there may not be a call site in the current module. We cannot even rely on the function prototype with ThinLTO because we may have discarded the prototype by this point. Furthermore there doesn't seem to be a point in preserving the calling convention because no matter what it is, we will always want the same code. That is what led me to the conclusion that what we need here is an IR construct that represents the notion of a branch funnel call to a function with an unknown prototype -- that is what the `llvm.icall.jumptable` intrinsic is. I considered putting the implementation of this intrinsic in the backend like a regular intrinsic, but one of the problems with that is that one of the things that we need to be able to do is stitch together multiple global variables into a single global in order to implement the branch funnel as a binary search, and it is too late to do that in the backend. The LowerTypeTests pass was a convenient place to do it (since it already needs to know how to do it in order to implement `llvm.type.test`) but admittedly it may not be the best place. It occurred to me that it may be possible to implement the stitching together of globals in a pre-isel pass and the rest of `llvm.icall.jumptable` in the backend, but I think that's something that's best considered for a followup patch since it may have implications for how we implement `llvm.type.test` as well. pcc: The main reason is that we need to make sure that whatever code we generate conforms with the…
		chandlercUnsubmitted Not Done Reply Inline Actions The more I think about this the more I think that implementing this with inline asm is just the wrong design. I understand the challenges you're hitting here, but I think we should solve them by lowering in the backend, not in the IR. The IR really can't model the kinds of things you're doing (and it shouldn't!). My suggestion would be to change the `llvm.icall.<mumble>` intrinsic or add a during-lowering intrinsic that lets you prepare the global variable stuff as necessary in the IR pass, and hand that prepared information cleanly (and abstractly) to the backend where you can effectively lower it to branch instructions. Since you are (notionally) lowering a call, you may even be able to use the intrinsic all the way into the code generator and just lower this manually with an MI pass. Another aspect that this will improve is that you shouldn't need to embed things into naked functions at all. This should be something that you can just expand wherever it is needed into the code. chandlerc: The more I think about this the more I think that implementing this with inline asm is just the…
		pccAuthorUnsubmitted Not Done Reply Inline Actions I'm not sure that I understand your proposal. Are you proposing that in the WholeProgramDevirt pass we would transform each virtual call site into an intrinsic call with the list of possible targets, and then in the backend we would lower the intrinsic call into a call to a branch funnel "function" together with its definition? pcc: I'm not sure that I understand your proposal. Are you proposing that in the WholeProgramDevirt…
		chandlercUnsubmitted Not Done Reply Inline Actions I'm imagining in the backend, we would lower the intrinsic call into an inline branch funnel across the destinations. (I can also imagine using a pseudo instead of call to an intrinsic, but as you want to capture a call with some specific calling convention / signature / etc, using a call to an intrinsic seems a reasonable representation... if needed, you could even put the potential destinations into an operand bundle so that the formal argument list is actually the argument list which should be used for calling the target function.) chandlerc: I'm imagining in the backend, we would lower the intrinsic call into an inline branch funnel…
		pccAuthorUnsubmitted Not Done Reply Inline Actions Are you sure you want an inline branch funnel? It will make the code more "branchy", which doesn't seem great for code size. i.e. the two options are call branchfunnel branchfunnel: cmp ... jb target1 je target2 jmp target3 and cmp ... jb .Ltmp1 je .Ltmp2 call target3 jmp .Ltmp3 .Ltmp1: call target1 jmp .Ltmp3 .Ltmp2: call target2 .Ltmp3: . Putting that aside, there are problems with the idea of putting the list of targets in every intrinsic. Most importantly, it complicates matters significantly for ThinLTO. It means that each backend job would need to know the list of targets as well as the layout of the combined global that stores the vtables. We don't need that for any other purpose, so we would need to include that information in the summary just to support this. Because we are including the information in the summary, it means that any change to the class hierarchy would cause a ThinLTO cache miss. (Right now, in many cases, we can avoid a cache miss as a result of careful summary design.) In other words, this proposal would be making the code more complicated and less efficient. So it doesn't seem like a good direction to me. I think we can both agree that lowering to inline asm isn't great. But I see a different way for this code to evolve so that it is properly layered. Essentially we would invent a new top-level entity (like a function or a global variable) that represents a thunk. There is already a need for such a top-level entity to represent CFI jump tables (currently we represent them in the same hackish way using inline asm), so to start with there would be two kinds of thunks: jump tables and branch funnels. Target specific code in the backend would lower the thunks to MI or MCInstrs. This would have a few advantages which would be shared with your intrinsic proposal: the IR would be somewhat platform independent until the backend, thus it can be easily optimized by midlevel passes, the backend would be able to choose whether to emit the branch funnel inline or outline, and most importantly for ThinLTO: it would support separate compilation. Please let me know what you think. pcc: Are you sure you want an inline branch funnel? It will make the code more "branchy", which…
		chandlercUnsubmitted Not Done Reply Inline Actions Having separate inline branch funnels may actually be nice when there are very few targets as they may be better predicted. There is a size/speed trade-off here. However, the separate compilation issue is very compelling. I think you can use an intrinsic to implement your proposal and get the advantages you describe without any new top-level construct: You could define / declare a thunk IR function which contains just this intrinsic. Only one TU gets the definition which calls the intrinsic and needs the global information. The others just get a declaration. No new top-level construct, and you lower it exactly as you describe. I'm not sure if this will directly map to the jump table case, but I'm happy for that to be dealt with in a follow-up as that is already in tree. If this idea doesn't work, let's chat about what alternatives would work or how to model the top-level entity. I feel pretty strongly that the current code with the current amount of inline assembly is really not up to the quality that should go into the tree even as a temporary thing. There doesn't seem to be such urgency here that we can't take the time to engineer this the right way. Also: We didn't actually clean up the inline asm for the CFI jump tables despite those being in-tree for a long time, so I think it is reasonable to insist we don't make things worse. This is a much more significant usage of inline asm, and so it seems much less reasonable as a temporary solution. chandlerc: Having separate inline branch funnels may actually be nice when there are very few targets as…
		pccAuthorUnsubmitted Not Done Reply Inline Actions You could define / declare a thunk IR function which contains just this intrinsic. Only one TU gets the definition which calls the intrinsic and needs the global information. The others just get a declaration. No new top-level construct, and you lower it exactly as you describe. The problem with that is: what would be the signature/calling convention of that function? That is the problem that I described on the first comment on this thread. If this idea doesn't work, let's chat about what alternatives would work or how to model the top-level entity. I feel pretty strongly that the current code with the current amount of inline assembly is really not up to the quality that should go into the tree even as a temporary thing. There doesn't seem to be such urgency here that we can't take the time to engineer this the right way. Also: We didn't actually clean up the inline asm for the CFI jump tables despite those being in-tree for a long time, so I think it is reasonable to insist we don't make things worse. This is a much more significant usage of inline asm, and so it seems much less reasonable as a temporary solution. Okay, fair enough. Assuming that you agree with my objection above, I will try to sketch out a more detailed plan for how I think the new top-level entity should look in the IR. pcc: > You could define / declare a thunk IR function which contains just this intrinsic. Only one…
		echristoUnsubmitted Not Done Reply Inline Actions I think you could count it as a naked function yes? I don't see a reason for a top level "thunk" entity in the IR necessarily. echristo: I think you could count it as a naked function yes? I don't see a reason for a top level…
		pccAuthorUnsubmitted Not Done Reply Inline Actions The main issue was that the thunk needs to be calling-convention-independent, which I thought was unrepresentable in IR, but it turns out that we can base it on a representation which is apparently used for thunks on Windows. That representation is to mark the thunk as varargs and mark the call as musttail. That is what this patch implements. pcc: The main issue was that the thunk needs to be calling-convention-independent, which I thought…

// FIXME: We previously checked that global var member of a type identifier		// FIXME: We previously checked that global var member of a type identifier
// must be a definition, but the IR linker may leave type metadata on		// must be a definition, but the IR linker may leave type metadata on
// declarations. We should restore this check after fixing PR31759.		// declarations. We should restore this check after fixing PR31759.

auto OffsetConstMD = dyn_cast<ConstantAsMetadata>(Type->getOperand(0));		auto OffsetConstMD = dyn_cast<ConstantAsMetadata>(Type->getOperand(0));
if (!OffsetConstMD)		if (!OffsetConstMD)
report_fatal_error("Type offset must be a constant");		report_fatal_error("Type offset must be a constant");
▲ Show 20 Lines • Show All 389 Lines • ▼ Show 20 Lines	void LowerTypeTestsModule::buildBitSetsFromFunctionsWASM(

// The indirect function table index space starts at zero, so pass a NULL		// The indirect function table index space starts at zero, so pass a NULL
// pointer as the subtracted "jump table" offset.		// pointer as the subtracted "jump table" offset.
lowerTypeTestCalls(TypeIds, ConstantPointerNull::get(Int32PtrTy),		lowerTypeTestCalls(TypeIds, ConstantPointerNull::get(Int32PtrTy),
GlobalLayout);		GlobalLayout);
}		}

void LowerTypeTestsModule::buildBitSetsFromDisjointSet(		void LowerTypeTestsModule::buildBitSetsFromDisjointSet(
ArrayRef<Metadata > TypeIds, ArrayRef<GlobalTypeMember > Globals) {		ArrayRef<Metadata > TypeIds, ArrayRef<GlobalTypeMember > Globals,
		ArrayRef<IcallJumptable *> IcallJumptables) {
DenseMap<Metadata *, uint64_t> TypeIdIndices;		DenseMap<Metadata *, uint64_t> TypeIdIndices;
for (unsigned I = 0; I != TypeIds.size(); ++I)		for (unsigned I = 0; I != TypeIds.size(); ++I)
TypeIdIndices[TypeIds[I]] = I;		TypeIdIndices[TypeIds[I]] = I;

// For each type identifier, build a set of indices that refer to members of		// For each type identifier, build a set of indices that refer to members of
// the type identifier.		// the type identifier.
std::vector<std::set<uint64_t>> TypeMembers(TypeIds.size());		std::vector<std::set<uint64_t>> TypeMembers(TypeIds.size());
unsigned GlobalIndex = 0;		unsigned GlobalIndex = 0;
		DenseMap<GlobalTypeMember *, uint64_t> GlobalIndices;
for (GlobalTypeMember *GTM : Globals) {		for (GlobalTypeMember *GTM : Globals) {
for (MDNode *Type : GTM->types()) {		for (MDNode *Type : GTM->types()) {
// Type = { offset, type identifier }		// Type = { offset, type identifier }
unsigned TypeIdIndex = TypeIdIndices[Type->getOperand(1)];		auto I = TypeIdIndices.find(Type->getOperand(1));
TypeMembers[TypeIdIndex].insert(GlobalIndex);		if (I != TypeIdIndices.end())
		TypeMembers[I->second].insert(GlobalIndex);
}		}
		GlobalIndices[GTM] = GlobalIndex;
GlobalIndex++;		GlobalIndex++;
}		}

		for (IcallJumptable *JT : IcallJumptables) {
		TypeMembers.emplace_back();
		std::set<uint64_t> &TMSet = TypeMembers.back();
		for (GlobalTypeMember *T : JT->targets())
		TMSet.insert(GlobalIndices[T]);
		}

// Order the sets of indices by size. The GlobalLayoutBuilder works best		// Order the sets of indices by size. The GlobalLayoutBuilder works best
// when given small index sets first.		// when given small index sets first.
std::stable_sort(		std::stable_sort(
TypeMembers.begin(), TypeMembers.end(),		TypeMembers.begin(), TypeMembers.end(),
[](const std::set<uint64_t> &O1, const std::set<uint64_t> &O2) {		[](const std::set<uint64_t> &O1, const std::set<uint64_t> &O2) {
return O1.size() < O2.size();		return O1.size() < O2.size();
});		});

▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	bool LowerTypeTestsModule::runForTesting(Module &M) {
}		}

return Changed;		return Changed;
}		}

bool LowerTypeTestsModule::lower() {		bool LowerTypeTestsModule::lower() {
Function *TypeTestFunc =		Function *TypeTestFunc =
M.getFunction(Intrinsic::getName(Intrinsic::type_test));		M.getFunction(Intrinsic::getName(Intrinsic::type_test));
if ((!TypeTestFunc \|\| TypeTestFunc->use_empty()) && !ExportSummary &&		Function *IcallJumptableFunc =
!ImportSummary)		M.getFunction(Intrinsic::getName(Intrinsic::icall_jumptable));
		if ((!TypeTestFunc \|\| TypeTestFunc->use_empty()) &&
		(!IcallJumptableFunc \|\| IcallJumptableFunc->use_empty()) &&
		!ExportSummary && !ImportSummary)
return false;		return false;

if (ImportSummary) {		if (ImportSummary) {
if (TypeTestFunc) {		if (TypeTestFunc) {
for (auto UI = TypeTestFunc->use_begin(), UE = TypeTestFunc->use_end();		for (auto UI = TypeTestFunc->use_begin(), UE = TypeTestFunc->use_end();
UI != UE;) {		UI != UE;) {
auto CI = cast<CallInst>((UI++).getUser());		auto CI = cast<CallInst>((UI++).getUser());
importTypeTest(CI);		importTypeTest(CI);
}		}
}		}

		if (IcallJumptableFunc && !IcallJumptableFunc->use_empty())
		report_fatal_error(
		"unexpected call to llvm.icall.jumptable during import phase");

SmallVector<Function *, 8> Defs;		SmallVector<Function *, 8> Defs;
SmallVector<Function *, 8> Decls;		SmallVector<Function *, 8> Decls;
for (auto &F : M) {		for (auto &F : M) {
// CFI functions are either external, or promoted. A local function may		// CFI functions are either external, or promoted. A local function may
// have the same name, but it's not the one we are looking for.		// have the same name, but it's not the one we are looking for.
if (F.hasLocalLinkage())		if (F.hasLocalLinkage())
continue;		continue;
if (ImportSummary->cfiFunctionDefs().count(F.getName()))		if (ImportSummary->cfiFunctionDefs().count(F.getName()))
Defs.push_back(&F);		Defs.push_back(&F);
else if (ImportSummary->cfiFunctionDecls().count(F.getName()))		else if (ImportSummary->cfiFunctionDecls().count(F.getName()))
Decls.push_back(&F);		Decls.push_back(&F);
}		}

for (auto F : Defs)		for (auto F : Defs)
importFunction(F, /isDefinition/ true);		importFunction(F, /isDefinition/ true);
for (auto F : Decls)		for (auto F : Decls)
importFunction(F, /isDefinition/ false);		importFunction(F, /isDefinition/ false);

return true;		return true;
}		}

// Equivalence class set containing type identifiers and the globals that		// Equivalence class set containing type identifiers and the globals that
// reference them. This is used to partition the set of type identifiers in		// reference them. This is used to partition the set of type identifiers in
// the module into disjoint sets.		// the module into disjoint sets.
using GlobalClassesTy =		using GlobalClassesTy = EquivalenceClasses<
EquivalenceClasses<PointerUnion<GlobalTypeMember , Metadata >>;		PointerUnion3<GlobalTypeMember , Metadata , IcallJumptable *>>;
GlobalClassesTy GlobalClasses;		GlobalClassesTy GlobalClasses;

// Verify the type metadata and build a few data structures to let us		// Verify the type metadata and build a few data structures to let us
// efficiently enumerate the type identifiers associated with a global:		// efficiently enumerate the type identifiers associated with a global:
// a list of GlobalTypeMembers (a GlobalObject stored alongside a vector		// a list of GlobalTypeMembers (a GlobalObject stored alongside a vector
// of associated type metadata) and a mapping from type identifiers to their		// of associated type metadata) and a mapping from type identifiers to their
// list of GlobalTypeMembers and last observed index in the list of globals.		// list of GlobalTypeMembers and last observed index in the list of globals.
// The indices will be used later to deterministically order the list of type		// The indices will be used later to deterministically order the list of type
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	if (CfiFunctionsMD) {
for (unsigned I = 2; I < FuncMD->getNumOperands(); ++I)		for (unsigned I = 2; I < FuncMD->getNumOperands(); ++I)
F->addMetadata(LLVMContext::MD_type,		F->addMetadata(LLVMContext::MD_type,
*cast<MDNode>(FuncMD->getOperand(I).get()));		*cast<MDNode>(FuncMD->getOperand(I).get()));
}		}
}		}
}		}
}		}

		DenseMap<GlobalObject , GlobalTypeMember > GlobalTypeMembers;
for (GlobalObject &GO : M.global_objects()) {		for (GlobalObject &GO : M.global_objects()) {
if (isa<GlobalVariable>(GO) && GO.isDeclarationForLinker())		if (isa<GlobalVariable>(GO) && GO.isDeclarationForLinker())
continue;		continue;

Types.clear();		Types.clear();
GO.getMetadata(LLVMContext::MD_type, Types);		GO.getMetadata(LLVMContext::MD_type, Types);
if (Types.empty())
continue;

bool IsDefinition = !GO.isDeclarationForLinker();		bool IsDefinition = !GO.isDeclarationForLinker();
bool IsExported = false;		bool IsExported = false;
if (isa<Function>(GO) && ExportedFunctions.count(GO.getName())) {		if (isa<Function>(GO) && ExportedFunctions.count(GO.getName())) {
IsDefinition \|= ExportedFunctions[GO.getName()].Linkage == CFL_Definition;		IsDefinition \|= ExportedFunctions[GO.getName()].Linkage == CFL_Definition;
IsExported = true;		IsExported = true;
}		}

auto *GTM =		auto *GTM =
GlobalTypeMember::create(Alloc, &GO, IsDefinition, IsExported, Types);		GlobalTypeMember::create(Alloc, &GO, IsDefinition, IsExported, Types);
		GlobalTypeMembers[&GO] = GTM;
for (MDNode *Type : Types) {		for (MDNode *Type : Types) {
verifyTypeMDNode(&GO, Type);		verifyTypeMDNode(&GO, Type);
auto &Info = TypeIdInfo[Type->getOperand(1)];		auto &Info = TypeIdInfo[Type->getOperand(1)];
Info.Index = ++I;		Info.Index = ++I;
Info.RefGlobals.push_back(GTM);		Info.RefGlobals.push_back(GTM);
}		}
}		}

Show All 24 Lines	for (const Use &U : TypeTestFunc->uses()) {
auto TypeIdMDVal = dyn_cast<MetadataAsValue>(CI->getArgOperand(1));		auto TypeIdMDVal = dyn_cast<MetadataAsValue>(CI->getArgOperand(1));
if (!TypeIdMDVal)		if (!TypeIdMDVal)
report_fatal_error("Second argument of llvm.type.test must be metadata");		report_fatal_error("Second argument of llvm.type.test must be metadata");
auto TypeId = TypeIdMDVal->getMetadata();		auto TypeId = TypeIdMDVal->getMetadata();
AddTypeIdUse(TypeId).CallSites.push_back(CI);		AddTypeIdUse(TypeId).CallSites.push_back(CI);
}		}
}		}

		if (IcallJumptableFunc) {
		for (const Use &U : IcallJumptableFunc->uses()) {
		if (Arch != Triple::x86_64)
		report_fatal_error("llvm.icall.jumptable not supported on this target");

		auto CI = cast<CallInst>(U.getUser());

		std::vector<GlobalTypeMember *> Targets;
		if (CI->getNumArgOperands() % 2 != 1)
		report_fatal_error("number of arguments should be odd");

		GlobalClassesTy::member_iterator CurSet;
		for (unsigned I = 1; I != CI->getNumArgOperands(); I += 2) {
		int64_t Offset;
		auto *Base = dyn_cast<GlobalObject>(GetPointerBaseWithConstantOffset(
		CI->getOperand(I), Offset, M.getDataLayout()));
		if (!Base)
		report_fatal_error("Expected jump table operand to be global value");

		GlobalTypeMember *GTM = GlobalTypeMembers[Base];
		Targets.push_back(GTM);
		GlobalClassesTy::member_iterator NewSet =
		GlobalClasses.findLeader(GlobalClasses.insert(GTM));
		if (I == 1)
		CurSet = NewSet;
		else
		CurSet = GlobalClasses.unionSets(CurSet, NewSet);
		}

		GlobalClasses.unionSets(CurSet,
		GlobalClasses.findLeader(GlobalClasses.insert(
		IcallJumptable::create(Alloc, CI, Targets))));
		}
		}

if (ExportSummary) {		if (ExportSummary) {
DenseMap<GlobalValue::GUID, TinyPtrVector<Metadata *>> MetadataByGUID;		DenseMap<GlobalValue::GUID, TinyPtrVector<Metadata *>> MetadataByGUID;
for (auto &P : TypeIdInfo) {		for (auto &P : TypeIdInfo) {
if (auto *TypeId = dyn_cast<MDString>(P.first))		if (auto *TypeId = dyn_cast<MDString>(P.first))
MetadataByGUID[GlobalValue::getGUID(TypeId->getString())].push_back(		MetadataByGUID[GlobalValue::getGUID(TypeId->getString())].push_back(
TypeId);		TypeId);
}		}

Show All 36 Lines	std::sort(Sets.begin(), Sets.end(),
return S1.second < S2.second;		return S1.second < S2.second;
});		});

// For each disjoint set we found...		// For each disjoint set we found...
for (const auto &S : Sets) {		for (const auto &S : Sets) {
// Build the list of type identifiers in this disjoint set.		// Build the list of type identifiers in this disjoint set.
std::vector<Metadata *> TypeIds;		std::vector<Metadata *> TypeIds;
std::vector<GlobalTypeMember *> Globals;		std::vector<GlobalTypeMember *> Globals;
		std::vector<IcallJumptable *> IcallJumptables;
for (GlobalClassesTy::member_iterator MI =		for (GlobalClassesTy::member_iterator MI =
GlobalClasses.member_begin(S.first);		GlobalClasses.member_begin(S.first);
MI != GlobalClasses.member_end(); ++MI) {		MI != GlobalClasses.member_end(); ++MI) {
if ((MI).is<Metadata >())		if (MI->is<Metadata *>())
TypeIds.push_back(MI->get<Metadata *>());		TypeIds.push_back(MI->get<Metadata *>());
else		else if (MI->is<GlobalTypeMember *>())
Globals.push_back(MI->get<GlobalTypeMember *>());		Globals.push_back(MI->get<GlobalTypeMember *>());
		else
		IcallJumptables.push_back(MI->get<IcallJumptable *>());
}		}

// Order type identifiers by global index for determinism. This ordering is		// Order type identifiers by global index for determinism. This ordering is
// stable as there is a one-to-one mapping between metadata and indices.		// stable as there is a one-to-one mapping between metadata and indices.
std::sort(TypeIds.begin(), TypeIds.end(), [&](Metadata M1, Metadata M2) {		std::sort(TypeIds.begin(), TypeIds.end(), [&](Metadata M1, Metadata M2) {
return TypeIdInfo[M1].Index < TypeIdInfo[M2].Index;		return TypeIdInfo[M1].Index < TypeIdInfo[M2].Index;
});		});

// Build bitsets for this disjoint set.		// Build bitsets for this disjoint set.
buildBitSetsFromDisjointSet(TypeIds, Globals);		buildBitSetsFromDisjointSet(TypeIds, Globals, IcallJumptables);
}		}

allocateByteArrays();		allocateByteArrays();

// Parse alias data to replace stand-in function declarations for aliases		// Parse alias data to replace stand-in function declarations for aliases
// with an alias to the intended target.		// with an alias to the intended target.
if (ExportSummary) {		if (ExportSummary) {
if (NamedMDNode *AliasesMD = M.getNamedMetadata("aliases")) {		if (NamedMDNode *AliasesMD = M.getNamedMetadata("aliases")) {
▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp

Show First 20 Lines • Show All 310 Lines • ▼ Show 20 Lines
// VTableSlotInfo class.		// VTableSlotInfo class.
struct CallSiteInfo {		struct CallSiteInfo {
/// The set of call sites for this slot. Used during regular LTO and the		/// The set of call sites for this slot. Used during regular LTO and the
/// import phase of ThinLTO (as well as the export phase of ThinLTO for any		/// import phase of ThinLTO (as well as the export phase of ThinLTO for any
/// call sites that appear in the merged module itself); in each of these		/// call sites that appear in the merged module itself); in each of these
/// cases we are directly operating on the call sites at the IR level.		/// cases we are directly operating on the call sites at the IR level.
std::vector<VirtualCallSite> CallSites;		std::vector<VirtualCallSite> CallSites;

		/// Whether all call sites represented by this CallSiteInfo, including those
		/// in summaries, have been devirtualized. This starts off as true because a
		/// default constructed CallSiteInfo represents no call sites.
		bool AllCallSitesDevirted = true;

// These fields are used during the export phase of ThinLTO and reflect		// These fields are used during the export phase of ThinLTO and reflect
// information collected from function summaries.		// information collected from function summaries.

/// Whether any function summary contains an llvm.assume(llvm.type.test) for		/// Whether any function summary contains an llvm.assume(llvm.type.test) for
/// this slot.		/// this slot.
bool SummaryHasTypeTestAssumeUsers;		bool SummaryHasTypeTestAssumeUsers = false;

/// CFI-specific: a vector containing the list of function summaries that use		/// CFI-specific: a vector containing the list of function summaries that use
/// the llvm.type.checked.load intrinsic and therefore will require		/// the llvm.type.checked.load intrinsic and therefore will require
/// resolutions for llvm.type.test in order to implement CFI checks if		/// resolutions for llvm.type.test in order to implement CFI checks if
/// devirtualization was unsuccessful. If devirtualization was successful, the		/// devirtualization was unsuccessful. If devirtualization was successful, the
/// pass will clear this vector by calling markDevirt(). If at the end of the		/// pass will clear this vector by calling markDevirt(). If at the end of the
/// pass the vector is non-empty, we will need to add a use of llvm.type.test		/// pass the vector is non-empty, we will need to add a use of llvm.type.test
/// to each of the function summaries in the vector.		/// to each of the function summaries in the vector.
std::vector<FunctionSummary *> SummaryTypeCheckedLoadUsers;		std::vector<FunctionSummary *> SummaryTypeCheckedLoadUsers;

bool isExported() const {		bool isExported() const {
return SummaryHasTypeTestAssumeUsers \|\|		return SummaryHasTypeTestAssumeUsers \|\|
!SummaryTypeCheckedLoadUsers.empty();		!SummaryTypeCheckedLoadUsers.empty();
}		}

/// As explained in the comment for SummaryTypeCheckedLoadUsers.		void markSummaryHasTypeTestAssumeUsers() {
void markDevirt() { SummaryTypeCheckedLoadUsers.clear(); }		SummaryHasTypeTestAssumeUsers = true;
		AllCallSitesDevirted = false;
		}

		void addSummaryTypeCheckedLoadUser(FunctionSummary *FS) {
		SummaryTypeCheckedLoadUsers.push_back(FS);
		AllCallSitesDevirted = false;
		}

		void markDevirt() {
		AllCallSitesDevirted = true;

		// As explained in the comment for SummaryTypeCheckedLoadUsers.
		SummaryTypeCheckedLoadUsers.clear();
		}
};		};

// Call site information collected for a specific VTableSlot.		// Call site information collected for a specific VTableSlot.
struct VTableSlotInfo {		struct VTableSlotInfo {
// The set of call sites which do not have all constant integer arguments		// The set of call sites which do not have all constant integer arguments
// (excluding "this").		// (excluding "this").
CallSiteInfo CSInfo;		CallSiteInfo CSInfo;

Show All 18 Lines	if (!CI \|\| CI->getBitWidth() > 64)
return CSInfo;		return CSInfo;
Args.push_back(CI->getZExtValue());		Args.push_back(CI->getZExtValue());
}		}
return ConstCSInfo[Args];		return ConstCSInfo[Args];
}		}

void VTableSlotInfo::addCallSite(Value *VTable, CallSite CS,		void VTableSlotInfo::addCallSite(Value *VTable, CallSite CS,
unsigned *NumUnsafeUses) {		unsigned *NumUnsafeUses) {
findCallSiteInfo(CS).CallSites.push_back({VTable, CS, NumUnsafeUses});		auto &CSI = findCallSiteInfo(CS);
		CSI.AllCallSitesDevirted = false;
		CSI.CallSites.push_back({VTable, CS, NumUnsafeUses});
}		}

struct DevirtModule {		struct DevirtModule {
Module &M;		Module &M;
function_ref<AAResults &(Function &)> AARGetter;		function_ref<AAResults &(Function &)> AARGetter;

ModuleSummaryIndex *ExportSummary;		ModuleSummaryIndex *ExportSummary;
const ModuleSummaryIndex *ImportSummary;		const ModuleSummaryIndex *ImportSummary;
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	tryFindVirtualCallTargets(std::vector<VirtualCallTarget> &TargetsForSlot,
uint64_t ByteOffset);		uint64_t ByteOffset);

void applySingleImplDevirt(VTableSlotInfo &SlotInfo, Constant *TheFn,		void applySingleImplDevirt(VTableSlotInfo &SlotInfo, Constant *TheFn,
bool &IsExported);		bool &IsExported);
bool trySingleImplDevirt(MutableArrayRef<VirtualCallTarget> TargetsForSlot,		bool trySingleImplDevirt(MutableArrayRef<VirtualCallTarget> TargetsForSlot,
VTableSlotInfo &SlotInfo,		VTableSlotInfo &SlotInfo,
WholeProgramDevirtResolution *Res);		WholeProgramDevirtResolution *Res);

		void applyIcallJumpTable(VTableSlotInfo &SlotInfo, Constant *JT,
		bool &IsExported);
		void tryIcallJumpTable(MutableArrayRef<VirtualCallTarget> TargetsForSlot,
		VTableSlotInfo &SlotInfo,
		WholeProgramDevirtResolution *Res, VTableSlot Slot);

bool tryEvaluateFunctionsWithArgs(		bool tryEvaluateFunctionsWithArgs(
MutableArrayRef<VirtualCallTarget> TargetsForSlot,		MutableArrayRef<VirtualCallTarget> TargetsForSlot,
ArrayRef<uint64_t> Args);		ArrayRef<uint64_t> Args);

void applyUniformRetValOpt(CallSiteInfo &CSInfo, StringRef FnName,		void applyUniformRetValOpt(CallSiteInfo &CSInfo, StringRef FnName,
uint64_t TheRetVal);		uint64_t TheRetVal);
bool tryUniformRetValOpt(MutableArrayRef<VirtualCallTarget> TargetsForSlot,		bool tryUniformRetValOpt(MutableArrayRef<VirtualCallTarget> TargetsForSlot,
CallSiteInfo &CSInfo,		CallSiteInfo &CSInfo,
Show All 17 Lines	struct DevirtModule {
// This function is called during the import phase to create a reference to		// This function is called during the import phase to create a reference to
// the symbol definition created during the export phase.		// the symbol definition created during the export phase.
Constant *importGlobal(VTableSlot Slot, ArrayRef<uint64_t> Args,		Constant *importGlobal(VTableSlot Slot, ArrayRef<uint64_t> Args,
StringRef Name);		StringRef Name);
Constant *importConstant(VTableSlot Slot, ArrayRef<uint64_t> Args,		Constant *importConstant(VTableSlot Slot, ArrayRef<uint64_t> Args,
StringRef Name, IntegerType *IntTy,		StringRef Name, IntegerType *IntTy,
uint32_t Storage);		uint32_t Storage);

		Constant getMemberAddr(const TypeMemberInfo M);

void applyUniqueRetValOpt(CallSiteInfo &CSInfo, StringRef FnName, bool IsOne,		void applyUniqueRetValOpt(CallSiteInfo &CSInfo, StringRef FnName, bool IsOne,
Constant *UniqueMemberAddr);		Constant *UniqueMemberAddr);
bool tryUniqueRetValOpt(unsigned BitWidth,		bool tryUniqueRetValOpt(unsigned BitWidth,
MutableArrayRef<VirtualCallTarget> TargetsForSlot,		MutableArrayRef<VirtualCallTarget> TargetsForSlot,
CallSiteInfo &CSInfo,		CallSiteInfo &CSInfo,
WholeProgramDevirtResolution::ByArg *Res,		WholeProgramDevirtResolution::ByArg *Res,
VTableSlot Slot, ArrayRef<uint64_t> Args);		VTableSlot Slot, ArrayRef<uint64_t> Args);

▲ Show 20 Lines • Show All 239 Lines • ▼ Show 20 Lines	for (auto &&VCallSite : CSInfo.CallSites) {
if (RemarksEnabled)		if (RemarksEnabled)
VCallSite.emitRemark("single-impl", TheFn->getName(), OREGetter);		VCallSite.emitRemark("single-impl", TheFn->getName(), OREGetter);
VCallSite.CS.setCalledFunction(ConstantExpr::getBitCast(		VCallSite.CS.setCalledFunction(ConstantExpr::getBitCast(
TheFn, VCallSite.CS.getCalledValue()->getType()));		TheFn, VCallSite.CS.getCalledValue()->getType()));
// This use is no longer unsafe.		// This use is no longer unsafe.
if (VCallSite.NumUnsafeUses)		if (VCallSite.NumUnsafeUses)
--*VCallSite.NumUnsafeUses;		--*VCallSite.NumUnsafeUses;
}		}
if (CSInfo.isExported()) {		if (CSInfo.isExported())
IsExported = true;		IsExported = true;
CSInfo.markDevirt();		CSInfo.markDevirt();
}
};		};
Apply(SlotInfo.CSInfo);		Apply(SlotInfo.CSInfo);
for (auto &P : SlotInfo.ConstCSInfo)		for (auto &P : SlotInfo.ConstCSInfo)
Apply(P.second);		Apply(P.second);
}		}

bool DevirtModule::trySingleImplDevirt(		bool DevirtModule::trySingleImplDevirt(
MutableArrayRef<VirtualCallTarget> TargetsForSlot,		MutableArrayRef<VirtualCallTarget> TargetsForSlot,
Show All 39 Lines	bool DevirtModule::trySingleImplDevirt(
}		}

Res->TheKind = WholeProgramDevirtResolution::SingleImpl;		Res->TheKind = WholeProgramDevirtResolution::SingleImpl;
Res->SingleImplName = TheFn->getName();		Res->SingleImplName = TheFn->getName();

return true;		return true;
}		}

		void DevirtModule::tryIcallJumpTable(
		MutableArrayRef<VirtualCallTarget> TargetsForSlot, VTableSlotInfo &SlotInfo,
		WholeProgramDevirtResolution *Res, VTableSlot Slot) {
		Triple T(M.getTargetTriple());
		if (T.getArch() != Triple::x86_64)
		return;

		const unsigned kJumpTableThreshold = 10;
		if (TargetsForSlot.size() > kJumpTableThreshold)
		return;

		bool HasNonDevirt = !SlotInfo.CSInfo.AllCallSitesDevirted;
		if (!HasNonDevirt)
		for (auto &P : SlotInfo.ConstCSInfo)
		if (!P.second.AllCallSitesDevirted) {
		HasNonDevirt = true;
		break;
		}

		if (!HasNonDevirt)
		return;

		FunctionType *FT =
		FunctionType::get(Type::getVoidTy(M.getContext()), {Int8PtrTy}, true);
		Function *JT;
		if (isa<MDString>(Slot.TypeID)) {
		JT = Function::Create(FT, Function::ExternalLinkage,
		getGlobalName(Slot, {}, "jumptable"), &M);
		JT->setVisibility(GlobalValue::HiddenVisibility);
		} else {
		JT = Function::Create(FT, Function::InternalLinkage, "jumptable", &M);
		}
		JT->addAttribute(1, Attribute::Nest);

		std::vector<Value *> JTArgs;
		JTArgs.push_back(JT->arg_begin());
		for (auto &T : TargetsForSlot) {
		JTArgs.push_back(getMemberAddr(T.TM));
		JTArgs.push_back(T.Fn);
		}

		BasicBlock *BB = BasicBlock::Create(M.getContext(), "", JT, nullptr);
		Constant *Intr =
		Intrinsic::getDeclaration(&M, llvm::Intrinsic::icall_jumptable, {});

		auto *CI = CallInst::Create(Intr, JTArgs, "", BB);
		CI->setTailCallKind(CallInst::TCK_MustTail);
		ReturnInst::Create(M.getContext(), nullptr, BB);

		bool IsExported = false;
		applyIcallJumpTable(SlotInfo, JT, IsExported);
		if (IsExported)
		Res->TheKind = WholeProgramDevirtResolution::JumpTable;
		}

		void DevirtModule::applyIcallJumpTable(VTableSlotInfo &SlotInfo, Constant *JT,
		bool &IsExported) {
		auto Apply = [&](CallSiteInfo &CSInfo) {
		if (CSInfo.isExported())
		IsExported = true;
		if (CSInfo.AllCallSitesDevirted)
		return;
		for (auto &&VCallSite : CSInfo.CallSites) {
		CallSite CS = VCallSite.CS;

		// Jump tables are only profitable if the retpoline mitigation is enabled.
		Attribute FSAttr = CS.getCaller()->getFnAttribute("target-features");
		if (FSAttr.hasAttribute(Attribute::None) \|\|
		!FSAttr.getValueAsString().contains("+retpoline"))
		continue;

		if (RemarksEnabled)
		VCallSite.emitRemark("jump-table", JT->getName(), OREGetter);

		// Pass the address of the vtable in the nest register, which is r10 on
		// x86_64.
		std::vector<Type *> NewArgs;
		NewArgs.push_back(Int8PtrTy);
		for (Type *T : CS.getFunctionType()->params())
		NewArgs.push_back(T);
		PointerType *NewFT = PointerType::getUnqual(
		FunctionType::get(CS.getFunctionType()->getReturnType(), NewArgs,
		CS.getFunctionType()->isVarArg()));

		IRBuilder<> IRB(CS.getInstruction());
		std::vector<Value *> Args;
		Args.push_back(IRB.CreateBitCast(VCallSite.VTable, Int8PtrTy));
		for (unsigned I = 0; I != CS.getNumArgOperands(); ++I)
		Args.push_back(CS.getArgOperand(I));

		CallSite NewCS;
		if (CS.isCall())
		NewCS = IRB.CreateCall(IRB.CreateBitCast(JT, NewFT), Args);
		else
		NewCS = IRB.CreateInvoke(
		IRB.CreateBitCast(JT, NewFT),
		cast<InvokeInst>(CS.getInstruction())->getNormalDest(),
		cast<InvokeInst>(CS.getInstruction())->getUnwindDest(), Args);
		NewCS.setCallingConv(CS.getCallingConv());

		AttributeList Attrs = CS.getAttributes();
		std::vector<AttributeSet> NewArgAttrs;
		NewArgAttrs.push_back(AttributeSet::get(
		M.getContext(),
		ArrayRef<Attribute>{Attribute::get(M.getContext(), Attribute::Nest)}));
		for (unsigned I = 0; I + 2 < Attrs.getNumAttrSets(); ++I)
		NewArgAttrs.push_back(Attrs.getParamAttributes(I));
		NewCS.setAttributes(
		AttributeList::get(M.getContext(), Attrs.getFnAttributes(),
		Attrs.getRetAttributes(), NewArgAttrs));

		CS->replaceAllUsesWith(NewCS.getInstruction());
		CS->eraseFromParent();

		// This use is no longer unsafe.
		if (VCallSite.NumUnsafeUses)
		--*VCallSite.NumUnsafeUses;
		}
		// Don't mark as devirtualized because there may be callers compiled without
		// retpoline mitigation, which would mean that they are lowered to
		// llvm.type.test and therefore require an llvm.type.test resolution for the
		// type identifier.
		};
		Apply(SlotInfo.CSInfo);
		for (auto &P : SlotInfo.ConstCSInfo)
		Apply(P.second);
		}

bool DevirtModule::tryEvaluateFunctionsWithArgs(		bool DevirtModule::tryEvaluateFunctionsWithArgs(
MutableArrayRef<VirtualCallTarget> TargetsForSlot,		MutableArrayRef<VirtualCallTarget> TargetsForSlot,
ArrayRef<uint64_t> Args) {		ArrayRef<uint64_t> Args) {
// Evaluate each function and store the result in each target's RetVal		// Evaluate each function and store the result in each target's RetVal
// field.		// field.
for (VirtualCallTarget &Target : TargetsForSlot) {		for (VirtualCallTarget &Target : TargetsForSlot) {
if (Target.Fn->arg_size() != Args.size() + 1)		if (Target.Fn->arg_size() != Args.size() + 1)
return false;		return false;
▲ Show 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	Value *Cmp =
B.CreateBitCast(Call.VTable, Int8PtrTy), UniqueMemberAddr);		B.CreateBitCast(Call.VTable, Int8PtrTy), UniqueMemberAddr);
Cmp = B.CreateZExt(Cmp, Call.CS->getType());		Cmp = B.CreateZExt(Cmp, Call.CS->getType());
Call.replaceAndErase("unique-ret-val", FnName, RemarksEnabled, OREGetter,		Call.replaceAndErase("unique-ret-val", FnName, RemarksEnabled, OREGetter,
Cmp);		Cmp);
}		}
CSInfo.markDevirt();		CSInfo.markDevirt();
}		}

		Constant DevirtModule::getMemberAddr(const TypeMemberInfo M) {
		Constant *C = ConstantExpr::getBitCast(M->Bits->GV, Int8PtrTy);
		return ConstantExpr::getGetElementPtr(Int8Ty, C,
		ConstantInt::get(Int64Ty, M->Offset));
		}

bool DevirtModule::tryUniqueRetValOpt(		bool DevirtModule::tryUniqueRetValOpt(
unsigned BitWidth, MutableArrayRef<VirtualCallTarget> TargetsForSlot,		unsigned BitWidth, MutableArrayRef<VirtualCallTarget> TargetsForSlot,
CallSiteInfo &CSInfo, WholeProgramDevirtResolution::ByArg *Res,		CallSiteInfo &CSInfo, WholeProgramDevirtResolution::ByArg *Res,
VTableSlot Slot, ArrayRef<uint64_t> Args) {		VTableSlot Slot, ArrayRef<uint64_t> Args) {
// IsOne controls whether we look for a 0 or a 1.		// IsOne controls whether we look for a 0 or a 1.
auto tryUniqueRetValOptFor = [&](bool IsOne) {		auto tryUniqueRetValOptFor = [&](bool IsOne) {
const TypeMemberInfo *UniqueMember = nullptr;		const TypeMemberInfo *UniqueMember = nullptr;
for (const VirtualCallTarget &Target : TargetsForSlot) {		for (const VirtualCallTarget &Target : TargetsForSlot) {
if (Target.RetVal == (IsOne ? 1 : 0)) {		if (Target.RetVal == (IsOne ? 1 : 0)) {
if (UniqueMember)		if (UniqueMember)
return false;		return false;
UniqueMember = Target.TM;		UniqueMember = Target.TM;
}		}
}		}

// We should have found a unique member or bailed out by now. We already		// We should have found a unique member or bailed out by now. We already
// checked for a uniform return value in tryUniformRetValOpt.		// checked for a uniform return value in tryUniformRetValOpt.
assert(UniqueMember);		assert(UniqueMember);

Constant *UniqueMemberAddr =		Constant *UniqueMemberAddr = getMemberAddr(UniqueMember);
ConstantExpr::getBitCast(UniqueMember->Bits->GV, Int8PtrTy);
UniqueMemberAddr = ConstantExpr::getGetElementPtr(
Int8Ty, UniqueMemberAddr,
ConstantInt::get(Int64Ty, UniqueMember->Offset));

if (CSInfo.isExported()) {		if (CSInfo.isExported()) {
Res->TheKind = WholeProgramDevirtResolution::ByArg::UniqueRetVal;		Res->TheKind = WholeProgramDevirtResolution::ByArg::UniqueRetVal;
Res->Info = IsOne;		Res->Info = IsOne;

exportGlobal(Slot, Args, "unique_member", UniqueMemberAddr);		exportGlobal(Slot, Args, "unique_member", UniqueMemberAddr);
}		}

// Replace each call with the comparison.		// Replace each call with the comparison.
▲ Show 20 Lines • Show All 370 Lines • ▼ Show 20 Lines	case WholeProgramDevirtResolution::ByArg::VirtualConstProp: {
ResByArg.Bit);		ResByArg.Bit);
applyVirtualConstProp(CSByConstantArg.second, "", Byte, Bit);		applyVirtualConstProp(CSByConstantArg.second, "", Byte, Bit);
break;		break;
}		}
default:		default:
break;		break;
}		}
}		}

		if (Res.TheKind == WholeProgramDevirtResolution::JumpTable) {
		auto *JT = M.getOrInsertFunction(getGlobalName(Slot, {}, "jumptable"),
		Type::getVoidTy(M.getContext()));
		bool IsExported = false;
		applyIcallJumpTable(SlotInfo, JT, IsExported);
		assert(!IsExported);
		}
}		}

void DevirtModule::removeRedundantTypeTests() {		void DevirtModule::removeRedundantTypeTests() {
auto True = ConstantInt::getTrue(M.getContext());		auto True = ConstantInt::getTrue(M.getContext());
for (auto &&U : NumUnsafeUsesForTypeTest) {		for (auto &&U : NumUnsafeUsesForTypeTest) {
if (U.second == 0) {		if (U.second == 0) {
U.first->replaceAllUsesWith(True);		U.first->replaceAllUsesWith(True);
U.first->eraseFromParent();		U.first->eraseFromParent();
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	if (ExportSummary) {
for (auto &P : *ExportSummary) {		for (auto &P : *ExportSummary) {
for (auto &S : P.second.SummaryList) {		for (auto &S : P.second.SummaryList) {
auto *FS = dyn_cast<FunctionSummary>(S.get());		auto *FS = dyn_cast<FunctionSummary>(S.get());
if (!FS)		if (!FS)
continue;		continue;
// FIXME: Only add live functions.		// FIXME: Only add live functions.
for (FunctionSummary::VFuncId VF : FS->type_test_assume_vcalls()) {		for (FunctionSummary::VFuncId VF : FS->type_test_assume_vcalls()) {
for (Metadata *MD : MetadataByGUID[VF.GUID]) {		for (Metadata *MD : MetadataByGUID[VF.GUID]) {
CallSlots[{MD, VF.Offset}].CSInfo.SummaryHasTypeTestAssumeUsers =		CallSlots[{MD, VF.Offset}]
true;		.CSInfo.markSummaryHasTypeTestAssumeUsers();
}		}
}		}
for (FunctionSummary::VFuncId VF : FS->type_checked_load_vcalls()) {		for (FunctionSummary::VFuncId VF : FS->type_checked_load_vcalls()) {
for (Metadata *MD : MetadataByGUID[VF.GUID]) {		for (Metadata *MD : MetadataByGUID[VF.GUID]) {
CallSlots[{MD, VF.Offset}]		CallSlots[{MD, VF.Offset}].CSInfo.addSummaryTypeCheckedLoadUser(FS);
.CSInfo.SummaryTypeCheckedLoadUsers.push_back(FS);
}		}
}		}
for (const FunctionSummary::ConstVCall &VC :		for (const FunctionSummary::ConstVCall &VC :
FS->type_test_assume_const_vcalls()) {		FS->type_test_assume_const_vcalls()) {
for (Metadata *MD : MetadataByGUID[VC.VFunc.GUID]) {		for (Metadata *MD : MetadataByGUID[VC.VFunc.GUID]) {
CallSlots[{MD, VC.VFunc.Offset}]		CallSlots[{MD, VC.VFunc.Offset}]
.ConstCSInfo[VC.Args]		.ConstCSInfo[VC.Args]
.SummaryHasTypeTestAssumeUsers = true;		.markSummaryHasTypeTestAssumeUsers();
}		}
}		}
for (const FunctionSummary::ConstVCall &VC :		for (const FunctionSummary::ConstVCall &VC :
FS->type_checked_load_const_vcalls()) {		FS->type_checked_load_const_vcalls()) {
for (Metadata *MD : MetadataByGUID[VC.VFunc.GUID]) {		for (Metadata *MD : MetadataByGUID[VC.VFunc.GUID]) {
CallSlots[{MD, VC.VFunc.Offset}]		CallSlots[{MD, VC.VFunc.Offset}]
.ConstCSInfo[VC.Args]		.ConstCSInfo[VC.Args]
.SummaryTypeCheckedLoadUsers.push_back(FS);		.addSummaryTypeCheckedLoadUser(FS);
}		}
}		}
}		}
}		}
}		}

// For each (type, offset) pair:		// For each (type, offset) pair:
bool DidVirtualConstProp = false;		bool DidVirtualConstProp = false;
std::map<std::string, Function*> DevirtTargets;		std::map<std::string, Function*> DevirtTargets;
for (auto &S : CallSlots) {		for (auto &S : CallSlots) {
// Search each of the members of the type identifier for the virtual		// Search each of the members of the type identifier for the virtual
// function implementation at offset S.first.ByteOffset, and add to		// function implementation at offset S.first.ByteOffset, and add to
// TargetsForSlot.		// TargetsForSlot.
std::vector<VirtualCallTarget> TargetsForSlot;		std::vector<VirtualCallTarget> TargetsForSlot;
if (tryFindVirtualCallTargets(TargetsForSlot, TypeIdMap[S.first.TypeID],		if (tryFindVirtualCallTargets(TargetsForSlot, TypeIdMap[S.first.TypeID],
S.first.ByteOffset)) {		S.first.ByteOffset)) {
WholeProgramDevirtResolution *Res = nullptr;		WholeProgramDevirtResolution *Res = nullptr;
if (ExportSummary && isa<MDString>(S.first.TypeID))		if (ExportSummary && isa<MDString>(S.first.TypeID))
Res = &ExportSummary		Res = &ExportSummary
->getOrInsertTypeIdSummary(		->getOrInsertTypeIdSummary(
cast<MDString>(S.first.TypeID)->getString())		cast<MDString>(S.first.TypeID)->getString())
.WPDRes[S.first.ByteOffset];		.WPDRes[S.first.ByteOffset];

if (!trySingleImplDevirt(TargetsForSlot, S.second, Res) &&		if (!trySingleImplDevirt(TargetsForSlot, S.second, Res)) {
tryVirtualConstProp(TargetsForSlot, S.second, Res, S.first))		DidVirtualConstProp \|=
DidVirtualConstProp = true;		tryVirtualConstProp(TargetsForSlot, S.second, Res, S.first);

		tryIcallJumpTable(TargetsForSlot, S.second, Res, S.first);
		}

// Collect functions devirtualized at least for one call site for stats.		// Collect functions devirtualized at least for one call site for stats.
if (RemarksEnabled)		if (RemarksEnabled)
for (const auto &T : TargetsForSlot)		for (const auto &T : TargetsForSlot)
if (T.WasDevirt)		if (T.WasDevirt)
DevirtTargets[T.Fn->getName()] = T.Fn;		DevirtTargets[T.Fn->getName()] = T.Fn;
}		}

Show All 37 Lines

llvm/test/Transforms/LowerTypeTests/icall-jumptable.ll

This file was added.

				; RUN: opt -S -lowertypetests < %s \| FileCheck %s

				target datalayout = "e-p:64:64"
				target triple = "x86_64-unknown-linux"

				; CHECK: @0 = private constant { i32, [0 x i8], i32, [0 x i8], i32 } { i32 1, [0 x i8] zeroinitializer, i32 2, [0 x i8] zeroinitializer, i32 3 }
				@g1 = constant i32 1
				@g2 = constant i32 2, !type !0
				@g3 = constant i32 3, !type !0

				define void @f1() !type !1 {
				ret void
				}

				define void @f2() !type !1 {
				ret void
				}

				define void @f3() !type !1 {
				ret void
				}

				define void @f4() !type !1 {
				ret void
				}

				define void @f5() !type !1 {
				ret void
				}

				define void @f6() !type !1 {
				ret void
				}

				define void @f7() !type !1 {
				ret void
				}

				define void @f8() !type !1 {
				ret void
				}

				define void @f9() !type !1 {
				ret void
				}

				define void @f10() !type !1 {
				ret void
				}

				declare void @g1f()
				declare void @g2f()

				; CHECK: define void @jt2()
				define void @jt2() {
				; CHECK-NEXT: call void asm sideeffect "leaq ${0:c}+5(%rip), %r11\0Acmp %r11, %r10\0Ajb ${1:c}@plt\0Ajmp ${2:c}@plt\0A", "s,s,s"({ i32, [0 x i8], i32, [0 x i8], i32 }* @0, void ()* @g1f, void ()* @g2f)
				call void (...) @llvm.icall.jumptable(
				i32* @g1, void ()* @g1f,
				i8* getelementptr (i8, i8* bitcast (i32* @g2 to i8), i64 1), void () @g2f
				)
				ret void
				}

				; CHECK: define void @jt3()
				define void @jt3() {
				; CHECK-NEXT: call void asm sideeffect "leaq ${0:c}+8(%rip), %r11\0Acmp %r11, %r10\0Ajb ${1:c}@plt\0Aje ${2:c}@plt\0Ajmp ${3:c}@plt\0A", "s,s,s,s"([10 x [8 x i8]]* bitcast (void ()* @.cfi.jumptable to [10 x [8 x i8]]), void () @f1, void ()* @f2, void ()* @f3)
				call void (...) @llvm.icall.jumptable(
				void ()* @f1, void ()* @f1,
				void ()* @f2, void ()* @f2,
				void ()* @f3, void ()* @f3
				)
				ret void
				}

				; CHECK: define void @jt7()
				define void @jt7() {
				; CHECK-NEXT: call void asm sideeffect "leaq ${0:c}+24(%rip), %r11\0Acmp %r11, %r10\0Ajb 0f\0Aje ${1:c}@plt\0Aleaq ${2:c}+40(%rip), %r11\0Acmp %r11, %r10\0Ajb ${3:c}@plt\0Aje ${4:c}@plt\0Ajmp ${5:c}@plt\0A0:\0Aleaq ${6:c}+8(%rip), %r11\0Acmp %r11, %r10\0Ajb ${7:c}@plt\0Aje ${8:c}@plt\0Ajmp ${9:c}@plt\0A", "s,s,s,s,s,s,s,s,s,s"([10 x [8 x i8]]* bitcast (void ()* @.cfi.jumptable to [10 x [8 x i8]]), void () @f4, [10 x [8 x i8]]* bitcast (void ()* @.cfi.jumptable to [10 x [8 x i8]]), void () @f5, void ()* @f6, void ()* @f7, [10 x [8 x i8]]* bitcast (void ()* @.cfi.jumptable to [10 x [8 x i8]]), void () @f1, void ()* @f2, void ()* @f3)
				call void (...) @llvm.icall.jumptable(
				void ()* @f1, void ()* @f1,
				void ()* @f2, void ()* @f2,
				void ()* @f3, void ()* @f3,
				void ()* @f4, void ()* @f4,
				void ()* @f5, void ()* @f5,
				void ()* @f6, void ()* @f6,
				void ()* @f7, void ()* @f7
				)
				ret void
				}

				; CHECK: define void @jt10()
				define void @jt10() {
				; CHECK-NEXT: call void asm sideeffect "leaq ${0:c}+40(%rip), %r11\0Acmp %r11, %r10\0Ajb 0f\0Aje ${1:c}@plt\0Aleaq ${2:c}+56(%rip), %r11\0Acmp %r11, %r10\0Ajb ${3:c}@plt\0Aje ${4:c}@plt\0Aleaq ${5:c}+72(%rip), %r11\0Acmp %r11, %r10\0Ajb ${6:c}@plt\0Ajmp ${7:c}@plt\0A0:\0Aleaq ${8:c}+8(%rip), %r11\0Acmp %r11, %r10\0Ajb ${9:c}@plt\0Aje ${10:c}@plt\0Aleaq ${11:c}+24(%rip), %r11\0Acmp %r11, %r10\0Ajb ${12:c}@plt\0Aje ${13:c}@plt\0Ajmp ${14:c}@plt\0A", "s,s,s,s,s,s,s,s,s,s,s,s,s,s,s"([10 x [8 x i8]]* bitcast (void ()* @.cfi.jumptable to [10 x [8 x i8]]), void () @f6, [10 x [8 x i8]]* bitcast (void ()* @.cfi.jumptable to [10 x [8 x i8]]), void () @f7, void ()* @f8, [10 x [8 x i8]]* bitcast (void ()* @.cfi.jumptable to [10 x [8 x i8]]), void () @f9, void ()* @f10, [10 x [8 x i8]]* bitcast (void ()* @.cfi.jumptable to [10 x [8 x i8]]), void () @f1, void ()* @f2, [10 x [8 x i8]]* bitcast (void ()* @.cfi.jumptable to [10 x [8 x i8]]), void () @f3, void ()* @f4, void ()* @f5)
				call void (...) @llvm.icall.jumptable(
				void ()* @f1, void ()* @f1,
				void ()* @f2, void ()* @f2,
				void ()* @f3, void ()* @f3,
				void ()* @f4, void ()* @f4,
				void ()* @f5, void ()* @f5,
				void ()* @f6, void ()* @f6,
				void ()* @f7, void ()* @f7,
				void ()* @f8, void ()* @f8,
				void ()* @f9, void ()* @f9,
				void ()* @f10, void ()* @f10
				)
				ret void
				}

				define i1 @tt(i8* %ptr) {
				%p = call i1 @llvm.type.test(i8* %ptr, metadata !"typeid1")
				ret i1 %p
				}

				!0 = !{i32 0, !"typeid1"}
				!1 = !{i32 0, !"typeid2"}

				declare i1 @llvm.type.test(i8* %ptr, metadata %bitset) nounwind readnone
				declare void @llvm.icall.jumptable(...)

llvm/test/Transforms/WholeProgramDevirt/Inputs/import-jumptable.yaml

This file was added.

				---
				TypeIdMap:
				typeid1:
				WPDRes:
				0:
				Kind: JumpTable
				typeid2:
				WPDRes:
				8:
				Kind: JumpTable
				...

llvm/test/Transforms/WholeProgramDevirt/Inputs/import-vcp-jumptable.yaml

This file was added.

				---
				TypeIdMap:
				typeid1:
				WPDRes:
				0:
				Kind: JumpTable
				ResByArg:
				1:
				Kind: VirtualConstProp
				Info: 0
				Byte: 42
				Bit: 0
				typeid2:
				WPDRes:
				8:
				Kind: JumpTable
				ResByArg:
				3:
				Kind: VirtualConstProp
				Info: 0
				Byte: 43
				Bit: 128
				...

llvm/test/Transforms/WholeProgramDevirt/import.ll

	; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-single-impl.yaml < %s \| FileCheck --check-prefixes=CHECK,SINGLE-IMPL %s			; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-single-impl.yaml < %s \| FileCheck --check-prefixes=CHECK,SINGLE-IMPL %s
	; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-uniform-ret-val.yaml < %s \| FileCheck --check-prefixes=CHECK,UNIFORM-RET-VAL %s			; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-uniform-ret-val.yaml < %s \| FileCheck --check-prefixes=CHECK,INDIR,UNIFORM-RET-VAL %s
	; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-unique-ret-val0.yaml < %s \| FileCheck --check-prefixes=CHECK,UNIQUE-RET-VAL0 %s			; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-unique-ret-val0.yaml < %s \| FileCheck --check-prefixes=CHECK,INDIR,UNIQUE-RET-VAL0 %s
	; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-unique-ret-val1.yaml < %s \| FileCheck --check-prefixes=CHECK,UNIQUE-RET-VAL1 %s			; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-unique-ret-val1.yaml < %s \| FileCheck --check-prefixes=CHECK,INDIR,UNIQUE-RET-VAL1 %s
	; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-vcp.yaml < %s \| FileCheck --check-prefixes=CHECK,VCP,VCP-X86,VCP64 %s			; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-vcp.yaml < %s \| FileCheck --check-prefixes=CHECK,VCP,VCP-X86,VCP64,INDIR %s
	; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-vcp.yaml -mtriple=i686-unknown-linux -data-layout=e-p:32:32 < %s \| FileCheck --check-prefixes=CHECK,VCP,VCP-X86,VCP32 %s			; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-vcp.yaml -mtriple=i686-unknown-linux -data-layout=e-p:32:32 < %s \| FileCheck --check-prefixes=CHECK,VCP,VCP-X86,VCP32 %s
	; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-vcp.yaml -mtriple=armv7-unknown-linux -data-layout=e-p:32:32 < %s \| FileCheck --check-prefixes=CHECK,VCP,VCP-ARM %s			; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-vcp.yaml -mtriple=armv7-unknown-linux -data-layout=e-p:32:32 < %s \| FileCheck --check-prefixes=CHECK,VCP,VCP-ARM %s
				; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-vcp-jumptable.yaml < %s \| FileCheck --check-prefixes=CHECK,VCP,VCP-X86,VCP64,JUMPTABLE %s
				; RUN: opt -S -wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-jumptable.yaml < %s \| FileCheck --check-prefixes=CHECK,JUMPTABLE,JUMPTABLE-NOVCP %s

	target datalayout = "e-p:64:64"			target datalayout = "e-p:64:64"
	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	; VCP-X86: @__typeid_typeid1_0_1_byte = external hidden global i8, !absolute_symbol !0			; VCP-X86: @__typeid_typeid1_0_1_byte = external hidden global i8, !absolute_symbol !0
	; VCP-X86: @__typeid_typeid1_0_1_bit = external hidden global i8, !absolute_symbol !1			; VCP-X86: @__typeid_typeid1_0_1_bit = external hidden global i8, !absolute_symbol !1
	; VCP-X86: @__typeid_typeid2_8_3_byte = external hidden global i8, !absolute_symbol !0			; VCP-X86: @__typeid_typeid2_8_3_byte = external hidden global i8, !absolute_symbol !0
	; VCP-X86: @__typeid_typeid2_8_3_bit = external hidden global i8, !absolute_symbol !1			; VCP-X86: @__typeid_typeid2_8_3_bit = external hidden global i8, !absolute_symbol !1

	; Test cases where the argument values are known and we can apply virtual			; Test cases where the argument values are known and we can apply virtual
	; constant propagation.			; constant propagation.

	; CHECK: define i32 @call1			; CHECK: define i32 @call1
	define i32 @call1(i8* %obj) {			define i32 @call1(i8* %obj) #0 {
	%vtableptr = bitcast i8* %obj to [3 x i8]*			%vtableptr = bitcast i8* %obj to [3 x i8]*
	%vtable = load [3 x i8], [3 x i8]* %vtableptr			%vtable = load [3 x i8], [3 x i8]* %vtableptr
	%vtablei8 = bitcast [3 x i8] %vtable to i8*			%vtablei8 = bitcast [3 x i8] %vtable to i8*
	%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !"typeid1")			%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !"typeid1")
	call void @llvm.assume(i1 %p)			call void @llvm.assume(i1 %p)
	%fptrptr = getelementptr [3 x i8], [3 x i8]* %vtable, i32 0, i32 0			%fptrptr = getelementptr [3 x i8], [3 x i8]* %vtable, i32 0, i32 0
	%fptr = load i8, i8* %fptrptr			%fptr = load i8, i8* %fptrptr
	%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)			%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)
				; CHECK: {{.}} = bitcast {{.}} to i8*
				; VCP: [[VT1:%.]] = bitcast {{.}} to i8*
	; SINGLE-IMPL: call i32 bitcast (void ()* @singleimpl1 to i32 (i8, i32))			; SINGLE-IMPL: call i32 bitcast (void ()* @singleimpl1 to i32 (i8, i32))
	%result = call i32 %fptr_casted(i8* %obj, i32 1)			%result = call i32 %fptr_casted(i8* %obj, i32 1)
	; UNIFORM-RET-VAL: ret i32 42			; UNIFORM-RET-VAL: ret i32 42
	; VCP: {{.}} = bitcast {{.}} to i8*
	; VCP: [[VT1:%.]] = bitcast {{.}} to i8*
	; VCP-X86: [[GEP1:%.]] = getelementptr i8, i8 [[VT1]], i32 ptrtoint (i8* @__typeid_typeid1_0_1_byte to i32)			; VCP-X86: [[GEP1:%.]] = getelementptr i8, i8 [[VT1]], i32 ptrtoint (i8* @__typeid_typeid1_0_1_byte to i32)
	; VCP-ARM: [[GEP1:%.]] = getelementptr i8, i8 [[VT1]], i32 42			; VCP-ARM: [[GEP1:%.]] = getelementptr i8, i8 [[VT1]], i32 42
	; VCP: [[BC1:%.]] = bitcast i8 [[GEP1]] to i32*			; VCP: [[BC1:%.]] = bitcast i8 [[GEP1]] to i32*
	; VCP: [[LOAD1:%.]] = load i32, i32 [[BC1]]			; VCP: [[LOAD1:%.]] = load i32, i32 [[BC1]]
	; VCP: ret i32 [[LOAD1]]			; VCP: ret i32 [[LOAD1]]
				; JUMPTABLE-NOVCP: [[VT1:%.]] = bitcast {{.}} to i8*
				; JUMPTABLE-NOVCP: call i32 bitcast (void ()* @__typeid_typeid1_0_jumptable to i32 (i8, i8, i32))(i8 nest [[VT1]], i8* %obj, i32 1)
	ret i32 %result			ret i32 %result
	}			}

	; Test cases where the argument values are unknown, so we cannot apply virtual			; Test cases where the argument values are unknown, so we cannot apply virtual
	; constant propagation.			; constant propagation.

	; CHECK: define i1 @call2			; CHECK: define i1 @call2
	define i1 @call2(i8* %obj) {			define i1 @call2(i8* %obj) #0 {
				; JUMPTABLE: [[VT1:%.]] = bitcast {{.}} to i8*
	%vtableptr = bitcast i8* %obj to [1 x i8]*			%vtableptr = bitcast i8* %obj to [1 x i8]*
	%vtable = load [1 x i8], [1 x i8]* %vtableptr			%vtable = load [1 x i8], [1 x i8]* %vtableptr
	%vtablei8 = bitcast [1 x i8] %vtable to i8*			%vtablei8 = bitcast [1 x i8] %vtable to i8*
	%pair = call {i8, i1} @llvm.type.checked.load(i8 %vtablei8, i32 8, metadata !"typeid2")			%pair = call {i8, i1} @llvm.type.checked.load(i8 %vtablei8, i32 8, metadata !"typeid2")
	%fptr = extractvalue {i8*, i1} %pair, 0			%fptr = extractvalue {i8*, i1} %pair, 0
	%p = extractvalue {i8*, i1} %pair, 1			%p = extractvalue {i8*, i1} %pair, 1
	; SINGLE-IMPL: br i1 true,			; SINGLE-IMPL: br i1 true,
	br i1 %p, label %cont, label %trap			br i1 %p, label %cont, label %trap

	cont:			cont:
	%fptr_casted = bitcast i8* %fptr to i1 (i8, i32)			%fptr_casted = bitcast i8* %fptr to i1 (i8, i32)
	; SINGLE-IMPL: call i1 bitcast (void ()* @singleimpl2 to i1 (i8, i32))			; SINGLE-IMPL: call i1 bitcast (void ()* @singleimpl2 to i1 (i8, i32))
	; UNIFORM-RET-VAL: call i1 %			; INDIR: call i1 %
	; UNIQUE-RET-VAL0: call i1 %			; JUMPTABLE: call i1 bitcast (void ()* @__typeid_typeid2_8_jumptable to i1 (i8, i8, i32))(i8 nest [[VT1]], i8* %obj, i32 undef)
	; UNIQUE-RET-VAL1: call i1 %
	%result = call i1 %fptr_casted(i8* %obj, i32 undef)			%result = call i1 %fptr_casted(i8* %obj, i32 undef)
	ret i1 %result			ret i1 %result

	trap:			trap:
	call void @llvm.trap()			call void @llvm.trap()
	unreachable			unreachable
	}			}

	; CHECK: define i1 @call3			; CHECK: define i1 @call3
	define i1 @call3(i8* %obj) {			define i1 @call3(i8* %obj) #0 {
	%vtableptr = bitcast i8* %obj to [1 x i8]*			%vtableptr = bitcast i8* %obj to [1 x i8]*
	%vtable = load [1 x i8], [1 x i8]* %vtableptr			%vtable = load [1 x i8], [1 x i8]* %vtableptr
	%vtablei8 = bitcast [1 x i8] %vtable to i8*			%vtablei8 = bitcast [1 x i8] %vtable to i8*
	%pair = call {i8, i1} @llvm.type.checked.load(i8 %vtablei8, i32 8, metadata !"typeid2")			%pair = call {i8, i1} @llvm.type.checked.load(i8 %vtablei8, i32 8, metadata !"typeid2")
	%fptr = extractvalue {i8*, i1} %pair, 0			%fptr = extractvalue {i8*, i1} %pair, 0
	%p = extractvalue {i8*, i1} %pair, 1			%p = extractvalue {i8*, i1} %pair, 1
	br i1 %p, label %cont, label %trap			br i1 %p, label %cont, label %trap

	cont:			cont:
	%fptr_casted = bitcast i8* %fptr to i1 (i8, i32)			%fptr_casted = bitcast i8* %fptr to i1 (i8, i32)
	%result = call i1 %fptr_casted(i8* %obj, i32 3)			%result = call i1 %fptr_casted(i8* %obj, i32 3)
	; UNIQUE-RET-VAL0: icmp ne i8* %vtablei8, @__typeid_typeid2_8_3_unique_member			; UNIQUE-RET-VAL0: icmp ne i8* %vtablei8, @__typeid_typeid2_8_3_unique_member
	; UNIQUE-RET-VAL1: icmp eq i8* %vtablei8, @__typeid_typeid2_8_3_unique_member			; UNIQUE-RET-VAL1: icmp eq i8* %vtablei8, @__typeid_typeid2_8_3_unique_member
	; VCP: [[VT2:%.]] = bitcast {{.}} to i8*			; VCP: [[VT2:%.]] = bitcast {{.}} to i8*
	; VCP-X86: [[GEP2:%.]] = getelementptr i8, i8 [[VT2]], i32 ptrtoint (i8* @__typeid_typeid2_8_3_byte to i32)			; VCP-X86: [[GEP2:%.]] = getelementptr i8, i8 [[VT2]], i32 ptrtoint (i8* @__typeid_typeid2_8_3_byte to i32)
	; VCP-ARM: [[GEP2:%.]] = getelementptr i8, i8 [[VT2]], i32 43			; VCP-ARM: [[GEP2:%.]] = getelementptr i8, i8 [[VT2]], i32 43
	; VCP: [[LOAD2:%.]] = load i8, i8 [[GEP2]]			; VCP: [[LOAD2:%.]] = load i8, i8 [[GEP2]]
	; VCP-X86: [[AND2:%.]] = and i8 [[LOAD2]], ptrtoint (i8 @__typeid_typeid2_8_3_bit to i8)			; VCP-X86: [[AND2:%.]] = and i8 [[LOAD2]], ptrtoint (i8 @__typeid_typeid2_8_3_bit to i8)
	; VCP-ARM: [[AND2:%.*]] = and i8 [[LOAD2]], -128			; VCP-ARM: [[AND2:%.*]] = and i8 [[LOAD2]], -128
	; VCP: [[ICMP2:%.*]] = icmp ne i8 [[AND2]], 0			; VCP: [[ICMP2:%.*]] = icmp ne i8 [[AND2]], 0
	; VCP: ret i1 [[ICMP2]]			; VCP: ret i1 [[ICMP2]]
				; JUMPTABLE-NOVCP: [[VT2:%.]] = bitcast {{.}} to i8*
				; JUMPTABLE-NOVCP: call i1 bitcast (void ()* @__typeid_typeid2_8_jumptable to i1 (i8, i8, i32))(i8 nest [[VT2]], i8* %obj, i32 3)
	ret i1 %result			ret i1 %result

	trap:			trap:
	call void @llvm.trap()			call void @llvm.trap()
	unreachable			unreachable
	}			}

	; SINGLE-IMPL-DAG: declare void @singleimpl1()			; SINGLE-IMPL-DAG: declare void @singleimpl1()
	; SINGLE-IMPL-DAG: declare void @singleimpl2()			; SINGLE-IMPL-DAG: declare void @singleimpl2()

	; VCP32: !0 = !{i32 -1, i32 -1}			; VCP32: !0 = !{i32 -1, i32 -1}
	; VCP64: !0 = !{i64 0, i64 4294967296}			; VCP64: !0 = !{i64 0, i64 4294967296}

	; VCP32: !1 = !{i32 0, i32 256}			; VCP32: !1 = !{i32 0, i32 256}
	; VCP64: !1 = !{i64 0, i64 256}			; VCP64: !1 = !{i64 0, i64 256}

	declare void @llvm.assume(i1)			declare void @llvm.assume(i1)
	declare void @llvm.trap()			declare void @llvm.trap()
	declare {i8, i1} @llvm.type.checked.load(i8, i32, metadata)			declare {i8, i1} @llvm.type.checked.load(i8, i32, metadata)
	declare i1 @llvm.type.test(i8*, metadata)			declare i1 @llvm.type.test(i8*, metadata)

				attributes #0 = { "target-features"="+retpoline" }

llvm/test/Transforms/WholeProgramDevirt/jumptable.ll

This file was added.

				; RUN: opt -S -wholeprogramdevirt %s \| FileCheck --check-prefixes=CHECK,RETP %s
				; RUN: sed -e 's,+retpoline,-retpoline,g' %s \| opt -S -wholeprogramdevirt \| FileCheck --check-prefixes=CHECK,NORETP %s
				; RUN: opt -wholeprogramdevirt -wholeprogramdevirt-summary-action=export -wholeprogramdevirt-read-summary=%S/Inputs/export.yaml -wholeprogramdevirt-write-summary=%t -S -o - %s \| FileCheck --check-prefixes=CHECK,RETP %s
				; RUN: FileCheck --check-prefix=SUMMARY %s < %t

				; SUMMARY: TypeIdMap:
				; SUMMARY-NEXT: typeid1:
				; SUMMARY-NEXT: TTRes:
				; SUMMARY-NEXT: Kind: Unsat
				; SUMMARY-NEXT: SizeM1BitWidth: 0
				; SUMMARY-NEXT: AlignLog2: 0
				; SUMMARY-NEXT: SizeM1: 0
				; SUMMARY-NEXT: BitMask: 0
				; SUMMARY-NEXT: InlineBits: 0
				; SUMMARY-NEXT: WPDRes:
				; SUMMARY-NEXT: 0:
				; SUMMARY-NEXT: Kind: JumpTable
				; SUMMARY-NEXT: SingleImplName: ''
				; SUMMARY-NEXT: ResByArg:
				; SUMMARY-NEXT: typeid2:
				; SUMMARY-NEXT: TTRes:
				; SUMMARY-NEXT: Kind: Unsat
				; SUMMARY-NEXT: SizeM1BitWidth: 0
				; SUMMARY-NEXT: AlignLog2: 0
				; SUMMARY-NEXT: SizeM1: 0
				; SUMMARY-NEXT: BitMask: 0
				; SUMMARY-NEXT: InlineBits: 0
				; SUMMARY-NEXT: WPDRes:
				; SUMMARY-NEXT: 0:
				; SUMMARY-NEXT: Kind: Indir
				; SUMMARY-NEXT: SingleImplName: ''
				; SUMMARY-NEXT: ResByArg:
				; SUMMARY-NEXT: typeid3:
				; SUMMARY-NEXT: TTRes:
				; SUMMARY-NEXT: Kind: Unsat
				; SUMMARY-NEXT: SizeM1BitWidth: 0
				; SUMMARY-NEXT: AlignLog2: 0
				; SUMMARY-NEXT: SizeM1: 0
				; SUMMARY-NEXT: BitMask: 0
				; SUMMARY-NEXT: InlineBits: 0
				; SUMMARY-NEXT: WPDRes:
				; SUMMARY-NEXT: 0:
				; SUMMARY-NEXT: Kind: JumpTable
				; SUMMARY-NEXT: SingleImplName: ''
				; SUMMARY-NEXT: ResByArg:

				target datalayout = "e-p:64:64"
				target triple = "x86_64-unknown-linux-gnu"

				@vt1_1 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf1_1 to i8*)], !type !0
				@vt1_2 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf1_2 to i8*)], !type !0

				declare i32 @vf1_1(i8* %this, i32 %arg)
				declare i32 @vf1_2(i8* %this, i32 %arg)

				@vt2_1 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_1 to i8*)], !type !1
				@vt2_2 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_2 to i8*)], !type !1
				@vt2_3 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_3 to i8*)], !type !1
				@vt2_4 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_4 to i8*)], !type !1
				@vt2_5 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_5 to i8*)], !type !1
				@vt2_6 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_6 to i8*)], !type !1
				@vt2_7 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_7 to i8*)], !type !1
				@vt2_8 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_8 to i8*)], !type !1
				@vt2_9 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_9 to i8*)], !type !1
				@vt2_10 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_10 to i8*)], !type !1
				@vt2_11 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf2_11 to i8*)], !type !1

				declare i32 @vf2_1(i8* %this, i32 %arg)
				declare i32 @vf2_2(i8* %this, i32 %arg)
				declare i32 @vf2_3(i8* %this, i32 %arg)
				declare i32 @vf2_4(i8* %this, i32 %arg)
				declare i32 @vf2_5(i8* %this, i32 %arg)
				declare i32 @vf2_6(i8* %this, i32 %arg)
				declare i32 @vf2_7(i8* %this, i32 %arg)
				declare i32 @vf2_8(i8* %this, i32 %arg)
				declare i32 @vf2_9(i8* %this, i32 %arg)
				declare i32 @vf2_10(i8* %this, i32 %arg)
				declare i32 @vf2_11(i8* %this, i32 %arg)

				@vt3_1 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf3_1 to i8*)], !type !2
				@vt3_2 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf3_2 to i8*)], !type !2

				declare i32 @vf3_1(i8* %this, i32 %arg)
				declare i32 @vf3_2(i8* %this, i32 %arg)

				@vt4_1 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf4_1 to i8*)], !type !3
				@vt4_2 = constant [1 x i8] [i8 bitcast (i32 (i8, i32) @vf4_2 to i8*)], !type !3

				declare i32 @vf4_1(i8* %this, i32 %arg)
				declare i32 @vf4_2(i8* %this, i32 %arg)

				; CHECK: define i32 @fn1
				define i32 @fn1(i8* %obj) #0 {
				%vtableptr = bitcast i8* %obj to [1 x i8]*
				%vtable = load [1 x i8], [1 x i8]* %vtableptr
				%vtablei8 = bitcast [1 x i8] %vtable to i8*
				%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !"typeid1")
				call void @llvm.assume(i1 %p)
				%fptrptr = getelementptr [1 x i8], [1 x i8]* %vtable, i32 0, i32 0
				%fptr = load i8, i8* %fptrptr
				%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)
				; RETP: {{.}} = bitcast {{.}} to i8*
				; RETP: [[VT1:%.]] = bitcast {{.}} to i8*
				; RETP: call i32 bitcast (void ()* @__typeid_typeid1_0_jumptable to i32 (i8, i8, i32))(i8 nest [[VT1]], i8* %obj, i32 1)
				%result = call i32 %fptr_casted(i8* %obj, i32 1)
				; NORETP: call i32 %
				ret i32 %result
				}

				; CHECK: define i32 @fn2
				define i32 @fn2(i8* %obj) #0 {
				%vtableptr = bitcast i8* %obj to [1 x i8]*
				%vtable = load [1 x i8], [1 x i8]* %vtableptr
				%vtablei8 = bitcast [1 x i8] %vtable to i8*
				%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !"typeid2")
				call void @llvm.assume(i1 %p)
				%fptrptr = getelementptr [1 x i8], [1 x i8]* %vtable, i32 0, i32 0
				%fptr = load i8, i8* %fptrptr
				%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)
				; CHECK: call i32 %
				%result = call i32 %fptr_casted(i8* %obj, i32 1)
				ret i32 %result
				}

				; CHECK: define i32 @fn3
				define i32 @fn3(i8* %obj) #0 {
				%vtableptr = bitcast i8* %obj to [1 x i8]*
				%vtable = load [1 x i8], [1 x i8]* %vtableptr
				%vtablei8 = bitcast [1 x i8] %vtable to i8*
				%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !4)
				call void @llvm.assume(i1 %p)
				%fptrptr = getelementptr [1 x i8], [1 x i8]* %vtable, i32 0, i32 0
				%fptr = load i8, i8* %fptrptr
				%fptr_casted = bitcast i8* %fptr to i32 (i8, i32)
				; RETP: call i32 bitcast (void ()* @jumptable to
				; NORETP: call i32 %
				%result = call i32 %fptr_casted(i8* %obj, i32 1)
				ret i32 %result
				}

				; CHECK: define internal void @jumptable()

				; CHECK: define hidden void @__typeid_typeid1_0_jumptable() [[A:#[0-9]+]]
				; CHECK-NEXT: call void (...) @llvm.icall.jumptable(i8* bitcast ([1 x i8] @vt1_1 to i8), i32 (i8, i32)* @vf1_1, i8* bitcast ([1 x i8] @vt1_2 to i8), i32 (i8, i32)* @vf1_2)

				declare i1 @llvm.type.test(i8*, metadata)
				declare void @llvm.assume(i1)

				!0 = !{i32 0, !"typeid1"}
				!1 = !{i32 0, !"typeid2"}
				!2 = !{i32 0, !"typeid3"}
				!3 = !{i32 0, !4}
				!4 = distinct !{}

				; CHECK: attributes [[A]] = { naked }

				attributes #0 = { "target-features"="+retpoline" }

This is an archive of the discontinued LLVM Phabricator instance.

Use branch funnels for virtual calls when retpoline mitigation is enabled.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 135173

compiler-rt/test/cfi/simple-pass.cpp

llvm/include/llvm/ADT/PointerUnion.h

llvm/include/llvm/CodeGen/TargetOpcodes.def

llvm/include/llvm/IR/Intrinsics.td

llvm/include/llvm/IR/ModuleSummaryIndex.h

llvm/include/llvm/IR/ModuleSummaryIndexYAML.h

llvm/include/llvm/Target/Target.td

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/lib/IR/Verifier.cpp

llvm/lib/Target/X86/X86ExpandPseudo.cpp

llvm/lib/Transforms/IPO/LowerTypeTests.cpp

llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp

llvm/test/Transforms/LowerTypeTests/icall-jumptable.ll

llvm/test/Transforms/WholeProgramDevirt/Inputs/import-jumptable.yaml

llvm/test/Transforms/WholeProgramDevirt/Inputs/import-vcp-jumptable.yaml

llvm/test/Transforms/WholeProgramDevirt/import.ll

llvm/test/Transforms/WholeProgramDevirt/jumptable.ll

Use branch funnels for virtual calls when retpoline mitigation is enabled.
ClosedPublic