This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
6
LangRef.rst
-
include/llvm/
-
llvm/
-
CodeGen/
-
Passes.h
-
InitializePasses.h
-
Target/
2/2
TargetLowering.h
-
lib/
-
CodeGen/
-
CMakeLists.txt
-
CodeGen.cpp
-
Passes.cpp
-
PatchablePrologues.cpp
-
Target/X86/
-
X86/
-
X86ISelLowering.h
3
X86ISelLowering.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
patchable-prologue.ll

Differential D19046

Introduce a "patchable-function" function attribute
ClosedPublic

Authored by sanjoy on Apr 12 2016, 7:38 PM.

Download Raw Diff

Details

Reviewers

dberris
echristo
mehdi_amini
rnk

Commits

rGc0441c29df64: Introduce a "patchable-function" function attribute
rL266715: Introduce a "patchable-function" function attribute

Summary

The "patchable-function" attribute can be used by an LLVM client to
influence LLVM's code generation in ways that makes the generated code
easily patchable at runtime (for instance, to redirect control).
Right now only one patchability scheme is supported,
"prologue-short-redirect", but this can be expanded in the future.

Diff Detail

Event Timeline

sanjoy updated this revision to Diff 53513.Apr 12 2016, 7:38 PM

sanjoy retitled this revision from to Introduce a "patchable-prologue" function attribute.

sanjoy updated this object.

sanjoy added reviewers: rnk, mehdi_amini, echristo.

sanjoy added a subscriber: llvm-commits.

Herald added a subscriber: mcrosier. · View Herald TranscriptApr 12 2016, 7:38 PM

rnk added inline comments.Apr 13 2016, 11:07 AM

include/llvm/Target/TargetLowering.h
1805	This is about the prologue, so I would put this in TargetFrameLowering / X86FrameLowering, rather than growing the massive TargetLowering interface.
1807	This file is not consistent on this point, but this should be report_fatal_error, since we want to keep the message in release builds.
lib/Target/X86/X86ISelLowering.cpp
30642	I guess we are confident that modifying an instruction during its execution is not problematic.

sanjoy marked 2 inline comments as done.Apr 13 2016, 11:41 AM

sanjoy added inline comments.

lib/Target/X86/X86ISelLowering.cpp
30642	At this point I'm fairly sure that replacing an instruction with another instruction of the exact same size is okay in practice. What I'm less sure of is replacing an executing instruction with another one that is smaller (i.e. replace only the prefix of an instruction), which is what this patch does. Unfortunately, I can't think of a way to determine if the second assertion is correct or not except by running a lot of code compiled with `"patchable-prologue"="hotpatch-compact"` on machines with high core counts (the patch as is passes some basic sanity checks). If more thorough testing uncovers issues, then we'll deal with them as they come. Given what I just said, do you think it is a good idea to rename the attribute to `"experimental-hotpatch-compact"`?

Address @rnk 's review

lgtm with the adjusted naming

include/llvm/Target/TargetFrameLowering.h
326 ↗	(On Diff #53601)	Why not "Kind" instead of "Flavor"? That's way more common across LLVM. Also, our enum naming convention would make this look like: enum PatchablePrologueKind { PPF_HotpatchCompact, PPF_Unknown }; http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

This revision is now accepted and ready to land.Apr 13 2016, 1:10 PM

rnk added inline comments.Apr 13 2016, 1:15 PM

lib/Target/X86/X86ISelLowering.cpp
30642	Hit submit too soon... I actually think you'll be OK here with the 2 byte alignment that you already have. No icache fetch is going to be able to observe any tearing. If we discover problems, we can nop-pad before subs. I wouldn't add experimental here. All we need to guarantee is that there are two bytes to patch. Changing what we do for sub after the fact won't break any users.

Rename "patchable-prologue" to ""patchable-function" + what @rnk
suggested around enum names.

sanjoy added a reviewer: dberris.Apr 14 2016, 12:21 PM

echristo added inline comments.Apr 14 2016, 1:53 PM

docs/LangRef.rst
1408	Does this need to be a hard coded attribute? Why not something similar to the floating point ones while we're still working out things? Avoids needing to worry about bitcode reading/writing.
1415	Perhaps a different name for it? hotpatch-compact isn't particularly enlightening without the description. Is the "compact" because it only handles the small code model? It might be best to talk about the option in an architecture neutral way and then explain the particular implementation in a cpu specific section below for it.
lib/CodeGen/PatchableFunction.cpp
44–47 ↗	(On Diff #53768)	Can merge all of this.
lib/Target/X86/X86FrameLowering.cpp
2924 ↗	(On Diff #53768)	Interesting. Is the idea here to avoid too much code growth? I'm assuming the performance of a pile of nops isn't that bad. Also, are you just using the address of the symbol as the patchable address for the function?

sanjoy added inline comments.Apr 14 2016, 2:07 PM

docs/LangRef.rst
1408	Does this need to be a hard coded attribute? Are you objecting to specifically documenting this attribute in the language reference? I don't mind that at all, given that means less work for me. :) Avoids needing to worry about bitcode reading/writing. If I understood you correctly, we don't have to worry about that here either, since this is a string attribute.
1415	"compact" as in "two bytes". I've tried to not mention any arch-specific details here (while avoid making things vague). Can you be more specific about how I can make this description less arch specific?
lib/Target/X86/X86FrameLowering.cpp
2924 ↗	(On Diff #53768)	Interesting. Is the idea here to avoid too much code growth? I'm assuming the performance of a pile of nops isn't that bad. Yes, we want to avoid too much code growth -- we have to do this for every function. In an older scheme where we had a 5 byte nop in the function prologue unconditionally, we did see some performance impact on old amd64 chips. Also, are you just using the address of the symbol as the patchable address for the function? Yes. To redirect control away from `foo`, we basically patch `&foo`.

PS. First review in LLVM, please be gentle? :)

docs/LangRef.rst
1415	My thought here is something that's recognisable. Consider things like: compact-redirect-prologue compact-rewrite-prologue prologue-short-redirect short-prologue-redirect short-prologue-rewrite If you intend to use "hotpatch" as a namespace of sorts (if there will be more later), something like: hotpatch-short-prologue hotpatch-prologue-small
lib/Target/X86/X86FrameLowering.cpp
2928–2930 ↗	(On Diff #53768)	Have you considered inserting a pseudo instruction that gets translated instead when emitting the assembler?
lib/Target/X86/X86FrameLowering.h
207–209 ↗	(On Diff #53768)	Is this intended to only handle prologues? Consider making this a single entry point, naming it something like `makeFunctionPatchable(...)`. There may be other places where the patch-sleds could be inserted (before calls to functions, before entering loops, before returning, etc.) and it would be really great if this wasn't tied just to the prologue.

sanjoy added inline comments.Apr 14 2016, 11:00 PM

docs/LangRef.rst
1415	Thanks! I think I'll go with `"prologue-short-redirect"`.
lib/CodeGen/PatchableFunction.cpp
44–47 ↗	(On Diff #53768)	Did not quite understand what you meant here. :)
lib/Target/X86/X86FrameLowering.cpp
2928–2930 ↗	(On Diff #53768)	That's a good idea, let me give that a try.
lib/Target/X86/X86FrameLowering.h
207–209 ↗	(On Diff #53768)	That's a good point, will do.

I don't have anything to add to dberris's comments. One reply inline.

Changed to use a pseudo instruction PATCHABLE_OP, as per @dberris, the code looks a lot cleaner now!
Renamed "prologue-hotpatch-compact" to "prologue-short-redirect"

There is some cleanup we can do after this, that I'll do separately
once this lands:

Simplify StackMapShadowTracker
Split out EmitNops so that the assert using OnlyOneNop can instead live in its caller.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptApr 15 2016, 1:35 PM

Remove unnecessary callback

Guard asserts-only work under #ifndef NDEBUG

dberris added inline comments.Apr 18 2016, 6:29 PM

lib/Target/X86/X86MCInstLower.cpp
837–838 ↗	(On Diff #53949)	I'm not sure this assertion makes sense here. I would have thought this assert should have been done in the calling code, that it doesn't ask for a single nop in the first place?
948–949 ↗	(On Diff #53949)	So in this branch, MinSize != 2 or Opcode != X86::PUSH64r. Question: If you check instead whether the function where MI is included in had the correct type of patchable-function attribute, and see that the nops being added is less than 2, you can assert and say this is actually a bug in a higher implementation detail? i.e. this would be a bug in the insertion of this instruction. This way you wouldn't need to touch EmitNops.

I'm not very familiar with Differential but is there a way for you to update the summary to more accurately describe what the patch is doing now?

(Just about to update the description).

lib/Target/X86/X86MCInstLower.cpp
837–838 ↗	(On Diff #53949)	That's the first point under the "There is some cleanup we can do after this" note I sent in with this update. :) Basically, I'd rather not make this already large patch any larger if it can be helped. NFC cleanups like these are easy to do once the hard stuff has been reviewed and checked in, IMO.
948–949 ↗	(On Diff #53949)	Question: If you check instead whether the function where MI is included in had the correct type of patchable-function attribute, and see that the nops being added is less than 2, you can assert and say this is actually a bug in a higher implementation detail? i.e. this would be a bug in the insertion of this instruction. Do you mean something like: assert(MinSize < 2 && !MF.getPatchableFnType() == "prologue-short-redirect"); I'd say that is an incorrect layering. It feels cleaner for MC to not have to know why the `PATCHABLE_OP` of a certain variety is present. It should just understand its end of the contract of how it needs to lower `PATCHABLE_OP`.

sanjoy retitled this revision from Introduce a "patchable-prologue" function attribute to Introduce a "patchable-function" function attribute.Apr 18 2016, 6:44 PM

sanjoy updated this object.

dberris added inline comments.Apr 18 2016, 6:47 PM

lib/Target/X86/X86MCInstLower.cpp
948–949 ↗	(On Diff #53949)	I'd say that is an incorrect layering. It feels cleaner for MC to not have to know why the PATCHABLE_OP of a certain variety is present. It should just understand its end of the contract of how it needs to lower PATCHABLE_OP. Consider the true branch of this if-statement though -- it already feels like it's already bleeding details in based on what a short 'prologue-short-redirect' patchable function attribute is already expecting. I'd say we're already breaking some layering guidelines here. :D

sanjoy added inline comments.Apr 18 2016, 7:20 PM

lib/Target/X86/X86MCInstLower.cpp
948–949 ↗	(On Diff #53949)	I don't think those two are the same things. One is saying "in this specific case I know can do better" (and you can extend the logic later by not just handling push'es but also returns, for instance), and the other is saying "this is the only case I support". I'm not opposed to having `PATCHABLE_OP` specifically only work with `"prologue-short-redirect"`, but then I'd rather have it not have a `minsize` operand at all. Do you think that will be cleaner? (I could go either way on this -- no strong preferences). Actually, now that I think about it, what I dislike about the current scheme (i.e. this patch) is that only the `minsize` == `2` case is tested, so from just a testing / code coverage POV, removing `minsize` sounds slightly better.

dberris added inline comments.Apr 18 2016, 7:28 PM

lib/Target/X86/X86MCInstLower.cpp
948–949 ↗	(On Diff #53949)	I think minsize still makes sense, but I'm thinking that there's two inputs here really: That there are instructions to be placed here with a given minimum size. Note that this could be not just a single instruction, or could be treated just as a placeholder. That the semantics of what kinds of instructions will be emitted would be based on the type of function patching is supported. In this case you're implementing 'prologue-short-redirect' which expects certain kinds of operations to be valid where a `PATCHABLE_OP` appears. Which is why I think looking at the attribute that defines the semantics of the `PATCHABLE_OP` still makes sense at this level. It allows us to make decisions of what's valid or not valid based on the attribute provided on the function. And I think this is the correct layer to make that decision, because this is where we're actually generating the instructions. Does that reasoning make sense?

sanjoy added inline comments.Apr 18 2016, 8:36 PM

lib/Target/X86/X86MCInstLower.cpp
948–949 ↗	(On Diff #53949)	I think minsize still makes sense, but I'm thinking that there's two inputs here really: That there are instructions to be placed here with a given minimum size. Note that this could be not just a single instruction, or could be treated just as a placeholder. That the semantics of what kinds of instructions will be emitted would be based on the type of function patching is supported. In this case you're implementing 'prologue-short-redirect' which expects certain kinds of operations to be valid where a PATCHABLE_OP appears. Which is why I think looking at the attribute that defines the semantics of the PATCHABLE_OP still makes sense at this level. It allows us to make decisions of what's valid or not valid based on the attribute provided on the function. And I think this is the correct I've been thinking about this as: an instance of `PATCHABLE_OP` is (should be) self contained with regards to what needs to be emitted when MC encounters one. Right now, as you said, emitting a `PATCHABLE_OP` involves ensuring two things: There is one instruction of at least `MinSize` bytes at the place in the instruction stream the `PATCHABLE_OP` appeared at. The set of instructions emitted as part of lowering the `PATCHABLE_OP` pseudo instruction is "equivalent" (i.e. has the same effect on the CPU state, modulo the delta by which the instruction pointer is advanced) to the instruction bundled with `PATCHABLE_OP`. (I could be more explicit about this in comment over PATCHABLE_OP -- let me know if you think that will help) This is all that a `PATCHABLE_OP` implies. Any optimizations that happens on top of (1) and (2) are strictly optional, and cannot tread beyond what is allowed by (1) and (2). If I understand you correctly, you're saying `PATCHABLE_OP` is not self contained, but interpreting MC needs to do when it sees a `PATCHABLE_OP` depends on what attributes the containing function has. I still don't think this is correct (assuming I haven't misrepresented your position): unless we gain something by by coupling function attributes with `PATCHABLE_OP`, I'd rather have these de-coupled (i.e. have `PATCHABLE_OP` be the mechanism by which the `"patchable-function"` policy is implemented). For instance, a potential use case for `PATCHABLE_OP` is to make some call sites patchable, and one way to do that is via call site attributes. Without making `PATCHABLE_OP` self sufficient we'd have to resort to walking back to the relevant (IR level) call instruction (which may be difficult to locate), in addition to checking the function attributes. What if after this we want to add a third source for `PATCHABLE_OP` (an intrinsic, say)? The complexity of lowering `PATCHABLE_OP`, unless it is self sufficient, will scale linearly with the number of ways we can generate one. layer to make that decision, because this is where we're actually generating the instructions. Does that reasoning make sense?

dberris accepted this revision.Apr 18 2016, 9:52 PM

dberris edited edge metadata.

dberris added inline comments.

lib/Target/X86/X86MCInstLower.cpp
948–949 ↗	(On Diff #53949)	I think I originally misunderstood the purpose of `PATCHABLE_OP` -- I had been thinking it was a standalone pseudo-instruction which would be reduced the the op as a parameter if patching was not enabled on the function (or due to some other consideration, like the size of the instruction that proceeds it). In my head (and my current implementation for something similar I and echristo are working on) we just have a pure pseudo-instruction that expands in a context-sensitive manner. As implemented, I think it's fine for the purpose of handling this specific attribute. I suppose later implementations of a different attribute can dispatch to the correct behaviour (and re-use/extend `PATCHABLE_OP`) appropriately.

Closed by commit rL266715: Introduce a "patchable-function" function attribute (authored by sanjoy). · Explain WhyApr 18 2016, 10:30 PM

This revision was automatically updated to reflect the committed changes.

aaron.ballman mentioned this in D19909: [Attr] Add support for the `ms_hook_prologue` attribute..May 4 2016, 12:06 PM

MaskRay mentioned this in D72215: [AArch64] Add function attribute "patchable-function-entry" to add NOPs at function entry.Jan 7 2020, 4:33 PM

Revision Contents

Path

Size

docs/

LangRef.rst

24 lines

include/

llvm/

CodeGen/

Passes.h

3 lines

InitializePasses.h

1 line

Target/

TargetLowering.h

13 lines

lib/

CodeGen/

CMakeLists.txt

1 line

CodeGen.cpp

1 line

Passes.cpp

2 lines

PatchablePrologues.cpp

56 lines

Target/

X86/

X86ISelLowering.h

6 lines

X86ISelLowering.cpp

57 lines

test/

CodeGen/

X86/

patchable-prologue.ll

43 lines

Diff 53513

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,399 Lines • ▼ Show 20 Lines	``optnone``
the function as well, so the function is never inlined into any caller.		the function as well, so the function is never inlined into any caller.
Only functions with the ``alwaysinline`` attribute are valid		Only functions with the ``alwaysinline`` attribute are valid
candidates for inlining into the body of this function.		candidates for inlining into the body of this function.
``optsize``		``optsize``
This attribute suggests that optimization passes and code generator		This attribute suggests that optimization passes and code generator
passes make choices that keep the code size of this function low,		passes make choices that keep the code size of this function low,
and otherwise do optimizations specifically to reduce code size as		and otherwise do optimizations specifically to reduce code size as
long as they do not significantly impact runtime performance.		long as they do not significantly impact runtime performance.
		``"patchable-prologue"``
		echristoUnsubmitted Not Done Reply Inline Actions Does this need to be a hard coded attribute? Why not something similar to the floating point ones while we're still working out things? Avoids needing to worry about bitcode reading/writing. echristo: Does this need to be a hard coded attribute? Why not something similar to the floating point…
		sanjoyAuthorUnsubmitted Not Done Reply Inline Actions Does this need to be a hard coded attribute? Are you objecting to specifically documenting this attribute in the language reference? I don't mind that at all, given that means less work for me. :) Avoids needing to worry about bitcode reading/writing. If I understood you correctly, we don't have to worry about that here either, since this is a string attribute. sanjoy: > Does this need to be a hard coded attribute? Are you objecting to specifically documenting…
		This attribute tells the code generator that the prologue
		generated for this function needs to follow a specific format that
		makes it possible for a runtime function to patch over it later.
		The exact effect of this attribute depends on the string value of
		this attribute, for which there currently is one legal value:

		* ``"hotpatch-compact"`` - This style of patchable prologues is
		echristoUnsubmitted Not Done Reply Inline Actions Perhaps a different name for it? hotpatch-compact isn't particularly enlightening without the description. Is the "compact" because it only handles the small code model? It might be best to talk about the option in an architecture neutral way and then explain the particular implementation in a cpu specific section below for it. echristo: Perhaps a different name for it? hotpatch-compact isn't particularly enlightening without the…
		sanjoyAuthorUnsubmitted Not Done Reply Inline Actions "compact" as in "two bytes". I've tried to not mention any arch-specific details here (while avoid making things vague). Can you be more specific about how I can make this description less arch specific? sanjoy: "compact" as in "two bytes". I've tried to not mention any arch-specific details here (while…
		dberrisUnsubmitted Not Done Reply Inline Actions My thought here is something that's recognisable. Consider things like: compact-redirect-prologue compact-rewrite-prologue prologue-short-redirect short-prologue-redirect short-prologue-rewrite If you intend to use "hotpatch" as a namespace of sorts (if there will be more later), something like: hotpatch-short-prologue hotpatch-prologue-small dberris: My thought here is something that's recognisable. Consider things like: - compact-redirect…
		sanjoyAuthorUnsubmitted Not Done Reply Inline Actions Thanks! I think I'll go with `"prologue-short-redirect"`. sanjoy: Thanks! I think I'll go with `"prologue-short-redirect"`.
		intended to support patching a function prologue to redirect
		control away from the function in a thread safe manner. It
		guarantees that the first instruction of the function will be
		large enough to accommodate a short jump instruction, and will
		be sufficiently aligned to allow being fully changed via an
		atomic compare-and-swap instruction. While the first
		requirement can be satisfied by inserting large enough NOP,
		LLVM can and will try to re-purpose an existing instruction
		(i.e. one that would have to be emitted anyway) as the
		patchable instruction larger than a short jump.

		``"hotpatch-compact"`` is currently only supported on x86-64.

		This attribute by itself does not imply restrictions on
		inter-procedural optimizations. All of the semantic effects the
		patching may have to be separately conveyed via the linkage type.
``readnone``		``readnone``
On a function, this attribute indicates that the function computes its		On a function, this attribute indicates that the function computes its
result (or decides to unwind an exception) based strictly on its arguments,		result (or decides to unwind an exception) based strictly on its arguments,
without dereferencing any pointer arguments or otherwise accessing		without dereferencing any pointer arguments or otherwise accessing
any mutable state (e.g. memory, control registers, etc) visible to		any mutable state (e.g. memory, control registers, etc) visible to
caller functions. It does not write through any pointer arguments		caller functions. It does not write through any pointer arguments
(including ``byval`` arguments) and never changes any state visible		(including ``byval`` arguments) and never changes any state visible
to callers. This means that it cannot unwind exceptions by calling		to callers. This means that it cannot unwind exceptions by calling
▲ Show 20 Lines • Show All 10,831 Lines • Show Last 20 Lines

include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 593 Lines • ▼ Show 20 Lines	/// MachineDominanaceFrontier - This pass is a machine dominators analysis pass.
extern char &OptimizePHIsID;		extern char &OptimizePHIsID;

/// StackSlotColoring - This pass performs stack slot coloring.		/// StackSlotColoring - This pass performs stack slot coloring.
extern char &StackSlotColoringID;		extern char &StackSlotColoringID;

/// \brief This pass lays out funclets contiguously.		/// \brief This pass lays out funclets contiguously.
extern char &FuncletLayoutID;		extern char &FuncletLayoutID;

		/// \brief This pass implements the "patchable-prologue" attribute.
		extern char &PatchableProloguesID;

/// createStackProtectorPass - This pass adds stack protectors to functions.		/// createStackProtectorPass - This pass adds stack protectors to functions.
///		///
FunctionPass createStackProtectorPass(const TargetMachine TM);		FunctionPass createStackProtectorPass(const TargetMachine TM);

/// createMachineVerifierPass - This pass verifies cenerated machine code		/// createMachineVerifierPass - This pass verifies cenerated machine code
/// instructions for correctness.		/// instructions for correctness.
///		///
FunctionPass *createMachineVerifierPass(const std::string& Banner);		FunctionPass *createMachineVerifierPass(const std::string& Banner);
▲ Show 20 Lines • Show All 106 Lines • Show Last 20 Lines

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 323 Lines • ▼ Show 20 Lines
	void initializeLoopDistributePass(PassRegistry&);			void initializeLoopDistributePass(PassRegistry&);
	void initializeSjLjEHPreparePass(PassRegistry&);			void initializeSjLjEHPreparePass(PassRegistry&);
	void initializeDemandedBitsPass(PassRegistry&);			void initializeDemandedBitsPass(PassRegistry&);
	void initializeFuncletLayoutPass(PassRegistry &);			void initializeFuncletLayoutPass(PassRegistry &);
	void initializeLoopLoadEliminationPass(PassRegistry&);			void initializeLoopLoadEliminationPass(PassRegistry&);
	void initializeFunctionImportPassPass(PassRegistry &);			void initializeFunctionImportPassPass(PassRegistry &);
	void initializeLoopVersioningPassPass(PassRegistry &);			void initializeLoopVersioningPassPass(PassRegistry &);
	void initializeWholeProgramDevirtPass(PassRegistry &);			void initializeWholeProgramDevirtPass(PassRegistry &);
				void initializePatchableProloguesPass(PassRegistry &);
	}			}

	#endif			#endif

include/llvm/Target/TargetLowering.h

Show First 20 Lines • Show All 1,788 Lines • ▼ Show 20 Lines	void setLibcallCallingConv(RTLIB::Libcall Call, CallingConv::ID CC) {
LibcallCallingConvs[Call] = CC;		LibcallCallingConvs[Call] = CC;
}		}

/// Get the CallingConv that should be used for the specified libcall.		/// Get the CallingConv that should be used for the specified libcall.
CallingConv::ID getLibcallCallingConv(RTLIB::Libcall Call) const {		CallingConv::ID getLibcallCallingConv(RTLIB::Libcall Call) const {
return LibcallCallingConvs[Call];		return LibcallCallingConvs[Call];
}		}

		//===--------------------------------------------------------------------===//
		// LLVM support for hotpatching
		//

		enum PatchablePrologueFlavor { PPF_HOTPATCH_COMPACT, PPF_UNKNOWN };

		/// Implement a specific variant of the "patchable-prologue" attribute.
		virtual void
		makeFunctionProloguePatchable(MachineFunction &MF,
		rnkUnsubmitted Done Reply Inline Actions This is about the prologue, so I would put this in TargetFrameLowering / X86FrameLowering, rather than growing the massive TargetLowering interface. rnk: This is about the prologue, so I would put this in TargetFrameLowering / X86FrameLowering…
		PatchablePrologueFlavor PPF) const {
		llvm_unreachable("Not implemented for this target!");
		rnkUnsubmitted Done Reply Inline Actions This file is not consistent on this point, but this should be report_fatal_error, since we want to keep the message in release builds. rnk: This file is not consistent on this point, but this should be report_fatal_error, since we want…
		}

private:		private:
const TargetMachine &TM;		const TargetMachine &TM;

/// Tells the code generator not to expand operations into sequences that use		/// Tells the code generator not to expand operations into sequences that use
/// the select operations if possible.		/// the select operations if possible.
bool SelectIsExpensive;		bool SelectIsExpensive;

/// Tells the code generator that the target has multiple (allocatable)		/// Tells the code generator that the target has multiple (allocatable)
▲ Show 20 Lines • Show All 1,146 Lines • Show Last 20 Lines

lib/CodeGen/CMakeLists.txt

Show First 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	add_llvm_library(LLVMCodeGen
MachinePostDominators.cpp		MachinePostDominators.cpp
MachineRegionInfo.cpp		MachineRegionInfo.cpp
MachineRegisterInfo.cpp		MachineRegisterInfo.cpp
MachineScheduler.cpp		MachineScheduler.cpp
MachineSink.cpp		MachineSink.cpp
MachineSSAUpdater.cpp		MachineSSAUpdater.cpp
MachineTraceMetrics.cpp		MachineTraceMetrics.cpp
MachineVerifier.cpp		MachineVerifier.cpp
		PatchablePrologues.cpp
MIRPrinter.cpp		MIRPrinter.cpp
MIRPrintingPass.cpp		MIRPrintingPass.cpp
OptimizePHIs.cpp		OptimizePHIs.cpp
ParallelCG.cpp		ParallelCG.cpp
Passes.cpp		Passes.cpp
PeepholeOptimizer.cpp		PeepholeOptimizer.cpp
PHIElimination.cpp		PHIElimination.cpp
PHIEliminationUtils.cpp		PHIEliminationUtils.cpp
▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

lib/CodeGen/CodeGen.cpp

Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeMachineFunctionPrinterPassPass(Registry);		initializeMachineFunctionPrinterPassPass(Registry);
initializeMachineLICMPass(Registry);		initializeMachineLICMPass(Registry);
initializeMachineLoopInfoPass(Registry);		initializeMachineLoopInfoPass(Registry);
initializeMachineModuleInfoPass(Registry);		initializeMachineModuleInfoPass(Registry);
initializeMachinePostDominatorTreePass(Registry);		initializeMachinePostDominatorTreePass(Registry);
initializeMachineSchedulerPass(Registry);		initializeMachineSchedulerPass(Registry);
initializeMachineSinkingPass(Registry);		initializeMachineSinkingPass(Registry);
initializeMachineVerifierPassPass(Registry);		initializeMachineVerifierPassPass(Registry);
		initializePatchableProloguesPass(Registry);
initializeOptimizePHIsPass(Registry);		initializeOptimizePHIsPass(Registry);
initializePEIPass(Registry);		initializePEIPass(Registry);
initializePHIEliminationPass(Registry);		initializePHIEliminationPass(Registry);
initializePeepholeOptimizerPass(Registry);		initializePeepholeOptimizerPass(Registry);
initializePostMachineSchedulerPass(Registry);		initializePostMachineSchedulerPass(Registry);
initializePostRASchedulerPass(Registry);		initializePostRASchedulerPass(Registry);
initializeProcessImplicitDefsPass(Registry);		initializeProcessImplicitDefsPass(Registry);
initializeRegisterCoalescerPass(Registry);		initializeRegisterCoalescerPass(Registry);
Show All 22 Lines

lib/CodeGen/Passes.cpp

Show First 20 Lines • Show All 596 Lines • ▼ Show 20 Lines	void TargetPassConfig::addMachinePasses() {

addPreEmitPass();		addPreEmitPass();

addPass(&FuncletLayoutID, false);		addPass(&FuncletLayoutID, false);

addPass(&StackMapLivenessID, false);		addPass(&StackMapLivenessID, false);
addPass(&LiveDebugValuesID, false);		addPass(&LiveDebugValuesID, false);

		addPass(&PatchableProloguesID, false);

AddingMachinePasses = false;		AddingMachinePasses = false;
}		}

/// Add passes that optimize machine instructions in SSA form.		/// Add passes that optimize machine instructions in SSA form.
void TargetPassConfig::addMachineSSAOptimization() {		void TargetPassConfig::addMachineSSAOptimization() {
// Pre-ra tail duplication.		// Pre-ra tail duplication.
addPass(&EarlyTailDuplicateID);		addPass(&EarlyTailDuplicateID);

▲ Show 20 Lines • Show All 205 Lines • Show Last 20 Lines

lib/CodeGen/PatchablePrologues.cpp

This file was added.

				//===-- PatchablePrologues.cpp - Patchable prologues for LLVM -------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements edits function prologues in place to support the
				// "patchable-prologue" attribute.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/CodeGen/Passes.h"
				#include "llvm/CodeGen/Analysis.h"
				#include "llvm/CodeGen/MachineFunction.h"
				#include "llvm/CodeGen/MachineFunctionPass.h"
				#include "llvm/Target/TargetLowering.h"
				#include "llvm/Target/TargetSubtargetInfo.h"

				using namespace llvm;

				namespace {
				struct PatchablePrologues : public MachineFunctionPass {
				static char ID; // Pass identification, replacement for typeid
				PatchablePrologues() : MachineFunctionPass(ID) {
				initializePatchableProloguesPass(*PassRegistry::getPassRegistry());
				}

				bool runOnMachineFunction(MachineFunction &F) override;
				MachineFunctionProperties getRequiredProperties() const override {
				return MachineFunctionProperties().set(
				MachineFunctionProperties::Property::AllVRegsAllocated);
				}
				};
				}

				bool PatchablePrologues::runOnMachineFunction(MachineFunction &MF) {
				if (!MF.getFunction()->hasFnAttribute("patchable-prologue"))
				return false;

				Attribute PatchAttr = MF.getFunction()->getFnAttribute("patchable-prologue");
				StringRef PatchType = PatchAttr.getValueAsString();

				assert(PatchType == "hotpatch-compact" && "Only possibility today!");
				(void)PatchType;

				auto *TLI = MF.getSubtarget().getTargetLowering();
				TLI->makeFunctionProloguePatchable(MF, TargetLowering::PPF_HOTPATCH_COMPACT);
				return true;
				}

				char PatchablePrologues::ID = 0;
				char &llvm::PatchableProloguesID = PatchablePrologues::ID;
				INITIALIZE_PASS(PatchablePrologues, "patchable-prologues", "", false, false)

lib/Target/X86/X86ISelLowering.h

Show First 20 Lines • Show All 1,211 Lines • ▼ Show 20 Lines	SDValue getRsqrtEstimate(SDValue Operand, DAGCombinerInfo &DCI,
bool &UseOneConstNR) const override;		bool &UseOneConstNR) const override;

/// Use rcp* to speed up fdiv calculations.		/// Use rcp* to speed up fdiv calculations.
SDValue getRecipEstimate(SDValue Operand, DAGCombinerInfo &DCI,		SDValue getRecipEstimate(SDValue Operand, DAGCombinerInfo &DCI,
unsigned &RefinementSteps) const override;		unsigned &RefinementSteps) const override;

/// Reassociate floating point divisions into multiply by reciprocal.		/// Reassociate floating point divisions into multiply by reciprocal.
unsigned combineRepeatedFPDivisors() const override;		unsigned combineRepeatedFPDivisors() const override;

		/// Hook to make function prologues patchable, as dictated by the
		/// "patchable-prologue" attribute.
		void makeFunctionProloguePatchable(
		MachineFunction &MF,
		TargetLowering::PatchablePrologueFlavor PPF) const override;
};		};

namespace X86 {		namespace X86 {
FastISel *createFastISel(FunctionLoweringInfo &funcInfo,		FastISel *createFastISel(FunctionLoweringInfo &funcInfo,
const TargetLibraryInfo *libInfo);		const TargetLibraryInfo *libInfo);
} // end namespace X86		} // end namespace X86
} // end namespace llvm		} // end namespace llvm

#endif // LLVM_LIB_TARGET_X86_X86ISELLOWERING_H		#endif // LLVM_LIB_TARGET_X86_X86ISELLOWERING_H

lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 30,592 Lines • ▼ Show 20 Lines	for (const MCPhysReg I = IStart; I; ++I) {

// Insert the copy-back instructions right before the terminator.		// Insert the copy-back instructions right before the terminator.
for (auto *Exit : Exits)		for (auto *Exit : Exits)
BuildMI(*Exit, Exit->getFirstTerminator(), DebugLoc(),		BuildMI(*Exit, Exit->getFirstTerminator(), DebugLoc(),
TII->get(TargetOpcode::COPY), *I)		TII->get(TargetOpcode::COPY), *I)
.addReg(NewVR);		.addReg(NewVR);
}		}
}		}

		void X86TargetLowering::makeFunctionProloguePatchable(
		MachineFunction &MF, TargetLowering::PatchablePrologueFlavor PPF) const {
		assert(PPF == TargetLowering::PPF_HOTPATCH_COMPACT &&
		"Only one possibility!");
		assert(Subtarget.is64Bit() && "Unsupported!");

		auto &FirstMI = *MF.begin()->begin();
		const TargetInstrInfo *TII = Subtarget.getInstrInfo();

		// We need to ensure that FirstMI is at least two bytes long (a short jump can
		// be encoded in two bytes) and does not span a 16 byte boundary (to be
		// patchable by a CAS).

		// Programmatically getting the size of an x86 instruction at the MachineInstr
		// level is difficult. So we instead hard-code a list of instructions that
		// are commonly expected to appear as the first instruction in MF.
		switch (FirstMI.getOpcode()) {
		case X86::PUSH64r:
		switch (FirstMI.getOperand(0).getReg()) {
		case X86::RAX:
		case X86::RBX:
		case X86::RCX:
		case X86::RDX:
		case X86::RSP:
		case X86::RBP:
		case X86::RSI:
		case X86::RDI:
		// These push instructions are one byte long. "Fatten" them to take up
		// two bytes.
		FirstMI.setDesc(TII->get(X86::PUSH64rmr));
		break;

		default:
		// All other push instructions are at least two bytes long, nothing more
		// needs to be done here.
		break;
		}
		break;

		case X86::SUB64ri32:
		// We know this instruction takes more than two bytes.
		rnkUnsubmitted Not Done Reply Inline Actions I guess we are confident that modifying an instruction during its execution is not problematic. rnk: I guess we are confident that modifying an instruction during its execution is not problematic.
		sanjoyAuthorUnsubmitted Not Done Reply Inline Actions At this point I'm fairly sure that replacing an instruction with another instruction of the exact same size is okay in practice. What I'm less sure of is replacing an executing instruction with another one that is smaller (i.e. replace only the prefix of an instruction), which is what this patch does. Unfortunately, I can't think of a way to determine if the second assertion is correct or not except by running a lot of code compiled with `"patchable-prologue"="hotpatch-compact"` on machines with high core counts (the patch as is passes some basic sanity checks). If more thorough testing uncovers issues, then we'll deal with them as they come. Given what I just said, do you think it is a good idea to rename the attribute to `"experimental-hotpatch-compact"`? sanjoy: At this point I'm fairly sure that replacing an instruction with another instruction of the…
		rnkUnsubmitted Not Done Reply Inline Actions Hit submit too soon... I actually think you'll be OK here with the 2 byte alignment that you already have. No icache fetch is going to be able to observe any tearing. If we discover problems, we can nop-pad before subs. I wouldn't add experimental here. All we need to guarantee is that there are two bytes to patch. Changing what we do for sub after the fact won't break any users. rnk: Hit submit too soon... I actually think you'll be OK here with the 2 byte alignment that you…
		break;

		default:
		// We didn't recognize the instruction, so add a two byte nop just to be
		// safe.
		BuildMI(*MF.begin(), MF.begin()->begin(), DebugLoc(),
		TII->get(X86::XCHG16ar)).addReg(X86::AX);
		break;
		}

		// We cannot let the first instruction cross a 16 byte boundary. Given that
		// the largest x86 instruction is 15 bytes long, it is sufficient to align the
		// start of the function to 16 bytes.
		MF.ensureAlignment(4);
		}

test/CodeGen/X86/patchable-prologue.ll

This file was added.

				; RUN: llc -filetype=obj -o - -mtriple=x86_64-apple-macosx < %s \| llvm-objdump -triple x86_64-apple-macosx -disassemble -
				; RUN: llc -mtriple=x86_64-apple-macosx < %s \| FileCheck %s --check-prefix=CHECK-ALIGN

				declare void @callee(i64*)

				define void @f0() "patchable-prologue"="hotpatch-compact" {
				; CHECK-LABEL: _f0:
				; CHECK-NEXT: 66 90 nop

				; CHECK-ALIGN: .p2align 4, 0x90
				; CHECK-ALIGN: _f0:

				ret void
				}

				define void @f1() "patchable-prologue"="hotpatch-compact" "no-frame-pointer-elim"="true" {
				; CHECK-LABEL: _f1
				; CHECK-NEXT: ff f5 pushq %rbp

				; CHECK-ALIGN: .p2align 4, 0x90
				; CHECK-ALIGN: _f1:
				ret void
				}

				define void @f2() "patchable-prologue"="hotpatch-compact" {
				; CHECK-LABEL: _f2
				; CHECK-NEXT: 48 81 ec a8 00 00 00 subq $168, %rsp

				; CHECK-ALIGN: .p2align 4, 0x90
				; CHECK-ALIGN: _f2:
				%ptr = alloca i64, i32 20
				call void @callee(i64* %ptr)
				ret void
				}

				define void @f3() "patchable-prologue"="hotpatch-compact" optsize {
				; CHECK-LABEL: _f3
				; CHECK-NEXT: 66 90 nop

				; CHECK-ALIGN: .p2align 4, 0x90
				; CHECK-ALIGN: _f3:
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

Introduce a "patchable-function" function attributeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 53513

docs/LangRef.rst

include/llvm/CodeGen/Passes.h

include/llvm/InitializePasses.h

include/llvm/Target/TargetLowering.h

lib/CodeGen/CMakeLists.txt

lib/CodeGen/CodeGen.cpp

lib/CodeGen/Passes.cpp

lib/CodeGen/PatchablePrologues.cpp

lib/Target/X86/X86ISelLowering.h

lib/Target/X86/X86ISelLowering.cpp

test/CodeGen/X86/patchable-prologue.ll

Introduce a "patchable-function" function attribute
ClosedPublic