The "patchable-function" attribute can be used by an LLVM client to
influence LLVM's code generation in ways that makes the generated code
easily patchable at runtime (for instance, to redirect control).
Right now only one patchability scheme is supported,
"prologue-short-redirect", but this can be expanded in the future.
Details
Diff Detail
Event Timeline
include/llvm/Target/TargetLowering.h | ||
---|---|---|
1805 | This is about the prologue, so I would put this in TargetFrameLowering / X86FrameLowering, rather than growing the massive TargetLowering interface. | |
1807 | This file is not consistent on this point, but this should be report_fatal_error, since we want to keep the message in release builds. | |
lib/Target/X86/X86ISelLowering.cpp | ||
30642 | I guess we are confident that modifying an instruction during its execution is not problematic. |
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
30642 | At this point I'm fairly sure that replacing an instruction with another instruction of the exact same size is okay in practice. What I'm less sure of is replacing an executing instruction with another one that is smaller (i.e. replace only the prefix of an instruction), which is what this patch does. Unfortunately, I can't think of a way to determine if the second assertion is correct or not except by running a lot of code compiled with "patchable-prologue"="hotpatch-compact" on machines with high core counts (the patch as is passes some basic sanity checks). If more thorough testing uncovers issues, then we'll deal with them as they come. Given what I just said, do you think it is a good idea to rename the attribute to "experimental-hotpatch-compact"? |
lgtm with the adjusted naming
include/llvm/Target/TargetFrameLowering.h | ||
---|---|---|
326 ↗ | (On Diff #53601) | Why not "Kind" instead of "Flavor"? That's way more common across LLVM. Also, our enum naming convention would make this look like: enum PatchablePrologueKind { PPF_HotpatchCompact, PPF_Unknown }; http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly |
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
30642 | Hit submit too soon... I actually think you'll be OK here with the 2 byte alignment that you already have. No icache fetch is going to be able to observe any tearing. If we discover problems, we can nop-pad before subs. I wouldn't add experimental here. All we need to guarantee is that there are two bytes to patch. Changing what we do for sub after the fact won't break any users. |
Rename "patchable-prologue" to ""patchable-function" + what @rnk
suggested around enum names.
docs/LangRef.rst | ||
---|---|---|
1408 | Does this need to be a hard coded attribute? Why not something similar to the floating point ones while we're still working out things? Avoids needing to worry about bitcode reading/writing. | |
1415 | Perhaps a different name for it? hotpatch-compact isn't particularly enlightening without the description. Is the "compact" because it only handles the small code model? It might be best to talk about the option in an architecture neutral way and then explain the particular implementation in a cpu specific section below for it. | |
lib/CodeGen/PatchableFunction.cpp | ||
44–47 ↗ | (On Diff #53768) | Can merge all of this. |
lib/Target/X86/X86FrameLowering.cpp | ||
2924 ↗ | (On Diff #53768) | Interesting. Is the idea here to avoid too much code growth? I'm assuming the performance of a pile of nops isn't that bad. Also, are you just using the address of the symbol as the patchable address for the function? |
docs/LangRef.rst | ||
---|---|---|
1408 |
Are you objecting to specifically documenting this attribute in the language reference? I don't mind that at all, given that means less work for me. :)
If I understood you correctly, we don't have to worry about that here either, since this is a string attribute. | |
1415 | "compact" as in "two bytes". I've tried to not mention any arch-specific details here (while avoid making things vague). Can you be more specific about how I can make this description less arch specific? | |
lib/Target/X86/X86FrameLowering.cpp | ||
2924 ↗ | (On Diff #53768) |
Yes, we want to avoid too much code growth -- we have to do this for every function. In an older scheme where we had a 5 byte nop in the function prologue unconditionally, we did see some performance impact on old amd64 chips.
Yes. To redirect control away from foo, we basically patch &foo. |
PS. First review in LLVM, please be gentle? :)
docs/LangRef.rst | ||
---|---|---|
1415 | My thought here is something that's recognisable. Consider things like:
If you intend to use "hotpatch" as a namespace of sorts (if there will be more later), something like:
| |
lib/Target/X86/X86FrameLowering.cpp | ||
2928–2930 ↗ | (On Diff #53768) | Have you considered inserting a pseudo instruction that gets translated instead when emitting the assembler? |
lib/Target/X86/X86FrameLowering.h | ||
207–209 ↗ | (On Diff #53768) | Is this intended to only handle prologues? Consider making this a single entry point, naming it something like makeFunctionPatchable(...). There may be other places where the patch-sleds could be inserted (before calls to functions, before entering loops, before returning, etc.) and it would be really great if this wasn't tied just to the prologue. |
docs/LangRef.rst | ||
---|---|---|
1415 | Thanks! I think I'll go with "prologue-short-redirect". | |
lib/CodeGen/PatchableFunction.cpp | ||
44–47 ↗ | (On Diff #53768) | Did not quite understand what you meant here. :) |
lib/Target/X86/X86FrameLowering.cpp | ||
2928–2930 ↗ | (On Diff #53768) | That's a good idea, let me give that a try. |
lib/Target/X86/X86FrameLowering.h | ||
207–209 ↗ | (On Diff #53768) | That's a good point, will do. |
- Changed to use a pseudo instruction PATCHABLE_OP, as per @dberris, the code looks a lot cleaner now!
- Renamed "prologue-hotpatch-compact" to "prologue-short-redirect"
There is some cleanup we can do after this, that I'll do separately
once this lands:
- Simplify StackMapShadowTracker
- Split out EmitNops so that the assert using OnlyOneNop can instead live in its caller.
lib/Target/X86/X86MCInstLower.cpp | ||
---|---|---|
837–838 ↗ | (On Diff #53949) | I'm not sure this assertion makes sense here. I would have thought this assert should have been done in the calling code, that it doesn't ask for a single nop in the first place? |
948–949 ↗ | (On Diff #53949) | So in this branch, MinSize != 2 or Opcode != X86::PUSH64r. Question: If you check instead whether the function where MI is included in had the correct type of patchable-function attribute, and see that the nops being added is less than 2, you can assert and say this is actually a bug in a higher implementation detail? i.e. this would be a bug in the insertion of this instruction. This way you wouldn't need to touch EmitNops. |
I'm not very familiar with Differential but is there a way for you to update the summary to more accurately describe what the patch is doing now?
(Just about to update the description).
lib/Target/X86/X86MCInstLower.cpp | ||
---|---|---|
837–838 ↗ | (On Diff #53949) | That's the first point under the "There is some cleanup we can do after this" note I sent in with this update. :) Basically, I'd rather not make this already large patch any larger if it can be helped. NFC cleanups like these are easy to do once the hard stuff has been reviewed and checked in, IMO. |
948–949 ↗ | (On Diff #53949) |
Do you mean something like: assert(MinSize < 2 && !MF.getPatchableFnType() == "prologue-short-redirect"); I'd say that is an incorrect layering. It feels cleaner for MC to not have to know why the PATCHABLE_OP of a certain variety is present. It should just understand its end of the contract of how it needs to lower PATCHABLE_OP. |
lib/Target/X86/X86MCInstLower.cpp | ||
---|---|---|
948–949 ↗ | (On Diff #53949) |
Consider the true branch of this if-statement though -- it already feels like it's already bleeding details in based on what a short 'prologue-short-redirect' patchable function attribute is already expecting. I'd say we're already breaking some layering guidelines here. :D |
lib/Target/X86/X86MCInstLower.cpp | ||
---|---|---|
948–949 ↗ | (On Diff #53949) | I don't think those two are the same things. One is saying "in this specific case I know can do better" (and you can extend the logic later by not just handling push'es but also returns, for instance), and the other is saying "this is the only case I support". I'm not opposed to having PATCHABLE_OP specifically only work with "prologue-short-redirect", but then I'd rather have it not have a minsize operand at all. Do you think that will be cleaner? (I could go either way on this -- no strong preferences). Actually, now that I think about it, what I dislike about the current scheme (i.e. this patch) is that only the minsize == 2 case is tested, so from just a testing / code coverage POV, removing minsize sounds slightly better. |
lib/Target/X86/X86MCInstLower.cpp | ||
---|---|---|
948–949 ↗ | (On Diff #53949) | I think minsize still makes sense, but I'm thinking that there's two inputs here really:
Which is why I think looking at the attribute that defines the semantics of the PATCHABLE_OP still makes sense at this level. It allows us to make decisions of what's valid or not valid based on the attribute provided on the function. And I think this is the correct layer to make that decision, because this is where we're actually generating the instructions. Does that reasoning make sense? |
lib/Target/X86/X86MCInstLower.cpp | ||
---|---|---|
948–949 ↗ | (On Diff #53949) |
I've been thinking about this as: an instance of PATCHABLE_OP is
(I could be more explicit about this in comment over PATCHABLE_OP -- This is all that a PATCHABLE_OP implies. Any optimizations that If I understand you correctly, you're saying PATCHABLE_OP is not
|
lib/Target/X86/X86MCInstLower.cpp | ||
---|---|---|
948–949 ↗ | (On Diff #53949) | I think I originally misunderstood the purpose of PATCHABLE_OP -- I had been thinking it was a standalone pseudo-instruction which would be reduced the the op as a parameter if patching was not enabled on the function (or due to some other consideration, like the size of the instruction that proceeds it). In my head (and my current implementation for something similar I and echristo are working on) we just have a pure pseudo-instruction that expands in a context-sensitive manner. As implemented, I think it's fine for the purpose of handling this specific attribute. I suppose later implementations of a different attribute can dispatch to the correct behaviour (and re-use/extend PATCHABLE_OP) appropriately. |
Does this need to be a hard coded attribute? Why not something similar to the floating point ones while we're still working out things? Avoids needing to worry about bitcode reading/writing.