This is an archive of the discontinued LLVM Phabricator instance.

Introduce a "patchable-function" function attribute
ClosedPublic

Authored by sanjoy on Apr 12 2016, 7:38 PM.

Details

Summary

The "patchable-function" attribute can be used by an LLVM client to
influence LLVM's code generation in ways that makes the generated code
easily patchable at runtime (for instance, to redirect control).
Right now only one patchability scheme is supported,
"prologue-short-redirect", but this can be expanded in the future.

Diff Detail

Event Timeline

sanjoy updated this revision to Diff 53513.Apr 12 2016, 7:38 PM
sanjoy retitled this revision from to Introduce a "patchable-prologue" function attribute.
sanjoy updated this object.
sanjoy added reviewers: rnk, mehdi_amini, echristo.
sanjoy added a subscriber: llvm-commits.
rnk added inline comments.Apr 13 2016, 11:07 AM
include/llvm/Target/TargetLowering.h
1805 ↗(On Diff #53513)

This is about the prologue, so I would put this in TargetFrameLowering / X86FrameLowering, rather than growing the massive TargetLowering interface.

1807 ↗(On Diff #53513)

This file is not consistent on this point, but this should be report_fatal_error, since we want to keep the message in release builds.

lib/Target/X86/X86ISelLowering.cpp
30642 ↗(On Diff #53513)

I guess we are confident that modifying an instruction during its execution is not problematic.

sanjoy marked 2 inline comments as done.Apr 13 2016, 11:41 AM
sanjoy added inline comments.
lib/Target/X86/X86ISelLowering.cpp
30642 ↗(On Diff #53513)

At this point I'm fairly sure that replacing an instruction with another instruction of the exact same size is okay in practice. What I'm less sure of is replacing an executing instruction with another one that is smaller (i.e. replace only the prefix of an instruction), which is what this patch does.

Unfortunately, I can't think of a way to determine if the second assertion is correct or not except by running a lot of code compiled with "patchable-prologue"="hotpatch-compact" on machines with high core counts (the patch as is passes some basic sanity checks). If more thorough testing uncovers issues, then we'll deal with them as they come.

Given what I just said, do you think it is a good idea to rename the attribute to "experimental-hotpatch-compact"?

sanjoy updated this revision to Diff 53601.Apr 13 2016, 11:41 AM
  • Address @rnk 's review
rnk accepted this revision.Apr 13 2016, 1:10 PM
rnk edited edge metadata.

lgtm with the adjusted naming

include/llvm/Target/TargetFrameLowering.h
326 ↗(On Diff #53601)

Why not "Kind" instead of "Flavor"? That's way more common across LLVM. Also, our enum naming convention would make this look like:

enum PatchablePrologueKind { PPF_HotpatchCompact, PPF_Unknown };

http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

This revision is now accepted and ready to land.Apr 13 2016, 1:10 PM
rnk added inline comments.Apr 13 2016, 1:15 PM
lib/Target/X86/X86ISelLowering.cpp
30642 ↗(On Diff #53513)

Hit submit too soon...

I actually think you'll be OK here with the 2 byte alignment that you already have. No icache fetch is going to be able to observe any tearing. If we discover problems, we can nop-pad before subs.

I wouldn't add experimental here. All we need to guarantee is that there are two bytes to patch. Changing what we do for sub after the fact won't break any users.

sanjoy updated this revision to Diff 53768.Apr 14 2016, 12:19 PM
sanjoy edited edge metadata.

Rename "patchable-prologue" to ""patchable-function" + what @rnk
suggested around enum names.

echristo added inline comments.Apr 14 2016, 1:53 PM
docs/LangRef.rst
1408

Does this need to be a hard coded attribute? Why not something similar to the floating point ones while we're still working out things? Avoids needing to worry about bitcode reading/writing.

1415

Perhaps a different name for it? hotpatch-compact isn't particularly enlightening without the description. Is the "compact" because it only handles the small code model? It might be best to talk about the option in an architecture neutral way and then explain the particular implementation in a cpu specific section below for it.

lib/CodeGen/PatchableFunction.cpp
45–48

Can merge all of this.

lib/Target/X86/X86FrameLowering.cpp
2924 ↗(On Diff #53768)

Interesting. Is the idea here to avoid too much code growth? I'm assuming the performance of a pile of nops isn't that bad.

Also, are you just using the address of the symbol as the patchable address for the function?

sanjoy added inline comments.Apr 14 2016, 2:07 PM
docs/LangRef.rst
1408

Does this need to be a hard coded attribute?

Are you objecting to specifically documenting this attribute in the language reference? I don't mind that at all, given that means less work for me. :)

Avoids needing to worry about bitcode reading/writing.

If I understood you correctly, we don't have to worry about that here either, since this is a string attribute.

1415

"compact" as in "two bytes". I've tried to not mention any arch-specific details here (while avoid making things vague). Can you be more specific about how I can make this description less arch specific?

lib/Target/X86/X86FrameLowering.cpp
2924 ↗(On Diff #53768)

Interesting. Is the idea here to avoid too much code growth? I'm assuming the performance of a pile of nops isn't that bad.

Yes, we want to avoid too much code growth -- we have to do this for every function. In an older scheme where we had a 5 byte nop in the function prologue unconditionally, we did see some performance impact on old amd64 chips.

Also, are you just using the address of the symbol as the patchable address for the function?

Yes. To redirect control away from foo, we basically patch &foo.

dberris edited edge metadata.Apr 14 2016, 9:23 PM

PS. First review in LLVM, please be gentle? :)

docs/LangRef.rst
1415

My thought here is something that's recognisable. Consider things like:

  • compact-redirect-prologue
  • compact-rewrite-prologue
  • prologue-short-redirect
  • short-prologue-redirect
  • short-prologue-rewrite

If you intend to use "hotpatch" as a namespace of sorts (if there will be more later), something like:

  • hotpatch-short-prologue
  • hotpatch-prologue-small
lib/Target/X86/X86FrameLowering.cpp
2928–2930 ↗(On Diff #53768)

Have you considered inserting a pseudo instruction that gets translated instead when emitting the assembler?

lib/Target/X86/X86FrameLowering.h
207–209 ↗(On Diff #53768)

Is this intended to only handle prologues? Consider making this a single entry point, naming it something like makeFunctionPatchable(...). There may be other places where the patch-sleds could be inserted (before calls to functions, before entering loops, before returning, etc.) and it would be really great if this wasn't tied just to the prologue.

sanjoy added inline comments.Apr 14 2016, 11:00 PM
docs/LangRef.rst
1415

Thanks! I think I'll go with "prologue-short-redirect".

lib/CodeGen/PatchableFunction.cpp
45–48

Did not quite understand what you meant here. :)

lib/Target/X86/X86FrameLowering.cpp
2928–2930 ↗(On Diff #53768)

That's a good idea, let me give that a try.

lib/Target/X86/X86FrameLowering.h
207–209 ↗(On Diff #53768)

That's a good point, will do.

echristo edited edge metadata.Apr 14 2016, 11:09 PM
echristo added a subscriber: echristo.

I don't have anything to add to dberris's comments. One reply inline.

sanjoy updated this revision to Diff 53941.Apr 15 2016, 1:35 PM
sanjoy edited edge metadata.
  • Changed to use a pseudo instruction PATCHABLE_OP, as per @dberris, the code looks a lot cleaner now!
  • Renamed "prologue-hotpatch-compact" to "prologue-short-redirect"

There is some cleanup we can do after this, that I'll do separately
once this lands:

  • Simplify StackMapShadowTracker
  • Split out EmitNops so that the assert using OnlyOneNop can instead live in its caller.
sanjoy updated this revision to Diff 53943.Apr 15 2016, 1:38 PM
  • Remove unnecessary callback
sanjoy updated this revision to Diff 53949.Apr 15 2016, 2:08 PM

Guard asserts-only work under #ifndef NDEBUG

dberris added inline comments.Apr 18 2016, 6:29 PM
lib/Target/X86/X86MCInstLower.cpp
837–838

I'm not sure this assertion makes sense here. I would have thought this assert should have been done in the calling code, that it doesn't ask for a single nop in the first place?

948–949

So in this branch, MinSize != 2 or Opcode != X86::PUSH64r.

Question: If you check instead whether the function where MI is included in had the correct type of patchable-function attribute, and see that the nops being added is less than 2, you can assert and say this is actually a bug in a higher implementation detail? i.e. this would be a bug in the insertion of this instruction.

This way you wouldn't need to touch EmitNops.

I'm not very familiar with Differential but is there a way for you to update the summary to more accurately describe what the patch is doing now?

(Just about to update the description).

lib/Target/X86/X86MCInstLower.cpp
837–838

That's the first point under the "There is some cleanup we can do after this" note I sent in with this update. :) Basically, I'd rather not make this already large patch any larger if it can be helped. NFC cleanups like these are easy to do once the hard stuff has been reviewed and checked in, IMO.

948–949

Question: If you check instead whether the function where MI is included in had the correct type of patchable-function attribute, and see that the nops being added is less than 2, you can assert and say this is actually a bug in a higher implementation detail? i.e. this would be a bug in the insertion of this instruction.

Do you mean something like:

assert(MinSize < 2 && !MF.getPatchableFnType() == "prologue-short-redirect");

I'd say that is an incorrect layering. It feels cleaner for MC to not have to know why the PATCHABLE_OP of a certain variety is present. It should just understand its end of the contract of how it needs to lower PATCHABLE_OP.

sanjoy retitled this revision from Introduce a "patchable-prologue" function attribute to Introduce a "patchable-function" function attribute.Apr 18 2016, 6:44 PM
sanjoy updated this object.
dberris added inline comments.Apr 18 2016, 6:47 PM
lib/Target/X86/X86MCInstLower.cpp
948–949

I'd say that is an incorrect layering. It feels cleaner for MC to not have to know why the PATCHABLE_OP of a certain variety is present. It should just understand its end of the contract of how it needs to lower PATCHABLE_OP.

Consider the true branch of this if-statement though -- it already feels like it's already bleeding details in based on what a short 'prologue-short-redirect' patchable function attribute is already expecting. I'd say we're already breaking some layering guidelines here. :D

sanjoy added inline comments.Apr 18 2016, 7:20 PM
lib/Target/X86/X86MCInstLower.cpp
948–949

I don't think those two are the same things. One is saying "in this specific case I know can do better" (and you can extend the logic later by not just handling push'es but also returns, for instance), and the other is saying "this is the only case I support".

I'm not opposed to having PATCHABLE_OP specifically only work with "prologue-short-redirect", but then I'd rather have it not have a minsize operand at all. Do you think that will be cleaner? (I could go either way on this -- no strong preferences).

Actually, now that I think about it, what I dislike about the current scheme (i.e. this patch) is that only the minsize == 2 case is tested, so from just a testing / code coverage POV, removing minsize sounds slightly better.

dberris added inline comments.Apr 18 2016, 7:28 PM
lib/Target/X86/X86MCInstLower.cpp
948–949

I think minsize still makes sense, but I'm thinking that there's two inputs here really:

  1. That there are instructions to be placed here with a given minimum size. Note that this could be not just a single instruction, or could be treated just as a placeholder.
  2. That the semantics of what kinds of instructions will be emitted would be based on the type of function patching is supported. In this case you're implementing 'prologue-short-redirect' which expects certain kinds of operations to be valid where a PATCHABLE_OP appears.

Which is why I think looking at the attribute that defines the semantics of the PATCHABLE_OP still makes sense at this level. It allows us to make decisions of what's valid or not valid based on the attribute provided on the function. And I think this is the correct layer to make that decision, because this is where we're actually generating the instructions.

Does that reasoning make sense?

sanjoy added inline comments.Apr 18 2016, 8:36 PM
lib/Target/X86/X86MCInstLower.cpp
948–949

I think minsize still makes sense, but I'm thinking that there's two
inputs here really:

That there are instructions to be placed here with a given minimum
size. Note that this could be not just a single instruction, or could
be treated just as a placeholder.

That the semantics of what kinds of instructions will be emitted would
be based on the type of function patching is supported. In this case
you're implementing 'prologue-short-redirect' which expects certain
kinds of operations to be valid where a PATCHABLE_OP appears.

Which is why I think looking at the attribute that defines the
semantics of the PATCHABLE_OP still makes sense at this level. It
allows us to make decisions of what's valid or not valid based on the
attribute provided on the function. And I think this is the correct

I've been thinking about this as: an instance of PATCHABLE_OP is
(should be) self contained with regards to what needs to be emitted
when MC encounters one. Right now, as you said, emitting a
PATCHABLE_OP involves ensuring two things:

  1. There is *one* instruction of at least MinSize bytes at the place in the instruction stream the PATCHABLE_OP appeared at.
  2. The set of instructions emitted as part of lowering the PATCHABLE_OP pseudo instruction is "equivalent" (i.e. has the same effect on the CPU state, modulo the delta by which the instruction pointer is advanced) to the instruction bundled with PATCHABLE_OP.

(I could be more explicit about this in comment over PATCHABLE_OP --
let me know if you think that will help)

This is all that a PATCHABLE_OP implies. Any optimizations that
happens on top of (1) and (2) are strictly optional, and cannot tread
beyond what is allowed by (1) and (2).

If I understand you correctly, you're saying PATCHABLE_OP is not
self contained, but interpreting MC needs to do when it sees a
PATCHABLE_OP depends on what attributes the containing function has.
I still don't think this is correct (assuming I haven't misrepresented
your position): unless we gain something by by coupling function
attributes with PATCHABLE_OP, I'd rather have these de-coupled
(i.e. have PATCHABLE_OP be the mechanism by which the
"patchable-function" policy is implemented). For instance, a
potential use case for PATCHABLE_OP is to make some call sites
patchable, and one way to do that is via call site attributes.
Without making PATCHABLE_OP self sufficient we'd have to resort to
walking back to the relevant (IR level) call instruction (which may be
difficult to locate), in addition to checking the function attributes.
What if after this we want to add a third source for PATCHABLE_OP
(an intrinsic, say)? The complexity of lowering PATCHABLE_OP,
unless it is self sufficient, will scale linearly with the number of
ways we can generate one.

layer to make that decision, because this is where we're actually
generating the instructions.

Does that reasoning make sense?

dberris accepted this revision.Apr 18 2016, 9:52 PM
dberris edited edge metadata.
dberris added inline comments.
lib/Target/X86/X86MCInstLower.cpp
948–949

I think I originally misunderstood the purpose of PATCHABLE_OP -- I had been thinking it was a standalone pseudo-instruction which would be reduced the the op as a parameter if patching was not enabled on the function (or due to some other consideration, like the size of the instruction that proceeds it). In my head (and my current implementation for something similar I and echristo are working on) we just have a pure pseudo-instruction that expands in a context-sensitive manner.

As implemented, I think it's fine for the purpose of handling this specific attribute. I suppose later implementations of a different attribute can dispatch to the correct behaviour (and re-use/extend PATCHABLE_OP) appropriately.

This revision was automatically updated to reflect the committed changes.