The InlineSite abstraction is currently a thin layer around CallSite, but later changes will have it multiplex between a CallSite and a Statepoint.
Details
Diff Detail
Event Timeline
Please ignore this change. I thought I had run the test suite after this, but I had not and a bunch of test fail with this applied. I'll put up an update and fixed change tomorrow.
Comments inline.
lib/Transforms/Utils/InlineFunction.cpp | ||
---|---|---|
66 | I'm not convinced of the need for a new abstraction here. What does this really add over having a function which takes a call site and extracts the underlying callee from a statepoint and the surface callee from a normal call? An getUnwrappedCallee(CallSite CS) function seems to serve the same purpose with much less complexity. Even if I accept the need for an extra abstraction, I *strongly* object to the duplication of the underlying fields. Add proxying access if needed, but please do not duplicate each field from the underlying call site. | |
119 | Placement wise, this should be below the existing InlineFunction code. | |
131 | I'm confused by this bit. It seems like possibly an unrelated refactoring? Or is this the statepoint related bits for gc.result? | |
1367 | Unintentional diff? |
lib/Transforms/Utils/InlineFunction.cpp | ||
---|---|---|
66 |
It is not just the call destination -- I will have to update all of the places where the Inliner accesses, say, paramHasAttr and change it to do something different if the call site is a statepoint. I suppose I could have a getUnwrappedXXX for each of those properties, but that's semantically what this class is. Plus by having a separate type I can verify that I've not accidentally used CS.getFoo() where it would be incorrect to do so.
Ok. | |
131 | I moved this part so to that I could RAUW gc_result easily in the following patch. I'll split this bit out into its own refactoring. | |
1367 | See previous comment. |
Make the InlineSite class a lot more lightweight. It is now a thin
wrapper around CallSite. A later change will have it multiplex over
Statepoint and CallSite.
Started looking and really thinking about this.
The more I think about it, the more I think that *if* we need to support inlining through statepoints, then CallSite should be able to wrap a statepoint just like it does an invoke. I don't think we want two abstractions here.
But the more I think about that, the more I think that the current statepoint IR form makes that really ... icky. But I don't know if there is realistically a better one because I've not sufficiently internalized the constraints the statepoints were designed under.
So prior to going further on this patch, I'm going to do two things to educate myself:
- Chat with Sanjoy (probably in IRC) so he can teach me more about statepoints and maybe point me at the right design discussions to just go read.
- Follow up on the statepoint inlining design discussion thread to ask for some more high-level details that shouldn't end up buried on this review.
Hi Chandler,
What is your current stance on this? Should I start making gc_statepoint calls / invokes managed by CallSite?
The only potential downside I see to making CallSites manage gc_statepoint is that it will make CallSite more complex than a simple dispatch over CallInst vs. InvokeInst. CallSite will now have to dispatch over the cross product of {CallInst, InvokeInst} and { isStatepoint, !isStatepoint }. There will be a slight performance penalty[1] and some bits of LLVM that depend on CallSite being uncomplicated may require updating. I don't mind doing all of this if we can all agree that CallSite managed statepoints are the right way to do this.
<hand-waving>
Here is how I think about what's going on, in case that is helpful:
The mental model I have for inlining through statepoints is that the inliner "sees through" some control and data flow in the gc_statepoint intrinsic and inlines two levels at once. It can then inline through / simplify the attached gc_relocate and gc_result calls as well.
For instance, an imaginary implementation of gc_statepoint that will allow the current (i.e. implemented in the current patch set) form of inlining looks like:
define i32 @gc_statepoint(func_ptr %f, ...args) { (arg0, arg1, ...) = unpack(...args) ;; can't be represented in LLVM IR %result = %f(arg0, arg1, ...) ret pack(%result) ;; can't be easily represented in LLVM IR }
with the axiom
gc_result(pack(%result)) == %result
The inliner then, semantically, first inlines through the gc_statepoint call, then recognizes that the call to %f is now direct, and then inlines through that call as well. It then simplifies the associated gc_(result|relocate) calls[2].
There are two places where this view of things break down:
- the "internals" of @gc_statepoint are not representable in LLVM IR, especially once we take into account deoptimization and gc state.
- even if (1) was somehow addressed, it won't be okay to inline only through @gc_statepoint and not subsequently inline through %f.
But semantically, this is one way to think about what's going on.
</hand-waving>
[1]: Most of this may be mitigated by s/PointerIntPair<InstrTy*, 1, bool>/PointerIntPair<InstrTy*, 2> in CallSite
[2]: Currently we only simplify gc_result calls, logic simplification for gc_relocate will come later)
I'm not convinced of the need for a new abstraction here. What does this really add over having a function which takes a call site and extracts the underlying callee from a statepoint and the surface callee from a normal call? An getUnwrappedCallee(CallSite CS) function seems to serve the same purpose with much less complexity.
Even if I accept the need for an extra abstraction, I *strongly* object to the duplication of the underlying fields. Add proxying access if needed, but please do not duplicate each field from the underlying call site.