This is an archive of the discontinued LLVM Phabricator instance.

Introduce @llvm.experimental.deoptimize
ClosedPublic

Authored by sanjoy on Feb 29 2016, 1:43 PM.

Details

Summary

This intrinsic, together with deoptimization operand bundles, allow
frontends to express transfer of control and frame-local state from
one (typically more specialized, hence faster) version of a function
into another (typically more generic, hence slower) version.

In languages with a fully integrated managed runtime this intrinsic can
be used to implement "uncommon trap" like functionality. In unmanaged
languages like C and C++, this intrinsic can be used to represent the
slow paths of specialized functions.

Diff Detail

Repository
rL LLVM

Event Timeline

sanjoy updated this revision to Diff 49412.Feb 29 2016, 1:43 PM
sanjoy retitled this revision from to Introduce @llvm.experimental.deoptimize.
sanjoy updated this object.
sanjoy added reviewers: atrick, chandlerc, rnk, reames.
sanjoy added subscribers: JosephTremoulet, maksfb, mjacob and 2 others.
chandlerc added inline comments.Feb 29 2016, 1:54 PM
docs/LangRef.rst
12147–12148 ↗(On Diff #49412)

Would it be more precise to say that it must be a 'musttail' call?

sanjoy added inline comments.Feb 29 2016, 2:19 PM
docs/LangRef.rst
12147–12148 ↗(On Diff #49412)

At the IR level, semantically, it does "look like" a musttail call,
but at the ABI level it is not a musttail call from LLVM's
perspective. Usually the caller frame will have state that is needed
to transition into the deopt continuation, and so it is not okay for
LLVM to lower this call as "pop frame and jump". This is why I'm
hesitant to specify that calls to experimental.deoptimize have to be
musttail. On the other hand, managed languages typically do have a
guarantee that once we've fully transitioned into the generic /
de-specialized function, the caller frame will have been fully popped;
so it is a lot like a musttail call from the "eventual stack growth"
perspective.

A second minor difference is that the restrictions are a little
stricter -- because the intrinsic is polymorphic on the return value,
there is no need to allow the optional bitcast (this part is fairly
inconsequential though).

So the summary is that I'm somewhat on the fence on making this a
musttail call (or not), and given that changing our mind later would
not involve a lot of interesting code-churn, I'd like to defer that
decision to a later point. However if either of the points in the
first paragraph strongly resonate with you please let me know.

reames edited edge metadata.Mar 4 2016, 11:56 AM

It looks like this is missing codegen support. For now, I'd suggest lowering to call to a known symbol. __llvm_deoptimize?

docs/LangRef.rst
12107 ↗(On Diff #49412)

Naming wise, I liked the term side exit much better.

12115 ↗(On Diff #49412)

Might be better to declare this as taking unspecified arguments which are interpreted by the runtime.

12124 ↗(On Diff #49412)

As written, this explanation is too generic. The intrinsic specifically only allows transfer of control in one direction - leaving the current code. It does not allow entries. Having a separate intrinsic which modelled OSR entry points might some day be useful, but that's a different problem.

12126 ↗(On Diff #49412)

Java, or JavaScript

12127 ↗(On Diff #49412)

or "side exit"

12143 ↗(On Diff #49412)

Might be good to say something about the interpretation of deopt continuation as being explicitly out of scope for LLVM. (i.e. go invoke random code which must...)

lib/Transforms/Utils/InlineFunction.cpp
1806 ↗(On Diff #49412)

Can't you just skip the entire region if the return type of the caller is the same as the callee? Actually, no, you need to remove them from the normal returns so that we early terminate the parent as well. Can you restructure/add comments to make both parts clear?

atrick edited edge metadata.Mar 4 2016, 11:59 AM

I was also confused by the missing codegen support. Otherwise looks fine to me. I'll let Philip decide when it's ready to accept.

sanjoy updated this object.Mar 7 2016, 3:53 PM
sanjoy edited edge metadata.
sanjoy updated this revision to Diff 50005.Mar 7 2016, 3:54 PM
sanjoy marked 6 inline comments as done.
sanjoy updated this object.
  • Add a lowering strategy to FastISel and SelectionDAG -- the intrinsic is lowered to __llvm_deoptimize.
  • The intrinsic now takes an argument of unspecified type. The contract of what the argument means and how __llvm_deoptimize interprets it is open-ended and up to the frontend.
sanjoy added inline comments.Mar 7 2016, 3:55 PM
docs/LangRef.rst
12107 ↗(On Diff #49412)

I'm probably going to rename @guard_on as @guarded_side_exit.

reames requested changes to this revision.Mar 9 2016, 12:10 PM
reames edited edge metadata.

My previous documentation comments have not been addressed.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5447 ↗(On Diff #50005)

possibly we should have experimental in the name?
__llvm_experimental_deoptimize?

lib/Transforms/Utils/InlineFunction.cpp
1814 ↗(On Diff #50005)

minor: the implicit bool conversion here is confusing. nullptr != X would be clearer. Alternatively, have the lambda explicitly return bool.

1829 ↗(On Diff #50005)

I'd intended in my previous comment to indicate that the intrinsic should take an unlimited number of untyped arguments. Not one untyped argument.

This revision now requires changes to proceed.Mar 9 2016, 12:10 PM

Re: doc comments -- I think did address all of them, can you point out the ones that you are not okay with in the current form?

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5447 ↗(On Diff #50005)

Good idea, will fix.

lib/Transforms/Utils/InlineFunction.cpp
1814 ↗(On Diff #50005)

SGTM, will fix shortly.

1829 ↗(On Diff #50005)

I thought about that -- having a varargs signature is not more general (since you can just create an aggregate type / alloca and pass that in), and did not feel like it was worth the complexity (e.g. we'd probably not want the call to be a varargs call at the ABI level).

reames accepted this revision.Mar 9 2016, 12:36 PM
reames edited edge metadata.

Re: doc comments -- I think did address all of them, can you point out the ones that you are not okay with in the current form?

You're correct. For some reason the doc changes didn't show up in the diff between versions (or I just miss them). All of the doc comments have been addressed satisfactorily.

With the current comments addressed, LGTM.

lib/Transforms/Utils/InlineFunction.cpp
1829 ↗(On Diff #50005)

Hm, I disagree with you, but let's separate this out. For this review, let's use a form with no arguments. We can extend it later to support the multiple argument form if desired. That can be done as a follow up change.

(I specifically don't want the one argument form going in because it *requires* one argument when a completely legitimate runtime might not need any.)

This revision is now accepted and ready to land.Mar 9 2016, 12:36 PM
sanjoy added a comment.Mar 9 2016, 1:47 PM

The lowering I'm doing here is basically incorrect. The lowering should be going through RS4GC / gc.statepoint (since that is the only way LLC can understand deopt state). There is also a missing assert in SelectionDAG and FastISel -- at this point they do not know how to lower operand bundles, and should assert if they do see a call with operand bundle (had this assert been there I'd not have misled myself).

Also, given that we'll go through RS4GC, it makes the "lower varargs to regular args" bit fairly easy.

Given that the lowering bits will be a little more complex now, what do you think about the following plan:

  • Rip out the lowering parts from this change
  • Change the spec to have deoptimize take varargs
  • Introduce lowering via RS4GC in a separate change

?

(I'll also add some assertions to SelectionDAG and FastISel to fail on calls with operand bundles, but that's not directly relevant here)

reames added a comment.Mar 9 2016, 2:57 PM

Sanjoy and I talked about this offline; let me summarize what we talked about.

I expressed discomfort with reusing RS4GC as Sanjoy suggested. While this would seem to be the path of least resistance, it seemed odd to me to tie deopt and GC so closely coupled together. We clearly need to reuse the backend representation (i.e. the STATEPOINT pseudo op) since we need the stack lowering and stack map entry pieces, but reusing the IR rewriting/lowering code seemed overly coupled.

After discussion, we settled on the following approach. Sanjoy is going to examine whether we could reuse the StatepointLowering.cpp code directly from SelectionDAGBuilder's intrinsic lowering code. This would imply that we don't need to wrap a call to @llvm.experimental.deoptimize in an statepoint to get it lowered to a STATEPOINT psuedo op. It also gives us the building block for a possible future change which removes the deopt arguments from the statepoint intrinsic entirely in favour of a deopt bundle attached to the call to the statepoint.

(Worth noting, we could have used a PATCHPOINT psuedo op and not a STATEPOINT one for the lowering of the deoptimize call. The difference is live-on-call/in-regs vs live-through-call/on-stack semantics for the deopt arguments. We decided to use the STATEPOINT pseudo op mostly because that's what we have more experience with. We may revisit this lowering detail in the future, but it wasn't worth the churn right now.)

p.s. I'm deferring to Sanjoy on whether he wants to do everything within this patch or separate the lowering into it's own patch. He'll make that decision and potentially land this patch without the lowering code or argument handling. I've already given an LGTM for that part and that still stands.

sanjoy updated this revision to Diff 50215.Mar 9 2016, 4:53 PM
sanjoy marked an inline comment as done.
sanjoy edited edge metadata.

This revision

  • Fixes the nits pointed out earlier
  • Removes lowering code (that will be addressed later, as Philip mentioned)
  • Changes the signature to varargs
  • Adds support for calls to @llvm.experimental.deoptimize in a function being invoke d, and relevant test cases

The last bit is a semantic change that hasn't been LGTM'ed or
discussed previously; hence this re-review.

ping!

The additional handling for inlining invokes to functions that contain deoptimize calls is fairly straightforward, it'd be great to get a quick re-LGTM on this.

rnk accepted this revision.Mar 11 2016, 10:50 AM
rnk edited edge metadata.

I looked at the inliner stuff, and it looks good to me.

docs/LangRef.rst
12147–12148 ↗(On Diff #50215)

I don't think it would be useful to mark these as musttail. You will end up fighting with the existing musttail logic, which still merges returns after a tail call into normal control flow.

lib/Transforms/Utils/InlineFunction.cpp
1841–1843 ↗(On Diff #50215)

Maybe do this outside the loop to save some lookups.

This revision was automatically updated to reflect the committed changes.
sanjoy marked an inline comment as done.