Page MenuHomePhabricator

Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFC
ClosedPublic

Authored by sanjoy on Mar 11 2016, 3:08 PM.

Details

Summary

This is a step towards implementing "direct" lowering of calls and
invokes with deopt operand bundles into STATEPOINT nodes (as opposed to
having them mandatorily pass through RewriteStatepointsForGC, which is
the case today).

This change extracts out a SelectionDAGBuilder::LowerAsStatepoint
helper function that is able to lower a "statepoint like thing", and
uses it to lower gc.statepoint calls. This is an NFC now, but in a
later change we will use LowerAsStatepoint to directly lower calls and
invokes with operand bundles without going through an intermediate
gc.statepoint IR representation.

FYI: I expect SelectionDAGBuilder::StatepointInfo will evolve as I add
support for lowering non gc.statepoints, right now it is fairly tightly
coupled with an IR level gc.statepoint.

Diff Detail

Repository
rL LLVM

Event Timeline

sanjoy updated this revision to Diff 50487.Mar 11 2016, 3:08 PM
sanjoy retitled this revision from to Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFC.
sanjoy updated this object.
sanjoy added reviewers: reames, pgavlin, JosephTremoulet.
sanjoy added a subscriber: llvm-commits.
reames edited edge metadata.Mar 11 2016, 6:38 PM

A couple of minor comments for now. I'm going to return to this Monday. FYI, the organization of this is really bugging me. Hopefully, I'll be able to explain why with a bit more thought. :)

lib/CodeGen/SelectionDAG/StatepointLowering.cpp
249 ↗(On Diff #50487)

minor: this looks like an unrelated NFCI change. Please separate and land to simplify review.

345 ↗(On Diff #50487)

This is going to be confusing named if reused. For a non-statepoint call with deopt, the GCResult instruction will just be the instruction itself.

375 ↗(On Diff #50487)

I think this is valid for a non-statepoint void call as well.

392 ↗(On Diff #50487)

again, nfci and separate.

580 ↗(On Diff #50487)

I think the only thing we're using GCArgs for is this loop. Might be better to just pass in the explicit set of spill slots.

sanjoy updated this revision to Diff 50513.Mar 11 2016, 7:18 PM
sanjoy edited edge metadata.
  • Factored out and landed the s/Value/GCRelocateInst/ bits
  • Add back some verification code I had accidentally removed

Hi Philip,

A couple of minor comments for now. I'm going to return to this Monday. FYI, the organization of this is really bugging me. Hopefully, I'll be able to explain why with a bit more thought. :)

If you're concerned about this not being general enough to support
deopt bundle lowering also (i.e. the coupling is too tight), then I
think you're right -- as I've mentioned in the commit message, I'm
fairly sure that I'll have to evolve the LowerAsStatepoint interface
somewhat to support both use cases. If you want, I can wait on
getting this reviewed till I'm done with the full implementation of
deopt bundle lowering (and get them reviewed at the same time, but as
different patches).

Or are you concerned about some other aspect of this patch?

The structure of the StatepointInfo struct strongly hints to me that we have a coupling of the API with the implementation. I understand this is an initial step, but I think it would be good to clean up some of the more messy aspects before moving on. I've tried to give a couple of specific suggestions below.

In general, I think we should be asking the question: why is StatepointInfo different from CallLoweringInfo? Merging them might be worth seriously considering. (I'd suggest NOT doing that in this patch, but keeping that goal in mind and merging them in a later step might be good.)

You might also look closely at the lowerInvokable function in SelectionDAGBuilder.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
739 ↗(On Diff #50513)

The fact we need to pass these in explicitly hints at a broken API. I'll scatter comments prefixed with gc-relocate: below with suggestions.

775 ↗(On Diff #50513)

These two fields hint at a problematic coupling of the interface. Why can't this simply be a list of arguments?

lib/CodeGen/SelectionDAG/StatepointLowering.cpp
345 ↗(On Diff #50513)

Could this block of code be lifted into the caller and the GCResult param removed? It seems unlikely we'll want this for the deopt only use.

528 ↗(On Diff #50513)

gc-relocate: this looks like it should really be an assertion in the visitGCRelocate code. Separate and commit?

596 ↗(On Diff #50513)

gc-relocate: It looks like the only thing we're using the relocates for here is to identify the derived pointer. Since we've already done that once, can this become a loop over the derived pointers? Separate and commit?

642 ↗(On Diff #50513)

gc-relocate: With all the other users removed, might this code and the startNewStatepoint call be lifted into the parent?

675 ↗(On Diff #50513)

Why do we need the IsStatepoint flag? Can't an empty flags convey the same information?

sanjoy marked an inline comment as done.Mar 14 2016, 6:47 PM
sanjoy added inline comments.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
775 ↗(On Diff #50513)

This is because SelectionDAGBuilder::lowerCallOperands takes a CallSite, unsigned ArgIdx, unsigned NumArgs. Passing in an ArrayRef<Value *> CallArgs is definitely cleaner, but we'll have to generalize lowerCallOperands before that happens.

Edit: I have an idea -- how about we stop using lowerCallOperands and use lowerInvokable instead? Then I can embed TargetLowering::CallLoweringInfo into StatepointInfo and reduce the conceptual duplication. In fact, the nesting could ultimately be: StatepointLoweringInfo contains DeoptLoweringInfo contains CallLoweringInfo. I'll try to code that up to see if it works.

lib/CodeGen/SelectionDAG/StatepointLowering.cpp
345 ↗(On Diff #50513)

Yes, I'll do that separately if you don't mind.

596 ↗(On Diff #50513)

It is also used below in if (Relocate->getParent() != StatepointInstr->getParent())

642 ↗(On Diff #50513)

We still need GCRelocates in lowerStatepointInfoMetaArgs, as noted earlier.

675 ↗(On Diff #50513)

Good point, I'll remove this.

sanjoy updated this revision to Diff 50789.Mar 15 2016, 6:44 PM

I moved some code around, and now have a separation between the
information needed to lower the call itself and the "meta" statepoint
information. Out of this I'll try to split out the
populateCallLoweringInfo (and land them if they look trivial).

However, there is one snag -- the Chain parameter of
CallLoweringInfo. With this new organization, we create the
CallLoweringInfo instance earlier than the spills, the CALL_SEQ
nodes end up anchoring to an earlier chain (since
populateCallLoweringInfo set CLI.Chain to DAG.getRoot()) and the
spill code gets DCE'd away.

In the current form of the patch, this is "fixed" by re-setting the
chain later, after we've emitted the spill code. But then the
CallLoweringInfo instance we create is invalid to use till we re-set
the chain which, which is ugly.

The Right(TM) fix to this is to (I think) pass in an explicit Chain
parameter to all the places that consume a CLI, since the Chain is not
part of the thing we're lowering, but a part of the location we want
to lower it at. However, that is a large (but simple) API change
touching all targets.

reames added inline comments.Mar 16 2016, 1:35 PM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
716 ↗(On Diff #50789)

Comments about what these do would help...

721 ↗(On Diff #50789)

I'm not sure lowerCallOperands is a useful abstraction once you've extracted populateCallLoweringInfo. Might be worth just inlining it at the two call sites.

sanjoy updated this revision to Diff 50872.Mar 16 2016, 2:22 PM

Rebase over http://reviews.llvm.org/rL263663

I think at this point we've reached the point of diminshing returns
for clean-ups before checkin. Philip - what do you think about
checking this in now, and working on it in tree?

reames accepted this revision.Mar 16 2016, 3:47 PM
reames edited edge metadata.

LGTM. I'm happy to see this evolve further in tree. Having the other user will also make it far more obvious where we've gotten the API wrong. :)

minor comments below. Can be done either before submission or as separate changes depending on your preference.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
752 ↗(On Diff #50872)

Why explicitly initial this and none of the other fields?

lib/CodeGen/SelectionDAG/StatepointLowering.cpp
437 ↗(On Diff #50872)

minor: I wouldn't bother changing the function name in this way.

511 ↗(On Diff #50872)

This comment still applies. A follow on commit is fine.

795 ↗(On Diff #50872)

Might be good to lift these two to the top of the function.

This revision is now accepted and ready to land.Mar 16 2016, 3:47 PM
sanjoy added inline comments.Mar 16 2016, 4:13 PM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
752 ↗(On Diff #50872)

Will fix -- the only ones with missing initializers right now are StatepointInstr and EHPadBB (the rest of the arguments are objects that should initialize themselves).

lib/CodeGen/SelectionDAG/StatepointLowering.cpp
437 ↗(On Diff #50872)

Will fix before checkin.

511 ↗(On Diff #50872)

Ah, sorry, I somehow missed seeing the comment earlier. I'll check in a follow-on change.

795 ↗(On Diff #50872)

Will fix before checkin.

This revision was automatically updated to reflect the committed changes.
sanjoy added inline comments.Mar 16 2016, 4:33 PM
lib/CodeGen/SelectionDAG/StatepointLowering.cpp
511 ↗(On Diff #50872)

Thinking about it, the current behavior seems somewhat problematic --
whether an incoming gc pointer gets the special "already a spill slot"
treatment depends on whether SelectionDAG lowers it to a
FrameIndexSDNode or not; and that in turn depends on whether the
llvm::Value is statically an llvm::AllocaInst or not. So if I
start with a gc.statepoint that has an alloca as a gc pointer,
something like LCSSA (say) could obscure the llvm::AllocaInst behind
a PHI, and getValue will give us a CopyFromReg, and this would
change behavior.

I think even in the best case we can't do what this bit of code is
trying to do (assuming I understand its intent correctly) -- you could
have a PHI of two different allocas flowing in, into the
gc.statepoint (this could have been created due to tail merging, for
instance) and then we won't even have a fixed frame index that we can
report.

The caveat to the above is that I've assumed that the special
treatment of alloca's is needed for correctness. If this is merely a
performance optimization, then the problem is easier.