This is an archive of the discontinued LLVM Phabricator instance.

[CGP] Be less conservative about tail-duplicating a ret to allow tail calls
ClosedPublic

Authored by mkuper on Sep 7 2016, 2:23 PM.

Details

Summary

CGP tail-duplicates rets into blocks that end with a call that feed the ret. This puts the call in tail position, potentially allowing the DAG builder to lower it as a tail call. To avoid the tail-duplication in cases where we won't form the tail call, CGP tries to predict whether this is going to be possible, and avoid doing it when lowering as a tail call will definitely fail. However, it was being too conservative by always throwing away calls to functions with a signext/zeroext attribute on the return type.

Instead, we can use the exact same logic the builder itself uses to determine whether the attributes work out.

Diff Detail

Repository
rL LLVM

Event Timeline

mkuper updated this revision to Diff 70595.Sep 7 2016, 2:23 PM
mkuper retitled this revision from to [CGP] Be less conservative about tail-duplicating a ret to allow tail calls.
mkuper updated this object.
mkuper added reviewers: hansw, rnk, iteratee.
mkuper added a subscriber: llvm-commits.
hans accepted this revision.Sep 7 2016, 2:45 PM
hans added a reviewer: hans.
hans added a subscriber: hans.

lgtm, nice!

This revision is now accepted and ready to land.Sep 7 2016, 2:46 PM
iteratee added inline comments.Sep 7 2016, 4:31 PM
lib/CodeGen/CodeGenPrepare.cpp
2008 ↗(On Diff #70595)

It's kind of awkward to pass this as an output parameter and not use it.
Would it be better to make it a pointer argument with a default value of nullptr?

mkuper added inline comments.Sep 7 2016, 4:50 PM
lib/CodeGen/CodeGenPrepare.cpp
2008 ↗(On Diff #70595)

I agree it's awkward, I'm just not sure the other way around (having a temporary in the callee, and then setting after a null check) is less awkward, overall. Same goes for other possible solutions (returning a pair, say).

In any case, I don't feel strongly about this, and I don't think the LLVM coding standards say anything concrete about this. So if you significantly prefer a pointer, let me know, and I'll change it.

iteratee added inline comments.Sep 7 2016, 5:03 PM
lib/CodeGen/CodeGenPrepare.cpp
2008 ↗(On Diff #70595)

How about this:
Use a local variable in attributesPermitTailCall.
Use make_scope_exit to set the output parameter on exit only if it isn't null.

mkuper added inline comments.Sep 7 2016, 5:16 PM
lib/CodeGen/CodeGenPrepare.cpp
2008 ↗(On Diff #70595)

That'll work too (I didn't even know we had make_scope_exit!), but it seems like overkill.

Another awkward option that doesn't involve make_scope_exit, as long as we're at it:

... bool *ADS = nullptr) {
  bool TempADS;
  bool &AllowDifferingSizes = ADS ? *ADS : TempADS;
...
}
iteratee added inline comments.Sep 7 2016, 5:18 PM
lib/CodeGen/CodeGenPrepare.cpp
2008 ↗(On Diff #70595)

Go with that. That's the least awkward of all of them.

This revision was automatically updated to reflect the committed changes.