When the target option GuaranteedTailCallOpt is specified, calls with the fastcc calling convention will be transformed into tail calls if they are in tail position. This diff adds a new calling convention, tailcc, currently supported only on X86, which behaves the same way as fastcc, except that the GuaranteedTailCallOpt flag does not need to enabled in order to enable tail call optimization.
I'd just like to note that this code is not well tested. A number of features and platforms have been added to the x86 prologue codepath that have not been tested in combination with guaranteed TCO, such as Win64 support. I've always assumed that users of this feature just like to live life dangerously.
Aside from that, the code looks correct, and I think this is a step towards simplifying things. As a long term goal, I think we should get rid of the GuaranteedTailCallOpt option and just ask people to use this convention. These are the conventions that work with this feature today:
- fastcc, replaced by tailcc
- GHC, we can probably make this always be callee cleanup and guarantee TCO for it in the same way
- HiPE, probably the same
- HHVM, unclear, I hope it's the same
- X86_RegCall, I'm not sure if it was intended for us to change the ABI of this convention when users pass -tailcallopt to llc. It seems like that could be a bug.
There are four cases this handles:
We have named booleans for case 2 and 3, but this code would benefit from a bool for case 4. I think you can call shouldGuaranteeTCO earlier and set up a bool with a name like GuaranteeTCO and check it in the rest of the cases throughout LowerCall.
A couple things I think could have testcases:
- Mismatched calling conventions should not tail call
- Interaction with musttail (see test/CodeGen/X86/musttail-fastcall.ll)
- What should we do with disable-tail-calls? (see test/CodeGen/X86/disable-tail-calls.ll)
Instead of copying the documentation from fastcc, it might be better to say this is equivalent (other than the tail call part)
It's not totally equivalent though... it's only equivalent to fastcc if you pass -tailcallopt. I agree with what you're proposing at a high level, but I'm not sure how to word it. Do you have a suggestion?
Maybe something like this?
"This calling convention ensures that calls in tail position will always be tail call optimized. This calling convention is equivalent to fastcc with an additional guarantee that tail calls will be produced whenever possible. Tail calls can only be optimized when this..."
I added tests for musttail and mixed calling conventions. But I'm a little unsure what the correct behavior should be here for disable-tail-calls. When you do a musttail call, disable-tail-calls is ignored. When you do a fastcc call with -tailcallopt, disable-tail-calls is not ignored. This seems a little inconsistent to me, but is beyond the scope of this revision. Which of these two behaviors should be implemented by tailcc, though? Currently it does the same thing as fastcc with -tailcallopt. I can add a test for that behavior, but if we want the other behavior, I need to know before I write that test because it affects the test's output.
Looking back in the discussion, I think it's fine to have it act the same way as fastcc.
If we want to get rid of GuaranteedTailCallOpt, I don't think it's a good idea to have tailcc disregard (or error out on) disable-tail-calls. We will still want to have the option to produce normal calls when someone asks for it.
I think it would be good to hear @rnk's thoughts on this too.