Page MenuHomePhabricator

Disable x86 tail call optimizations that jump through GOT

Authored by chh on May 15 2015, 1:45 PM.

Diff Detail


Event Timeline

chh updated this revision to Diff 25891.May 15 2015, 1:45 PM
chh retitled this revision from to Disable x86 tail call optimization under PIC mode.
chh updated this object.
chh edited the test plan for this revision. (Show Details)
chh set the repository for this revision to rL LLVM.
chh removed rL LLVM as the repository for this revision.May 15 2015, 1:48 PM
chh added a subscriber: Unknown Object (MLST).
chh added a reviewer: enh.May 15 2015, 5:06 PM
chh added a reviewer: void.May 18 2015, 3:32 PM
chh added a reviewer: rnk.May 20 2015, 4:41 PM
rnk added inline comments.May 22 2015, 11:23 AM
2759 ↗(On Diff #25891)

We can't do this, musttail guarantees that a tail call will occur. I think the old behavior of forcing relocations to be resolved eagerly was preferable. If we cannot honor musttail, we must report an error somehow.

chh updated this revision to Diff 26338.May 22 2015, 12:15 PM

Okay, I removed the change to isMustTailCall in the new diff.

rnk edited edge metadata.May 22 2015, 12:44 PM

You removed the musttail call change, but also removed the supporting code that arranges to jump through ECX, so I don't think this will work either.

I commented on the bug, I'm trying to understand why the GOT reference causes a problem. I don't yet see a codegen bug here.

chh added a comment.May 22 2015, 1:56 PM

The purpose of this change is to remove the jump (through ECX or not) completely since the use of tailcallee@GOT is incorrect under lazy dynamic linking.

davidxl added inline comments.
2929 ↗(On Diff #26338)

It is a valid optimization with eager binding, so it seems the behavior should be controlled by an option.

chh added inline comments.May 22 2015, 4:40 PM
2929 ↗(On Diff #26338)

David, I suppose you prefer the default to have tail call optimization.
Then, we don't need to change llvm.
We can use -fno-optimize-sibling-calls now to turn off tail call optimization.
Or we can add another -fno-optimize-tail-calls.

I would rather llvm for x86 to play safe as gcc, with no tail call optimization by default.
Then we could have another option like -foptimize-tail-calls.

On the other hand, I think I should not remove line 2928 to 2943,
for they might be needed for IsMustTail functions.

chh updated this revision to Diff 26363.May 22 2015, 5:02 PM
chh edited edge metadata.

Keep IsMustTail working as before.

I am still not convinced this is a compiler bug. See my comment in the bug.

joerg added a subscriber: joerg.May 26 2015, 9:56 AM

This code really should be under an option. As mentioned in the original PR, the origin of the problem is the misdesigned XFree86/Xorg driver interface. We should not pessimise everyone who doesn't create brain dead code. The executive summary of the X bug is:
(1) A module is loaded with a reference to foo. foo is not yet defined.
(2) Another module is lodead that defines foo. X requires the first module to call this.

Ignoring must-tailcall would break all languages that require the optimisation.

chh added a comment.May 26 2015, 1:52 PM

I read the new discussions in

  • If it is not considered an optimization bug, we need to change semantics of dlopen() with RTLD_LAZY.
  • A fix to have both tail call optimization and RTLD_LAZY seems infeasible, to setup %ebx before and restore %ebx after the call.
  • Both and Android hit this problem. I have spent weeks to trace down this problem in Android and hence my motivation to change llvm and save future debug time.
  • This problem has been worked around with a global -fno-optimize-sibling-calls flag. So it is not urgent to have a fix. But that flag also disables other valid cases such as non-PIC mode.

In my new diff3, I keep current isMustTail feature and disable isTailCall only under PIC mode.
That should not lose performance compared with gcc.

If we want to keep this optimization selectable under PIC mode,
then we need a new flag and the question is whether the optimization is on or off
under PIC mode by default. Considering the difficulty of debugging this problem
and current competition from gcc, I would rather the default to be off.


Changing the default may affect users who depend on the current behavior which has been the case for a couple of releases. Another possible solution is to emit a warning for cases like this so that debugging similar issues in the future will become easier.

rnk added a comment.May 27 2015, 11:15 AM

I'm coming around to the position that we should simply disable this
optimization for PIC calls to symbols with default visibility. The current
change disables all tail calls when generating PIC code, which is

We can add a flag later if someone asks, and call it

chh updated this revision to Diff 26645.May 27 2015, 4:04 PM
chh retitled this revision from Disable x86 tail call optimization under PIC mode to Disable x86 tail call optimizations that jump through GOT.
chh added a comment.May 27 2015, 4:10 PM

Please review my new diff 4. This should allow tail call optimizations for protected and hidden symbols.
AFAIK, jump to those symbols do not use GOT.
On the other hand, I found some oddity about "sin" and "sin2" symbols in one test case,
which made me to use the strange condition:

if (!(G && (.... || ...))) isTailCall = false;
This revision was automatically updated to reflect the committed changes.
rnk added a comment.May 28 2015, 2:04 PM

Thanks for the patch! I simplified the conditional a bit and tweaked the tests to try to cover some more corner cases that I could think of. During testing I found that this disabled tail calling of internal (think static in C) functions, so I added hasLocalLinkage() to the conditional. I committed the result as r238487.

chh added a comment.May 28 2015, 2:53 PM

Reid, thank you very much for the review, advices, and clean up of this patch.