Externally-defined functions with weak linkage should not be
tail-called on ARM or AArch64, as the AAELF spec requires normal calls
to undefined weak functions to be replaced with a NOP or jump to the
next instruction. The behaviour of branch instructions in this
situation (as used for tail calls) is implementation-defined, so we
cannot rely on the linker replacing the tail call with a return.
Details
- Reviewers
t.p.northover
Diff Detail
Event Timeline
Hi Oliver,
Is this an armlink work-around? The ABI looks suspiciously like it was written specifically to be compatible with some recalcitrant linker. It's even phrased as a platform-dependent issue (especially so on AArch32).
GNU's linker on Linux (both AArch32 & AArch64) seems to cope fine, for example. So I don't think we'd want to disable the optimisation there. Or anywhere without evidence of problems, really.
Cheers.
Tim.
I originally stated that "we cannot rely on the linker replacing the tail call with a return", but after further thought I now realise that it is impossible for the linker to replace a tail-call with a return, as it may need to insert a stack pointer adjustment, but it does not know how big the adjustment must be.
Both armlink and arm-none-eabi-ld replace branch instructions to undefined weak symbols with a NOP, so this is not just an armlink problem.
The AAELF spec only states this for "platforms that do not support dynamic pre-emption of symbols", so I have modified the patch to only prevent the tail call for bare-metal systems.
Hi Oliver,
I'm really sorry, I've realised I was confused yesterday. For some reason I'd convinced myself that the nop/"branch-to-next" was the correct behaviour and that's what I was checking for. I see more clearly what you mean now.
So going onto the patch itself, I think it's in the wrong place. The "tail call" instructions inserted at IR level are (usually) more of a hint than a strict requirement. The backends decide what really should and shouldn't be made a tail call, using the "isEligibleForTailCallOptimisation" in XYZISelLowering.cpp (at least in the AArch64 & ARM cases).
My confusion also affects where this is valid. It's almost certainly needed for Linux -static compilations too. Darwin platforms appear to be closest to the pre-emptable Linux situation.
Cheers.
Tim.
Moved this to the ARM and AArch64 backends, removed all changes to TailCallElim.
I have taken out the restriction based on target platform, as both bare-metal and Linux could need this.