This is an archive of the discontinued LLVM Phabricator instance.

[darwin/x86] Model cxx_fast_tlscc as not preserving r11
Needs ReviewPublic

Authored by thakis on Sep 1 2021, 4:50 PM.

Details

Summary

If a TLV stub is called from a dylib, dyld_stub_binder will clobber r11.

Fixes PR51703.

Diff Detail

Event Timeline

thakis created this revision.Sep 1 2021, 4:50 PM
thakis requested review of this revision.Sep 1 2021, 4:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 1 2021, 4:50 PM
thakis edited the summary of this revision. (Show Details)Sep 1 2021, 4:50 PM

Hmm, I wonder if there's a compatibility problem for changing this; I hadn't realized this was so old. It dates to 2015 (r252837), in a commit by Manman.

thakis added a comment.Sep 7 2021, 8:22 AM

What's the next step here? Do you want to run some internal tests? Do you want me to run some tests? Use a different approach?

Sorry, I'm not entirely sure who owns this feature right now, I'm asking around.

Are there any owners to be found? :)

Whats the next step here?

thakis added a comment.Dec 1 2021, 4:30 PM

roughly monthly ping :)

thakis added a comment.Jan 6 2022, 6:51 AM

Should we just land this and see what (if any) breaks?

We do know that it fixes a miscompile of fairly basic code.

hans added a subscriber: hans.Jan 13 2022, 6:26 AM

Hmm, I think the potential problem here is that dropping R11 from CSR_64_TLS_Darwin does not only change the calling code (which is what we want, and seems like a safe change), but will also change the callee, i.e. the TLS wrapper functions wherever they're generated, to no longer preserve R11 (which they did before, although not when called through the dyld_stub_binder). That would cause a problem if the caller and callee are built by different compiler versions (e.g. an older program calling some newer (system?) library code). I don't know whether that could happen though?

(These registers are also tied up with "split CSRs" somehow, which was added by https://github.com/llvm/llvm-project/commit/abc7c1d1d2d44b4d7416c714a13239872d95a72e followed by AArch64-specific code in https://github.com/llvm/llvm-project/commit/cbe4f9417d1cbb0192173439501fedb782949bad and then x86_64 in https://github.com/llvm/llvm-project/commit/ed967f37520fec398aa419514de552e1a917c9ec
But I'm not sure how relevant that is..)

rjmccall, do you have any opinion on what (if anything) should happen here? At the moment, using TLV across dylibs on darwin is basically broken (see https://github.com/llvm/llvm-project/issues/51045).

This is a fairly simple fix. It carries some risk, but maybe not having it is worse?

Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2022, 6:51 AM
Herald added a subscriber: jsji. · View Herald Transcript

I think I can speak authoritatively at this point that the right way to understand this convention is:

  • Functions defined using this convention must promise to preserve all registers.
  • As a general rule, calls that are not direct calls within a linkage unit may invalidate linker scratch registers such as r11.
  • Therefore, calls to the function may rely on all non-linker-scratch registers as being preserved, but may only rely on linker scratch registers being preserved if the call is a direct call to a callee known to be within the current linkage unit.

I'll leave it to LLVM folks to decide how they want to model that.