This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Correct when Itanium ABI guard variables are set for non-block variables with static or thread storage duration.
ClosedPublic

Authored by tahonermann on Oct 13 2022, 2:20 PM.

Details

Summary

Previously, Itanium ABI guard variables were set after initialization was complete for non-block declared variables with static and thread storage duration. That resulted in initialization of such variables being restarted in cases where the variable was referenced while it was still under construction. Per C++20 [class.cdtor]p2, such references are permitted (though the value obtained by such an access is unspecified). The late initialization resulted in recursive reinitialization loops for cases like this:

template<typename T>
struct ct {
  struct mc {
    mc() { ct<T>::smf(); }
    void mf() const {}
  };
  thread_local static mc tlsdm;
  static void smf() { tlsdm.mf(); }
};
template<typename T>
thread_local typename ct<T>::mc ct<T>::tlsdm;
int main() {
  ct<int>::smf();
}

With this change, guard variables are set before initialization is started so as to avoid such reinitialization loops.

Fixes https://github.com/llvm/llvm-project/issues/57828

Diff Detail

Event Timeline

tahonermann created this revision.Oct 13 2022, 2:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 13 2022, 2:20 PM
tahonermann requested review of this revision.Oct 13 2022, 2:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 13 2022, 2:20 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

So this logically looks/sounds right to me (and I have no code concerns)... But I'm also notoriously bad at anything threading related. I'd love it if one of the other reviewers could sanity check this for me.

This revision is now accepted and ready to land.Oct 14 2022, 7:38 AM
rjmccall requested changes to this revision.Oct 14 2022, 9:30 AM

We can't set the flag if initialization is aborted by an exception, which is not ruled out by the use of thread-unsafe statics. So this is not a correct change.

More basically, I don't think "recursive calls try to re-initialize" is actually worse behavior than "recursive calls access an uninitialized object". And both seem to be allowed.

The real problem here is that everyone understands this flag to be a request to generate a cheap code sequence for testing initialization, but it's not actually possible to do the maximally cheap sequence and have it still be safe against recursive entry; you need to generate guards and diagnose the failure.

This revision now requires changes to proceed.Oct 14 2022, 9:30 AM

We can't set the flag if initialization is aborted by an exception, which is not ruled out by the use of thread-unsafe statics. So this is not a correct change.

@tahonermann, the case from https://github.com/llvm/llvm-project/issues/57828 is not for a block-scope variable, but the case in the patch description is...

rjmccall added a comment.EditedOct 14 2022, 9:40 AM

Sorry, I responded as if you were modifying the guards of thread-unsafe global variables, but you're actually modifying the guards of thread-local variables. I apologize for the confusion. However, my substantive points pretty much all still hold:

  • we have to generate code that behaves correctly in the presence of exceptions
  • providing access to an uninitialized variable is worse than recursively re-entering initialization; in either case, the program is incorrect, and the latter is much more likely to cause an immediate failure at runtime
  • the real fix requires a more elaborate code sequence to reliably generate a runtime failure

The major difference is that this is the standard behavior of the compiler and therefore not something we can just throw our hands up about.

Also, this needs Itanium discussion because it can affect interoperation.

the case from https://github.com/llvm/llvm-project/issues/57828 is not for a block-scope variable, but the case in the patch description is...

Thanks, Hubert. Yes, I found that the reported issue occurred for any case where thread safe guard variables are not required. I chose the block-scope variable example for the patch summary because I felt it better presented the issue.

Your question inspired me to do some additional testing though and I see both gcc and icc also exhibit the re-initialization behavior for that case, but not the case reported in https://github.com/llvm/llvm-project/issues/57828.

we have to generate code that behaves correctly in the presence of exceptions

Thank you, John. Indeed, I didn't consider exception handling. And I locally confirmed that the changes break cases like https://godbolt.org/z/M8Penaxa1.

providing access to an uninitialized variable is worse than recursively re-entering initialization; in either case, the program is incorrect, and the latter is much more likely to cause an immediate failure at runtime

My understanding is that, as long as the value of the uninitialized variable is not used, the program is not incorrect.

the real fix requires a more elaborate code sequence to reliably generate a runtime failure
Also, this needs Itanium discussion because it can affect interoperation.

I'll do some more analysis of how gcc and icc handle these scenarios and report back. If they appear to handle these cases right, perhaps Clang can be updated to match without the need for further ABI discussion.

the case from https://github.com/llvm/llvm-project/issues/57828 is not for a block-scope variable, but the case in the patch description is...

Thanks, Hubert. Yes, I found that the reported issue occurred for any case where thread safe guard variables are not required. I chose the block-scope variable example for the patch summary because I felt it better presented the issue.

Your question inspired me to do some additional testing though and I see both gcc and icc also exhibit the re-initialization behavior for that case, but not the case reported in https://github.com/llvm/llvm-project/issues/57828.

For the block-scope case, https://eel.is/c++draft/stmt.dcl#3: If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.

the case from https://github.com/llvm/llvm-project/issues/57828 is not for a block-scope variable, but the case in the patch description is...

Thanks, Hubert. Yes, I found that the reported issue occurred for any case where thread safe guard variables are not required. I chose the block-scope variable example for the patch summary because I felt it better presented the issue.

Your question inspired me to do some additional testing though and I see both gcc and icc also exhibit the re-initialization behavior for that case, but not the case reported in https://github.com/llvm/llvm-project/issues/57828.

For the block-scope case, https://eel.is/c++draft/stmt.dcl#3: If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.

Ah, right, so in principle we can just ignore this possibility. It would still be good to have some mode that dynamically diagnoses it, though, maybe under UBSan. We could probably get the Itanium ABI to agree to a standard guard value that means "currently being initialized" which implementations could use if they want to diagnose this case.

For non-block variables, this is a deferred initialization, as set out in [basic.start.dynamic]p7 (https://eel.is/c++draft/basic.start.dynamic#7). I don't think there's an analogous restriction about recursion, so only the general object lifetime rules apply, [basic.life]p7 (https://eel.is/c++draft/basic.life#7). A recursive use during initialization would be a use of the "original object" prior to the completion of initialization. Under the lifetime rules, the acceptability of that depends on what's being done, but generally, anything that accesses memory through it is either UB or yields an unspecified value. That does not permit us to do arbitrary things whenever we see a recursive reference, though, unless we actually do an analysis of that reference, which we are not doing here.

Note also that, for non-block variables, the exceptions concern does not arise because [basic.start.dynamic]p8 (https://eel.is/c++draft/basic.start.dynamic#8) says that exceptions during initialization trigger std::terminate.

I think that suggests that setting the guard to 1 prior to initialization is actually *required* for the deferred initialization of thread-locals, because otherwise we are interfering with the evaluation of uses of the name that are potentially valid prior to initialization, such as storing the address of the variable somewhere without accessing it. And if the use isn't valid and is one of the UB/unspecified cases, well, it's too bad we can't catch that immediately, but it's no worse than what would happen with static storage duration.

So if you alter this patch to only apply to non-block variables, I think we can move forward with it.

Please add a comment something like this:

The semantics of dynamic initialization of thread-local variables depends subtly on whether they are block-scope. The initialization of block-scope thread locals can be aborted with an exception and later retried (<cite>), and recursive entry to the initialization has undefined behavior (<cite>). For non-block thread locals, exceptions lead to termination (<cite>), and recursive references to the variables are governed only by the lifetime rules (<cite>), which means they’re perfectly fine as long as they avoid touching memory. As a result, we need to avoid marking block-scope variables as initialized until after initialization completes (or be prepared revert the mark after an exception), but we need to mark non-block-scope variables as initialized immediately so that non-memory-accessing uses will succeed.

tahonermann retitled this revision from [Clang] Set thread_local Itanium ABI guard variables before calling constructors. to [Clang] Correct when Itanium ABI guard variables are set for non-block variables with static or thread storage duration..
tahonermann edited the summary of this revision. (Show Details)

Please add a comment something like this:

Done. Thank you for the prior review comments. This is again ready for review.

@rjmccall, could you please take a look at the updated changes? I think I've addressed the issues you raised.

rjmccall accepted this revision.Nov 15 2022, 3:25 PM

Thanks, this looks great.

This revision is now accepted and ready to land.Nov 15 2022, 3:25 PM