This is an archive of the discontinued LLVM Phabricator instance.

Allow MS extension: support of constexpr with __declspec(dllimport).
Needs ReviewPublic

Authored by zahiraam on Oct 31 2022, 12:49 PM.

Details

Summary

Microsoft allows the support of ‘constexpr’ with ‘declspec(dllimport) starting
from VS2017 15.u update 4. See Constexpr doesn't support
declspec(dllimport)
in VS2017 15.8 - Visual Studio Feedback.

Let’s consider this example.
lib.cpp:
declspec(dllexport) int val=12;
declspec(dllexport)
int next(int n)
{

return n + 1;

}

app1.cpp:
#include <iostream>
extern int __declspec(dllimport) next(int n);
int main () {

extern int __declspec(dllimport) val;
constexpr int& val_ref = val;
int i = next(val_ref);
std::cout << "i: " << i << std::endl;
return i;

}

Compiling this will give the expected output.
$ cl /LD lib.cpp
$ cl /EHsc app1.cpp /link lib.lib
$ ./app1.exe
i: 13

The Intel compiler has the same behavior than MSVC.
Clang compiles this test case with this error:
error: constexpr variable 'val_ref' must be initialized by a constant

    expression
constexpr int& val_ref = val;
               ^         ~~~

1 error generated.

I think this should be fixed. This patch is doing that.

Now let’s look now at this example (dllimport at TU scope level):
app2.cpp:
#include <iostream>
extern int declspec(dllimport) next(int n);
extern int
declspec(dllimport) val;
constexpr int& val_ref = val;

int main () {

int i = next(val_ref);
std::cout << "i: " << i << std::endl;
return i;

}
Compiling this will result into an unresolved symbol:
$ cl /EHsc app2.cpp /link lib.lib
app2.obj : error LNK2001: unresolved external symbol "int val" (?val@@3HA)
app2.exe : fatal error LNK1120: 1 unresolved externals

ICL and clang generate the same error.

These are the symols generated for all 3 compilers:
ICL:
003 00000000 UNDEF notype External | imp_?next@@YAHH@Z (declspec(dllimport) int cdecl next(int))
004 00000000 UNDEF notype External |
imp_?val@@3HA (declspec(dllimport) int val)
00C 00000000 SECT4 notype External | ?val_ref@@3AEAHEA (int & val_ref)
00D 00000000 UNDEF notype External | ?val@@3HA (int val)
MSVC:
00A 00000000 UNDEF notype External |
imp_?next@@YAHH@Z (declspec(dllimport) int cdecl next(int))
015 00000000 SECT6 notype Static | ?val_ref@@3AEAHEA (int & val_ref)

The symbols generated by ICL seem to be what's expected. MSVC should be generating an "__imp_?val" symbol.

Clang with the change in this patch is generating the expected symbols:
010 00000000 UNDEF notype External | imp_?val@@3HA (declspec(dllimport) int val)
011 00000000 UNDEF notype External | imp_?next@@YAHH@Z (declspec(dllimport) int __cdecl next(int))
012 00000000 SECT5 notype External | ?val_ref@@3AEAHEA (int & val_ref)
013 00000000 UNDEF notype External | ?val@@3HA (int val)

Fixes issue https://github.com/llvm/llvm-project/issues/53182

Differential Revision: https://reviews.llvm.org/D137107

Diff Detail

Event Timeline

zahiraam created this revision.Oct 31 2022, 12:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 31 2022, 12:49 PM
zahiraam requested review of this revision.Oct 31 2022, 12:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 31 2022, 12:49 PM
rnk added a comment.Oct 31 2022, 1:03 PM

Unless I'm missing something, I think Clang's behavior here is preferable to MSVC's. MSVC produces code which will not link. Clang turns the linker error into a compiler error, which is generally easier for the user to understand. To my knowledge, it is still true that there is no COFF relocation which can statically initialize a global variable with the address of a dllimported symbol.

For the use case you are considering, how do things work out in practice? Is there some new import patching facility that I'm not aware of? Is the variable linked statically instead of dynamically, and then the linker relaxes the __imp references to direct references?

rnk added a subscriber: akhuang.Oct 31 2022, 1:10 PM

@akhuang, can you help out with this? The address of dllimport symbols not being constexpr is a big pain point for our users too, and I'd love to fix it:
https://github.com/protocolbuffers/protobuf/issues/10159

I am ready to be wrong here: if things have changed and my info is stale, we should definitely adapt.

Unless I'm missing something, I think Clang's behavior here is preferable to MSVC's. MSVC produces code which will not link. Clang turns the linker error into a compiler error, which is generally easier for the user to understand. To my knowledge, it is still true that there is no COFF relocation which can statically initialize a global variable with the address of a dllimported symbol.

For the use case you are considering, how do things work out in practice? Is there some new import patching facility that I'm not aware of? Is the variable linked statically instead of dynamically, and then the linker relaxes the __imp references to direct references?

I haven't used any import patch. I just used MSVC 2019 and the test compiles and links. Not sure how the variable is linked.
The symbols generated are:
009 00000000 UNDEF notype External | imp_?next@@YAHH@Z (declspec(dllimport) int cdecl next(int))
012 00000000 UNDEF notype External |
imp_?val@@3HA (__declspec(dllimport) int val)
and this is the asm:
main PROC
$LN3:

sub     rsp, 56                                 ; 00000038H

; Line 5

mov     rax, QWORD PTR __imp_?val@@3HA
mov     QWORD PTR val_ref$[rsp], rax

; Line 6

mov     rax, QWORD PTR val_ref$[rsp]
mov     ecx, DWORD PTR [rax]
call    QWORD PTR __imp_?next@@YAHH@Z
mov     DWORD PTR i$[rsp], eax

; Line 8

mov     eax, DWORD PTR i$[rsp]

; Line 9

add     rsp, 56                                 ; 00000038H
ret     0

main ENDP

Unless I'm missing something, I think Clang's behavior here is preferable to MSVC's. MSVC produces code which will not link. Clang turns the linker error into a compiler error, which is generally easier for the user to understand. To my knowledge, it is still true that there is no COFF relocation which can statically initialize a global variable with the address of a dllimported symbol.

For the use case you are considering, how do things work out in practice? Is there some new import patching facility that I'm not aware of? Is the variable linked statically instead of dynamically, and then the linker relaxes the __imp references to direct references?

The testcase that is shown here is peculiar in the sense that constexpr int& val_ref = val; is located within main(), so it isn't really constexpr (as far as I'd undestand it), but MSVC produces a load from __imp_val.

I am ready to be wrong here: if things have changed and my info is stale, we should definitely adapt.

I'm not aware of anything that would have changed here...

For the cases where this needs to be really statically initialized, the mingw pseudo relocations do fill that spot (which fakes it by initializing it very early on, from a linker-generated list of spots in the binary that needs patching). For non-constexpr cases in C++, Clang solves it by implicitly generating a static constructor which fetches it from the dllimported variable.

Unless I'm missing something, I think Clang's behavior here is preferable to MSVC's. MSVC produces code which will not link. Clang turns the linker error into a compiler error, which is generally easier for the user to understand. To my knowledge, it is still true that there is no COFF relocation which can statically initialize a global variable with the address of a dllimported symbol.

For the use case you are considering, how do things work out in practice? Is there some new import patching facility that I'm not aware of? Is the variable linked statically instead of dynamically, and then the linker relaxes the __imp references to direct references?

The testcase that is shown here is peculiar in the sense that constexpr int& val_ref = val; is located within main(), so it isn't really constexpr (as far as I'd undestand it), but MSVC produces a load from __imp_val.

I think I can be convinced that generating a compiler error instead of a link error for the first use case is better, but about this use case (I am not sure I understand why you say “it isn’t a real constexpr”):

extern int __declspec(dllimport) next(int n);
int main () {

extern int _declspec(dllimport) val;
constexpr int& val_ref = val;
int i = next(val_ref);
return i;

}

With MSVC with or without constexpr will lead to the same generated symbol

012 00000000 UNDEF notype External | imp_?val@@3HA (declspec(dllimport) int val)

and the same assembly (load __imp_?val) (is that why you made the comment above?). And the test case runs.

But that’s not the case for clang (without this patch).
With constexpr it’s failing with the error mentioned above.

Without constexpr the symbol generated is

0E 00000000 UNDEF notype External | imp_?val@@3HA (declspec(dllimport) int val)

and the assembly generates a ‘load __imp_? Val’. Same than MSVC, but the test case fails to run. @rnk Shouldn't this run?

I am ready to be wrong here: if things have changed and my info is stale, we should definitely adapt.

I'm not aware of anything that would have changed here...

For the cases where this needs to be really statically initialized, the mingw pseudo relocations do fill that spot (which fakes it by initializing it very early on, from a linker-generated list of spots in the binary that needs patching). For non-constexpr cases in C++, Clang solves it by implicitly generating a static constructor which fetches it from the dllimported variable.

and the same assembly (load __imp_?val) (is that why you made the comment above?). And the test case runs.

Yes, pretty much - the actual value of the address of &val isn't known at compile time and it cannot be produced in a way to initialize e.g. static memory, which I would believe that one can expect of constexpr. The runtime linker that loads DLLs can't fix it so that the address is available as a statically initialized data, but to use it it does require runtime code to fetch the address.

Without constexpr the symbol generated is

0E 00000000 UNDEF notype External | imp_?val@@3HA (declspec(dllimport) int val)

and the assembly generates a ‘load __imp_? Val’. Same than MSVC, but the test case fails to run. @rnk Shouldn't this run?

Can you give the full exact repro for this case, without constexpr, which you say fails to run - I tried to follow the instructions but for me it seems to work just fine, without constexpr.

and the same assembly (load __imp_?val) (is that why you made the comment above?). And the test case runs.

Yes, pretty much - the actual value of the address of &val isn't known at compile time and it cannot be produced in a way to initialize e.g. static memory, which I would believe that one can expect of constexpr. The runtime linker that loads DLLs can't fix it so that the address is available as a statically initialized data, but to use it it does require runtime code to fetch the address.

Without constexpr the symbol generated is

0E 00000000 UNDEF notype External | imp_?val@@3HA (declspec(dllimport) int val)

and the assembly generates a ‘load __imp_? Val’. Same than MSVC, but the test case fails to run. @rnk Shouldn't this run?

Can you give the full exact repro for this case, without constexpr, which you say fails to run - I tried to follow the instructions but for me it seems to work just fine, without constexpr.

Here is an even smaller test case.
lib.cpp:
__declspec(dllexport) int val=12;

t1.cpp:
int main () {

extern int _declspec(dllimport) val;
int& val_ref = val;
return val;

}
$ clang.exe -LD lib.cpp (this will create a.lib and a.exp)
$ clang.exe t1.cpp -la
$ ./a.exe
This will result in a popup message: "The application was unable to start correctly ..."

With MSVC
$ cl /LD lib.cpp (this will create lib.lib and lib.dll)
$ cl t1.cpp /link lib.lib
$ t1.exe
$ echo $?
12
$

Here is an even smaller test case.
lib.cpp:
__declspec(dllexport) int val=12;

t1.cpp:
int main () {

extern int _declspec(dllimport) val;
int& val_ref = val;
return val;

}
$ clang.exe -LD lib.cpp (this will create a.lib and a.exp)
$ clang.exe t1.cpp -la
$ ./a.exe
This will result in a popup message: "The application was unable to start correctly ..."

What does llvm-readobj --coff-imports a.exe show here?

What environment are you running this in? -LD is a cl/clang-cl style option and can't be passed to the gnu-style interface of clang.exe.

When I try to follow these examples in cmd.exe, I get the following:

>clang.exe -LD lib.cpp
   Creating library a.lib and object a.exp
LINK : fatal error LNK1561: entry point must be defined
clang: error: linker command failed with exit code 1561 (use -v to see invocation)

It looks to me like the way you're invoking the compiler here is creating the issue you're observing. This testcase works just fine for me when I build it like this:

>clang-cl -LD lib.cpp
>clang-cl t1.cpp -link lib.lib
>t1.exe
>echo %errorlevel%
12
rnk added a comment.Nov 3 2022, 4:15 PM
extern int __declspec(dllimport) next(int n);
int main () {
  extern int _declspec(dllimport) val;
  constexpr int& val_ref = val;
  int i = next(val_ref);
  return i;
}

@rnk Shouldn't this run?

Yes, I agree, this is a bug. Clang should compile this and reference __imp_next here. However, Clang should continue producing errors when a dllimport symbol is used to initialize a constexpr global variable, which was one of the other cases mentioned in the initial report.

zahiraam updated this revision to Diff 473693.Nov 7 2022, 8:44 AM
extern int __declspec(dllimport) next(int n);
int main () {
  extern int _declspec(dllimport) val;
  constexpr int& val_ref = val;
  int i = next(val_ref);
  return i;
}

@rnk Shouldn't this run?

Yes, I agree, this is a bug. Clang should compile this and reference __imp_next here. However, Clang should continue producing errors when a dllimport symbol is used to initialize a constexpr global variable, which was one of the other cases mentioned in the initial report.

@rnk would you mind looking at the fix I am proposing for this bug? Thanks.

Sorry for the delay, dev meeting. I think we should try to add some reviewers for IR gen to help, I don't have a lot of time to be responsive:
https://github.com/llvm/llvm-project/blob/main/clang/CodeOwners.rst#clang-llvm-ir-generation
+@efriedma @hans

clang/test/CodeGenCXX/PR19955.cpp
17

This seems concerning to me, are we sure this is the right change? It's important that varp points to the imported var, and not a local copy, which is what https://llvm.org/PR19955 deals with.

There are multiple notions of "constant" here:

  • Whether a value is invariant at runtime.
  • Whether a value is legal in a "constexpr" expression.
  • Whether we can emit a constant without emitting code that runs at startup to initialize it.

The third is generally a superset of the second; most targets have relocations that allow us emit the address of a global variable into another global variable. dllimport variables are an exception: even though the address is constant, we can't emit a relocation.

To make this work, what we need to do is:

  • Extend the notion of a "constant" in the AST/Sema to allow dllimport variables. (There should be code specifically checking for dllimport variables; we just need to drop that check.)
  • Teach IR generation (clang/lib/CodeGen) how to emit constants that don't have a relocation. This probably means a check that if we're generating the initializer for a global variable, and the initializer refers to the address of a dllimport variable, we instead emit a global constructor. (Maybe with a higher priority that normal global constructor, so the user can't accidentally observe an uninitialized variable? I forget if MSVC actually has constructor priorities...). Or I guess as an initial patch, we could just emit an error if we detect this case.
zahiraam updated this revision to Diff 478624.Nov 29 2022, 3:19 PM

There are multiple notions of "constant" here:

  • Whether a value is invariant at runtime.
  • Whether a value is legal in a "constexpr" expression.
  • Whether we can emit a constant without emitting code that runs at startup to initialize it.

The third is generally a superset of the second; most targets have relocations that allow us emit the address of a global variable into another global variable. dllimport variables are an exception: even though the address is constant, we can't emit a relocation.

To make this work, what we need to do is:

  • Extend the notion of a "constant" in the AST/Sema to allow dllimport variables. (There should be code specifically checking for dllimport variables; we just need to drop that check.)
  • Teach IR generation (clang/lib/CodeGen) how to emit constants that don't have a relocation. This probably means a check that if we're generating the initializer for a global variable, and the initializer refers to the address of a dllimport variable, we instead emit a global constructor. (Maybe with a higher priority that normal global constructor, so the user can't accidentally observe an uninitialized variable? I forget if MSVC actually has constructor priorities...). Or I guess as an initial patch, we could just emit an error if we detect this case.

@efriedma Thanks for taking the time to look at it and for the explanation. I made an attempt to make a few changes. Please let me know if this is moving in the right direction. Thanks.

That's roughly what I was thinking, yes.

The one missing piece is the code to modify the constructor priorities to make sure that "constant" variables get initialized first. (This ensures we honor the C++ rules for "constant initialization".)

clang/lib/Sema/SemaDecl.cpp
13896 ↗(On Diff #478624)

I don't understand why this diagnostic is necessary.

clang/test/CodeGenCXX/PR19955.cpp
10

We probably want to continue using "null" as the placeholder here, to avoid any confusion.

efriedma added inline comments.Nov 29 2022, 4:36 PM
clang/lib/CodeGen/CodeGenModule.cpp
4897

This check probably needs to be inside tryEmitForInitializer, so other callers of tryEmitForInitializer don't get confused.

zahiraam added inline comments.Nov 30 2022, 7:03 AM
clang/lib/Sema/SemaDecl.cpp
13896 ↗(On Diff #478624)

Now that the DLLImport variable can be a constant, HasConstInit is returning true (it was before returning false) and when the variable is global a diagnostic should be reported.

This:

extern int _declspec(dllimport) val;
constexpr int& val_ref = val;

should report a diagnostic, but this:

int foo() { 
   extern int _declspec(dllimport) val; 
   constexpr int& val_ref = val;
}

shouldn't report a diagnostic.

efriedma added inline comments.Nov 30 2022, 11:02 AM
clang/lib/Sema/SemaDecl.cpp
13896 ↗(On Diff #478624)

MSVC doesn't report a diagnostic for either of your examples?

Looking at this again, I just thought of something: in C mode, we probably don't want to generate global constructors, so we might want to continue to print an error in that case.

clang/lib/Sema/SemaDecl.cpp
13896 ↗(On Diff #478624)

Right, they shouldn't fail.

My perspective is that since we're able to give those the expected semantics, we should just do that. The only reason that "dllimport" is relevant at all is that there isn't a relocation for it, but our plan is to cover that up with a runtime constructor. So the user shouldn't need to know there's trickery behind the scenes unless they disassemble the binary.

zahiraam added inline comments.Nov 30 2022, 2:52 PM
clang/lib/Sema/SemaDecl.cpp
13896 ↗(On Diff #478624)

Got it! Thanks. Will remove the diagnostic here then.

rnk added inline comments.Nov 30 2022, 4:28 PM
clang/lib/Sema/SemaDecl.cpp
13896 ↗(On Diff #478624)

I worry that some of our users really do care quite a lot about constant initialization. I think emitting a special runtime initializer may be our best option (mingw does it through other means), but this is likely to be disruptive. Chrome has code which runs before CRT initialization to do some early sandboxing. ASan also competes to initialize shadow memory as early as possible (search compiler-rt for .CRT).

I worry that these users will think, "well, I've used [[constinit]] and constexpr, so this global should *definitely* be initialized during startup, and I won't have to worry about init order fiasco problems!", and then they will be surprised to find that this is not the case. Should we care? What can we do about this, realistically? Leaving behind this portability pothole of "dlimport symbols just aren't constexpr" seems worse. Our current behavior may be unduly influenced by my particular values: I think constant initialization is really important, I really care a lot about whether things are relocated or initialized at runtime.

I don't see any good solutions, so please carry on, but if you have any good ideas, please share. :)

efriedma added inline comments.Nov 30 2022, 8:50 PM
clang/lib/Sema/SemaDecl.cpp
13896 ↗(On Diff #478624)

For the obscure cases where the user actually cares about that sort of thing, we can make -Wglobal-constructors trigger, I guess? I'm not sure what else we could do that doesn't break the feature.


It might be worth trying to reach out to Microsoft to figure out a common design for the runtime initialization thing. MSVC appears to emit broken code in certain cases at the moment. For example, in the following, "z" is correctly initialized at runtime, but "z2" emits a broken direct reference.

_declspec(dllimport) int x;
constexpr int *y = &x;
int *z = &x;
int *z2 = y;
zahiraam added inline comments.Dec 1 2022, 8:36 AM
clang/lib/Sema/SemaDecl.cpp
13896 ↗(On Diff #478624)

Thank you for the feedback.

I think these are really corner cases, right? Then maybe it's an acceptable solution to use the -Wglobal-constructors? I suggest we proceed with these changes?

@efriedma the one thing that's missing is the priority in which these constructors are initialized. You said in an earlier comment that they should have higher priority than normal global constructors. What priority should be given to them? I see this PrioritizedCXXGlobalInits being filled in EmitCXXGlobalVarDeclInitFunc. Or should it be through AddGlobalCtor? And what priority should be given to them? Should it be hard coded (horrible!)? Or calculated using OrderGlobalInitsOrStermFinalizers?

efriedma added a subscriber: aeubanks.

I'm not really familiar with the way constructor priorities work on Windows works... see https://reviews.llvm.org/D131910 ? Adding @aeubanks as a reviewer.

rnk added a comment.Dec 1 2022, 3:11 PM

To recap, I'm in favor of implementing dllimport constexpr global initializers with a high priority dynamic initializer. I would suggest using init_priority of 201 to run right after the "compiler" initializers.

zahiraam updated this revision to Diff 479652.Dec 2 2022, 8:47 AM
zahiraam marked 2 inline comments as done.

(-Wglobal-constructors warning is still not implemented.)

clang/lib/CodeGen/CGDeclCXX.cpp
604

How is ConstInitAttr relevant here?

clang/test/Sema/dllimport.c
41 ↗(On Diff #479652)

Like I mentioned before, we might want to restrict this feature to C++.

zahiraam added inline comments.Dec 5 2022, 2:39 PM
clang/lib/CodeGen/CGDeclCXX.cpp
604

This change made (without the !(D->hasAttr<ConstInitAttr>()) made the LIT behavior of aix-static-init.cpp. The IR generated for
namespace test3 {

struct Test3 {
  constexpr Test3() {};
  ~Test3() {};
};

constinit Test3 t;

} // namespace test3

was different. I would have thought that the change we made for constexpr wouldn't affter constinit?

clang/test/Sema/dllimport.c
41 ↗(On Diff #479652)

Sorry I forgot about that. I will change that so that the diag is generated only for C++.

efriedma added inline comments.Dec 5 2022, 3:19 PM
clang/lib/CodeGen/CGDeclCXX.cpp
604

I think the significant bit there isn't the use of constinit; it's the non-trivial destructor. I think the priority modification should only affect constructors, not destructors. (Not sure how to make that work, at first glance.)

zahiraam updated this revision to Diff 480574.Dec 6 2022, 12:15 PM
zahiraam marked 3 inline comments as done.
zahiraam edited the summary of this revision. (Show Details)
clang/lib/CodeGen/CGDeclCXX.cpp
604

Let's see if this is an acceptable solution.

efriedma added inline comments.Dec 6 2022, 1:16 PM
clang/lib/CodeGen/CGDeclCXX.cpp
604

To fake constant initialization, we need to initialize the variable before anything else runs. But the rearranged prioritization isn't supposed to affect the destructor. From [basic.start.term]: "If an object is initialized statically, the object is destroyed in the same order as if the object was dynamically initialized."

What you're doing here isn't exactly implementing that. What you're doing here is delaying both the initialization and the destruction if the variable has a non-trivial destructor. We need to separate the two to get the behavior we want.

zahiraam added inline comments.Dec 7 2022, 10:05 AM
clang/lib/CodeGen/CGDeclCXX.cpp
604

Could we consider adding a field to EvaluatedStmt called "HasTrivialDestrutor" and only perform the prioritization change when
!D->HasTrivialDesctructor? Instead of using the test for D->hasConstantInitialization(). This seems to be englobing to many cases.

I considered returning null for HasConstantInitialization in case of var has a non-trivial destructor but that doesn't seem to work.

efriedma added inline comments.Dec 7 2022, 10:20 AM
clang/lib/CodeGen/CGDeclCXX.cpp
604

Once you separate out the initialization from the destruction, the rest should come together naturally, I think? I'm not sure what other cases could be caught by hasConstantInitialization().

zahiraam updated this revision to Diff 481029.Dec 7 2022, 1:08 PM
zahiraam added inline comments.Dec 7 2022, 1:10 PM
clang/lib/CodeGen/CGDeclCXX.cpp
604

Does this change accomplish this? Thanks.

zahiraam updated this revision to Diff 481271.Dec 8 2022, 7:27 AM
efriedma added inline comments.Dec 8 2022, 11:01 AM
clang/lib/CodeGen/CGDeclCXX.cpp
604

When a global variable is dynamically initialized, there are two parts to the initialization:

  • Constructing the variable (calling the constructor, or equivalent)
  • Registering the destructor with the runtime (atexit on Windows)

If an object is initialized statically, the two parts are separated: the variable is emitted already initialized by the compiler. But we still register the destructor at the same time, in the same way.

If a dllimport object is initialized statically, we need to make it appear to user code that the same thing happened. So we need to initialize the object early, but we need to register the destructor at the same time we would have otherwise registered it. To make this work, we need two separate initializer functions with different priorities.

zahiraam added inline comments.Dec 9 2022, 11:11 AM
clang/lib/CodeGen/CGDeclCXX.cpp
604

Thanks for the clarification. I am slowly getting it, I think.

In terms of IR, for example for this:

struct Test3 {
  constexpr Test3() {};
  ~Test3() {};
};

constinit Test3 t;

Are we looking for something like this (partial IR):

@t = global %struct.Test3 undef
@llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 **201**, ptr @_GLOBAL__I_000201, ptr null }]
@llvm.global_dtors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 **65535**, ptr @_GLOBAL__D_a, ptr null }]
define internal void @__cxx_global_var_init() {
entry:
  call void @_ZN5Test3C1Ev(ptr noundef nonnull align 1 dereferenceable(1) @t)
  %0 = call i32 @atexit(ptr @__dtor_t) 
  ret void
}
define internal void @__finalize_t() {
 %0 = call i32 @unatexit(ptr @__dtor_t) 
 %needs_destruct = icmp eq i32 %0, 0
 br i1 %needs_destruct, label %destruct.call, label %destruct.end

destruct.call:  
  call void @__dtor_t()
.....}
}

Or

define internal void @__cxx_global_var_init()  {
entry:
  call void @_ZN5Test3C1Ev(ptr noundef nonnull align 1 dereferenceable(1) @t)
  ret void
}
define internal void @__finalize_t()  {
%0 = call i32 @atexit(ptr @__dtor_t) 
}
efriedma added inline comments.Dec 12 2022, 11:03 AM
clang/lib/CodeGen/CGDeclCXX.cpp
604

global_dtors is not at all relevant. (It's not something we ever use for C++ globals, and it doesn't have the right semantics for that anyway, and I don't think we even support it on Windows.) The sequence we currently generate looks something like this:

@llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I__, ptr null }]

define internal void @_GLOBAL__sub_I__() #1 {
entry:
  store ptr @"?x@@3HA", ptr @"?y@@3UX@@A", align 8, !tbaa !4
  %0 = tail call i32 @atexit(ptr nonnull @"??__Fy@@YAXXZ") #2
  ret void
}

What we want is something like this:

@llvm.global_ctors = appending global [2 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 201, ptr @ctor, ptr null }, { i32, ptr, ptr } { i32 65535, ptr @dtor, ptr null }]

define internal void @ctor() {
entry:
  store ptr @"?x@@3HA", ptr @"?y@@3UX@@A", align 8, !tbaa !4
  ret void
}

define internal void @dtor() {
entry:
  %0 = tail call i32 @atexit(ptr nonnull @"??__Fy@@YAXXZ") #2
  ret void
}
zahiraam updated this revision to Diff 483320.Dec 15 2022, 1:31 PM

This uploaded patch is a draft (LIT tests fails but are not updated since I am not sure about these changes) that I am part way confident about. There are still a lot of things that I need to look at, but would you mind looking at it to see if this is moving in the right direction?

Not sure exactly what code that generates in its current form, but that's roughly the right idea, yes.

Not sure exactly what code that generates in its current form, but that's roughly the right idea, yes.

Thanks.

This is the IR I am getting for this simple case. I think there is still something wrong with the call to the atexit. Still trying to find out where this is generated in the code.

// RUN: %clang_cc1 -no-enable-noundef-analysis x86_64-windows-msvc  \
// RUN: -fno-rtti -fno-threadsafe-statics -fms-extensions -emit-llvm -std=c++2a \
// RUN: -O0 -o - %s | FileCheck %s

// CHECK: @"?b@@3UB@@A" = dso_local global %struct.B undef, align 1
// CHECK: @llvm.global_ctors = appending global [2 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 201, ptr @ctor, ptr null }, { i32, ptr, ptr } { i32 65535, ptr @dtor, ptr null }]

// CHECK: define internal void @ctor() #0 {
// CHECK: entry:
// CHECK:   %call = call x86_thiscallcc ptr @"??0B@@QAE@XZ"(ptr nonnull align 1 dereferenceable(1) @"?b@@3UB@@A")
// CHECK:   %0 = call i32 @atexit(ptr @"??__Fb@@YAXXZ") #2    **<== This shouldn't be called here, right?**
// CHECK:   ret void
// CHECK: }

// CHECK: define internal void @dtor() #0 {
// CHECK: entry:
// CHECK:   %0 = call i32 @atexit(ptr @"??__Fb@@YAXXZ.1") #2
// CHECK:   ret void
// CHECK: }

struct B {
  constexpr B() {}
  ~B() {};
};
constinit B b;

Right, the call to atexit from the constructor function shouldn't be there. I think you need to add a "skip emitting the destructor" flag to EmitCXXCtorInit

zahiraam updated this revision to Diff 484006.Dec 19 2022, 10:38 AM
efriedma added inline comments.Dec 19 2022, 10:54 AM
clang/lib/CodeGen/CodeGenModule.cpp
4812

Not sure what the D->getType()->isRecordType() check is doing here.

4907

I have no idea what this code is supposed to do.

5062

I think you want to use priority 201 whether or not there's a destructor.

zahiraam updated this revision to Diff 484062.Dec 19 2022, 1:50 PM
zahiraam marked 2 inline comments as done.
clang/lib/CodeGen/CodeGenModule.cpp
4907

I have removed from this the thread_local variables for being processed the same way. Maybe they should be included, not sure?

For the rest is to differentiate between these cases. I found the 2nd test case while doing some unit testing.

struct B {
constexpr B() {}
~B() {};
};
constinit B b;

and

struct A {
  int s;
  ~A();
};

struct D : A {};
D d1 = {};

I think the second test case is not supposed to EmitDeclInit in EmitCXXGlobalVarDeclInit right? But since now tryEmitForInitializer is returning an Initializer, then NeedsGlobalCtor = true;

Actually, when I removed this code, I have 2 LIT tests failing with the same crash.
WDYT?

5062

Is that what you mean?

efriedma added inline comments.Dec 19 2022, 2:28 PM
clang/lib/CodeGen/CodeGenModule.cpp
4907

I'm not sure how thread-local var init works on Windows off the top of my head; I'd need to check.

For the given testcases, neither one of those cases should be calling EmitDeclInit; we use constant initialization for both cases, so the global init func should just be registering the destructors. We shouldn't need to examine the initializer beyond checking whether tryEmitForInitializer succeeds.

Not sure what's crashing from your description.

zahiraam added inline comments.Dec 19 2022, 2:56 PM
clang/lib/CodeGen/CodeGenModule.cpp
4907

Removing this code will still result, in both cases to the call to EmitDeclInit, because the first call to EmitCXXCtorInit is called with PerformInit true.

Calling EmitCXXCtorInit with false would result into a ctor function with an empty body for the 1st test case (struct B):

define internal void @ctor() #0 {
entry:
  ret void
}

I would have thought that we should have something like this:

define internal void @ctor() #0 {
entry:
  %call = call x86_thiscallcc ptr @"??0B@@QAE@XZ"(ptr nonnull align 1 dereferenceable(1) @"?b@@3UB@@A")
  ret void
}

No?

efriedma added inline comments.Dec 19 2022, 3:20 PM
clang/lib/CodeGen/CodeGenModule.cpp
4907

because the first call to EmitCXXCtorInit is called with PerformInit true.

That sounds wrong; if tryEmitForInitializer succeeds, we shouldn't perform the init at all.

5062

I think it should look something more like this:

if (isStaticInit(D, getLangOpts()) {
  if (NeedsGlobalCtor)
    EmitCXXCtorInit(D, GV, true, 201, llvm::StringLiteral("ctor"), false);
  if (NeedsGlobalDtor)
    EmitCXXCtorInit(D, GV, false, 65535, llvm::StringLiteral("dtor"), true);
  DelayedCXXInitPosition[D] = ~0U;
} else {
  EmitCXXGlobalVarDeclInitFunc(D, GV, NeedsGlobalCtor);
}
zahiraam marked an inline comment as done.Dec 20 2022, 6:51 AM
zahiraam added inline comments.
clang/lib/CodeGen/CodeGenModule.cpp
5062

If you agree with the generated IR for this case, then I will start editing the LIT tests accordingly.

// CHECK: @"?b@@3UB@@A" = dso_local global %struct.B undef
// CHECK: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @dtor, ptr null }]

// CHECK: define internal void @dtor()
// CHECK: entry:
// CHECK:     %0 = call i32 @atexit(ptr @"??__Fb@@YAXXZ")
// CHECK:   ret void

// CHECk: define linkonce_odr dso_local x86_thiscallcc void @"??1B@@QAE@XZ"
// CHECK: entry:
// CHECK:  %this.addr = alloca ptr, align 4
// CHECK:  store ptr %this, ptr %this.addr, align 4
// CHECK:  %this1 = load ptr, ptr %this.addr, align 4
// CHECK:  ret void


// CHECK: define internal void @"??__Fb@@YAXXZ"() #0 {
// CHECK: entry:
// CHECK:   call x86_thiscallcc void @"??1B@@QAE@XZ"(ptr @"?b@@3UB@@A")
// CHECK:   ret void


struct B {
  constexpr B() {}
  ~B() {};
};
constinit B b;

The ctor with priority 201 is generated only when tryEmitForInitializer returns a nullptr (that's the only time when NeedsGlobalCtor is set to true). Correct?

zahiraam updated this revision to Diff 484258.Dec 20 2022, 6:56 AM
zahiraam added inline comments.
clang/lib/CodeGen/CodeGenModule.cpp
5067

Do you agree this should be done only when one of those flags is on?

zahiraam updated this revision to Diff 484278.Dec 20 2022, 8:05 AM
zahiraam added inline comments.
clang/test/CodeGenCXX/PR19955.cpp
20

@rnk was concerned about varp pointing to dllimport var. That does it I think.

efriedma added inline comments.Dec 20 2022, 11:47 AM
clang/include/clang/Basic/DiagnosticCommonKinds.td
254

Missing code to emit this warning?

clang/lib/CodeGen/CodeGenModule.cpp
5062

Right, we only need the priority 201 ctor if tryEmitForInitializer fails, but we need constant initialization anyway to meet the C++ language requirements.

Updated code seems much cleaner.

5067

Yes, that's fine; I wasn't really paying close attention to the exact code. Just wanted to make the point about the structure of the if statements, and code was the easiest way to explain it.

Maybe the outer if statement should actually be if (isStaticInit(D, getLangOpts()) && NeedsGlobalCtor) {.

On a related note, we might want to avoid the name "ctor", in case that accidentally conflicts with some user code; an "__"-prefixed name would be appropriate.

clang/test/CodeGenCXX/PR19955.cpp
16

It's easier to debug FileCheck failures if you use CHECK-LABEL for function definitions.

You probably want to use a different prefix that's used for both 32-bit and 64-bit for lines like this.

CHECK-64 isn't used by any FileCheck invocation.

zahiraam marked 3 inline comments as done.Dec 21 2022, 7:04 AM
zahiraam added inline comments.
clang/lib/CodeGen/CodeGenModule.cpp
5067

Maybe the outer if statement should actually be if (isStaticInit(D, getLangOpts()) && NeedsGlobalCtor) {

Not sure about that! There are cases where (isStaticInit(D, getLangOpts())) = true and NeedsGlobalCtor=false, but NeedsGlobalDtor=true. In which case a __dtor needs to be emitted, no?

Writing the condition as you are proposing would actually not get me into the body to emit the __dtor. Is that what we want?

efriedma added inline comments.Dec 21 2022, 11:47 AM
clang/lib/CodeGen/CodeGenModule.cpp
5067

EmitCXXGlobalVarDeclInitFunc should be able to handle that case.

Looking again, I'm a little concerned that in the isStaticInit() case, we're skipping a bunch of the logic in EmitCXXGlobalVarDeclInitFunc. EmitCXXCtorInit handles the basic cases correctly, but there are a lot of special cases in EmitCXXGlobalVarDeclInitFunc.

zahiraam updated this revision to Diff 484665.Dec 21 2022, 1:46 PM
zahiraam marked an inline comment as done.
zahiraam added inline comments.Dec 21 2022, 1:46 PM
clang/lib/CodeGen/CodeGenModule.cpp
5067

I have left the condition as it was to make sure no cases are left. What other cases are you thinking of?

zahiraam updated this revision to Diff 484814.Dec 22 2022, 6:24 AM
zahiraam updated this revision to Diff 484840.Dec 22 2022, 7:49 AM
efriedma added inline comments.Dec 24 2022, 1:07 PM
clang/lib/CodeGen/CGExprConstant.cpp
1926

I think this will trigger in cases we don't actually want it to; we only want to warn when we actually generate a global constructor, not when we end up falling back to generated code within a function. (For example, we sometimes constant-evaluate local variables.)

clang/lib/CodeGen/CodeGenModule.cpp
5067

EmitCXXGlobalVarDeclInitFunc has special cases for CUDA, OpenMP, thread-local variables, the InitSeg attribute, and inline variables.

zahiraam updated this revision to Diff 486274.Jan 4 2023, 7:05 AM
zahiraam marked 2 inline comments as done.
zahiraam added inline comments.Jan 4 2023, 7:09 AM
clang/lib/CodeGen/CodeGenModule.cpp
5067

Let me know if this works? I have added one additional LIT test. I referred to https://learn.microsoft.com/en-us/cpp/cpp/storage-classes-cpp?view=msvc-170#thread_local for thread_local.
For the inline variables, I couldn't find much information about initialization of inline variables. I tried to find a case that wouldn't pass by the new code and wouldn't pass by EmitCXXGlobalVarDeclInitFunc, but no success.

efriedma added inline comments.Jan 19 2023, 5:06 PM
clang/lib/CodeGen/CGDeclCXX.cpp
520

I don't really like the copy-pasted bits here.

clang/lib/CodeGen/CodeGenModule.cpp
5067

Inline variable example:

__declspec(dllimport) extern int x;
inline int *y = &x;
int **z = &y;

This should trigger your new codepath, I think?

clang/test/CodeGenCXX/constant-init.cpp
18

This doesn't look like your new codepath is triggering? If your code is working correctly, it should, I think?

zahiraam marked an inline comment as done.Jan 24 2023, 8:30 AM
zahiraam added inline comments.
clang/lib/CodeGen/CodeGenModule.cpp
5067

Yes. And the IR looks like this:

@"?y@@3PEAHEA" = linkonce_odr dso_local global ptr null, comdat, align 8
@"?z@@3PEAPEAHEA" = dso_local global ptr @"?y@@3PEAHEA", align 8
@"?x@@3HA" = external dllimport global i32, align 4
@llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 201, ptr @__ctor, ptr null }]

; Function Attrs: noinline nounwind
define linkonce_odr dso_local void @__ctor() #0 comdat {
entry:
  store ptr @"?x@@3HA", ptr @"?y@@3PEAHEA", align 8
  ret void
}

I think this is correct.
Adding it to the LIT testing.

clang/test/CodeGenCXX/constant-init.cpp
18

That's right. In all cases where

llvm::Constant *Initializer = emitter->tryEmitForInitializer(*InitDecl);

is returning an initializer (with NeedsGlobalCtor=false and NeedsGlobalDtor=true) the control path is following the old path (no call to EmitCXXCtorInit), i.e. call to EmitCXXGlobalVarDeclInitFunc.

Unless we want to change the value of NeedsGlobaCtor in this case?

With:

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index f7add1350238..106ac355d356 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp

+++ b/clang/lib/CodeGen/CodeGenModule.cpp

@@ -4898,6 +4898,8 @@ void CodeGenModule::EmitGlobalVarDefinition(const VarDecl *D,
     // also don't need to register a destructor.
     if (getLangOpts().CPlusPlus && !NeedsGlobalDtor)
       DelayedCXXInitPosition.erase(D);
+      if (isStaticInit(D, getLangOpts()) && !NeedsGlobalCtor)
+        NeedsGlobalCtor = true;

 #ifndef NDEBUG
      CharUnits VarSize = getContext().getTypeSizeInChars(ASTTy) +

In which case for:

struct Test1 {
constexpr Test1(int) {}
~Test1() {}
};
inline constinit Test1 t2 = 2;

We would get this IR?

@llvm.global_ctors = appending global [2 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 201, ptr @__ctor, ptr null }, { i32, ptr, ptr } { i32 65535, ptr @__dtor, ptr null }]

; Function Attrs: noinline nounwind
define linkonce_odr dso_local void @__ctor() #0 comdat {
entry:

%call = call noundef ptr @"??0Test1@@QEAA@H@Z"(ptr noundef nonnull align 1 dereferenceable(1) @"?t2@@3UTest1@@A", i32 noundef 2)
%0 = call i32 @atexit(ptr @"??__Ft2@@YAXXZ") #2
ret void

}

define internal void @"??__Ft2@@YAXXZ"() #0 {
entry:

call void @"??1Test1@@QEAA@XZ"(ptr @"?t2@@3UTest1@@A")
ret void

}

define linkonce_odr dso_local void @__dtor() #0 comdat {
entry:

%0 = call i32 @atexit(ptr @"??__Ft2@@YAXXZ.1") #2
ret void

}

zahiraam marked an inline comment as done.Jan 24 2023, 8:30 AM

Thanks. Please see my comments/questions.

zahiraam updated this revision to Diff 493411.Jan 30 2023, 1:56 PM
zahiraam updated this revision to Diff 493726.Jan 31 2023, 1:55 PM
zahiraam updated this revision to Diff 494051.Feb 1 2023, 1:12 PM

Would be nice to have feedback on this! So much work was put into it! @rnk, @efriedma

zahiraam updated this revision to Diff 502776.Mar 6 2023, 1:29 PM
efriedma added inline comments.Apr 27 2023, 3:27 PM
clang/lib/CodeGen/CGDeclCXX.cpp
520

It would be nice it we could to unify EmitCXXCtorInit with EmitCXXGlobalVarDeclInitFunc a bit more.

clang/lib/CodeGen/CodeGenModule.cpp
5067

Without your patch, the IR looks significantly different; the constructor function is named ??__Ey@@YAXXZ, and there's a reference to the variable in llvm.global_ctors. This is necessary to ensure we only construct the variable once.

In this particular case, I guess the harm of constructing the variable multiple times would be relatively minor, but it's still not a great idea. And we definitely can't emit a linkonce_odr function and just name it "__ctor".

clang/test/CodeGenCXX/constant-init.cpp
18

Reading this again, not sure what I was asking for here.

We should not be calling atexit() from a priority 201 constructor; we should always register it with the normal registration priority, so stuff is destroyed in the right order.

(I apologize it took me so long to get back to reviewing this.)

rnk edited reviewers, added: hans, efriedma; removed: majnemer, rnk, aeubanks.May 8 2023, 3:39 PM

Thanks for working on this, this is also an issue for our users. I've been out on leave. I replaced myself as a reviewer with Hans.