This is an archive of the discontinued LLVM Phabricator instance.

Add /Zc:DllexportInlines option to clang-cl
ClosedPublic

Authored by takuto.ikuta on Aug 28 2018, 1:37 AM.

Details

Summary

This CL adds /Zc:DllexportInlines flag to clang-cl.
When Zc:DllexportInlines- is specified, inline class member function is not exported if the function does not have local static variables.

By not exporting inline function, code for those functions are not generated and that reduces both compile time and obj size. Also this flag does not import inline functions from dllimported class if the function does not have local static variables.

On my 24C48T windows10 machine, build performance of chrome target in chromium repository is like below.
These stats are come with 'target_cpu="x86" enable_nacl = false is_component_build=true dcheck_always_on=true` build config and applied

Below stats were taken with this patch applied on https://github.com/llvm-project/llvm-project-20170507/commit/a05115cd4c57ff76b0f529e38118765b58ed7f2e

configbuild timespeedupbuild dir size
with patch, PCH on, debug1h10m0sx1.1335.6GB
without patch, PCH on, debug1h19m17s49.0GB
with patch, PCH off, debug1h15m45sx1.1633.7GB
without patch, PCH off, debug1h28m10s52.3GB
with patch, PCH on, release1h13m13sx1.2226.2GB
without patch, PCH on, release1h29m57s37.5GB
with patch, PCH off, release1h23m38sx1.3223.7GB
without patch, PCH off, release1h50m50s38.7GB

This patch reduced obj size and the number of exported symbols largely, that improved link time too.
e.g. link time stats of blink_core.dll become like below

cold disk cachewarm disk cache
with patch, PCH on, debug71s30s
without patch, PCH on, debug111s48s

This patch's implementation is based on Nico Weber's patch. I modified to support static local variable, added tests and took stats.

Bug: https://bugs.llvm.org/show_bug.cgi?id=33628

Diff Detail

Repository
rC Clang

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
takuto.ikuta edited the summary of this revision. (Show Details)Sep 7 2018, 5:52 AM

Did both your builds use PCH? It'd be interesting to see the difference without PCH too; the effect should be even larger.

Added stats of without PCH build.

The summary should probably reference https://bugs.llvm.org/show_bug.cgi?id=33628 and it needs to mention how it affects dllimport too.

Added to description, thanks!

Okay, after reading through the patch, it seems we're still marking class members dllexport, and then you selectively remove the attribute later. That does feel a little bit backward... Does -fvisibility-inlines-hidden also have the static local problem, or how does that flag handle it?

Ah, maybe I can get performance improvement just support fvisibility-inlines-hidden in clang-cl. Let me try.

I just support fvisibility-inlines-hidden in clang-cl and that looks to work intended.

takuto.ikuta edited the summary of this revision. (Show Details)Sep 7 2018, 6:12 AM
takuto.ikuta edited the summary of this revision. (Show Details)Sep 7 2018, 7:38 AM
takuto.ikuta edited the summary of this revision. (Show Details)Sep 7 2018, 3:47 PM
takuto.ikuta edited the summary of this revision. (Show Details)Sep 8 2018, 3:56 PM
takuto.ikuta edited the summary of this revision. (Show Details)Sep 9 2018, 10:28 PM
hans added inline comments.Sep 10 2018, 12:30 AM
clang/lib/Driver/ToolChains/Clang.cpp
5244 ↗(On Diff #164379)

Huh, does this actually affect whether functions get dllexported or not?

takuto.ikuta added inline comments.Sep 10 2018, 2:32 AM
clang/lib/Driver/ToolChains/Clang.cpp
5244 ↗(On Diff #164379)

Sorry, what you want to ask?

This will used to not add dllexport attr in L5690 of SemaDeclCXX.cpp.

hans added inline comments.Sep 10 2018, 5:49 AM
clang/lib/Driver/ToolChains/Clang.cpp
5244 ↗(On Diff #164379)

Oops, I didn't see that. I'm glad to see this is looking so simple :-)

Actually, I don't think we should the same flag name for this, since "hidden" is an ELF concept, not a COFF one, just that we should match the behaviour of the flag.

Hmm, or do people use -fvisibility-inlines-hidden on MinGW or something? Where does the hidden-dllimport.cpp file come from?

Also, is it the case that -fvisibility-inlines-hidden just ignores the problem of static local variables? If that's the case we can probably do it too, we just have to be sure, and document it eventually.

I'm trying to handle local static var correctly.

takuto.ikuta added inline comments.Sep 11 2018, 2:04 AM
clang/lib/Driver/ToolChains/Clang.cpp
5244 ↗(On Diff #164379)

I confirmed that -fvisibility-inlines-hidden treats local static var correctly in linux.
So I'm trying to export inline functions if it has local static variables.

takuto.ikuta edited the summary of this revision. (Show Details)Sep 11 2018, 2:06 AM
takuto.ikuta retitled this revision from Add /Zc:DllexportInlines option to clang-cl to [WIP] Add /Zc:DllexportInlines option to clang-cl.Sep 11 2018, 7:07 AM
rnk added inline comments.Sep 11 2018, 1:12 PM
clang/lib/Driver/ToolChains/Clang.cpp
5244 ↗(On Diff #164379)

This sounds like it would be really hard in general, since you can hide static locals almost anywhere:

struct Foo {
  static int foo() {
    return ([]() { static int x; return ++x; })();
  }
};

Can we reuse the RecursiveASTVisitor @hans added to check if we can emit dllimport inline functions as available_externally definitions? I think it will find exactly the circumstances we are looking for, i.e. export all the things that cannot be emitted inline in other DLLs.

takuto.ikuta added inline comments.Sep 12 2018, 8:29 AM
clang/lib/Driver/ToolChains/Clang.cpp
5244 ↗(On Diff #164379)

Actually, StmtVisitor added dll export attribute to local static variable in lambda function. And static variables seems exported.

But I found other issue in current implementation, some cxx method decls passed to emitting function before local static variable checker adds dll export attributes. In such case local static variables are not exported and link failure happens.

So let me try to use DLLImportFunctionVisitor in CodeGenModule::shouldEmitFunction for exported function instead of processing/checking dll attribute around SemaDeclCXX.

takuto.ikuta edited the summary of this revision. (Show Details)

Current implementation cannot build chrome when pch is enabled.
undefined symbol error happens during linking blink_modules.dll

takuto.ikuta retitled this revision from [WIP] Add /Zc:DllexportInlines option to clang-cl to Add /Zc:DllexportInlines option to clang-cl.
takuto.ikuta edited the summary of this revision. (Show Details)
takuto.ikuta edited the summary of this revision. (Show Details)

PTAL again.

I confirmed that current patch can link chrome and functions with local static variable are exported.

But current ToT clang was not improved well by this patch.
I guess there is some change recently making effect of this patch smaller. Or chromium has many inline functions with static local variable?

takuto.ikuta edited the summary of this revision. (Show Details)Sep 19 2018, 8:57 PM
takuto.ikuta edited the summary of this revision. (Show Details)

Remove unnecessary willHaveBody check condition

takuto.ikuta edited the summary of this revision. (Show Details)Sep 21 2018, 9:28 PM
takuto.ikuta edited the summary of this revision. (Show Details)Sep 21 2018, 10:07 PM
takuto.ikuta edited the summary of this revision. (Show Details)Sep 21 2018, 10:09 PM
takuto.ikuta edited the summary of this revision. (Show Details)Sep 21 2018, 10:12 PM
takuto.ikuta edited the summary of this revision. (Show Details)Sep 23 2018, 9:18 PM

Ping?

This patch reduced obj size largely, and I expect this makes distributed build (like goma) faster by reducing data transferred on network.

clang/lib/Driver/ToolChains/Clang.cpp
5244 ↗(On Diff #164379)

I think avoiding dll export is better to be done before we reached around CodeGenModule. Also removing dll attribute later made it difficult to pass tests.

hans added a comment.Sep 24 2018, 7:28 AM

Ping?

This patch reduced obj size largely, and I expect this makes distributed build (like goma) faster by reducing data transferred on network.

I'll try to look at it this week.

Have you confirmed on some Chromium file, e.g. stroke_opacity_custom.cc (go/stroke-opactity-custom) that the number of functions that we codegen decreases as expected? I'd expect this to save a lot of compile time.

Ping?

This patch reduced obj size largely, and I expect this makes distributed build (like goma) faster by reducing data transferred on network.

I'll try to look at it this week.

Have you confirmed on some Chromium file, e.g. stroke_opacity_custom.cc (go/stroke-opactity-custom) that the number of functions that we codegen decreases as expected? I'd expect this to save a lot of compile time.

Yes, stroke_opacity_custom.obj export 18 symbols, reduced from 781 in PCH build, compile time is reduced from 8.2 seconds to 6.7 seconds.

hans added a comment.Sep 26 2018, 6:33 AM

The static local stuff still makes me nervous. How much does Chromium break if we just ignore the problem? And how does -fvisibility-inlines-hidden handle the issue?

Also, Sema already has a check for static locals in C inline functions:

$ echo "inline void f() { static int x; }" | bin/clang -c -x c -
<stdin>:1:19: warning: non-constant static local variable in inline function may be different in different files [-Wstatic-local-in-inline]
inline void f() { static int x; }
                  ^

could we reuse that check somehow for this case with static locals in dllexport inline methods?

clang/lib/Sema/SemaDecl.cpp
12624 ↗(On Diff #166450)

Hmm, now we're adding an AST walk over all inline methods which probably slows us down a bit. Not sure I have any better ideas though.

In any case, ActOnFinishInlineFunctionDef needs a comment explaining why it's doing this.

clang/lib/Sema/SemaDeclCXX.cpp
5702 ↗(On Diff #166450)

What worries me is that now we have logic in two different places about inheriting the dll attribute.

The new place in ActOnFinishInlineFunctionDef doesn't have the same checks (checking for MS ABI and the template specialization kind) which means the logic for when to inherit is already a little out of sync...

Extract inline function export check to function

Update comment for Sema::ActOnFinishInlineFunctionDef

Thank you for review!

The static local stuff still makes me nervous. How much does Chromium break if we just ignore the problem? And how does -fvisibility-inlines-hidden handle the issue?

I'm not sure how much chromium will break, if we ignore static local variables. But ignoring static local var may cause some unhappy behavior and experience to developer.
I'd like to have check for local static variables as -fvisibility-inlines-hidden does.

-fvisibility-inlines-hidden checks static local around here.
https://github.com/llvm-project/llvm-project-20170507/blob/77b7ca42b0ea496373c5e5a9710b1fe11e1fba68/clang/lib/AST/Decl.cpp#L1265

Also, Sema already has a check for static locals in C inline functions:

$ echo "inline void f() { static int x; }" | bin/clang -c -x c -
<stdin>:1:19: warning: non-constant static local variable in inline function may be different in different files [-Wstatic-local-in-inline]
inline void f() { static int x; }
                  ^

could we reuse that check somehow for this case with static locals in dllexport inline methods?

I think we can do the same thing when we are given -fno-dllexport-inlines around here.
https://github.com/llvm-project/llvm-project-20170507/blob/77b7ca42b0ea496373c5e5a9710b1fe11e1fba68/clang/lib/Sema/SemaDecl.cpp#L6661

But I bit prefer to do export functions with static local variables. Because that is consistent with -fvisibility-inlines-hidden and we won't need more changes to chromium than the CLs in description.

clang/lib/Sema/SemaDeclCXX.cpp
5702 ↗(On Diff #166450)

Agree. Do you allow me to extract check to function and re-use that?

takuto.ikuta added inline comments.Oct 1 2018, 2:53 AM
clang/lib/Sema/SemaDecl.cpp
12624 ↗(On Diff #166450)

Added comment. I think typical code does not have static variables in inline function and this check is worth to be done for the performance improvement.

hans added a comment.Oct 5 2018, 1:06 AM

Ping?

Sorry, I know I'm being slow to look at this :-( I have not forgotten, it's just a lot of other things happening at the same time.

Ping?

Sorry, I know I'm being slow to look at this :-( I have not forgotten, it's just a lot of other things happening at the same time.

I see. Sorry for rushing you. I wait until you have time.

hans added inline comments.Oct 9 2018, 7:44 AM
clang/lib/Sema/SemaDeclCXX.cpp
5599 ↗(On Diff #167683)

Okay, breaking out this logic is a little better, but I still don't like that we now have split the "inherit dllexport attribute" in two places: one for non-inline functions and one for inline (even if they both call this function). It feels like it will be hard to maintain.

Here is another idea:

When we inherit the dllexport attribute to class members, if getLangOpts().DllExportInlines is false, we don't put dllexport on inline functions, but instead we put a new attribute "dllexportstaticlocals".

That attribute only has the effect that it makes static locals exported. We would check for it when computing the linkage of the static local, similar to how it works in hidden functions.

This has two benefits: it doesn't complicate the "inherit dllexport" code very much, and it avoids the need for a separate AST walk of the function.

What do you think?

address comment

Thank you for review!

clang/lib/Sema/SemaDeclCXX.cpp
5599 ↗(On Diff #167683)

I implemented your idea the way I understood.
Use new attribute to check static local var later.

Due to difference of treating between linkage and dll attribute, I inherit dll attribute in Sema/SemaDecl.cpp instead of AST/Decl.cpp

hans added a comment.Oct 10 2018, 6:59 AM

Thanks! I think this is getting close now.

clang/lib/Sema/SemaDecl.cpp
11976 ↗(On Diff #168979)

Why does this need to be a loop? I don't think FunctionDecl's can be nested?

11995 ↗(On Diff #168979)

Isn't it enough that the static local is exported, does the function itself need to be exported too?

clang/lib/Sema/SemaDeclCXX.cpp
5703 ↗(On Diff #168979)

I don't think checking for Microsoft ABI is necessary, just checking the DllExportInlines flag should be enough.

5705 ↗(On Diff #168979)

But I don't understand why the template stuff is checked here...

The way I imagined this, we'd only need to change the code when creating NewAttr below..
Something like

NewAttr = ClassAtr->clone()...
if (!getLandOpts().DllExportInlines() && Member is an inline method)
  NewAttr = our new dllimport/export static locals attribute

What do you think?

takuto.ikuta marked an inline comment as done.

address comment

Thank you for review! I updated the code.

clang/lib/Sema/SemaDecl.cpp
11976 ↗(On Diff #168979)

This is for static local var in lambda function.
static_x's ParentFunction does not become f().

class __declspec(dllexport) C {
  int f() {
    return ([]() { static int static_x; return ++static_x; })();
  }
};
11995 ↗(On Diff #168979)

Adding dllexport only to variable does not export variable when the function is not used.
But dllimport is not necessary for function, removed.

clang/lib/Sema/SemaDeclCXX.cpp
5705 ↗(On Diff #168979)

But I don't understand why the template stuff is checked here...

Templated inline function is not always inlined, it seems depending on optimization level.

I updated the code as you wrote in later part of comment.

hans added inline comments.Oct 11 2018, 6:58 AM
clang/lib/Sema/SemaDecl.cpp
11976 ↗(On Diff #168979)

I don't think the lambda (or its static local) should be exported in this case.

11995 ↗(On Diff #168979)

Hmm, I wonder if we could fix that instead? That is, export the variable in that case even if the function is not used?

clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

I still don't understand why we need these checks for template instantiations. Why does it matter whether the functions get inlined or not?

takuto.ikuta added inline comments.Oct 11 2018, 8:04 AM
clang/lib/Sema/SemaDecl.cpp
11976 ↗(On Diff #168979)

If we don't export static_x, dllimport library cannot find static_x when linking?

11995 ↗(On Diff #168979)

I see. I'll investigate which code path emits static variables

clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

When function of dllimported class is not inlined, such funtion needs to be dllexported from implementation library.

c.h

template<typename T>
class EXPORT C {
 public:
  void f() {}
};

cuser.cc

#include "c.h"

void cuser() {
  C<int> c;
  c.f();  // This function may not be inlined when EXPORT is __declspec(dllimport), so link may fail.
}

When cuser.cc and c.h are built to different library, cuser.cc needs to be able to see dllexported C::f() when C::f() is not inlined.
This is my understanding.

hans added inline comments.Oct 11 2018, 8:26 AM
clang/lib/Sema/SemaDecl.cpp
11976 ↗(On Diff #168979)

I believe even if C is marked dllimport, the lambda will not be dllimport. The inline definition will be used.

clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

Your example doesn't use explicit instantiation definition or declaration, so doesn't apply here I think.

As long as the dllexport and dllimport attributes matches it's fine. It doesn't matter whether the function gets inlined or not, the only thing that matters is that if it's marked dllimport on the "consumer side", it must be dllexport when building the dll.

Address comment

clang/lib/Sema/SemaDecl.cpp
11976 ↗(On Diff #168979)

Do you say importing/implementation library should have different instance of static_x here?
Current clang does not generate such obj.

I think static_x should be dllimported. But without this loop, static_x cannot find FD of C::f() having DLLImport/ExportStaticLocalAttr and static_x won't be treated as imported/exported variable.

And if static_x is not exported, importing library and implementation library will not have the same instance of static_x when C::f() is inlined.

11995 ↗(On Diff #168979)

static local variable seems to be exported in
https://github.com/llvm-project/llvm-project-20170507/blob/f54514c50350743d3d1a3c5aa80cbe4bb449eb71/clang/lib/CodeGen/CGDecl.cpp#L378

But emitting static local var only by bypassing function emission seems difficult.
Let me leave as-is here.

clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

Hmm, I changed the linkage for functions having DLLExport/ImportStaticLocal Attr.

takuto.ikuta retitled this revision from Add /Zc:DllexportInlines option to clang-cl to [WIP] Add /Zc:DllexportInlines option to clang-cl.Oct 14 2018, 10:08 PM
takuto.ikuta retitled this revision from [WIP] Add /Zc:DllexportInlines option to clang-cl to Add /Zc:DllexportInlines option to clang-cl.

Export function inside explicit template instantiation definition

Remove unnecessary attr creation

Fix linkage for inline function of explicit template instantiation declaration

remove comment out code

Hans, I addressed all your comments.
How do you think about current implementation?

clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

I changed linkage in ASTContext so that inline function is emitted when it is necessary when we use fno-dllexport-inlines.

nhaehnle removed a subscriber: nhaehnle.Oct 16 2018, 4:57 AM

ping? Can I go forward in this way?

hans added a comment.Oct 23 2018, 4:45 AM

Hans, I addressed all your comments.
How do you think about current implementation?

Just a few questions, but I think it's pretty good.

clang/include/clang/Basic/Attr.td
2688 ↗(On Diff #169792)

The comment should explain what the attribute does, also the dllimport one below.

clang/lib/AST/ASTContext.cpp
9552 ↗(On Diff #169792)

Can you give an example for why this is needed?

clang/lib/Sema/SemaDecl.cpp
11976 ↗(On Diff #168979)

Look at what MSVC does:

>type a.cc
struct __declspec(dllexport) S {
  int f() {
    static int y;
    return ([]() { static int static_x; return ++static_x; })() + y;
  }
};

void use(S *s) { s->f(); }

>cl -c a.cc && dumpbin /directives a.obj
Microsoft (R) C/C++ Optimizing Compiler Version 19.12.25834 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

a.cc
Microsoft (R) COFF/PE Dumper Version 14.12.25834.0
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file a.obj

File Type: COFF OBJECT

   Linker Directives
   -----------------
   /DEFAULTLIB:LIBCMT
   /DEFAULTLIB:OLDNAMES
   /EXPORT:?f@S@@QAEHXZ
   /EXPORT:??4S@@QAEAAU0@ABU0@@Z
   /EXPORT:??4S@@QAEAAU0@$$QAU0@@Z
   /EXPORT:?y@?1??f@S@@QAEHXZ@4HA,DATA

  Summary

           8 .bss
          50 .chks64
          64 .debug$S
          A6 .drectve
          6A .text$mn

As you can see S::f and S::f::y is exported, but the lambda and its static local are not.

Oh, but I guess you're asking what if we don't dllexport S::f anymore because we're not dllexporting inline methods anymore... hmm, yes, then I do suppose it needs to be exported so maybe this is right after all.

takuto.ikuta marked an inline comment as done.

Added explanation comment for added attributes and rebased

clang/lib/AST/ASTContext.cpp
9552 ↗(On Diff #169792)

Sorry, this change does not need. Removed.

clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

I revived explicit template instantiation check.
Found that this is necessary.

For explicit template instantiation, inheriting dll attribute from function for local static var is run before inheriting dll attribute from class for member functions. So I cannot detect local static var of inline function after class level dll attribute processing for explicit template instantiation.

hans added a comment.Oct 29 2018, 7:17 AM

I've been thinking more about your example with static locals in lambda and how this works with regular dllexport.

It got me thinking more about the problem of exposing inline functions from a library. For example:

lib.h:

#ifndef LIB_H
#define LIB_H

int foo();

struct __declspec(dllimport) S {
  int bar() { return foo(); }
};

#endif

lib.cc:

#include "lib.h"

int foo() { return 123; }

main.cc:

#include "lib.h"

int main() {
  S s;
  return s.bar();
}

Here, Clang will not emit the body of S::bar(), because it references the non-dllimport function foo(), and trying to referencing foo() outside the library would result in a link error. This is what the DLLImportFunctionVisitor is used for. For the same reason, MSVC will also not inline dllimport functions.

Now, with /Zc:dllexportInlines-, we no longer mark S::bar() dllimport, and so we do emit it, causing that link error. The same problem happens with -fvisibility-inlines-hidden: substitute the declspec above for __attribute__((visibility("default"))) above and try it:

$ g++ -fvisibility=hidden -fvisibility-inlines-hidden -fpic -shared -o lib.so lib.cc
$ g++ main.cc lib.so
/tmp/cc557J3i.o: In function `S::bar()':
main.cc:(.text._ZN1S3barEv[_ZN1S3barEv]+0xd): undefined reference to `foo()'

So this is a known issue with -fvisibility-inlines-hidden. This doesn't come up often in practice, but when it does the developer needs to deal with it.

However, the static local problem is much scarier, because that leads to run-time bugs:

lib.h:

#ifndef LIB_H
#define LIB_H

inline int foo() { static int x = 0; return x++; }

struct __attribute__((visibility("default"))) S {
  int bar() { return foo(); }
  int baz();
};

#endif

lib.cc:

#include "lib.h"

int S::baz() { return foo(); }

main.cc:

#include <stdio.h>
#include "lib.h"

int main() {
  S s;
  printf("s.bar(): %d\n", s.bar());
  printf("s.baz(): %d\n", s.baz());
  return 0;
}

If we build the program normally, we get the expected output:

$ g++ lib.cc main.cc && ./a.out
s.bar(): 0
s.baz(): 1

but building as a shared library shows the problem:

$ g++ -fvisibility=hidden -fvisibility-inlines-hidden -fpic -shared -o lib.so lib.cc
$ g++ main.cc lib.so
$ LD_LIBRARY_PATH=$PWD:$LD_LIBRARY_PATH ./a.out
s.bar(): 0
s.baz(): 0

Oops.

This does show that it's a pre-existing problem with the model of not exporting inline functions though. I don't think we need to solve this specifically for Windows, I think we should match what -fvisibility-inlines-hidden is doing.

And what about the lambdas?

lib.h:

#ifndef LIB_H
#define LIB_H

struct __attribute__((visibility("default"))) S {
  int bar() { return ([]() { static int x; return x++; })(); }
  int baz();
};

#endif

lib.cc:

#include "lib.h"

int S::baz() { return bar(); }
$ g++ -fvisibility=hidden -fvisibility-inlines-hidden -fpic -shared -o lib.so lib.cc && g++ main.cc lib.so && LD_LIBRARY_PATH=$PWD:$LD_LIBRARY_PATH ./a.out
s.bar(): 0
s.baz(): 1

Interesting, the lambda function and its static local are not hidden.

Either that's because they're applying the "static locals of hidden functions in visible decl contexts are visible" thing. Or alternatively, they never applied the hidden visibility to the lambda in the first place.

I'm not entirely sure what this means for the dllexport case though. Maybe the loop in the current patch is fine.

hans added inline comments.Oct 29 2018, 7:32 AM
clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

Oh I see, it's a static local problem..
Can you provide a concrete example that does not work without your check?
Maybe this is the right thing to do, but I would like to understand exactly what the problem is.

I've been thinking more about your example with static locals in lambda and how this works with regular dllexport.

It got me thinking more about the problem of exposing inline functions from a library. For example:

lib.h:

#ifndef LIB_H
#define LIB_H

int foo();

struct __declspec(dllimport) S {
  int bar() { return foo(); }
};

#endif

lib.cc:

#include "lib.h"

int foo() { return 123; }

main.cc:

#include "lib.h"

int main() {
  S s;
  return s.bar();
}

Here, Clang will not emit the body of S::bar(), because it references the non-dllimport function foo(), and trying to referencing foo() outside the library would result in a link error. This is what the DLLImportFunctionVisitor is used for. For the same reason, MSVC will also not inline dllimport functions.

Now, with /Zc:dllexportInlines-, we no longer mark S::bar() dllimport, and so we do emit it, causing that link error. The same problem happens with -fvisibility-inlines-hidden: substitute the declspec above for __attribute__((visibility("default"))) above and try it:

$ g++ -fvisibility=hidden -fvisibility-inlines-hidden -fpic -shared -o lib.so lib.cc
$ g++ main.cc lib.so
/tmp/cc557J3i.o: In function `S::bar()':
main.cc:(.text._ZN1S3barEv[_ZN1S3barEv]+0xd): undefined reference to `foo()'

So this is a known issue with -fvisibility-inlines-hidden. This doesn't come up often in practice, but when it does the developer needs to deal with it.

Yeah, that is the reason of few chromium code changes.
https://chromium-review.googlesource.com/c/chromium/src/+/1212379

However, the static local problem is much scarier, because that leads to run-time bugs:

lib.h:

#ifndef LIB_H
#define LIB_H

inline int foo() { static int x = 0; return x++; }

struct __attribute__((visibility("default"))) S {
  int bar() { return foo(); }
  int baz();
};

#endif

lib.cc:

#include "lib.h"

int S::baz() { return foo(); }

main.cc:

#include <stdio.h>
#include "lib.h"

int main() {
  S s;
  printf("s.bar(): %d\n", s.bar());
  printf("s.baz(): %d\n", s.baz());
  return 0;
}

If we build the program normally, we get the expected output:

$ g++ lib.cc main.cc && ./a.out
s.bar(): 0
s.baz(): 1

but building as a shared library shows the problem:

$ g++ -fvisibility=hidden -fvisibility-inlines-hidden -fpic -shared -o lib.so lib.cc
$ g++ main.cc lib.so
$ LD_LIBRARY_PATH=$PWD:$LD_LIBRARY_PATH ./a.out
s.bar(): 0
s.baz(): 0

Oops.

This does show that it's a pre-existing problem with the model of not exporting inline functions though. I don't think we need to solve this specifically for Windows, I think we should match what -fvisibility-inlines-hidden is doing.

Currently this CL doesn't take care of inline function that is not member of a class.

lib.h:

#ifndef LIB_H
#define LIB_H

struct __attribute__((visibility("default"))) S {
  int bar() {
    static int x = 0; return x++;
  }
  int baz();
};

#endif

lib.cc:

#include "lib.h"

int S::baz() {
  return bar();
}

Then, static local in inline function is treated correctly.

$ g++ -fvisibility=hidden -fvisibility-inlines-hidden -fpic -shared -o lib.so lib.cc
$ g++ main.cc lib.so
$ LD_LIBRARY_PATH=$PWD:$LD_LIBRARY_PATH ./a.out
s.bar(): 0
s.baz(): 1

This is the same behavior with /Zc:dllexportInlines-.

clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

For the following code

template<typename T>
class M{
public:

  T Inclass() {
    static T static_x;
    ++static_x;
    return static_x;
  }
};

template class __declspec(dllexport) M<int>;

extern template class __declspec(dllimport) M<short>;

int f (){
  M<int> mi;
  M<short> ms;
  return mi.Inclass() + ms.Inclass();
}

llvm code without instantiation check become like below. Both inline functions of M<int> and M<short> is not dllimported/exported.

$"?Inclass@?$M@H@@QEAAHXZ" = comdat any

$"?static_x@?2??Inclass@?$M@H@@QEAAHXZ@4HA" = comdat any

@"?static_x@?2??Inclass@?$M@H@@QEAAHXZ@4HA" = linkonce_odr dso_local global i32 0, comdat, align 4

; Function Attrs: noinline nounwind optnone
define weak_odr dso_local i32 @"?Inclass@?$M@H@@QEAAHXZ"(%class.M* %this) #0 comdat align 2 {
entry:
  %this.addr = alloca %class.M*, align 8
  store %class.M* %this, %class.M** %this.addr, align 8
  %this1 = load %class.M*, %class.M** %this.addr, align 8
  %0 = load i32, i32* @"?static_x@?2??Inclass@?$M@H@@QEAAHXZ@4HA", align 4
  %inc = add nsw i32 %0, 1
  store i32 %inc, i32* @"?static_x@?2??Inclass@?$M@H@@QEAAHXZ@4HA", align 4
  %1 = load i32, i32* @"?static_x@?2??Inclass@?$M@H@@QEAAHXZ@4HA", align 4
  ret i32 %1
}

; Function Attrs: noinline nounwind optnone
define dso_local i32 @"?f@@YAHXZ"() #0 {
entry:
  %mi = alloca %class.M, align 1
  %ms = alloca %class.M.0, align 1
  %call = call i32 @"?Inclass@?$M@H@@QEAAHXZ"(%class.M* %mi)
  %call1 = call i16 @"?Inclass@?$M@F@@QEAAFXZ"(%class.M.0* %ms)
  %conv = sext i16 %call1 to i32
  %add = add nsw i32 %call, %conv
  ret i32 %add
}

declare dso_local i16 @"?Inclass@?$M@F@@QEAAFXZ"(%class.M.0*) #1

With the check, both functions are dllimported/exported and static local vars will be treated correctly.

$"??4?$M@H@@QEAAAEAV0@AEBV0@@Z" = comdat any

$"?Inclass@?$M@H@@QEAAHXZ" = comdat any

$"?static_x@?2??Inclass@?$M@H@@QEAAHXZ@4HA" = comdat any

@"?static_x@?2??Inclass@?$M@H@@QEAAHXZ@4HA" = linkonce_odr dso_local global i32 0, comdat, align 4

; Function Attrs: noinline nounwind optnone
define weak_odr dso_local dllexport dereferenceable(1) %class.M* @"??4?$M@H@@QEAAAEAV0@AEBV0@@Z"(%class.M* %this, %class.M* dereferenceable(1)) #0 comdat align 2 {
entry:
  %.addr = alloca %class.M*, align 8
  %this.addr = alloca %class.M*, align 8
  store %class.M* %0, %class.M** %.addr, align 8
  store %class.M* %this, %class.M** %this.addr, align 8
  %this1 = load %class.M*, %class.M** %this.addr, align 8
  ret %class.M* %this1
}

; Function Attrs: noinline nounwind optnone
define weak_odr dso_local dllexport i32 @"?Inclass@?$M@H@@QEAAHXZ"(%class.M* %this) #0 comdat align 2 {
entry:
  %this.addr = alloca %class.M*, align 8
  store %class.M* %this, %class.M** %this.addr, align 8
  %this1 = load %class.M*, %class.M** %this.addr, align 8
  %0 = load i32, i32* @"?static_x@?2??Inclass@?$M@H@@QEAAHXZ@4HA", align 4
  %inc = add nsw i32 %0, 1
  store i32 %inc, i32* @"?static_x@?2??Inclass@?$M@H@@QEAAHXZ@4HA", align 4
  %1 = load i32, i32* @"?static_x@?2??Inclass@?$M@H@@QEAAHXZ@4HA", align 4
  ret i32 %1
}

; Function Attrs: noinline nounwind optnone
define dso_local i32 @"?f@@YAHXZ"() #0 {
entry:
  %mi = alloca %class.M, align 1
  %ms = alloca %class.M.0, align 1
  %call = call i32 @"?Inclass@?$M@H@@QEAAHXZ"(%class.M* %mi)
  %call1 = call i16 @"?Inclass@?$M@F@@QEAAFXZ"(%class.M.0* %ms)
  %conv = sext i16 %call1 to i32
  %add = add nsw i32 %call, %conv
  ret i32 %add
}

declare dllimport i16 @"?Inclass@?$M@F@@QEAAFXZ"(%class.M.0*) #1
hans added a comment.Oct 30 2018, 2:30 AM
$ g++ -fvisibility=hidden -fvisibility-inlines-hidden -fpic -shared -o lib.so lib.cc
$ g++ main.cc lib.so
/tmp/cc557J3i.o: In function `S::bar()':
main.cc:(.text._ZN1S3barEv[_ZN1S3barEv]+0xd): undefined reference to `foo()'

So this is a known issue with -fvisibility-inlines-hidden. This doesn't come up often in practice, but when it does the developer needs to deal with it.

Yeah, that is the reason of few chromium code changes.
https://chromium-review.googlesource.com/c/chromium/src/+/1212379

Ah, thanks! I hadn't seen what the fixes would look like.

However, the static local problem is much scarier, because that leads to run-time bugs:

Currently this CL doesn't take care of inline function that is not member of a class.

lib.h:

#ifndef LIB_H
#define LIB_H

struct __attribute__((visibility("default"))) S {
  int bar() {
    static int x = 0; return x++;
  }
  int baz();
};

#endif

lib.cc:

#include "lib.h"

int S::baz() {
  return bar();
}

Then, static local in inline function is treated correctly.

I think it's possible to get the same problem with member functions, but that doesn't matter, it's an existing problem so we don't need to solve it, just be aware.

clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

Thanks! This seems like a problem with the current code though, and hopefully we can fix it instead of working around it in your patch.

A shorter version of your example:

template <typename T> struct S {
  int foo() { static int x; return x++; }
};

template struct __declspec(dllexport) S<int>;

int f() {
  S<int> a;
  return a.foo();
}

Clang will dllexport S<int>::foo(), but not the static local. That's a bug.

I'll see if I can fix (tracking in https://bugs.llvm.org/show_bug.cgi?id=39496).

hans added inline comments.Oct 31 2018, 1:51 AM
clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

The fix for that was committed in r345699. I think we can now remove the checks for instantiation here.

Rebased to take r345699

Thank you for quick fix!

clang/lib/AST/ASTContext.cpp
9552 ↗(On Diff #169792)

Sorry, this change is necessary still.

Without this, definition of inline function in explicit template instantiation declaration is not be emitted, due to GVA_AvailableExternally linkage.
But we stop exporting definition of inline function in explicit template instantiation definition too.

So without this, definition of dllimported inline function of explicit template instantiation declaration won't be available.

clang/lib/Sema/SemaDeclCXX.cpp
5719 ↗(On Diff #169179)

Thanks! I removed this check again.

hans added inline comments.Oct 31 2018, 2:52 AM
clang/lib/AST/ASTContext.cpp
9552 ↗(On Diff #169792)

Can you provide a code example of why this is needed?

takuto.ikuta added inline comments.Oct 31 2018, 3:45 AM
clang/lib/AST/ASTContext.cpp
9552 ↗(On Diff #169792)

If we don't change function linkage, following code will be compiled like below with -fno-dllexport-inlines.

template<typename>
class M{
public:
  void foo() {}
};

template class __declspec(dllexport) M<int>;

extern template class __declspec(dllimport) M<short>;

void f (){
  M<int> mi;
  mi.foo();

  M<short> ms;
  ms.foo();
}
$"?foo@?$M@H@@QEAAXXZ" = comdat any

; Function Attrs: noinline nounwind optnone
define weak_odr dso_local void @"?foo@?$M@H@@QEAAXXZ"(%class.M* %this) #0 comdat align 2 {
entry:
  %this.addr = alloca %class.M*, align 8
  store %class.M* %this, %class.M** %this.addr, align 8
  %this1 = load %class.M*, %class.M** %this.addr, align 8
  ret void
}

; Function Attrs: noinline nounwind optnone
define dso_local void @"?f@@YAXXZ"() #0 {
entry:
  %mi = alloca %class.M, align 1
  %ms = alloca %class.M.0, align 1
  call void @"?foo@?$M@H@@QEAAXXZ"(%class.M* %mi)
  call void @"?foo@?$M@F@@QEAAXXZ"(%class.M.0* %ms)
  ret void
}

declare dso_local void @"?foo@?$M@F@@QEAAXXZ"(%class.M.0*) #1

M<short>::foo() is declared, but inline function is not dllexported (See M<int>::foo() is not dllexported). So this code cannot be linked because definition of M<short>::foo does not exist. If the function is properly inlined, we don't need to have this code. But I'm not sure why the function is not inlined.

hans added inline comments.Oct 31 2018, 6:30 AM
clang/lib/AST/ASTContext.cpp
9552 ↗(On Diff #169792)

Interesting. I wonder how -fvisibility-inlines-hidden handles this...

template <typename> struct S {
  void foo() {}
};

template struct S<int>;

void use() {
  S<int> s;
  s.foo();
}

$ g++ -fvisibility-inlines-hidden -c a.cc && objdump -t a.o | grep _ZN1SIiE3fooEv
0000000000000000 l    d  .text._ZN1SIiE3fooEv	0000000000000000 .text._ZN1SIiE3fooEv
0000000000000000  w    F .text._ZN1SIiE3fooEv	000000000000000b _ZN1SIiE3fooEv  <--- Not hidden.

(If I comment out the explicit instantiation definition above, foo() is hidden as expected.)

Okay, it seems that for explicit instantiation definitions, -fvisibility-inlines-hidden does not apply.

And thinking more about it, that makes sense.

I don't think we should change the linkage like this though, I think we should just not apply the /Zc:dllexportInlines- for explicit instantiation decls and definitions in checkClassLevelDLLAttribute(). I realize you had code to check for this before, but now we have a good and well understood reason.

export/import explicit template instantiation function

takuto.ikuta marked an inline comment as done.Oct 31 2018, 7:36 AM
takuto.ikuta added inline comments.
clang/lib/AST/ASTContext.cpp
9552 ↗(On Diff #169792)

Thank you for confirmation. I revived the check.

rnk added inline comments.Oct 31 2018, 2:46 PM
clang/lib/AST/ASTContext.cpp
9552 ↗(On Diff #169792)

I agree, this option should not affect explicit instantiations. Inline functions from explicit instantiations should still be exported+weak_odr, and imported+available_externally for extern declarations.

rnk added inline comments.Oct 31 2018, 2:48 PM
clang/include/clang/Basic/LangOptions.h
246 ↗(On Diff #171916)

We should define this in the LangOptions.def file.

takuto.ikuta marked 2 inline comments as done.

Added option to LangOptions.def

Thank you for review!

clang/include/clang/Basic/LangOptions.h
246 ↗(On Diff #171916)

Moved.

hans added a comment.Nov 1 2018, 1:24 AM

Thank you Takuto! I think the functionality looks great now: it's not too complicated :-)

I only have comments about the test.

clang/test/CodeGenCXX/dllexport-no-dllexport-inlines.cpp
3 ↗(On Diff #172090)

Instead of "DEFAULT", I think "CHECK" is the default prefix to use when using the same prefix for multiple invocations.

Also, instead of INLINE and NOINLINE (which sounds like it's about inlining), I'd suggest perhaps "EXPORTINLINES" and "NOEXPORTINLINES" or something like that.

6 ↗(On Diff #172090)

Instead of using -O0 here and above, consider using -O1 -disable-llvm-passes so we get available_externally functions.

9 ↗(On Diff #172090)

This change doesn't have any effect on non-member functions, so I don't think we need these tests.

42 ↗(On Diff #172090)

Can you move the checks down to where the variable/function is defined? That makes it easier to read the test I think.

49 ↗(On Diff #172090)

I think this test case should be first in the file, because it's really showing the core of this feature.

I would suggest just calling "ExportedClass" to make it simpler (if it was a template, then we'd put template in the name). Also, I suggest making it a struct instead so you don't have to write "public:".

51 ↗(On Diff #172090)

This check is only using part of the mangled name..
But I don't think the patch does anything special for this kind of function, so I'd just skip this test.

58 ↗(On Diff #172090)

This doesn't seem to be used for anything?

73 ↗(On Diff #172090)

Why foceinline? I think it should be just "inline". This goes for all use of forceinline in this file.

76 ↗(On Diff #172090)

Hmm, I would have thought we should export this function.

102 ↗(On Diff #172090)

I'm not sure that OutclassDefFunc() adds much value to the test. Also templateValue below. I think they can be removed.

122 ↗(On Diff #172090)

We should also have a test with implicit instantiation, and then the inline function should not be exported when using /Zc:dllexportInlines-.

132 ↗(On Diff #172090)

Again, I think we can drop the out-of-line function since this change should not affect those.

144 ↗(On Diff #172090)

This case is not interesting, since it has nothing to do with dllexport/import, I think it can be removed.

148 ↗(On Diff #172090)

Ditto.

165 ↗(On Diff #172090)

Again, make the classes "struct" and remove the "public:" lines :-)

166 ↗(On Diff #172090)

I don't see any test lines that reference this inner class. I guess it's not interesting?

172 ↗(On Diff #172090)

This check looks wrong. I think there should be a dllimport declaration for this function? (Or an available_externally definition if we use -O1?)

181 ↗(On Diff #172090)

Let's put this first in the class, since it's the "basic" case, and the static local is more of a special case.

hans added inline comments.Nov 1 2018, 7:37 AM
clang/include/clang/Driver/CLCompatOptions.td
336 ↗(On Diff #172090)

Just one more small thing I remembered: please add a test in tests/Driver/cl-options.c to make sure we translate this to the cc1 flag correctly.

takuto.ikuta marked 17 inline comments as done.

Simplify test and fix some behavior.

Hans, thank you for review! I addressed all your comment and fixed small behavior.

clang/test/CodeGenCXX/dllexport-no-dllexport-inlines.cpp
76 ↗(On Diff #172090)

I see.
Added MD->isThisDeclarationADefinition() check in Sema::checkClassLevelDLLAttribute
This will make change like https://chromium-review.googlesource.com/c/v8/v8/+/1186017 unnecessary.

122 ↗(On Diff #172090)

Added test and MD->isImplicitlyInstantiable() check for the test.

172 ↗(On Diff #172090)

I think it is OK not dllimporting this function as far as static variable is dllimported and definition of this function is emitted.

Added cl driver flag test

takuto.ikuta marked an inline comment as done.Nov 2 2018, 12:51 AM

I missed the comment about driver flag test.

hans added a comment.Nov 2 2018, 1:54 AM

Thanks! I don't think the new isThisDeclarationADefinition() and isImplicitlyInstantiable() checks are right. Besides that, I only have a few more comments about the test.

clang/lib/Sema/SemaDeclCXX.cpp
5715 ↗(On Diff #172317)

The isThisDeclarationADefinition() and isImplicitlyInstantiable() checks don't look right. Just checking isInlined() should be enough.

clang/test/CodeGenCXX/dllexport-no-dllexport-inlines.cpp
9 ↗(On Diff #172317)

Why this separate STATIC invocation? It looks just the same as the invocation above.

Oh I see, it's because the test is mixing -DAG and -NOT lines, which breaks things.

I have a suggestion for avoiding the -NOT lines below, and that means this separate invocation to check for statics (and the other one below) can be removed.

25 ↗(On Diff #172317)

Instead of a -NOT check, it would be better to "use" the function somewhere so that it's emitted, and check that it's not dllexport. That will make it easier to see the difference between NOEXPORTINLINE and EXPORTINLINE.

Also it avoids -NOT checks, which don't interact well with -DAG checks, and would let you remove the separate STATIC invocation.

44 ↗(On Diff #172317)

Having an inline function and not defining it like this is not so great, especially in a dllexport class. I don't think there's any need to change the behaviour here or test for it.

47 ↗(On Diff #172317)

But with -fno-dllexport-inlines, I don't think it should be exported.

It really doesn't matter if the body is inside or outside the class -- it's still an inline member function, and so the flag should apply.

53 ↗(On Diff #172317)

I'd suggest calling it "OutoflineDefFunc()" instead.

65 ↗(On Diff #172317)

I don't think there's any reason to dllexport the "user" function.

81 ↗(On Diff #172317)

I don't think this instantiation is different from the ​​template class TemplateExportedClass<B22>; one below, so I think we can skip it.

83 ↗(On Diff #172317)

There should also be a check to make sure it's exported also in the NOEXPORTINLINE case.

88 ↗(On Diff #172317)

Thanks, this is the implicit instantiation case. It would be good to have an inline function with a static local to make sure it does get exported even with the flag.

takuto.ikuta marked 10 inline comments as done.

Address comment

added a few more check

Thank you for quick review!

clang/lib/Sema/SemaDeclCXX.cpp
5715 ↗(On Diff #172317)

I see.

hans accepted this revision.Nov 2 2018, 3:37 AM

Assuming the test still passes when you put the EXPORTINLINE filecheck prefix back, looks good to me!

clang/test/CodeGenCXX/dllexport-no-dllexport-inlines.cpp
9 ↗(On Diff #172332)

Did the "EXPORTINLINE" check-prefix get lost here?

This revision is now accepted and ready to land.Nov 2 2018, 3:37 AM
takuto.ikuta marked an inline comment as done.

Fix check-prefix

takuto.ikuta added a comment.EditedNov 2 2018, 6:45 AM

Thank you! I'll submit this if other people do not have any comments.

clang/test/CodeGenCXX/dllexport-no-dllexport-inlines.cpp
9 ↗(On Diff #172332)

Good catch!

This revision was automatically updated to reflect the committed changes.