This is an archive of the discontinued LLVM Phabricator instance.

Mangling of anonymous unions
ClosedPublic

Authored by eastig on Nov 17 2014, 9:46 AM.

Details

Summary

Anonymous union issues:

A name in the following code is mangled incorrectly:

inline int f1 () {
  static union {
    int a;
    long int b;
  };

  static union {
    int c;
    double d;
  };

  return a+c;
}

int f2 (int a) {
  return a+f1();
}

The name of the second union is mangled as _ZZ2f1vE1c_0 instead of _ZZ2f1vE1c. According to the Itanium C++ ABI '5.1.6 Scope Encoding':
The discriminator is used only for the second and later occurrences of the same "top-level" name within a single function (since "unnamed types" are distinctly numbered, they never include a discriminator).
In the program above 'int c' is the first occurrence of the name 'c'. So no discriminator must be used.

Diff Detail

Repository
rL LLVM

Event Timeline

eastig updated this revision to Diff 16292.Nov 17 2014, 9:46 AM
eastig retitled this revision from to Mangling of local names: anonymous unions, names in dead code.
eastig updated this object.
eastig edited the test plan for this revision. (Show Details)
eastig added a subscriber: Unknown Object (MLST).Nov 18 2014, 7:55 AM
rnk added a subscriber: rnk.Nov 19 2014, 9:52 AM

I don't think we should change our behavior w.r.t. static locals of non-inline functions. This behavior should stay the same:

void f() {
  if (0) {
    static int a = 0;
  }
  static int a = 0; // _ZZ1fvE1a
}

inline void f() {
  if (0) {
    static int a = 0;
  }
  static int a = 0; // _ZZ1fvE1a_0
}

The reason is that in the non-inline case, we don't want to waste time during Sema building up tables on the ASTContext. The name is not ABI relevant, and either name is as good as the next.

I agree with your assessment of the static local union case, though. It's not clear to me how your fix works, but maybe -ast-dump isn't complete enough for me to see the IndirectFieldDecl that you're finding.

eastig added a comment.EditedNov 20 2014, 2:07 AM
In D6295#3, @rnk wrote:

I don't think we should change our behavior w.r.t. static locals of non-inline functions. This behavior should stay the same:

void f() {
  if (0) {
    static int a = 0;
  }
  static int a = 0; // _ZZ1fvE1a
}

inline void f() {
  if (0) {
    static int a = 0;
  }
  static int a = 0; // _ZZ1fvE1a_0
}

The reason is that in the non-inline case, we don't want to waste time during Sema building up tables on the ASTContext. The name is not ABI relevant, and either name is as good as the next.

I agree with your assessment of the static local union case, though. It's not clear to me how your fix works, but maybe -ast-dump isn't complete enough for me to see the IndirectFieldDecl that you're finding.

You mean that this will hurt performance, don't you?

The code that takes into account declarations in dead code was added in revision 185986 by an engineer from Apple as a fix of the issue rdar://problem/14204721. I do not have access to Apple's bug tracking system to see what was the issue.

The current issue is that different enumeration schemas are used by getNextDiscriminator. And they produce different names depending on the function specifier 'inline'.
For example:

f.h

inline int f(int v) {
  static union {
    int a;
    int b;
  };

  static union {
    union {
      union {
        int c;                                                                                                                                                                                                                                
      };
      float d;
    };
  };

  int r = a+c;

  a = v;
  c = v+1;

  return r;
}

m_a.cpp

#include "f.h"                                                                                                                                                                                                                                

int f_a(int a) {
  return a+f(a);
}

m_b.cpp

#include "f.h"                                                                                                                                                                                                                                

int f_b(int b) {
  return b+f(b);
}

main.cpp

extern int f_a(int);
extern int f_b(int);

int main(int argc, const char* argv[]) {
  int r1 = f_a(argc);
  int r2 = f_b(argc+1);
  return r1 + r2;                                                                                                                                                                                                                             
}

According to the C++ standard [dcl.fct.spec]: A static local variable in an extern inline function always refers to the same object.
This works when everything is compile by one compiler but it does not work when m_a.cpp is compiled with clang and other files are compiled with g++:

$ /tools/collections/clang/3.5.0/3/linux_3.2-ubuntu_12.04-x86_64/bin/clang++ -c -O3 m_a.cpp
$ /tools/collections/clang/3.5.0/3/linux_3.2-ubuntu_12.04-x86_64/bin/clang++ -c -O3 m_b.cpp
$ /tools/collections/clang/3.5.0/3/linux_3.2-ubuntu_12.04-x86_64/bin/clang++ -O3 main.cpp m_a.o m_b.o
$ ./a.out 
$ echo $?
6

$ g++ -c -O3 m_a.cpp
$ g++ -c -O3 m_b.cpp
$ g++ -O3 main.cpp m_b.o m_a.o
$ ./a.out 
$ echo $?
6

$ /tools/collections/clang/3.5.0/3/linux_3.2-ubuntu_12.04-x86_64/bin/clang++ -c -O3 m_a.cpp
$ g++ -O3 main.cpp m_b.o m_a.o
$ ./a.out 
$ echo $?
4
rnk added a comment.Nov 20 2014, 10:50 AM
In D6295#5, @eastig wrote:

You mean that this will hurt performance, don't you?

Yes, there is an (admittedly minor) performance cost to numbering static locals of non-inline functions when no numbering is required.

The code that takes into account declarations in dead code was added in revision 185986 by an engineer from Apple as a fix of the issue rdar://problem/14204721. I do not have access to Apple's bug tracking system to see what was the issue.

Yep, I'm familiar with the work Eli did here because I helped make the Microsoft ABI version of this work.

The current issue is that different enumeration schemas are used by getNextDiscriminator. And they produce different names depending on the function specifier 'inline'.

They do. This is by design. Can you elaborate on why this is a problem? The names of static locals in non-inline functions are not ABI-relevant.

According to the C++ standard [dcl.fct.spec]: A static local variable in an extern inline function always refers to the same object.

Note that the standard is explicitly talking about an "extern inline function" here.

This works when everything is compile by one compiler but it does not work when m_a.cpp is compiled with clang and other files are compiled with g++:

Yes, the example you provided is the static local anonymous union case, which I agree needs to be fixed. Can you reupload a new patch which is limited to handling static local anonymous unions in inline functions, and avoid changes to non-ABI relevant names?

Yes, the example you provided is the static local anonymous union case, which I agree needs to be fixed. Can you reupload a new patch which is limited to handling static local anonymous unions in inline functions, and avoid changes to non-ABI relevant names?

Yes. I have implemented such fix. I am testing it.

eastig updated this revision to Diff 16563.Nov 24 2014, 6:52 AM
eastig retitled this revision from Mangling of local names: anonymous unions, names in dead code to Mangling of anonymous unions.
eastig updated this object.
eastig edited the test plan for this revision. (Show Details)

Updated patch according to comments

rsmith added inline comments.Nov 25 2014, 2:28 PM
lib/AST/ItaniumMangle.cpp
3913–3925 ↗(On Diff #16563)

Why do you need to do this? Since this code is only reached for non-externally-visible entities, a single counter for the context (like we used to use) seems like it should be sufficient.

eastig updated this revision to Diff 16636.Nov 26 2014, 2:29 AM
eastig updated this object.
rsmith added inline comments.Nov 26 2014, 7:03 PM
lib/AST/ItaniumCXXABI.cpp
65–67 ↗(On Diff #16636)

How is it possible to get here with a VarDecl with no identifier, but where it's not an anonymous union? Please either add a comment explaining or change this to an assert/castAs.

eastig updated this revision to Diff 16676.Nov 27 2014, 3:16 AM

Cleaned up code:

  • no need of 'if' checks as VarDecl without an identifier should be an anonymous union declaration.
  • created function 'findAnonymousUnionVarDeclName'
eastig added a comment.Dec 1 2014, 5:35 AM
This comment was removed by eastig.
lib/AST/ItaniumMangle.cpp
3913–3925 ↗(On Diff #16563)

You are right. Reverted to the old code.

rsmith accepted this revision.Dec 11 2014, 4:08 PM
rsmith edited edge metadata.

LGTM

lib/AST/ItaniumCXXABI.cpp
50 ↗(On Diff #16676)

Can this happen?

This revision is now accepted and ready to land.Dec 11 2014, 4:08 PM
This revision was automatically updated to reflect the committed changes.