This is an archive of the discontinued LLVM Phabricator instance.

PR17829: Functions declared extern "C" with a name matching a mangled C++ function are allowed
ClosedPublic

Authored by andreybokhanko on Jul 17 2015, 7:26 AM.

Details

Summary

This patch fixes compiler crash described in the following thread: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/thread.html#43852. It also fixes incorrect behavior described in PR17829.

In essence, I inserted a check in GetAddrOfFunction that verifies that no two different declarations correspond to the same mangled name. Given that clang defers everything, this is the best suitable place I managed to find for the check. Also, the check makes sense only if one of declarations is a C++ method, as there are some kinds of non-C++ declarations that can correctly have the same mangled name (inline assembly inserts is an example).

Yours,

Andrey Bokhanko

Software Engineer
Intel Compiler Team
Intel

Diff Detail

Repository
rL LLVM

Event Timeline

andreybokhanko retitled this revision from to PR17829: Functions declared extern "C" with a name matching a mangled C++ function are allowed.
andreybokhanko updated this object.
andreybokhanko added reviewers: rjmccall, rsmith.
andreybokhanko added a subscriber: cfe-commits.
rsmith added inline comments.Jul 17 2015, 1:06 PM
lib/CodeGen/CodeGenModule.cpp
1674–1675 ↗(On Diff #30000)

Why do you need this check? The same mangling collisions can happen with any other kind of function declaration.

1681 ↗(On Diff #30000)

Any reason not to use an undef value of the appropriate type (Ty)? If you did that, you presumably wouldn't need special cases elsewhere.

rjmccall edited edge metadata.Jul 23 2015, 11:40 PM

I disagree with the conclusions in that thread (and wish you had CC'ed me; I unfortunately do not have time to keep up with cfe-dev). There is roughly zero chance that somebody is declaring an extern "C" symbol with a name that matches an Itanium mangled name with any intent other than to name that entity. We should not be rejecting this code or silently renaming one of the symbols. We should, of course, still diagnose attempts to *define* the same symbol multiple times.

The correct fix is to add bitcasts and/or replace globals as necessary so that the colliding references name the same LLVM object. We already have a type, it should be easy to bitcast to that.

Richard, John, thanks for the review. I'm currently on vacation, so can't respond now, but will do after getting back to work.

John, I will make sure in the future to CC you on all CodeGen-related issues, promise! :-)

Yours,
Andrey

Thanks. :)

When you do have time to work on this again, I've found that the best way to do this is to simply propagate a flag down to GetAddrOfFunction indicating whether you intend to define it. (One of these lessons that I keep meaning to back port from the Swift compiler...) The actions to take are then pretty easy:

  • If the global value doesn't already exist, or it already exists and has the right type + kind, you don't need to do anything special.
  • Otherwise, we have an existing global value with the wrong kind (e.g. is a GlobalVariable) or the wrong type:
    • If you're not trying to define it, just return a bitcast.
    • If you're trying to define it, and the existing value is not a definition, just claim the name for a new Function and queue up the existing one to be replaced with a bitcast. (Replacing it immediately can be dangerous because there could be outstanding references to it that aren't held in a value handle, e.g. if we're emitting the definition of something in the middle of emitting the definition of something else, which does happen sometimes.)
    • If you're trying to define it, and the existing value is a definition, diagnose the collision and return a new Function. You don't want to mess with the existing definition because it's possible that something is currently emitting code into it.

For these purposes, the existing value is a definition if it was created by a call to GetAddrOfFunction or GetAddrOfGlobalVar with a flag saying that this is for a definition. (Or if it was emitted by EmitAliasDefinition.) You have to be somewhat careful about just checking whether it's an LLVM definition because we might have created it for the purposes of defining it but not yet started that definition; this is particularly likely with global variables. (In fact, it might only be possible with them.)

andreybokhanko added a comment.EditedAug 11 2015, 5:56 AM

John,
(Others are welcome to chime in as well)

I started to implement your suggestion, but faced with problems.

Consider the following test case:

1: struct T {
2:   ~T() {}
3: };
4: 
5: extern "C" void _ZN1TD1Ev();
6: 
7: int main() {
8:   _ZN1TD1Ev();
9:   T t;
10: }

First time we visit "GetOrCreateLLVMFunction" for "_ZN1TD1Ev" mangled name is when we are dealing with the call at line 8. No global values are associated with "_ZN1TD1Ev" name yet, so we simply create a new llvm::Function and add it to the list of module's functions (ParentModule->getFunctionList().push_back(this); at Function.cpp:264).

Next time we visit "GetOrCreateLLVMFunction" for the same "_ZN1TD1Ev" name is when we are dealing with the implicit t's destructor call at the end of "main" routine (say, at line 10). We already have a global value associated with "_ZN1TD1Ev" name *and* we have a function with this name in module's function list. We can't create a new llvm::Function and add it to the functions list -- since module's machinery asserts that there is only one function with a given mangled name in the list; we can't remove old Function either -- since it is referred to in the call at line 8. The best we can do is to bit cast old llvm::Function (from line 8) to destructor's type and return it. But this generates destructor call for a wrong llvm::Function which means problems and further asserts down the road.

Any hints how to resolve this?

Yours,
Andrey

P.S.: I honestly believe my test above is a case of incorrect usage of mangled names (despite no definition for "ZN1TD1Ev" function), so in my opinion the way to go would be to issue a error inside "GetOrCreateLLVMFunction". This is different from my initial patch and still compiles fine the following example -- which, as I believe, you meant as a correct use of explicitly mangled names:

struct T {
  ~T() {}
};

extern "C" void _ZN1TD1Ev(T *this_p);

int main() {
  _ZN1TD1Ev(0);
  T t;
}

(My initial patch rejected even this code, which is probably too strict indeed.)

You only have one attempt to define the function here; I don't see the problem. Recall that I said to add a flag to getOrCreateLLVMFunction that says whether the caller intends to define the function. The rule should be that only callers that pass "true" should be allowed to assume that they'll get a normal llvm::Function back. Everybody needs to be prepared to receive a bitcast. Whenever you find yourself needing to replace an existing function, just queue it up to be replaced at the end of IRGen.

I don't think we need to fall over ourselves ensuring that these "aliased" uses properly mark functions as used or instantiate templates or anything.

John,

Thank you for the quick reply!

Let me make sure I understand what you said, using my test as an example (BTW, sorry if this is a dumb question -- I asked our local Clang experts, but no-one seems to be 100% sure what to do):

1: struct T {
2:   ~T() {}
3: };
4: 
5: extern "C" void _ZN1TD1Ev();
6: 
7: int main() {
8:   _ZN1TD1Ev();
9:   T t;
10: }

When we deal with the call at line N8, there are no Functions created yet, so nothing to bitcast. Thus, we create a Function and the following call:

call void @_ZN1TD1Ev()

When we deal with implicit destructor call at line N10, there is already a Function with "_ZN1TD1Ev" mangled name exists. Thus, we create a bitcast and the following call:

call void bitcast (void ()* @_ZN1TD1Ev to void (%struct.T*)*)(%struct.T* %t)

At the end of IRGen we should replace all references of old _ZN1TD1Ev (one with zero arguments) with new _ZN1TD1Ev (one with a single T* argument) -- *including* adding a new bitcast (as we replace a Function with different type) in all places in IR where we do the replacement.

Is my understanding correct?

Andrey

John,

Thank you for the quick reply!

Let me make sure I understand what you said, using my test as an example (BTW, sorry if this is a dumb question -- I asked our local Clang experts, but no-one seems to be 100% sure what to do):

1: struct T {
2:   ~T() {}
3: };
4: 
5: extern "C" void _ZN1TD1Ev();
6: 
7: int main() {
8:   _ZN1TD1Ev();
9:   T t;
10: }

When we deal with the call at line N8, there are no Functions created yet, so nothing to bitcast. Thus, we create a Function and the following call:

call void @_ZN1TD1Ev()

Yes.

When we deal with implicit destructor call at line N10, there is already a Function with "_ZN1TD1Ev" mangled name exists. Thus, we create a bitcast and the following call:

call void bitcast (void ()* @_ZN1TD1Ev to void (%struct.T*)*)(%struct.T* %t)

Yes.

At the end of IRGen we should replace all references of old _ZN1TD1Ev (one with zero arguments) with new _ZN1TD1Ev (one with a single T* argument) -- *including* adding a new bitcast (as we replace a Function with different type) in all places in IR where we do the replacement.

This is only necessary if you try to emit the definition at some point. In this case, you will, because emitting the reference to the destructor as part of the second call will require the destructor to be emitted, because it's inline.

Let me try to spell out the sequence more precisely.

  1. IRGen starts out with no Function.
  2. IRGen sees the declaration of the extern "C" function. It's just a declaration, not a definition, so IRGen doesn't need to do anything.
  3. IRGen sees the declaration of the destructor. It's a definition, but it's a deferrable definition, so IRGen doesn't need to do anything except record that it's got a deferred definition in DeferredDecls.
  4. IRGen emits a reference to the extern "C" function. This is a reference, not a definition, so it's fine for getOrCreateLLVMFunction to return an arbitrary Constant; it doesn't have to return a Function. But we don't have a Function yet, so we create one with the expected type for the extern "C" function. We also notice that we've got a deferred definition for this name, so we move that to the deferred-decls-to-emit queue.
  5. IRGen emits a reference to the destructor. This is a reference, not a definition, so any kind of Constant is fine. Now, we've got a Function, but it's got the wrong type, so we need to bitcast it. We've already enqueued the deferred definition, so that's fine.
  6. IRGen emits the deferred definition. We tell getOrCreateLLVMFunction that we're going to define the function, so getOrCreate has to return a Function with the right type; it's got an existing llvm::Function, but it's the wrong type, so it has to make a new llvm::Function. It creates a new Function with no name, takes the name of the existing Function (with takeName), replaces existing references with a bitcast to the new name, and queues up something for the end of emission to remove the replaced Function.
  7. IRGen reaches the end of emission and sees that it's got a Function to replace. It replaces the Function with a bitcast of the new function again (if there are any remaining uses) and then deletes it.

Make sense?

John,

Make sense?

Yes, absolutely! -- even for me. :-)

Thanks a 1000000!

Yours,
Andrey

andreybokhanko edited edge metadata.

John,

I implemented precisely what you described (or so I believe :-))

Patch is updated; please re-review.

This patch implements support for functions, but not variables yet -- the patch is big enough already, so variables will come next.

Note that the biggest change in CodeGenModule.cpp is just moving of several static functions to another part of the file (to make them accessible earlier).

Yours,
Andrey

This looks generally like what I'm looking for, thanks! Some comments.

lib/CodeGen/CodeGenModule.cpp
1129 ↗(On Diff #33083)

This is a pretty expensive extra check, and I think it only kicks in on invalid code where we've got multiple definitions of a function. Can we just eliminate it? It's not really a problem to emit the second function definition as long as we're not trying to emit it into the same llvm::Function.

1573 ↗(On Diff #33083)

Instead of moving this function in this patch, just add a forward declaration. If you want to move it, you can do that in a separate patch that only moves the function.

lib/CodeGen/CodeGenModule.h
349 ↗(On Diff #33083)

Missing a period in the comment.

354 ↗(On Diff #33083)

I don't think this is necessary. I don't believe we ever need to emit a (mangled) function for definition between starting to emit another function and adding its entry block, or between starting to emit a global variable and finishing its constant initializer. So you should be able to just check whether the existing llvm::GlobalValue is a definition.

I don't think avoiding emitting the diagnostic multiple times for different globals is necessary, or even a good idea.

All (but one) of John McCall's comments fixed.

andreybokhanko marked 3 inline comments as done.Aug 27 2015, 6:31 AM

John,

Thank you for the review!

All your comments but one are fixed. See below for details on the single one I didn't manage to get fixed.

Andrey

lib/CodeGen/CodeGenModule.h
354 ↗(On Diff #33316)

Checking that a GlobalDecl is not in ExplicitDefinitions yet is actually required to avoid printing multiple identical warnings.

In my example:

1: struct T {
2:   ~T() {}
3: };
4: 
5: extern "C" void _ZN1TD1Ev();
6: 
7: int main() {
8:   _ZN1TD1Ev();
9:   T t;
10: }

~T() is added to the list of deferred decls twice. Judging from this comment in "EmitDeferred" method:

// Check to see if we've already emitted this.  This is necessary
// for a couple of reasons: first, decls can end up in the
// deferred-decls queue multiple times, and second, decls can end
// up with definitions in unusual ways (e.g. by an extern inline
// function acquiring a strong function redefinition).  Just
// ignore these cases.

this is pretty normal ("decls can end up in the deferred-decls queue multiple times").

This means that we can call "GetOrCreateLLVMFunction"(..., /*IsForDefinition*/=true) for duplicated decls several times, which is fine in general, *but* will print the "duplicated mangled names" diagnostic multiple times as well -- unless we check that we already printed a warning on duplicated mangled names for given decl.

As for not emitting diagnostics for different globals -- this won't happen, as we will call "GetOrCreateLLVMFunction" at least once for each global with a definition, and thus, will print a warning for everyone.

I thought really hard (honestly!) on how to prevent duplicated diagnostics without usage of an additional set, but didn't found any solution. If you have any hints here, they would be much appreciated.

rjmccall added inline comments.Aug 27 2015, 11:37 AM
lib/CodeGen/CodeGenModule.h
354 ↗(On Diff #33316)

Okay, that's fine. Can you at least make sure you only add to the set when you emit the warning?

As a minor optimization, you can add it to the set and check whether it was already there in the same operation, so that this would look like:

if (LookupRepresentativeDecl(...) && DiagnosedConflictingDefinitions.insert(GD).second) {
  ...
}

Fixed last note from John McCall.

John, please re-review.

Yours,
Andrey

This looks great, thanks! One last comment; if you agree with me, go ahead and fix it and then commit.

include/clang/Basic/DiagnosticSemaKinds.td
2323 ↗(On Diff #33401)

I'm sorry to bring this up so late in the process, but is there a good reason for this to be a warning and not an error?

andreybokhanko added inline comments.Aug 28 2015, 5:50 AM
include/clang/Basic/DiagnosticSemaKinds.td
2323 ↗(On Diff #33401)

No good reasons at all -- I thought you want this to be a warning, but looks like I misinterpreted you.

Do you want me to make this an error?

(If yes, apart of changing warn_duplicate_mangled_name to err_duplicate_mangled_name everywhere, I will have to change the test -- let me know whatever you want to review these changes or not really interested and your "go ahead and fix it and then commit" still holds true)

Andrey

Yes, please make it an error. And the obvious test changes are fine. :)

This revision was automatically updated to reflect the committed changes.

Yes, please make it an error.

Done.

John, thank you for all your patience and explanations! -- I understand that this particular review and patch author required more than the usual measure. :-(

And the obvious test changes are fine. :)

I asked because after switching from warning to error, I had to introduce a new run line in the test -- effectively transforming it into two.

Andrey