Page MenuHomePhabricator

PR25910: clang allows two var definitions with the same mangled name
ClosedPublic

Authored by andreybokhanko on Dec 21 2015, 4:00 AM.

Details

Summary

This patch fixes incorrect behavior described in PR25910.

Specifically, when there are two definitions with the same mangled name:

extern "C" {

int _ZN2nm3abcE = 1;

}

namespace nm {

float abc = 2;

}

clang should issue an error, not just emit an arbitrary one of them, as it does now.

It is essentially the same stuff as in http://reviews.llvm.org/D11297, just for variables, not functions.

Yours,

Andrey

Software Engineer
Intel Compiler Team
Intel

Diff Detail

Event Timeline

andreybokhanko retitled this revision from to PR25910: clang allows two var definitions with the same mangled name.
andreybokhanko updated this object.
andreybokhanko added reviewers: rjmccall, tra.
andreybokhanko added a subscriber: cfe-commits.
andreybokhanko added inline comments.Dec 21 2015, 5:14 AM
lib/CodeGen/CodeGenModule.cpp
1235–1243

Artem, to address your concern (from http://reviews.llvm.org/rL254195#39417): if IsForDefinition equals to true, it is guaranteed that we get GlobalValue here. We specifically call GetAddrOfGlobal to create (or get) a global with required type, not cast from a declaration with a different type.

tra added inline comments.Jan 4 2016, 5:06 PM
lib/CodeGen/CodeGenModule.cpp
1235–1243

Empirical evidence suggests that it's possible to get non GlobalValue from GetAddrOfGlobal.
For instance, you may get llvm::UnaryConstantExpr which would be the case when GetAddrOfGlobal returns an addrspacecast which may happen during CUDA or OpenCL compilation.

If you want to reproduce the problem, apply my pending patch http://reviews.llvm.org/D15305 which has to defer emitting some variables. Run clang tests and you will see a lot of CUDA test cases crashing with assertion on this cast.

andreybokhanko marked an inline comment as done.

Fixed tra's note.

lib/CodeGen/CodeGenModule.cpp
1235–1243

You are right -- I missed this case. Fixed.

Patch updated.

tra edited edge metadata.Jan 5 2016, 10:41 AM

A better description of the problem would help. PR itself is somewhat short on details.
If I understand it correctly, the problem is that if we create multiple definitions with the same mangled name, clang does not always report it as an error and only emits one of those instances.

lib/CodeGen/CodeGenModule.cpp
1235–1236

Minor nit -- you could just do dyn_cast<...>(GetAddrOfGlobal()), considering that GVC is not used anywhere else.

1250

GetGlobalValue() may return nullptr and existing code did check for that.
I'd add an assert(GV).

2184–2188

This may be collapsed into a single if() now.

2197–2198

This can be moved above if (!MustBeEmitted()) -- no point calling MustBeEmitted() in this case if GV != nullptr.

lib/CodeGen/CodeGenModule.h
704

It would be good to document what IsForDefinition does.

1140

Ditto here.

1151

And for IsTentative, too.

andreybokhanko updated this object.
andreybokhanko edited edge metadata.
andreybokhanko marked 7 inline comments as done.

Fixed tra's notes. All the fixes are local, logic of the patch is not changed.

In D15686#319725, @tra wrote:

A better description of the problem would help. PR itself is somewhat short on details.

Sorry for being not clear. I expected my main reviewer to be @rjmccall, who knows exactly what the problem is.

If I understand it correctly, the problem is that if we create multiple definitions with the same mangled name, clang does not always report it as an error and only emits one of those instances.

Yes, exactly.

Patch updated; please re-review.

Yours,
Andrey

lib/CodeGen/CodeGenModule.h
1140

Done (in *.cpp file).

1151

Done (in*.cpp file).

rnk added a subscriber: rnk.Jan 12 2016, 3:34 PM

I thought we already addressed this issue with @rjmccall and decided that, if the user intentionally declares extern "C" variables with an _Z prefix, then we know they are intentionally attempting to name some C++ global variable.

I'd rather just warn on all extern "C" declarations starting with _Z, and let users keep the pieces when things break.

In D15686#325266, @rnk wrote:

I thought we already addressed this issue with @rjmccall and decided that, if the user intentionally declares extern "C" variables with an _Z prefix, then we know they are intentionally attempting to name some C++ global variable.

I'd rather just warn on all extern "C" declarations starting with _Z, and let users keep the pieces when things break.

That doesn't contradicts my patch at all. See the test -- where there is one declaration and one definition:

We expect no warnings here, as there is only declaration of _ZN2nm3abcE
global, no definitions.
extern "C" {

int _ZN2nm3abcE;

}

namespace nm {

float abc = 2;

}
// CHECK: @_ZN2nm3abcE = global float

nothing bad happens.

But if there are two definitions:

extern "C" {

int _ZN2nm3abcE = 1; // expected-note {{previous definition is here}}

}

namespace nm {

float abc = 2; // expected-error {{definition with same mangled name as another definition}}

}

an error is printed, as it should be.

Here is what John wrote himself (http://reviews.llvm.org/D11297#211351):

"I disagree with the conclusions in that thread (and wish you had CC'ed me; I unfortunately do not have time to keep up with cfe-dev). There is roughly zero chance that somebody is declaring an extern "C" symbol with a name that matches an Itanium mangled name with any intent other than to name that entity. We should not be rejecting this code or silently renaming one of the symbols. We should, of course, still diagnose attempts to *define* the same symbol multiple times."

John, may I ask you to take a look at this patch and confirm (or deny :-)) that it follows your understanding of how things should be?

Yours,
Andrey

rjmccall edited edge metadata.Jan 13 2016, 5:51 PM

Yes, this seems to be exactly what I wanted, thanks! LGTM.

This revision was automatically updated to reflect the committed changes.

@tra, @rnk, @rjmccall, thanks for the review!

Yours,
Andrey