Page MenuHomePhabricator

PR47663: Warn if an entity becomes weak after its first use.
AcceptedPublic

Authored by rsmith on Oct 11 2020, 2:08 PM.

Details

Summary

Such code will not work in general; we might have already used the
non-weakness, for example when constant-evaluating an address comparison
Thisor when generating code for the declaration.

Modern versions of GCC reject some such cases, but silently accept
others (based, it appears, on whether the weakness of the declaration
was already inspected). It doesn't seem feasible for us to exactly
follow their behavior, and it might be problematic to start rejecting
cases that GCC accepts and that work in practice, so only warn on this
rather than rejecting, at least for now.

Diff Detail

Unit TestsFailed

TimeTest
4,300 mswindows > Clang-Unit.DirectoryWatcher/_/DirectoryWatcherTests_exe::DirectoryWatcherTest.AddFiles
Note: Google Test filter = DirectoryWatcherTest.AddFiles [==========] Running 1 test from 1 test case.
270 mswindows > lld.ELF/invalid::symtab-sh-info.s
Script: -- : 'RUN: at line 4'; c:\ws\w32-1\llvm-project\premerge-checks\build\bin\yaml2obj.exe --docnum=1 C:\ws\w32-1\llvm-project\premerge-checks\lld\test\ELF\invalid\symtab-sh-info.s -o C:\ws\w32-1\llvm-project\premerge-checks\build\tools\lld\test\ELF\invalid\Output\symtab-sh-info.s.tmp.o

Event Timeline

rsmith created this revision.Oct 11 2020, 2:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 11 2020, 2:08 PM
rsmith requested review of this revision.Oct 11 2020, 2:08 PM

This seems like a good idea in the abstract; we'll need to figure out what the practical consequences are.

This seems like a good idea in the abstract; we'll need to figure out what the practical consequences are.

Looks like it triggers warnings in Abseil at least; there, the code looks like this:

// spinlock.h
void AbslInternalSpinLockDelay();
inline void lock(...) {
  AbslInternalSpinLockDelay();
}
// spinlock.cc
__attribute__((weak)) void AbslInternalSpinLockDelay() { ... }

... and adding the weak attribute to the declaration would change the meaning of the program (we want an external reference, not an extern_weak reference).

Perhaps we should only warn if the function is not defined in the same translation unit? If it is defined, then I think its address can never be null, and CodeGen will emit it with the correct linkage. We still have the problem that == comparisons against pointers to the function can incorrectly evaluate to false, but I think that's really a problem with aliases rather than with weakness. (C++ requires that void f(), g(); bool b = f == g; initializes b to false even though the only correct answer is that we don't know; we refuse to evaluate the comparison if either f or g is weak, but I think that's only really addressing the problem that they could both be null, not that they could have the same non-null address, since the latter problem isn't specific to weak declarations. I think the constant evaluator is being overly-conservative, and could evaluate f == g to false if f and g are both weak but at least one of them is defined locally, under the language-guaranteed assumption that distinct functions have different addresses. And perhaps we should have a way of declaring a function as potentially aliasing another function without actually defining the function as an alias, as a way to turn off that assumption independent of weakness?)

It does seem necessary to distinguish weak definitions from weak non-definitions, but that's completely reasonable — those two cases essentially act like totally different attributes that happen to be written with the same spelling. If we acknowledge that, I think that gives us a straightforward rule: the problem is adding weak to a non-definition after a previous non-weak non-definition.

I think the only reasonable solution for aliases is to assume they don't happen, given the language rules. We could add an attribute to allow, but I'm not sure anyone would use it: in practice I think it's extremely uncommon to define an external symbol as an alias of another external symbol. Maybe we should just document alongside the alias attribute that the behavior of equality comparisons involving aliased symbols may be inconsistent between translation units.

rsmith updated this revision to Diff 297767.Oct 12 2020, 10:19 PM
  • Don't warn if we see a weak definition for a declaration that's already

Patch looks good to me. I think we're running some internal tests; do you want to wait for those?

aaron.ballman accepted this revision.Oct 14 2020, 5:39 AM

LGTM aside from some very small nits (feel free to ignore any that don't make sense to you).

clang/lib/AST/DeclBase.cpp
622

This is just novel enough that I'd appreciate a comment about the declaration being in Attr.td so it's more obvious where to find this.

clang/lib/Sema/SemaDecl.cpp
6431

Can the pointers here all be marked const or does that cause issues?

6435–6436

Ah, it's too bad that Decl::specific_attrs() doesn't accept a pack of attributes...

18288–18302

Same request for const here as above.

This revision is now accepted and ready to land.Oct 14 2020, 5:39 AM

Patch looks good to me. I think we're running some internal tests; do you want to wait for those?

More testing and validation would be nice; I don't think this patch is urgent so I'm happy to wait.

In my internal testing, I found one more false positive: the protocol buffer compiler's support for weak imports results in code like this:

// generated .h file
extern T foo;
inline T *get() { return &foo; }
// generated .cc file
__attribute__((weak)) extern T foo;
// more uses of foo

This is intentional: the idea is that if someone includes the .h file and invokes the inline function, they get a strong link-time dependency on the symbol foo and need to link against that proto library. But if not, then there is no such dependency, and the library is optional. However, this functionality of the protocol buffer compiler is deprecated and being phased out (and suppressing the warning in this one case should be straightforward) so I'm not worried about it.

rsmith updated this revision to Diff 298268.Oct 14 2020, 5:40 PM
rsmith marked 3 inline comments as done.
rsmith marked an inline comment as done.Oct 14 2020, 5:40 PM
rsmith added inline comments.
clang/lib/Sema/SemaDecl.cpp
6435–6436

I took a brief look at fixing that, but it's not a completely natural extension because we would want to use a different value_type for the iterator if there's more than one kind under consideration. Probably not worth it for only one user.

18288–18302

Can't make this const; this function modifies the declaration by adding an attribute :) But I can make VD below const.

aaron.ballman added inline comments.Oct 19 2020, 7:05 AM
clang/lib/Sema/SemaDecl.cpp
6435–6436

Agreed, thank you for looking into it!

18288–18302

Thanks!