This is an archive of the discontinued LLVM Phabricator instance.

For a used declaration, mark any associated usings as referenced.
Needs ReviewPublic

Authored by CarlosAlbertoEnciso on Apr 27 2018, 7:37 AM.

Details

Summary

The following submission under review

https://reviews.llvm.org/D44826

introduced an option to warn on unused 'using declarations'.

But it can't detect cases like

namespace N {
  void foo();
  int var;
}

using N::foo;     <-- warning
using N::var;     <-- MISSING warning  

void A() {
  using N::foo;     <-- warning
  using N::var;     <-- warning
}

void B() {
  N::var = 1;
}

One of the reviewers (dblaikie) mentioned the possible cause for that limitation.

You mention a missed case in the description - where the target of a using decl is
used and that causes the unused-using not to warn about the using decl? That
seems like a big limitation. Does that mean the 'used' bit is being tracked in the
wrong place, on the target of the using decl instead of on the using decl itself?

This patch, set the 'Referenced' bit for any associated 'usings' for a referenced
declaration. The goal is to create additional information to help in the reduction
of the debug information size, which is affected by the extra generation of
'non-referenced' declarations.

Note:

To solve the debug information issue (size), there are 3 pieces of
work:

  • Set the 'referenced' bit for 'usings'. This patch.
  • Review the - Wunused-usings and -Wunused-local-typedefs implementation to take advantage of new 'Referenced' settings.
  • Do not generate debug information for the non-referenced using.

Thanks for your view on this issue and on the general approach.

Diff Detail

Event Timeline

Note:

The patch includes the entry points to set Objective-C and OpenMP declarations, but I do not
know anything about Objective-C or OpenMP. The functions just return without updating any
extra 'Used' bit.

Any suggestion on how to deal with those functions, is very much appreciated.

Thanks.

rsmith requested changes to this revision.Apr 27 2018, 4:07 PM

No, absolutely not. This would be insanely expensive, and marks entirely unrelated things "used".

The right thing to do here is to consistently track both the named declaration (the one found by name lookup, which might be a using-declaration) and the ultimately selected declaration (which might be the target of the using-declaration, or if that declaration is a template, a specialization of that template), and pass both to MarkAnyDeclUsed. Then we should mark the named declaration as Referenced and the selected declaration as Used. (The "unused using-declaration" warning should be based on the Referenced bit, since it's not meaningful to odr-use a using-declaration.)

This revision now requires changes to proceed.Apr 27 2018, 4:07 PM

No, absolutely not. This would be insanely expensive, and marks entirely unrelated things "used".

My apologies, reading this again it comes off a lot harsher than I'd intended.

No, absolutely not. This would be insanely expensive, and marks entirely unrelated things "used".

My apologies, reading this again it comes off a lot harsher than I'd intended.

I appreciate your apology and I understand your concerns. I do not have much knowledge on this area.

My initial intention was to traverse just the associated type for an odr-used declaration and mark them as used, as an indication that those types are "really needed" for that odr-declaration. The type chain is traversed just once.

The approach was intended as a general solution, not only for the original 'using declaration' issue.

I will have a look at your suggested approach.

Thanks very much for your feedback.

My initial approach was based on the following test case:

void Bar() {
  typedef int I_Ref;
  I_Ref var_bar;
}

void Foo() {
  typedef int I_Used;
  I_Used var_foo;
  var_foo = 2;
}

void Test() {
  Foo();
}

and the generated AST looks like:

|-FunctionDecl Bar 'void ()'
|   |-DeclStmt
|   | `-TypedefDecl referenced I_Ref 'int'
|   |-DeclStmt
|     `-VarDecl var_bar 'I_Ref':'int'

|-FunctionDecl used Foo 'void ()'
|   |-DeclStmt
|   | `-TypedefDecl referenced I_Used 'int'
|   |-DeclStmt
|   | `-VarDecl used var_foo 'I_Used':'int'

The typedef 'I_Ref' is marked as 'referenced' due to its association with 'var_bar'.
The typedef 'I_Used' is marked as 'referenced' despite that its association with 'var_foo' which is 'used'
Both 'typedefs' are marked as 'referenced'.

With the intended patch, the AST looks like:

|-FunctionDecl Bar 'void ()'
|   |-DeclStmt
|   | `-TypedefDecl referenced I_Ref 'int'
|   `-DeclStmt
|     `-VarDecl var_bar 'I_Ref':'int'

|-FunctionDecl used Foo 'void ()'
|   |-DeclStmt
|   | `-TypedefDecl used I_Used 'int'
|   |-DeclStmt
|   | `-VarDecl used var_foo 'I_Used':'int'

The typedef 'I_Ref' is marked as 'referenced'.
The typedef 'I_Used' is marked as 'used'.

My initial approach was based on the following test case:

void Bar() {
  typedef int I_Ref;
  I_Ref var_bar;
}

void Foo() {
  typedef int I_Used;
  I_Used var_foo;
  var_foo = 2;
}

void Test() {
  Foo();
}

[...]

The typedef 'I_Ref' is marked as 'referenced' due to its association with 'var_bar'.
The typedef 'I_Used' is marked as 'referenced' despite that its association with 'var_foo' which is 'used'
Both 'typedefs' are marked as 'referenced'.

This is already the correct outcome. Both typedefs are referenced; this is correct because they are named by the source code. Neither typedef is marked as "used"; this is correct because a typedef cannot be odr-used.

Let's look at your original example again (I've cleaned it up and extended it a bit, but hopefully this matches your intent):

namespace nsp {
  int var1, var2;
}

using nsp::var1;
using nsp::var2;

void bar() {
  nsp::var1 = 1;
  var2 = 1;
}

Now, what should happen here is:

  • The use of nsp::var1 in bar marks nsp::var1 as referenced (because it was named), and marks nsp::var1 as used (because it was odr-used by the assignment). The UsingShadowDecl for ::var1 is not marked as referenced (nor used), so we emit an unused-using-declaration warning.
  • The use of var2 in bar marks ::var2 as referenced (because it was named), and marks nsp::var2 as used (because it was odr-used by the assignment). The UsingShadowDecl for ::var2 is marked as referenced, so we do not warn on it.

This is already the correct outcome. Both typedefs are referenced; this is correct because they are named by the source code. Neither typedef is marked as "used"; this is correct because a typedef cannot be odr-used.

I appreciate very much your explanations on the typedef case.

My misunderstanding comes from my experience with a commercial C/C++ front-end, which has the concept of 'needed' and 'really needed' and those concepts seems different to the 'referenced' and 'used' in Clang. I was trying to match that concept:

needed -> referenced.
really needed -> used.

Let's look at your original example again (I've cleaned it up and extended it a bit, but hopefully this matches your intent):

namespace nsp {
  int var1, var2;
}

using nsp::var1;
using nsp::var2;

void bar() {
  nsp::var1 = 1;
  var2 = 1;
}

Now, what should happen here is:

  • The use of nsp::var1 in bar marks nsp::var1 as referenced (because it was named), and marks nsp::var1 as used (because it was odr-used by the assignment). The UsingShadowDecl for ::var1 is not marked as referenced (nor used), so we emit an unused-using-declaration warning.
  • The use of var2 in bar marks ::var2 as referenced (because it was named), and marks nsp::var2 as used (because it was odr-used by the assignment). The UsingShadowDecl for ::var2 is marked as referenced, so we do not warn on it.

Your extended example matches my original intent. That is a good starting point. Now I am clear on the expected behavior.

Thanks very much for your comments.

aprantl removed a subscriber: aprantl.May 2 2018, 9:26 AM
CarlosAlbertoEnciso retitled this revision from For an ODR declaration, set the 'Used' bit on its associated declarations. to For a referenced declaration, mark any associated usings as referenced..May 16 2018, 4:37 AM
CarlosAlbertoEnciso edited the summary of this revision. (Show Details)

Address the issues raised by the reviewer (rsmith).

CarlosAlbertoEnciso edited the summary of this revision. (Show Details)May 16 2018, 4:57 AM
CarlosAlbertoEnciso edited the summary of this revision. (Show Details)May 16 2018, 5:03 AM

Ping.

The review

https://reviews.llvm.org/D44826

is already approved and it is dependent on this patch being reviewed.

Is there anything I can add to this patch?

Thanks very much.

@rsmith anything else needed here?

Ping.

The review

https://reviews.llvm.org/D44826

is already approved and it is dependent on this patch being reviewed.

@rsmith Is there anything I can add to this patch?

Thanks

@dblaikie is @rsmith back from the standards meeting yet? I hate to be a pest but this is blocking other work Carlos has in progress.

Style comments.
The two new Sema methods (for namespaces and using references) are C++ specific, so SemaDeclCXX.cpp would seem like a more appropriate home for them.

lib/Sema/Sema.cpp
1879 ↗(On Diff #147050)

You could do this as an early return and reduce indentation in the rest of the method.

if (!ND || ND->isReferenced())
  return;
1880 ↗(On Diff #147050)

Initializing this to nullptr is redundant, as you set NA in the while-loop expression.

1891 ↗(On Diff #147050)

\brief is unnecessary, as we have auto-brief turned on.

1903 ↗(On Diff #147050)

This dyn_cast<> can be simply a cast<>.

1916 ↗(On Diff #147050)

You could sink the declaration of USD like so:

for (auto *DR : S->decls())
  if (auto *USD = dyn_cast<UsingShadowDecl>(DR))
    if (!USD->isReferenced() && USD->getTargetDecl() == D) {

Also I would put braces around the 'for' loop body, even though it is technically one statement.

1927 ↗(On Diff #147050)

You verified that his is a namespace earlier, so use cast<> not dyn_cast<>.

The right way to handle this is to pass both the ultimately-selected declaration and the declaration found by name lookup into the calls to MarkAnyDeclReferenced and friends. We should mark the selected declaration as Used (when appropriate), and mark the found declaration as Referenced.

We should not be trying to reconstruct which using declarations are in scope after the fact. Only declarations actually found by name lookup and then selected by the context performing the lookup should be marked referenced. (There may be cases where name lookup discards equivalent lookup results that are not redeclarations of one another, though, and to handle such cases properly, you may need to teach name lookup to track a list of found declarations rather than only one.)

lib/Sema/Sema.cpp
1884–1885 ↗(On Diff #147050)

The loop and recursion here -- and indeed this whole function -- seem unnecessary.

namespace A {} // #1 
namespace X {
  namespace B = A; // #2
}
namespace Y = X; // #3
namespace C = Y::B; // #4

Declaration #4 should mark #2 and #3 referenced. So the only 'unreferenced namespace alias' warning we should get here is for #4 -- and there is no need for marking #4 as referenced to affect #2 or #3.

1890–1891 ↗(On Diff #147050)

Why is this ever appropriate? I would think we should just mark the named declaration from the relevant lookup as referenced in all cases.

@probinson Thanks very much for your review. I will address them.

rsmith Thanks very much for your review. I will address them.

The right way to handle this is to pass both the ultimately-selected declaration and the declaration found by name lookup into the calls to MarkAnyDeclReferenced and friends. We should mark the selected declaration as Used (when appropriate), and mark the found declaration as Referenced.

We should not be trying to reconstruct which using declarations are in scope after the fact. Only declarations actually found by name lookup and then selected by the context performing the lookup should be marked referenced. (There may be cases where name lookup discards equivalent lookup results that are not redeclarations of one another, though, and to handle such cases properly, you may need to teach name lookup to track a list of found declarations rather than only one.)

Following your comments, I have replaced the scope traversal with LookupName in order to find the declarations.

But there are 2 specific cases:

  • using-directives

The LookupName function does not record the using-directives. With this patch, there is an extra option to store those directives in the lookup results, to be processed during the setting of the 'Referenced' bit.

  • namespace-alias

I am aware of your comment on the function that recursively traverse the namespace alias, but I can't see another way to do it. If you have a more specific idea, I am happy to explore it.

For both cases, may be LookupName function can have that functionality and store in the results any namespace-directives and namespace-alias associated with the given declaration.

lib/Sema/Sema.cpp
1879 ↗(On Diff #147050)

Corrected to

if (!ND || ND->isReferenced())
  return;
1880 ↗(On Diff #147050)

Removed the nullptr initialization.

1891 ↗(On Diff #147050)

Removed the \brief format.

1903 ↗(On Diff #147050)

Changed the dyn_cast<> to cast<>

1916 ↗(On Diff #147050)

Not available in the new patch.

1927 ↗(On Diff #147050)

Not available in the new patch.

Style comments.
The two new Sema methods (for namespaces and using references) are C++ specific, so SemaDeclCXX.cpp would seem like a more appropriate home for them.

Both functions have been moved to SemaDeclCXX.cpp.

CarlosAlbertoEnciso retitled this revision from For a referenced declaration, mark any associated usings as referenced. to For a used declaration, mark any associated usings as referenced..Jul 4 2018, 4:54 AM

Update the patch to address the comments from the reviewers.

lib/Sema/Sema.cpp
1884–1885 ↗(On Diff #147050)

The function and recursion is to be able to handle cases like:

namespace N {
  int var;
}

void Foo() {
  namespace NA = N;
  namespace NB = NA;
  NB::var = 4;
}

'var' is referenced and N, NA and NB should be marked as 'referenced' as well.

1890–1891 ↗(On Diff #147050)

My intention is to detect cases where the using statements do not affect other declarations.

namespace N {
  int var;
}

void Foo() {
  using N::var;    <- No Referenced
  N::var = 1;      <- Used
}

@rsmith Is there anything I can add to this patch?

The review

https://reviews.llvm.org/D44826

is already approved and it is dependent on this patch being reviewed.

Thanks very much

@rsmith and @probinson:

Is there anything I can add to this patch?

The review

https://reviews.llvm.org/D44826

is already approved and it is blocked by this patch being reviewed.

Thanks very much

A bunch of style comments. Maybe try clang-format-diff.

include/clang/Sema/SemaInternal.h
91

The comments on a nullptr parameter usually use the formal parameter name, not its type.

lib/Sema/SemaDeclCXX.cpp
15549

Space after the comma.

15554

Could this be a range-based for loop?

15566

else if

lib/Sema/SemaExpr.cpp
14460

Parameter comments usually use the formal parameter name, not its type.

15374

Parameter comments usually use the formal parameter name, not its type.

15396

Parameter comments usually use the formal parameter name, not its type.

lib/Sema/SemaLookup.cpp
196

Can this be a range-based for loop?

1064

Space after the comma.
UDirs has a different meaning elsewhere in this file, maybe UsingDirs instead?

1236

Space after the comma.

1277

Space after the comma.

CarlosAlbertoEnciso marked 11 inline comments as done.

Address review comments from @probinson

include/clang/Sema/SemaInternal.h
91

Replaced the type name with the formal parameter name.

lib/Sema/SemaDeclCXX.cpp
15554

Replaced by a range-based for loop.

lib/Sema/SemaExpr.cpp
14460

Replaced the type name with the formal parameter name.

15374

Replaced the type name with the formal parameter name.

15396

Replaced the type name with the formal parameter name.

lib/Sema/SemaLookup.cpp
98

As UDirs is used to reference an instance of UnqualUsingDirectiveSet, I have changed UDirs to UsingDirs.

196

Replaced with a range-based for loop.

1064

Preserved 'UDirs' in this function, but changed the 'UDirs' typedef into 'UsingDirs'.

A bunch of style comments. Maybe try clang-format-diff.

@probinson:

Thanks very much for your review.

@probinson: I tried clang-format-diff and the main format issues are with the test cases.

Do you want the test cases to follow the clang format?

@rsmith any further thoughts? We would really like to get this in before the LLVM 7.0 branch, currently scheduled for 1 August.

@probinson: I tried clang-format-diff and the main format issues are with the test cases.

Do you want the test cases to follow the clang format?

The lib and include files absolutely need to follow clang format. Tests should also, if doing so doesn't disturb their readability. If clang-format-diff is doing things like rearranging comments in the test files, it depends on whether it improves or reduces clarity. Up to you.

lib/Sema/SemaDeclCXX.cpp
15500

This should use /// to be picked up by doxygen.

Used clang-format-diff as indicated by @probinson:

  • The lib and include files follow the clang format.
  • The tests mostly follow the clang format. I removed some extra formatting which was introduced and disturb their readability.
CarlosAlbertoEnciso marked an inline comment as done.

A minor modification to allow the comments for MarkNamespaceAliasReferenced to be picked up by doxygen.

@rsmith & @probinson,

Is there anything I can add to this patch in order to submit it?
This patch is blocking the review https://reviews.llvm.org/D44826 which is already approved.

Thanks very much.

lib/Sema/SemaDeclCXX.cpp
15500

Uploaded a new patch that contains that change.

probinson accepted this revision.Jul 26 2018, 1:26 PM

Although I am far from expert in how Clang manages declarations, AFAICT this does what @rsmith requested, so LGTM.

rsmith added inline comments.Jul 26 2018, 1:30 PM
include/clang/Sema/Sema.h
4020–4022

Pass in the FoundDecl instead of a scope specifier and an expression.

lib/Sema/SemaDeclCXX.cpp
15505–15510

This is not the right approach. You should not iteratively mark namespaces as referenced here. Instead, when we parse the first namespace alias declaration in the chain, *it* should mark the subsequent one referenced immediately. So:

namespace A { int n; }
namespace B = A; // this declaration marks A referenced
namespace C = B; // this declaration marks B referenced
int k = C::n; // parsing the nested-name-specifier of this expression marks C referenced
15516–15571

You should not do any of this. You should keep track of how each name was actually originally found, and mark the FoundDecl as referenced. Do not attempt to redo the lookup like you do here. You will get it wrong, and in any case, redoing the lookup is wasteful.