Page MenuHomePhabricator

[clang][AST] Ignore template instantiations if not in AsIs mode
ClosedPublic

Authored by steveire on Jun 1 2020, 3:20 PM.

Details

Summary

IgnoreUnlessSpelledInSource mode should ignore these because they are
not written in the source. This matters for example when trying to
replace types or values which are templated. The new test in
TransformerTest.cpp in this commit demonstrates the problem.

In existing matcher code, users can write
unless(isInTemplateInstantiation()) or unless(isInstantiated()) (the
user must know which to use). The point of the
TK_IgnoreUnlessSpelledInSource mode is to allow the novice to avoid such
details. This patch changes the IgnoreUnlessSpelledInSource mode to
skip over implicit template instantiations.

This patch does not change the TK_AsIs mode. Adjust existing clang-tidy
matchers which explicitly desire to match implicit template
instantiations.

Note: An obvious attempt at an alternative implementation would simply
change the shouldVisitTemplateInstantiations() in ASTMatchFinder.cpp to
return something conditional on the operational TraversalKind. That
does not work because shouldVisitTemplateInstantiations() is called
before a possible top-level traverse() matcher changes the operational
TraversalKind.

The fact that RAV excludes these nodes by default is also justification
for this behavior in IgnoreUnlessSpelledInSource mode. It contains the
following comment:

By default, we do not traverse the instantiations of                   
class templates since they do not appear in the user code.

Diff Detail

Event Timeline

steveire created this revision.Jun 1 2020, 3:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2020, 3:20 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I think it makes sense to align the documentation and the definition of IgnoreUnlessSpelledInSource, I'm just not sure which component we should change -- the documentation or the implementation.

Making IgnoreUnlessSpelledInSource not match template instantiations is a nice and simple mental model. However, I'm not sure it is more useful in practice, and here's why. Processing non-instantiated templates is generally very difficult to do correctly: major parts of AST semantics are still unknown, and the parts that can be known are not reliably available. Clang does not provide guarantees about what parts of the template AST are type checked and what are not (I think there are only aspirations), so AST matchers that work in simple tests (where Clang decides to type check more expressions) could fail in more complex templates (where Clang decides to type check less). Generally, processing non-instantiated templates is difficult for the same reasons that make analyzing macro definitions difficult. Analyzing macro definitions is so difficult that tools generally don't do it at all and analyze expansions instead. Analyzing template definitions is more tractable (at least we have some AST nodes), but still, in practice, I'd generally recommend people to analyze template instantiations instead and if they need to perform a refactoring (like renaming a method), trying to cross-reference information across multiple template instantiations.

If IgnoreUnlessSpelledInSource is indeed for novice users (and not to be strictly interpreted as "it does what it says") we should think about whether it more useful to ignore instantiations or to match in instantiations.

For example, imagine someone who is trying to find usages of a certain function in their codebase by writing a matcher (for example so that they can rename it, or delete as dead code or whatever -- "find usages" is a very common first step). A matcher in IgnoreUnlessSpelledInSource mode that ignores instantiations could find no matches when in fact there are usages from templates. The user deletes the function, and in the best case they get a compilation error (and frustration because the AST matchers lied to them), or in the worst case due to the magic of overload resolution and ADL the code still compiles, but calls a different function that was in the overload set.

Thank you for bringing up this issue. I think it highlights an underlying problem -- editing templates is quite difficult -- that neither setting will address, as Dmitri expanded on above. Given the parallel to macros, I'd say your change is better than the status quo. Most clang tidies and other rewriting tools that I've encountered simply skip code in macro expansions, rather than reason about how to update the macro definition or whatnot. So, by that reasoning, we should skip template instantations.

My experience with clang-tidy has been that template instantiations are a double-edged sword. The instantiation is the only place at which you have sufficient information to perform many kinds of analyses, but it's also often not plausible to modify the templated code because it's not clear from the instantiation context how changes may impact other instantiations. The result is that most of the clang-tidy checks wind up punting on template instantiations similar to what they do for macros. Based on that experience, I think it makes sense to continue in the proposed direction.

Thank you for bringing up this issue. I think it highlights an underlying problem -- editing templates is quite difficult -- that neither setting will address, as Dmitri expanded on above. Given the parallel to macros, I'd say your change is better than the status quo. Most clang tidies and other rewriting tools that I've encountered simply skip code in macro expansions, rather than reason about how to update the macro definition or whatnot. So, by that reasoning, we should skip template instantations.

Yes, it's generally true for tools that given the choice of

  1. possibly making a change which is definitely incorrect (as is currently done by default and as demonstrated in the Transformer test case)
  2. not making a particular change meaning the overall change is incomplete (as in Dmitris response)

number (2) would always be way to go. I don't think Dmitris objection makes any sense.

My experience with clang-tidy has been that template instantiations are a double-edged sword. The instantiation is the only place at which you have sufficient information to perform many kinds of analyses, but it's also often not plausible to modify the templated code because it's not clear from the instantiation context how changes may impact other instantiations. The result is that most of the clang-tidy checks wind up punting on template instantiations similar to what they do for macros. Based on that experience, I think it makes sense to continue in the proposed direction.

Great. I also think that given that the default has just been changed to TK_IgnoreUnlessSpelledInSource, we should change this soon. The facts would still be the same if we do this at some future date, but it would be needlessly more disruptive.

If IgnoreUnlessSpelledInSource is indeed for novice users (and not to be strictly interpreted as "it does what it says") we should think about whether it more useful to ignore instantiations or to match in instantiations.

Tools should generally follow the Hippocratic: First do no harm.

As I demonstrated, simple tools currently accidentally make unintended and incorrect changes to code. That is wrong.

Do you have a standing objection to this being changed?

I can't really work on this if I know or assume that you're going to prevent it going in (ever or for a significant time which results in this being harder than it should be).

steveire edited the summary of this revision. (Show Details)Jun 3 2020, 4:16 PM
klimek added a comment.Jun 4 2020, 2:42 AM

Thank you for bringing up this issue. I think it highlights an underlying problem -- editing templates is quite difficult -- that neither setting will address, as Dmitri expanded on above. Given the parallel to macros, I'd say your change is better than the status quo. Most clang tidies and other rewriting tools that I've encountered simply skip code in macro expansions, rather than reason about how to update the macro definition or whatnot. So, by that reasoning, we should skip template instantations.

I've personally written quite a few tools that changed macro definitions. Do you have numbers on how many of the library team's tools generally want to match template instantiations?

klimek added a comment.Jun 4 2020, 2:53 AM

Without jumping into the discussion whether it should be the default, I think we should be able to control template instantiation visitation separately from other implicit nodes.
Having to put AsIs on a matcher every time you need to match template instantiations is a rather big change (suddenly you have to change all the matchers you've written so far).
I love the idea of being able to control visitation of template instantiation.
I am somewhat torn on whether it should be the default, and would like to see more data.
I feel more strongly about needing AsIs when I want to match template instantiations.

Without jumping into the discussion whether it should be the default, I think we should be able to control template instantiation visitation separately from other implicit nodes.

Hmm, IgnoreUnlessSpelledInSource is designed and named to fill that role. After this patch is in, I intend to make it also control whether the implicit node here is dumped/matchable: https://godbolt.org/z/V_KCFm

Having to put AsIs on a matcher every time you need to match template instantiations is a rather big change (suddenly you have to change all the matchers you've written so far).

I don't think that's true. You have to change the matchers you've written which deliberately match the instantiation, but you can also (optionally) simplify the others to remove unless(isInTemplateInstantiation()) from them. The third category of matchers are the ones where you haven't used unless(isInTemplateInstantiation()) even though you should have and you have a bug that you don't know about yet. This change fixes those ones.

I love the idea of being able to control visitation of template instantiation.
I am somewhat torn on whether it should be the default, and would like to see more data.
I feel more strongly about needing AsIs when I want to match template instantiations.

I feel strongly that the default should not change code in a known-wrong way, as the unit test demonstrates. It's not a novice-friendly default.

Hmm, IgnoreUnlessSpelledInSource is designed and named to fill that role.

In other words: to me this change is a bug fix.

Without jumping into the discussion whether it should be the default, I think we should be able to control template instantiation visitation separately from other implicit nodes.
Having to put AsIs on a matcher every time you need to match template instantiations is a rather big change (suddenly you have to change all the matchers you've written so far).

I think that's the intended meaning of AsIs though. Template instantiations are not source code the user wrote, they're source code the compiler stamped out from code the user wrote. I hope IgnoreUnlessSpelledInSource isn't a misnomer.

I love the idea of being able to control visitation of template instantiation.
I am somewhat torn on whether it should be the default, and would like to see more data.
I feel more strongly about needing AsIs when I want to match template instantiations.

FWIW, my experience in clang-tidy has been that template instantiations are ignored far more often than they're desired. In fact, instantiations tend to be a source of bugs for us because they're easy to forget about when writing matchers without keeping templates in mind. The times when template instantiations become important to *not* ignore within the checks is when the check is specific to template behavior, but that's a minority of the public checks thus far.

I don't think that's true. You have to change the matchers you've written which deliberately match the instantiation, but you can also (optionally) simplify the others to remove unless(isInTemplateInstantiation()) from them. The third category of matchers are the ones where you haven't used unless(isInTemplateInstantiation()) even though you should have and you have a bug that you don't know about yet. This change fixes those ones.

This seems to be an example of that third category: https://reviews.llvm.org/D81336

I love the idea of being able to control visitation of template instantiation.
I am somewhat torn on whether it should be the default, and would like to see more data.
I feel more strongly about needing AsIs when I want to match template instantiations.

I feel strongly that the default should not change code in a known-wrong way, as the unit test demonstrates. It's not a novice-friendly default.

Any more feelings on this?

steveire updated this revision to Diff 269081.Jun 7 2020, 3:33 PM

Port unit tests

steveire retitled this revision from WIP: Ignore template instantiations if not in AsIs mode to Ignore template instantiations if not in AsIs mode.Jun 7 2020, 3:34 PM
steveire edited the summary of this revision. (Show Details)

Yeah, I think this is consistent with the naming/doc of IgnoreUnlessSpelledInSource, and the concerns are more about the implications of making a less-powerful, more conservative version of matchers the default.
Which isn't really what *this* patch is about, but it's reasonable to be concerned at how much functionality is being "buried" beneath a pretty awkward API.

klimek added a comment.Jun 8 2020, 2:06 AM

Without jumping into the discussion whether it should be the default, I think we should be able to control template instantiation visitation separately from other implicit nodes.
Having to put AsIs on a matcher every time you need to match template instantiations is a rather big change (suddenly you have to change all the matchers you've written so far).

I think that's the intended meaning of AsIs though. Template instantiations are not source code the user wrote, they're source code the compiler stamped out from code the user wrote. I hope IgnoreUnlessSpelledInSource isn't a misnomer.

I love the idea of being able to control visitation of template instantiation.
I am somewhat torn on whether it should be the default, and would like to see more data.
I feel more strongly about needing AsIs when I want to match template instantiations.

FWIW, my experience in clang-tidy has been that template instantiations are ignored far more often than they're desired. In fact, instantiations tend to be a source of bugs for us because they're easy to forget about when writing matchers without keeping templates in mind. The times when template instantiations become important to *not* ignore within the checks is when the check is specific to template behavior, but that's a minority of the public checks thus far.

So I assume the idea is that this will work & be what we'll want people to write?
traverse(AsIs, decl(traverse(IgnoreImplicit, hasFoo)))

Without jumping into the discussion whether it should be the default, I think we should be able to control template instantiation visitation separately from other implicit nodes.
Having to put AsIs on a matcher every time you need to match template instantiations is a rather big change (suddenly you have to change all the matchers you've written so far).

I think that's the intended meaning of AsIs though. Template instantiations are not source code the user wrote, they're source code the compiler stamped out from code the user wrote. I hope IgnoreUnlessSpelledInSource isn't a misnomer.

I love the idea of being able to control visitation of template instantiation.
I am somewhat torn on whether it should be the default, and would like to see more data.
I feel more strongly about needing AsIs when I want to match template instantiations.

FWIW, my experience in clang-tidy has been that template instantiations are ignored far more often than they're desired. In fact, instantiations tend to be a source of bugs for us because they're easy to forget about when writing matchers without keeping templates in mind. The times when template instantiations become important to *not* ignore within the checks is when the check is specific to template behavior, but that's a minority of the public checks thus far.

So I assume the idea is that this will work & be what we'll want people to write?
traverse(AsIs, decl(traverse(IgnoreImplicit, hasFoo)))

I believe so, yes. It's explicit about which traversal mode you want any given set of matchers to match with, so it is more flexible at the expense of being more verbose. Alternatively, you could be in AsIs mode for all of it and explicitly ignore the implicit nodes you wish to ignore (which may be even more verbose with the same results, depending on the matchers involved).

Without jumping into the discussion whether it should be the default, I think we should be able to control template instantiation visitation separately from other implicit nodes.
Having to put AsIs on a matcher every time you need to match template instantiations is a rather big change (suddenly you have to change all the matchers you've written so far).

I think that's the intended meaning of AsIs though. Template instantiations are not source code the user wrote, they're source code the compiler stamped out from code the user wrote. I hope IgnoreUnlessSpelledInSource isn't a misnomer.

I love the idea of being able to control visitation of template instantiation.
I am somewhat torn on whether it should be the default, and would like to see more data.
I feel more strongly about needing AsIs when I want to match template instantiations.

FWIW, my experience in clang-tidy has been that template instantiations are ignored far more often than they're desired. In fact, instantiations tend to be a source of bugs for us because they're easy to forget about when writing matchers without keeping templates in mind. The times when template instantiations become important to *not* ignore within the checks is when the check is specific to template behavior, but that's a minority of the public checks thus far.

So I assume the idea is that this will work & be what we'll want people to write?
traverse(AsIs, decl(traverse(IgnoreImplicit, hasFoo)))

I believe so, yes. It's explicit about which traversal mode you want any given set of matchers to match with, so it is more flexible at the expense of being more verbose. Alternatively, you could be in AsIs mode for all of it and explicitly ignore the implicit nodes you wish to ignore (which may be even more verbose with the same results, depending on the matchers involved).

As far as I can tell, the people who had "standing objections" to fixing this bug have been convinced that it makes sense. At least that is implied.

The problem is they haven't explicitly reversed so progress is still blocked.

Please respond to either say you're no longer blocking this or say why you continue to block it.

There is no reason not to fix this.

steveire updated this revision to Diff 270472.Jun 12 2020, 11:38 AM
steveire edited the summary of this revision. (Show Details)

Update

Without jumping into the discussion whether it should be the default, I think we should be able to control template instantiation visitation separately from other implicit nodes.
Having to put AsIs on a matcher every time you need to match template instantiations is a rather big change (suddenly you have to change all the matchers you've written so far).

I think that's the intended meaning of AsIs though. Template instantiations are not source code the user wrote, they're source code the compiler stamped out from code the user wrote. I hope IgnoreUnlessSpelledInSource isn't a misnomer.

I love the idea of being able to control visitation of template instantiation.
I am somewhat torn on whether it should be the default, and would like to see more data.
I feel more strongly about needing AsIs when I want to match template instantiations.

FWIW, my experience in clang-tidy has been that template instantiations are ignored far more often than they're desired. In fact, instantiations tend to be a source of bugs for us because they're easy to forget about when writing matchers without keeping templates in mind. The times when template instantiations become important to *not* ignore within the checks is when the check is specific to template behavior, but that's a minority of the public checks thus far.

So I assume the idea is that this will work & be what we'll want people to write?
traverse(AsIs, decl(traverse(IgnoreImplicit, hasFoo)))

I believe so, yes. It's explicit about which traversal mode you want any given set of matchers to match with, so it is more flexible at the expense of being more verbose. Alternatively, you could be in AsIs mode for all of it and explicitly ignore the implicit nodes you wish to ignore (which may be even more verbose with the same results, depending on the matchers involved).

Support for clang-query is coming up, I assume? If that works & will work in clang-query that addresses my concern.

As far as I can tell, the people who had "standing objections" to fixing this bug have been convinced that it makes sense. At least that is implied.

The problem is they haven't explicitly reversed so progress is still blocked.

Please respond to either say you're no longer blocking this or say why you continue to block it.

There is no reason not to fix this.

Stephen, I want to call this out, as I think the way you write here is not OK:

  1. the scare quotes around "standing objections" reads like you're not respecting the opinions of others here; it's fine to be frustrated, it's fine to call out why you're frustrated (for example, reviews taking long time, or it being hard to find consensus), but it's not OK to use sarcasm to implicitly attack folks - the only result is that people are hurt
  2. if you want somebody to chime in, please say so; implicitly alluding to people you want to respond reads as an implicit accusation of folks trying to block you; nobody wants to block you; we're professionals trying to create a great product, and we often have different opinions what that means; that's part of the process, and in the end makes for a better product
  3. "There is no reason not to fix this" - asserting your believes is not an argument, and comes across as arrogant - the moment other people have objections there are reasons to object; whether those reasons in the end are more important than the reasons you cited for doing it is often a value call, and different people have different values

If you need to vent, feel free to write me (and only me) angry emails (if you preface with Vent: I also know the context, which helps :), or I'm also happy to add you to my work chat so you can vent at me in chat. I understand the need to vent - but it must not be done at the emotional expense of others on the project. In the end, behaving like this in code reviews will reduce the chance that people are willing to engage with you and spend time on reviewing your code, leading to even longer review times. This is a vicious cycle. It's in your hands to break it, but it will take time and patience.
As I've said before, I love what you're doing for the project, but ultimately, if you cannot engage constructively, we are an OSS project exactly so people who don't want to be part of the community can fork and do their own thing.
If you want to be part of the community (and I do sincerely hope that you do!), it's important to play by its rules; in the long run, it will make it a significantly better experience (and better product!) for everyone.

  1. the scare quotes around "standing objections" reads like you're not respecting the opinions of others here;

Hmm, this wasn't intended. I sometimes quote things if they are a particular term (ordinarily I would quote "particular term").

Nevertheless, I'll take on your message. I don't have more to say.

  1. the scare quotes around "standing objections" reads like you're not respecting the opinions of others here;

Hmm, this wasn't intended. I sometimes quote things if they are a particular term (ordinarily I would quote "particular term").

Nevertheless, I'll take on your message. I don't have more to say.

Thanks!

steveire added a comment.EditedJun 21 2020, 4:15 AM

Here's another example where it is not appropriate to transform a template instantiation:

http://clang-developers.42468.n3.nabble.com/Questions-discussions-about-cast-types-in-clang-td4068626.html

"If we call foo<int>, it can be a const_cast, but if we call foo<double>, it has to be a reinterpret_cast. In a situation where the user is writing library code, it won't only be relevant to the current translation unit but also for future purposes. Therefore, I propose that we leave dependent type c-style casts as they are."

Just recording it as part of the motivation of this patch.

I reviewed the changes in the patch and they seem reasonable, but this patch is hard to have high confidence in because you need to audit all the places where the default behavior silently changed and no changes were made in the patch. I'm assuming that the code changes were made to cause existing tests to pass, but did you verify that changes weren't needed to checks that show no behavioral test changes to ensure that it wasn't because the tests have poor template test coverage?

clang-tools-extra/clang-tidy/bugprone/TooSmallLoopVariableCheck.cpp
62 ↗(On Diff #270517)

Do we still need this TK_AsIs with the below changes?

clang/lib/ASTMatchers/ASTMatchFinder.cpp
587

This should not be part of the public interface.

593

Nor should this, right?

martong removed a subscriber: martong.Jul 27 2020, 9:16 AM

Many of the changes which were part of a previous iteration of this change were related to the change of default behavior of matchers. As the default is no longer changed, those changes fell away.

lebedev.ri retitled this revision from Ignore template instantiations if not in AsIs mode to [clang][AST] Ignore template instantiations if not in AsIs mode.Oct 26 2020, 6:30 AM

Many of the changes which were part of a previous iteration of this change were related to the change of default behavior of matchers. As the default is no longer changed, those changes fell away.

Indeed! Thank you for continuing on this. Aside from a few small nits, I think this LGTM.

clang/include/clang/AST/ASTNodeTraverser.h
485–488

Any interest in using curly braces here for readability?

clang/lib/AST/ASTDumper.cpp
132–136

Any interest in using curly braces here for readability?

clang/lib/ASTMatchers/ASTMatchersInternal.cpp
287–289

You can combine these into a single if statement (and then elide the braces).

312

Same here.

aaron.ballman accepted this revision.Mon, Nov 2, 5:36 AM

Oops, forgot to click the important bit to actually accept the review. :-)

This revision is now accepted and ready to land.Mon, Nov 2, 5:36 AM
This revision was landed with ongoing or failed builds.Mon, Nov 2, 12:22 PM
This revision was automatically updated to reflect the committed changes.

This seems to make asan buildbots unhappy: http://lab.llvm.org:8011/#/builders/5/builds/744
I can't see anything immediately suspicious, but I don't know the matcher internals that well :-\