This is an archive of the discontinued LLVM Phabricator instance.

[LifetimeAnalysis] Support more STL idioms (template forward declaration and DependentNameType)
ClosedPublic

Authored by mgehre on Aug 13 2019, 3:35 PM.

Details

Summary

This fixes inference of gsl::Pointer on std::set::iterator with libstdc++ (the typedef for iterator
on the template is a DependentNameType - we can only put the gsl::Pointer attribute
on the underlaying record after instantiation)

inference of gsl::Pointer on std::vector::iterator with libc++ (the class was forward-declared,
we added the gsl::Pointer on the canonical decl (the forward decl), and later when the
template was instantiated, there was no attribute on the definition so it was not instantiated).

and a duplicate gsl::Pointer on some class with libstdc++ (we first added an attribute to
a incomplete instantiation, and then another was copied from the template definition
when the instantiation was completed).

Diff Detail

Repository
rL LLVM

Event Timeline

mgehre created this revision.Aug 13 2019, 3:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2019, 3:35 PM
Herald added a subscriber: Szelethus. · View Herald Transcript
gribozavr added inline comments.Aug 14 2019, 1:55 AM
clang/lib/Sema/SemaAttr.cpp
102 ↗(On Diff #214949)

I wonder why this is necessary. Sema should call inference on every CXXRecordDecl. Is it because of incorrect short-circuiting in inferGslPointerAttribute that I'm pointing out below?

143 ↗(On Diff #214949)

So... what's the contract for pointer and owner attributes, are they expected to be present on every declaration in the redeclaration chain, or is only one sufficient?

Seems like here we're adding the attribute only to the canonical decl of the iterator, which will lead to the same sort of issue that this patch is trying to fix.

200 ↗(On Diff #214949)

Should this code not do this short-circuit check? It is prone to going out of sync with the code in addGslOwnerPointerAttributeIfNotExisting, like it did just now.

If you want to check for attribute before doing the string search, you could pass the string set into addGslOwnerPointerAttributeIfNotExisting, and let that decide if it should infer the attribute or not.

clang/test/SemaCXX/attr-gsl-owner-pointer-std.cpp
95 ↗(On Diff #214949)

This test file is getting pretty long and subtle, with lots of things are being mixed into one file without isolation.

WDYT about refactoring it into a unit test that uses AST matchers to verify attribute presence?

See clang/unittests/ASTMatchers/ASTMatchersNodeTest.cpp for examples.

Each test case would look approximately like this:

EXPECT_TRUE(matches(
  "template class ...",
  classTemplateDecl(hasName("foo"), hasAttr(clang::attr::GslPointer)));

Each test case would be isolated from other tests, each test would have a name (and optionally a comment) that will make it obvious what exactly is being tested.

It would be also possible to verify things like "The iterator typedef is a DependentNameType" to ensure that we're testing exactly what we want to test.

mgehre marked 6 inline comments as done.Aug 14 2019, 3:19 PM
mgehre added inline comments.
clang/lib/Sema/SemaAttr.cpp
102 ↗(On Diff #214949)

The contract is: A CXXRecordDecl is a Pointer/Owner if and only if its canonical declaration has the Pointer/Owner attribute.

The analysis only checks this on concrete classes (excuse me if I'm using the wrong term here), i.e. non-template classes or instantiation of template classes.

During the inference, we might also put the attributes onto the templated declaration of a ClassTemplateDecl. The Pointer/Owner attributes here will not be used by the analysis.
We just put them there so clang will copy them to each instantiation, where they will be used by the analysis.

From what I understand, clang copies the attributes of the definition of the templated declaration to the instantiation.
In addition, when clang sees a redeclaration (e.g. of a template), it will also copy the attributes from the previous/canonical (?) declaration.

In the problematic case

// The canonical declaration of the iterator template is not its definition.
template <typename T>
class __unordered_multiset_iterator;

template <typename T>
class __unordered_multiset_iterator {
};

template <typename T>
class unordered_multiset {
public:
  using iterator = __unordered_multiset_iterator<T>;
};

static_assert(sizeof(unordered_multiset<int>::iterator), ""); // Force instantiation.

clang would (before this PR):

  1. Create a ClassTemplateDecl with an CXXRecordDecl for __unordered_multiset_iterator (this is the canonical declaration, but not a definition)
  2. Create a ClassTemplateDecl with an CXXRecordDecl for __unordered_multiset_iterator (this is not the canonical declaration, but its definition)
  3. While processing the using iterator line, add a PointerAttr to the canonical declaration (from point 1)
  4. While instantiating __unordered_multiset_iterator<int>, copy attributes from the template definition (from point 2), which are none

Result: The ClassTemplateSpecializationDecl for __unordered_multiset_iterator<int> would not have the PointerAttr.

On the other hand, a swapped example

template <typename T>
class __unordered_multiset_iterator;

template <typename T>
class unordered_multiset {
public:
  using iterator = __unordered_multiset_iterator<T>;
};

template <typename T>
class __unordered_multiset_iterator {
};

static_assert(sizeof(unordered_multiset<int>::iterator), ""); // Force instantiation.

would work because clang would (before this PR):

  1. Create a ClassTemplateDecl with an CXXRecordDecl for __unordered_multiset_iterator (this is the canonical declaration, but not a definition)
  2. While processing the using iterator line, add a PointerAttr to the canonical declaration (from point 1)
  3. Create a ClassTemplateDecl with an CXXRecordDecl for __unordered_multiset_iterator (this is not the canonical declaration, but its definition); copy all attributes from a previous declaration, i.e. the PointerAttr from point 1
  4. While instantiating __unordered_multiset_iterator<int>, copy PointerAttr from the template definition (from point 3)

So we cannot just put the PointerAttr on the definition of the class template because there might be none yet (like in the second example). We need to put it on the definition in case
it was already parsed.

200 ↗(On Diff #214949)

Yes, the hasAttr check here is an optimization to avoid the string search. I don't think I can move the string search into addGslOwnerPointerAttributeIfNotExisting, because that function is called
from other call sites that don't care about a set of names.

clang/test/SemaCXX/attr-gsl-owner-pointer-std.cpp
95 ↗(On Diff #214949)

I like the AST matcher approach! This test file is really hard to debug - I usually copy a test to its own file for debugging.
Would you be okay with deferring the testing change to the next PR?

gribozavr added inline comments.Aug 14 2019, 4:01 PM
clang/lib/Sema/SemaAttr.cpp
200 ↗(On Diff #214949)

Sorry I wasn't clear enough -- the issue that I noticed is that this check looks at the canonical decl, while addGslOwnerPointerAttributeIfNotExisting attaches the attribute to the decl itself -- that's what I meant by "out of sync".

mgehre marked 4 inline comments as done.Aug 16 2019, 1:35 PM

I now start to wonder if we should instead put the attribute on all declarations (instead of just the canonical). Later redeclarations automatically inherit the attributes.
This would make both the checking easier (just check any declaration, no need to obtain the canonical first), and remove the special
case for adding to the definition of templated record for ClassTemplateDecls.

clang/lib/Sema/SemaAttr.cpp
200 ↗(On Diff #214949)

I make sure that addGslOwnerPointerAttributeIfNotExisting is only called with the canonical decl as argument, which the caller usually has around. The only exception to this is when addGslOwnerPointerAttributeIfNotExisting calls itself to attach an attribute to the definition of templated class.

mgehre marked an inline comment as done.Aug 18 2019, 1:59 PM
gribozavr added inline comments.Aug 20 2019, 5:49 AM
clang/lib/Sema/SemaAttr.cpp
200 ↗(On Diff #214949)

I see. This is a tricky contract, and ".getCanonicalDecl()" at callsites feel somewhat random. I think it should be at least documented. However...

I now start to wonder if we should instead put the attribute on all declarations (instead of just the canonical). Later redeclarations automatically inherit the attributes.
This would make both the checking easier (just check any declaration, no need to obtain the canonical first), and remove the special
case for adding to the definition of templated record for ClassTemplateDecls.

I'd support that.

clang/test/SemaCXX/attr-gsl-owner-pointer-std.cpp
95 ↗(On Diff #214949)

Of course! Thanks!

xazax.hun added inline comments.Aug 20 2019, 11:20 AM
clang/lib/Sema/SemaAttr.cpp
94 ↗(On Diff #214949)

Doesn't the attribute have a CreateImplicit static method? If so, we could use that :)

mgehre updated this revision to Diff 216247.Aug 20 2019, 2:07 PM
mgehre marked 6 inline comments as done.
  • Put OwnerAttr/PointerAttr on all redeclarations
  • clang/lib/Sema/SemaAttr.cpp: Use Attribute::CreateImplicit as requested in review

I now add the attributes to all redeclarations to make the logic a bit easier to follow.

clang/lib/Sema/SemaAttr.cpp
94 ↗(On Diff #214949)

Nice, didn't know that!

clang/test/SemaCXX/attr-gsl-owner-pointer-std.cpp
95 ↗(On Diff #214949)

I added the some tests for this particular change to a new unittest and will migrate the rest of the tests later.

gribozavr accepted this revision.Aug 21 2019, 2:19 AM
gribozavr added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
4596 ↗(On Diff #216247)

Add it only if it does not already exist?

This revision is now accepted and ready to land.Aug 21 2019, 2:19 AM
xazax.hun added inline comments.Aug 21 2019, 9:20 AM
clang/lib/Sema/SemaInit.cpp
6627 ↗(On Diff #216247)

You can also remove the getCanonicalDecl call here :)

mgehre marked 3 inline comments as done.Aug 21 2019, 3:07 PM
mgehre added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
4596 ↗(On Diff #216247)

If a previous decl had that Attribute, then we would have inherited it, and the check above for an existing attribute would have led to an early return.

This revision was automatically updated to reflect the committed changes.
mgehre marked an inline comment as done.
Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2019, 3:10 PM
thakis added a subscriber: thakis.Aug 21 2019, 3:36 PM

Similar to

cfe/trunk/unittests/Sema/CMakeLists.txt
14

Is it necessary to use ASTMachers to test this? It'd be good if SemaTests wouldn't have to depend on ASTMatchers (for linking speed, as well for layering hygiene).

cfe/trunk/unittests/Sema/GslOwnerPointerInference.cpp
9

This weird relative include path is a hint that this isn't the intended use :)

mgehre marked an inline comment as done.Aug 21 2019, 3:57 PM
mgehre added inline comments.
cfe/trunk/unittests/Sema/GslOwnerPointerInference.cpp
9

True. But ...
I thought about copying the matches function from STMatchersTest.h, but together with all callees, that's many lines of code.
Alternatively, I could implement my own AST walking and property checking, but then the test would be much less readable.
As yet another alternative, I though about moving the GslOwnerPointerInference.cpp test itself to unittests/ASTMatchers,
but that would also be strange because it actually tests Sema functionality.

What would be your suggestion? (It's good that we discuss this now because I'm planning to extend this unit test further).

thakis added inline comments.Aug 21 2019, 4:25 PM
cfe/trunk/unittests/Sema/GslOwnerPointerInference.cpp
9

Do you need the full matches() function? With the current test, it's not even clear to me what exactly it tests since it looks very integration-testy and not very unit-testy. Maybe if you make the test narrower, you don't that much scaffolding? (Integration-type tests are supposed to be lit tests, and unit tests are supposed to test small details that are difficult to test via lit.)

And, unrelatedly, for libc++, shouldn't we just add the attribute to the libc++ source instead of trying to infer it in the compiler?

mgehre marked 2 inline comments as done.Aug 22 2019, 1:17 PM
mgehre added inline comments.
cfe/trunk/unittests/Sema/GslOwnerPointerInference.cpp
9

The purpose of the test is to verify that the ClassTemplateSpecialization gets the Owner attribute, independent of which of the redeclarations of the template had the attribute attached.
The second thing that I want to test here in the future is that the inference for std:: types leads to correct Owner/Pointer attributes on the ClassTemplateSpecialization.,
i.e. a type named std::vector should have OwnerAttr on each of its ClassTemplateSpecialization.

We currently do this in a lit test here https://reviews.llvm.org/differential/changeset/?ref=1574879 via checking the AST dump, but that is really hard to debug in cases of failures,
and not entirely clear that it checks exactly what we want. (E.g. we can check that a ClassTemplateSpecializationDecl line is followed by a PointerAttr line, but we don't actually know if that attribute hierarchically belongs to that decl or a another one).

So @gribozavr suggested to me to instead break this integration tests into smaller chunks using the ASTMatchers, which at least solves that problem.

mgehre marked an inline comment as done.Aug 22 2019, 1:21 PM

Yes, for libc++ we could add the annotations to its source code, and eventually this would be the cleaner solution.
Currently, that is neither sufficient (because one could use libstdc++, MSVC or an older libc++ version) nor necessary (the inference we need for libstdc++ / MSVC also works for libc++),
so it's currently low on our priority list.