Page MenuHomePhabricator

Teach some warnings to respect gsl::Pointer and gsl::Owner attributes
ClosedPublic

Authored by xazax.hun on Jul 5 2019, 10:50 AM.

Details

Summary

This patch extends some existing warnings to utilize the knowledge about the gsl::Pointer and gsl::Owner attributes.

The implicit assumption of this patch is that if a class annotated with gsl::Pointer is created from a gsl::Owner (through conversion operator or a constructor taking the owner as the first argument) the Pointer will point to the buffer of the Owner, so it will dangle after the lifetime of the Owner ends.

Diff Detail

Repository
rL LLVM

Event Timeline

xazax.hun created this revision.Jul 5 2019, 10:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2019, 10:50 AM
xazax.hun retitled this revision from Teach some warnings to respect gsd::Pointer and gsl::Owner attributes to Teach some warnings to respect gsl::Pointer and gsl::Owner attributes.Jul 5 2019, 10:51 AM
xazax.hun changed the repository for this revision from rC Clang to rG LLVM Github Monorepo.
gribozavr added inline comments.Jul 8 2019, 4:18 AM
clang/lib/Sema/SemaInit.cpp
6510 ↗(On Diff #208206)

What is this name supposed to mean? Initialization of a "lifetime pointer"? What's a "lifetime pointer"?

If you were imitating "LifetimeBoundCall", it is a reference to the attribute: https://clang.llvm.org/docs/AttributeReference.html#lifetimebound .

6587 ↗(On Diff #208206)

I feel like the code would be better off if we defined another function for handling gsl::Pointer semantics.

This function is for clang::lifetimebound, and mixing the logic is confusing IMO.

7049 ↗(On Diff #208206)

Sorry, I can't parse this: "... initializing lifetime Pointer ..."

clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
10 ↗(On Diff #208206)

= nullptr

16 ↗(On Diff #208206)

Can T and MyOwner be the same type?

It is confusing to have two -- for example, toOwner() returning T instead of MyOwner is confusing.

20 ↗(On Diff #208206)

"ReleaseAsMyPointer'

21 ↗(On Diff #208206)

"ReleaseAsRawPointer"

22 ↗(On Diff #208206)

This method is confusing -- is it a name that the warning is supposed to know about?

25 ↗(On Diff #208206)

It would be helpful to rename this test function and other functions below according to what they test.

f, g, g2, i, i2 -- unclear what the test is supposed to be about, and why tests are grouped like that.

33 ↗(On Diff #208206)

Why is the last line OK? Looks like a false negative to me -- the pointer on the heap is pointing to a local variable, just like in f, which produces a warning.

If it is an intentional false negative, please annotate it as such in comments.

38 ↗(On Diff #208206)

Is "release" a magic name that the warning understands?

xazax.hun marked 5 inline comments as done.Jul 8 2019, 8:14 AM
xazax.hun added inline comments.
clang/lib/Sema/SemaInit.cpp
6510 ↗(On Diff #208206)

When we were working on the lifetime analysis we needed to distinguish raw pointers from the user defined types that are categorized as pointers. We adopted this notion of Lifetime Pointer which stands for the user defined type that is annotated with gsl::Pointer.

In case this is confusing I could rename it GslPointerInit and GslTempOwner. What do you think?

7049 ↗(On Diff #208206)

We called gsl::Pointer annotated types lifetime pointers (to distinguish from raw pointers). Should we use gsl::Pointer in the comments too? Or do you have an alternative in mind?

clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
16 ↗(On Diff #208206)

I agree that T might not be a great name. The reason why we have two types because one can be converted to a Pointer using a conversion operator the other can be converted using a one argument non-explicit constructor. These two are generating different ASTs, thus I wanted to test both ways.

33 ↗(On Diff #208206)

Yeah, this is an intentional false negative as we do not have enough information in a statement local analysis to warn about this case. Will fix the comment thanks!

38 ↗(On Diff #208206)

Method calls are not handled yet, but the idea is to understand some STL specific methods, like the std::unique_ptr::release.

xazax.hun updated this revision to Diff 208439.Jul 8 2019, 9:20 AM
xazax.hun marked 12 inline comments as done.
  • Address review comments.
clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
22 ↗(On Diff #208206)

Not yet, see my other answer below.

xazax.hun updated this revision to Diff 208441.Jul 8 2019, 9:22 AM
  • Fix a typo.
gribozavr added inline comments.Jul 9 2019, 5:56 AM
clang/lib/Sema/SemaInit.cpp
6510 ↗(On Diff #208206)

My suggestion is to refer to a type that was annotated with gsl::Pointer as a "gsl::Pointer type" in prose, or "GslPointer" where you need an identifier.

So I agree about GslPointerInit.

As for the second one, it seems like GslOwnerTemporaryInit would be more fitting -- WDYT? It is an initialization of a temporary that has a gsl::Owner type.

6549 ↗(On Diff #208441)

Move closer to the point of usage?

6549 ↗(On Diff #208441)

re: name, pathOnlyContainsGslPointerInit or pathOnlyInitializesGslPointer

6564 ↗(On Diff #208441)

"isRecordWithAttr" ?

7072 ↗(On Diff #208441)

Please adjust the name.

7076 ↗(On Diff #208441)

Is it important that the whole path only contains gsl::Pointer nodes?

clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
16 ↗(On Diff #208206)

Please explain this distinction in the comments in the test. If you didn't tell me just now, I would never know.

Maybe it is also a good idea to split the MyPointer type then, let each owner have its own pointer. Right now having one pointer with two owners is confusing, it suggests that owners are somehow related, but they aren't.

xazax.hun marked 6 inline comments as done.Jul 9 2019, 9:05 AM
xazax.hun added inline comments.
clang/lib/Sema/SemaInit.cpp
7076 ↗(On Diff #208441)

We are trying to figure out what do we initialize a gsl::Pointer annotated class. The reason why we have multiple GslPointerInits in the path is that we usually can see a lot of intermediate temporary objects. Looking at the AST of the examples I have so far the other kind of paths does not seem to be relevant. I can imagine refining this condition later after examining a false negative but I think starting with the most conservative approach should be reasonable for warnings.

xazax.hun updated this revision to Diff 208709.Jul 9 2019, 9:06 AM
  • Address review comments.
gribozavr accepted this revision.Jul 10 2019, 1:27 AM

I'm not an expert in SemaInit code, but this change LGTM.

clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
8 ↗(On Diff #208709)

Not needed anymore.

25 ↗(On Diff #208709)

Would be nice if the second pair of types had clearly related names, like the first pair (MyOwner/MyPointer).

This revision is now accepted and ready to land.Jul 10 2019, 1:27 AM
xazax.hun marked an inline comment as done.Jul 11 2019, 12:08 PM

Thanks for the review! I will update the patch soon. As there is a dependency that is not accepted yet Richard (who authored the code I extended) might have some chance to take a look at this patch.

clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
25 ↗(On Diff #208709)

Would renaming MyPtrFromConv to PointerFromConv be enough to suggest the relationship?

gribozavr added inline comments.Jul 11 2019, 1:37 PM
clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
25 ↗(On Diff #208709)

Yes, that would be sufficient. However my best suggestion would be:

MyOwner => MyIntOwner
MyPointer => MyIntPointer

OwnerWithConv => MyLongOwnerWithConversion
MyPtrFromConv => MyLongPointerFromConversion

Also makes it clear which two types are related together.

xazax.hun updated this revision to Diff 211127.Jul 22 2019, 9:25 AM
xazax.hun marked 5 inline comments as done.
  • Address the rest of the review comments.
  • Fix a TODO and add some more tests.
  • Fix a false positive from previous change.
  • Fix a false positive case found by running over ~200 open source projects
gribozavr accepted this revision.Jul 29 2019, 5:52 AM
gribozavr added inline comments.
clang/lib/Sema/SemaInit.cpp
7095 ↗(On Diff #211580)

It is unclear to me why if (IsGslPtrInitWithGslTempOwner) is before if (!MTE), seems like that exclusion should apply to our case, too.

xazax.hun marked an inline comment as done.
  • A small refactoring based on an observation from a review comment.
xazax.hun marked an inline comment as done.
  • Actually move the code snippet in question.
clang/lib/Sema/SemaInit.cpp
7095 ↗(On Diff #211580)

The reason was that I always skipped MTEs and was looking for the expr that actually initialized MTEs. Relying on MTEs, however, turned out to simplify the code a bit. Updating the diff, thanks!

Thanks for the review!
Since I did some refactoring I will wait for an additional accept before committing.

gribozavr accepted this revision.Aug 6 2019, 4:36 AM
gribozavr added inline comments.
clang/lib/Sema/SemaInit.cpp
7031 ↗(On Diff #212205)

This change would be best committed separately (preferably with a test).

7046 ↗(On Diff #212205)

I think you can go back to llvm::c_find now?

7076 ↗(On Diff #212205)

"Owner"

7077 ↗(On Diff #212205)

Why is it a false positive? std::move left memory owned by localOwner in unspecified state.

xazax.hun marked an inline comment as done.Aug 6 2019, 8:43 AM
xazax.hun added inline comments.
clang/lib/Sema/SemaInit.cpp
7077 ↗(On Diff #212205)

I saw user code relying on the semantics of certain classes. E.g. they assume if a std::unique_ptr is moved the pointee is still in place, so it is safe to return a reference to the pointee. Do you think those cases should be diagnosed too?

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 6 2019, 12:13 PM
gribozavr added inline comments.Aug 6 2019, 12:49 PM
clang/lib/Sema/SemaInit.cpp
7077 ↗(On Diff #212205)

It is... debatable. It is not obvious whether the lifetime of the pointed-to memory has ended or not without more detailed lifetime annotations. I think it is fair to silence it, however, I think the comment should be updated to explain the situation in a more detailed way, since without context it looks like a use-after-move.

xazax.hun marked an inline comment as done.Aug 7 2019, 11:10 AM
xazax.hun added inline comments.
clang/lib/Sema/SemaInit.cpp
7077 ↗(On Diff #212205)

Do you think renaming localOwner to uniquePtr would be sufficient or do you want me to extend the text too?

Hi. I noticed in our builders that both of the warnings introduced in this patch are being diagnosed for pointers that don't use GSL at all. I'm attempting to make a small reproducer now.

Hi. I noticed in our builders that both of the warnings introduced in this patch are being diagnosed for pointers that don't use GSL at all. I'm attempting to make a small reproducer now.

Hi!

Clang now apply GSL annotations for some STL types automatically. So this is the expected behavior. Are any of those warnings unwanted/false positive with the newest Clang version? If so, please share the reproducers and I will either fix them ASAP or revert/turn off the warnings.

leonardchan added a comment.EditedAug 12 2019, 4:38 PM

Hi. I noticed in our builders that both of the warnings introduced in this patch are being diagnosed for pointers that don't use GSL at all. I'm attempting to make a small reproducer now.

Hi!

Clang now apply GSL annotations for some STL types automatically. So this is the expected behavior. Are any of those warnings unwanted/false positive with the newest Clang version? If so, please share the reproducers and I will either fix them ASAP or revert/turn off the warnings.

Thanks! I didn't know that they were applied to stl types. This could explain why the warning is raised in my case (std::string), but I don't think there's anything wrong in this example that would lead to a warning:

leonardchan@cp-snakewater:~/llvm-monorepo/llvm-build-3-master$ cat ~/misc/test.cpp
#include <string>

std::string MakeStr();
void other_func(const char *s);

void func(std::string s) {
  const char *x = MakeStr().c_str();
  other_func(x);
}
leonardchan@cp-snakewater:~/llvm-monorepo/llvm-build-3-master$ ~/llvm-monorepo/llvm-build-3-master/bin/clang++ -c ~/misc/test.cpp
/usr/local/google/home/leonardchan/misc/test.cpp:7:19: warning: object backing the pointer will be destroyed at the end of the full-expression [-Wdangling]
  const char *x = MakeStr().c_str();
                  ^~~~~~~~~
1 warning generated.

This was made using a release build from ToT clang.

Hi. I noticed in our builders that both of the warnings introduced in this patch are being diagnosed for pointers that don't use GSL at all. I'm attempting to make a small reproducer now.

Hi!

Clang now apply GSL annotations for some STL types automatically. So this is the expected behavior. Are any of those warnings unwanted/false positive with the newest Clang version? If so, please share the reproducers and I will either fix them ASAP or revert/turn off the warnings.

Thanks! I didn't know that they were applied to stl types. This could explain why the warning is raised in my case (std::string), but I don't think there's anything wrong in this example that would lead to a warning:

leonardchan@cp-snakewater:~/llvm-monorepo/llvm-build-3-master$ cat ~/misc/test.cpp
#include <string>

std::string MakeStr();
void other_func(const char *s);

void func(std::string s) {
  const char *x = MakeStr().c_str();
  other_func(x);
}
leonardchan@cp-snakewater:~/llvm-monorepo/llvm-build-3-master$ ~/llvm-monorepo/llvm-build-3-master/bin/clang++ -c ~/misc/test.cpp
/usr/local/google/home/leonardchan/misc/test.cpp:7:19: warning: object backing the pointer will be destroyed at the end of the full-expression [-Wdangling]
  const char *x = MakeStr().c_str();
                  ^~~~~~~~~
1 warning generated.

This was made using a release build from ToT clang.

Thanks for the repro! I think this is a true positive. MakeStr will create a temporary object, and c_str() will create a pointer that points into the buffer owned by the temporary object. At the end of the full expression the temporary object is destroyed and the result of c_str will dangle.

Hi. I noticed in our builders that both of the warnings introduced in this patch are being diagnosed for pointers that don't use GSL at all. I'm attempting to make a small reproducer now.

Hi!

Clang now apply GSL annotations for some STL types automatically. So this is the expected behavior. Are any of those warnings unwanted/false positive with the newest Clang version? If so, please share the reproducers and I will either fix them ASAP or revert/turn off the warnings.

Thanks! I didn't know that they were applied to stl types. This could explain why the warning is raised in my case (std::string), but I don't think there's anything wrong in this example that would lead to a warning:

leonardchan@cp-snakewater:~/llvm-monorepo/llvm-build-3-master$ cat ~/misc/test.cpp
#include <string>

std::string MakeStr();
void other_func(const char *s);

void func(std::string s) {
  const char *x = MakeStr().c_str();
  other_func(x);
}
leonardchan@cp-snakewater:~/llvm-monorepo/llvm-build-3-master$ ~/llvm-monorepo/llvm-build-3-master/bin/clang++ -c ~/misc/test.cpp
/usr/local/google/home/leonardchan/misc/test.cpp:7:19: warning: object backing the pointer will be destroyed at the end of the full-expression [-Wdangling]
  const char *x = MakeStr().c_str();
                  ^~~~~~~~~
1 warning generated.

This was made using a release build from ToT clang.

Thanks for the repro! I think this is a true positive. MakeStr will create a temporary object, and c_str() will create a pointer that points into the buffer owned by the temporary object. At the end of the full expression the temporary object is destroyed and the result of c_str will dangle.

Ah you're right! I accidentally glossed over that. This means your patch is working as expected then. Thanks!