This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] support ignoring use-after-free checking with reference_counted attribute
Needs ReviewPublic

Authored by chrisdangelo on Nov 10 2021, 3:39 PM.

Details

Reviewers
NoQ
aaron.ballman
Summary

This change adds a new attribute "reference_counted". This attribute is intended to annotate struct declarations that are known to use a reference counting pattern before being freed.

The long term intention is that "reference_counted" may grow additional affordances for describing the expected retain and release conventions that can be wired up to train the RetainCountChecker.

The short term intention is that "reference_counted" will be used to silence use-after-free and double-free checks that are indicating false positives when the pointer is being monitored by a reference counting system.

This change is being uploaded privately, first for review by Artem Dergachev and then later with Aaron Ballman.

Diff Detail

Event Timeline

chrisdangelo requested review of this revision.Nov 10 2021, 3:39 PM
chrisdangelo created this revision.
chrisdangelo created this object with visibility "No One".
Herald added a project: Restricted Project. · View Herald TranscriptNov 10 2021, 3:39 PM
chrisdangelo retitled this revision from [analyzer] support ignoring use-after-free checking with reference_counted attribute to [wip] [analyzer] support ignoring use-after-free checking with reference_counted attribute.Nov 15 2021, 1:06 PM
chrisdangelo added a reviewer: NoQ.
chrisdangelo changed the visibility from "No One" to "Public (No Login Required)".Nov 15 2021, 2:16 PM
chrisdangelo added inline comments.Dec 1 2021, 12:07 PM
clang/include/clang/Basic/Attr.td
1699

I've discussed a bit with Devin Coughlin yesterday. Devin would like to be sure that we have appropriate warnings showing if this attribute is misused.

Previously, when I have been testing the use of this attribute, I mistakenly added the attribute annotation after "typedef" and before "struct". This was causing the attribute to not be discovered when the analyzer attempts to inspect the declaration. I don't recall if a compiler warning was being emitted.

This note is being written as a reminder to be evaluate what warning support is already included or adding support if necessary.

These changes allow the analyzer to silence an issue discovered by MallocChecker if the SymRef or Statement in question is of a struct that has been declared with the compiler attribute annotation "reference_counted".

In the previous iteration of this diff, SymRef alone was used to determine the declared type, and if it was "reference_counted". Previously, if the SymRef was found to be "reference_counted" an analyzer warning would not be issued.

In the current changes, the static analyzer is more lenient and robust. Now, when a MallocChecker issue is discovered, each node in the bug path is visited, and any use of the pointer in question is checked for its type. If the type is found to be "reference_counted" the bug is marked invalid and ultimately not delivered to the developer. In the current changes, a pointer in question is checked for its type both by using the SymRef and using the type information in the AST.

This change adds support for reference_counted attribute in pragma-attribute-supported-attributes-list.

This change edits comments describing function signatures for isReferenceCountedAttributed[...].

Hi @aaron.ballman,

It's nice to meet you, virtually.

I've been working with @NoQ on this change. I've now removed the [wip] prefix. When you have some time, I'd appreciate your feedback.

This change adds a new attribute "reference_counted". This attribute is intended to annotate struct declarations that are known to use a reference counting pattern before being freed.

The long term intention is that "reference_counted" may grow additional affordances for describing the expected retain and release conventions that can be wired up to train the static analyzer RetainCountChecker.

The short term intention, executed in these changes, is that "reference_counted" will be used to silence static analyzer use-after-free and double-free checks that are indicating false positives when the pointer is being monitored by a reference counting system.

This change does not currently enable warnings when the "reference_counted" attribute is written before the "struct" keyword. There may be other cases where the programmer may incorrectly use "reference_counted".

I've successfully run the tests locally via ninja check-clang-analysis and ninja check-clang.

I've successfully exercised these changes against a large C / C++ project and studied the output with @NoQ.

These changes are expected to be used in conjunction with additional work with ownership compiler attributes (https://reviews.llvm.org/D113530).

Thank you,
Chris

chrisdangelo retitled this revision from [wip] [analyzer] support ignoring use-after-free checking with reference_counted attribute to [analyzer] support ignoring use-after-free checking with reference_counted attribute.Dec 14 2021, 1:22 PM

I've successfully exercised these changes against a large C / C++ project and studied the output with @NoQ.

Could you please share the results to have look? How can I reproduce and evaluate the effect of this change?

NoQ added a comment.Dec 15 2021, 2:40 PM

Could you please share the results to have look? How can I reproduce and evaluate the effect of this change?

You'll need a project that uses a lot of manual reference counting, typically in C code (in our case it's XNU which is technically open-source but probably not very interesting). I've definitely seen such code in the wild, typically around various interpreters (eg., C APIs for python or javascript interpreters where the interpreted language's values are reference counted and garbage collected by the language so they need to be manually retained during interop) but we don't have much real data on those.

In such projects you'll see code like this:

struct S {
  int ref_count;
};

...

void release(S *s) {
  if (--s->refcount == 0) {
    free(s);
  }
}

If the static analyzer doesn't know the original reference count, it'll have to assume that the retain count will drop to zero every time such check is made. In particular, it'll assume that even if it did see a prior increment of the reference count (what if the original reference count was equal to -1?). Any subsequent use of the pointer will cause a use-after-free warning. In practice, however, the whole point of having reference counts is to be protected against such situations; reference count is always positive and typically large enough to avoid such use-after-free problems, so these are usually false positives when emitted by MallocChecker.

We have a separate checker, RetainCountChecker, that's specifically designed to deal with such code, to understand the underlying reference counting conventions. Currently RetainCountChecker supports 3 different hardcoded reference counting conventions. However even in just XNU we've seen a lot more separate conventions that are so many that they become impractical to hardcode.

The new attribute currently suppresses MallocChecker warnings but in the future we plan to extend it to actively enable RetainCountChecker to learn the appropriate convention and emit warnings that are more appropriate. The patch doesn't have any effect if the attribute isn't already present so you'll have to actively add it in order to observe any effect.

clang/include/clang/Basic/Attr.td
1700

Hmm, I think it's a good idea to provide some documentation.

Could you please share the results to have look? How can I reproduce and evaluate the effect of this change?

...

Ah, I see. Thank you for the detailed motivation, and future plans.

Hi @aaron.ballman,

It's nice to meet you, virtually.

Nice to meet you as well, and I'm very sorry that this review took so long for me to get to. It fell off my radar for a while. :-(

I've been working with @NoQ on this change. I've now removed the [wip] prefix. When you have some time, I'd appreciate your feedback.

This change adds a new attribute "reference_counted". This attribute is intended to annotate struct declarations that are known to use a reference counting pattern before being freed.

One thing to consider: analyzer_noreturn currently exists under the GNU spelling only. We may want to think about adding a [[]] spelling for Clang static analyzer attributes. I'm not certain whether we'd want that to be spelled [[clang::reference_counted]] or [[clang_analyzer::reference_counted]], etc. We've never really decided on a policy about analyzer-specific attributes before and we may want to consider what we'd like to do before/while adding this attribute.

The long term intention is that "reference_counted" may grow additional affordances for describing the expected retain and release conventions that can be wired up to train the static analyzer RetainCountChecker.

The short term intention, executed in these changes, is that "reference_counted" will be used to silence static analyzer use-after-free and double-free checks that are indicating false positives when the pointer is being monitored by a reference counting system.

Another possibility to consider is whether we want to introduce a generic suppression attribute to suppress static analyzer diagnostics in general, rather than having a per-use case attribute.

This change does not currently enable warnings when the "reference_counted" attribute is written before the "struct" keyword. There may be other cases where the programmer may incorrectly use "reference_counted".

I've successfully run the tests locally via ninja check-clang-analysis and ninja check-clang.

I've successfully exercised these changes against a large C / C++ project and studied the output with @NoQ.

That's great to hear! Can you speak to the results in more detail? (Did you have to use the attribute a large number of times? Did the false positive rate improve and if so, by how much? Were there any surprises you found? That sort of thing.)

I'd also be curious to know how this attribute interacts (if at all) with things like ARC (automatic reference counting) in ObjC and the NS retains-related attributes as those are also related to reference counting and also only impact the static analyzer. If we can avoid adding a new attribute (perhaps by adding a new attribute spelling to an existing attribute that covers the same needs), that'd be a really great thing.

These changes are expected to be used in conjunction with additional work with ownership compiler attributes (https://reviews.llvm.org/D113530).

Thank you,
Chris

clang/include/clang/Basic/Attr.td
1699

Agreed -- we're definitely missing test coverage for the attribute as it stands. We should have coverage testing applying it to the correct and incorrect subjects, diagnostics on being given arguments, tests that inheritance works, etc.

1700

+1, no new undocumented attributes please. (The docs also help us reviewer to ensure the behavior of the patch matches the intent.)

clang/test/Analysis/malloc-annotations.c
314

Please give all of the new test case functions a prototype (e.g., (void) instead of ()).

490

You should add a newline to the end of the file.