This is an archive of the discontinued LLVM Phabricator instance.

[attributes] [analyzer] Add handle related attributes
ClosedPublic

Authored by xazax.hun on Nov 19 2019, 3:30 PM.

Details

Summary

These are three new attributes that I plan to use in a Fuchsia specific check. I did not give them Fuchsia specific names as they might be useful for other APIs such as Posix as well. In case that is a concern I can rename them to be more specific to the check.

Diff Detail

Event Timeline

xazax.hun created this revision.Nov 19 2019, 3:30 PM
NoQ added inline comments.Nov 19 2019, 4:05 PM
clang/lib/Sema/SemaDeclAttr.cpp
7408

The next obvious step here would be to type-check the declaration to make sure that it's actually a handle (and emit a warning if it isn't).

xazax.hun marked an inline comment as done.Nov 19 2019, 4:17 PM
xazax.hun added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
7408

Yeah, I do agree, but I think this depends on the role of the attribute. Adding a type check would make this more restrictive in a sense other users who want to write for example a Posix API checker and want to reuse this attribute might not be able to do so without touching the type-check code. Which may or may not be good. I do not have any strong feelings about either of the directions.

NoQ added inline comments.Nov 19 2019, 4:46 PM
clang/lib/Sema/SemaDeclAttr.cpp
7408

So you envision this working on completely arbitrary types? Fair!

Maybe check that when AcquireHandleAttr is on a parameter, it's actually an out-parameter?

xazax.hun marked an inline comment as done.Nov 19 2019, 4:57 PM
xazax.hun added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
7408

Hmm, I think this makes a lot of sense! So basically, I would exclude primitive, non-pointer types that are taken by value.

This is also missing tests for all the Sema handling (that the attributes only appertain to the specified subjects, accept no arguments, etc).

clang/include/clang/Basic/Attr.td
3477

What about function-like interfaces such as lambdas, blocks, or other callable objects that are not a FunctionDecl?

3482

Should this have a subject limiting it to parameters?

3487

Should this have a subject limiting it to parameters?

clang/include/clang/Basic/AttrDocs.td
4703

What is a "handle"? I think some introduction docs are needed.

  • Check trivial cases when the acquire_handle attribute is not on an output parameter.
xazax.hun marked 3 inline comments as done.Nov 20 2019, 10:49 AM
xazax.hun added inline comments.
clang/include/clang/Basic/Attr.td
3477

Good point!

3482

My understanding was that the subject list is "inherited" from InheritableParamAttr, which already has it limited to parameters. Is this not the case?

clang/include/clang/Basic/AttrDocs.td
4703

Good point. Do you prefer to copy and paste the introduction to all attributes or is it enough to only have it for one of them?

aaron.ballman added inline comments.Nov 20 2019, 11:04 AM
clang/include/clang/Basic/Attr.td
3482

It doesn't seem to be the case looking at Attr.td:557 or thereabouts. Adding tests for the attributes should clarify whether the change is needed or not, too.

clang/include/clang/Basic/AttrDocs.td
4703

You can set up a documentation category to put all the related attributes under, and that category can have the introductory documentation. You can see an example of this with the calling convention attributes. There is a DocCatCallingConvs that is then the Category for all the other attributes.

xazax.hun updated this revision to Diff 230347.Nov 20 2019, 3:57 PM
xazax.hun marked 8 inline comments as done.
  • Address review comments.
aaron.ballman added inline comments.Nov 23 2019, 11:57 AM
clang/include/clang/Basic/Attr.td
3477

This get you partway there -- FunctionLike allows functions and function pointers, but I don't think it allows lambdas or blocks. You should add some test cases for all of these (including function pointers) to verify you're able to mark the entities you want and get the behavior you're after.

For instance, on function pointers and lambdas, this may impact the function *type* which is a good design question to be thinking about. What should happen with code like:

[[clang::acquire_handle]] void int open(const char *);
int (*fp)(const char *) = open;

If the attribute is a type attribute as opposed to a declaration attribute, then you can think about diagnostics in these sort of cases. However, if it is a type attribute, there is more work involved in hooking it up to the type system.

FWIW, this attribute smells like it should be a type attribute rather than a declaration attribute because I can imagine wanting to use this with function pointers and not losing the analysis capabilities (esp for things like callback functions).

3483

One test that you should add is whether this case is properly handled:

void (*fp)(int whatever [[clang::uses_handle]]);
[](int whatever [[clang::uses_handle]]){};

e.g., when it's on a parameter declaration of function pointer type or a lambda, does everything still behave as expected?

clang/include/clang/Basic/AttrDocs.td
4702

sockets, processes -> sockets, and processes

clang/lib/Sema/SemaDeclAttr.cpp
6606

const auto *

6607

I'm skeptical of this. I think a better check is if the type is a pointer or reference -- was there a reason you didn't go with that approach?

xazax.hun marked an inline comment as done.Nov 27 2019, 11:46 AM
xazax.hun added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
6607

The reason why I do not want to restrict this to pointers and references is that the users might have a custom wrapper type to deal with handles that might be passed by value. Or the user might pass an iterator by value which points to a handle. These are not extremely likely, but I did not want to diagnose those cases. What do you think?

xazax.hun marked 5 inline comments as done.
  • Address review comments.
xazax.hun added inline comments.Nov 27 2019, 12:11 PM
clang/include/clang/Basic/Attr.td
3477

I see your point. I have, however, some concerns making this part of the type system. I think it would be great to let the users add the annotations gradually without any diagnostic. For example:

void my_function_with_callback( int (*callback)(int __attribute__((handle_use)) );

...

my_function_with_callback(already_annotated);
my_function_with_callback(not_yet_annotated); // Do we intend to give a diagnostics here?

What do you think?

3477

After some digging it looks like I cannot apply any attributes to lambdas. For example, you cannot even add [[noreturn]] to a lambda, because the language only supports type attributes in that position.

aaron.ballman added inline comments.Nov 27 2019, 1:31 PM
clang/include/clang/Basic/Attr.td
3477

After some digging it looks like I cannot apply any attributes to lambdas.

That's what got me thinking about making this a type attribute instead. It would be a shame for you to not be able to mark lambdas.

I have, however, some concerns making this part of the type system.

Understandable -- type attributes are not much fun. ;-) I think your concern is justified, but you have the control to allow or disallow whatever you'd like in terms of the semantics of the type attribute. Personally, I would want that use case to give a diagnostic because that use may close the handle, for instance. Or it could be creating/releasing a handle instead of just using one, etc.

For instance, one idea (and it may not be a good one) is that you could diagnose any implicit conversion between the function pointer types, but still allow explicit conversion between them. This gives users a transition path -- they can explicitly add the type cast in the places they've not yet got the annotations added yet. They could even use a macro to perform the cast, giving them a handy marker of what code still needs to be fixed.

clang/lib/Sema/SemaDeclAttr.cpp
6607

I guess I was envisioning the value type being the issue, but I suppose you could have some sort of reference-counted object where the copy is not problematic. But this is still a bit of a strange constraint -- it disallows putting it on an integer parameter type, but not, say a float or a const char *?

Have you considered requiring the handle type to be annotated itself, so that you can verify the parameter is actually a handle as opposed to a mistake?

xazax.hun updated this revision to Diff 231784.Dec 2 2019, 2:50 PM
  • Convert to type attributes.

@aaron.ballman

Type attributes are definitely more flexible as in they can be applied in more contexts. These attributes, however, make no sense outside of a function declaration, or maybe a type alias. So I feel like we could argue on both sides. I made a minimal version of the attribute so it can potentially unblock dependent revisions (the static analyzer changes) and plan to add more diagnostics in follow-up patches. What do you think?

In the meantime, I realized lambdas are actually not that important. Lambdas are only usable in C++, and in C++ one should prefer to use move only RAII wrapper for handles. This reduces the problem into a use after move and we already have a check for that. These annotations are more important for C, where we do not have language features to mitigate these problems. The function pointer argument is still valid though.

@aaron.ballman

Type attributes are definitely more flexible as in they can be applied in more contexts. These attributes, however, make no sense outside of a function declaration, or maybe a type alias.

Function pointers are a primary case for what I was thinking of, and those are neither a function declaration nor a type alias.

So I feel like we could argue on both sides. I made a minimal version of the attribute so it can potentially unblock dependent revisions (the static analyzer changes) and plan to add more diagnostics in follow-up patches. What do you think?

Thank you! I think this is a more flexible way forward.

In the meantime, I realized lambdas are actually not that important. Lambdas are only usable in C++, and in C++ one should prefer to use move only RAII wrapper for handles.

Generally yes, but thinking of things like file descriptors (which are also handles), it seems like overkill to expect someone to wrap those in a separate type.

This reduces the problem into a use after move and we already have a check for that. These annotations are more important for C, where we do not have language features to mitigate these problems. The function pointer argument is still valid though.

Okay, that's a fair point.

clang/include/clang/Basic/Attr.td
3475

These probably should be DeclOrTypeAttr because they can be applied to a parameter (which is a declaration) or a type.

clang/include/clang/Basic/AttrDocs.td
4700

The docs should clarify when they say "function" that they mean the function type rather than the function declaration.

clang/lib/Sema/SemaType.cpp
7436

You should be able to pass in Attr directly instead of Attr.getAttrName()->getName(), I believe. Also, I'd prefer the non-function be put into the diagnostic text itself with a %select if we need to vary it.

clang/test/Sema/attr-handles.cpp
8

Are you planning to implement the TODOs as part of this patch?

xazax.hun updated this revision to Diff 232681.Dec 6 2019, 6:06 PM
xazax.hun marked 7 inline comments as done.
  • Make the annotation acceptable on both types and declarations.
  • Fix some TODOs.
  • Teach TypePrinter to print the annotations.
  • Address review comments.
clang/lib/Sema/SemaType.cpp
7436

Indeed. But only passing Attr would result in duplicated '. I'm reusing the message, so I ended up not changing this part.

aaron.ballman added inline comments.Dec 7 2019, 6:45 AM
clang/lib/Sema/SemaType.cpp
7444

Is this correct, or should it be if (!CurType->isFunctionType())?

clang/test/Sema/attr-handles.cpp
21–22

These diagnostics look incorrect to me -- I thought we wanted the attribute to apply to function types?

You should also add some test cases where the attribute is applied to a function pointer type.

23

Please add the newline to the end of the file.

xazax.hun updated this revision to Diff 232723.Dec 7 2019, 4:04 PM
xazax.hun marked 2 inline comments as done.
xazax.hun added inline comments.
clang/test/Sema/attr-handles.cpp
21–22

I think this one can be a bit confusing. We can apply this to the types of the parameter or the return type. I thought restricting to users to put it either on the parameter or the return type would make it more obvious what is the target of this attribute. In case this is not idiomatic, I can let the user to attach it to the function type.

aaron.ballman marked an inline comment as done.Dec 8 2019, 9:26 AM
aaron.ballman added inline comments.
clang/lib/Sema/SemaType.cpp
7444

I see now that this is only incorrect for the acquire attribute, but not the other two.

clang/test/Sema/attr-handles.cpp
4

Can you add some test cases showing that these work on a typedef declaration?

21–22

Ah, I see -- acquire can be on a function type, but use and release are on the parameter. You are missing some test cases:

int correct() [[clang::acquire_handle]]; // Should work, applies to the function type
int (*fp [[clang::acquire_handle]])(); // Should work, applies to the function type
xazax.hun marked an inline comment as done.Dec 8 2019, 9:47 AM
xazax.hun added inline comments.
clang/lib/Sema/SemaType.cpp
7444

I think we might consider it incorrect for the acquire as well. Because if we let the user put acquire on the function, it is ambiguous where to put the annotation. If we do not let them put on the function they are always forced to put on the return type to express this. But in case this kind of ambiguity is a feature and not a bug I can change the behavior.

xazax.hun updated this revision to Diff 232873.Dec 9 2019, 8:59 AM
xazax.hun marked 2 inline comments as done.
  • Make sure typedefs are supported.
aaron.ballman added inline comments.Dec 9 2019, 11:03 AM
clang/lib/Sema/SemaType.cpp
7444

I guess I was considering the semantics as being acquire on a function type applies to the non-void return value being the handle and release on a function type applies to the only parameter being passed to the function.

I don't think it makes sense to apply acquire or release semantics to a handle type itself because that's not really part of the type identity itself (it's not an "acquiring handle" or a "releasing handle" -- I would think it's *both* at the same time), so I think that's been throwing me for a loop with the latest round of patches here. What I was imagining was that you should mark any place a handle is created or released, which works almost-fine for parameters (because you can mark the parameter declarations with an attribute). It doesn't quite work for function pointers, however, because those don't have parameters. e.g., void (*fp)([[clang::acquire_handle]] int &) won't do what you expect, I believe. However, for functions which return the acquired handle, there is no good place to write the attribute except on the function itself (because, again, it's not a property of the return type, but of the specific function's return type). For those cases, I imagine you want the function *type* to be what codifies that a handle is returned, because then it will work with function pointers. For *using* a handle, I'm not certain what the benefit is to having an attribute on each use -- that seems likely for a user to forget to add an annotation somewhere and get incorrect behavior. I would normally suggest that the attribute be applied to the type declaration of the handle (e.g., through a typedef), but if you want this to work with handles coming from system headers (like file descriptors or sockets, etc) then will this approach work?

NoQ added a comment.EditedDec 9 2019, 7:31 PM

It still mildly worries me that the attributes aren't truly reusable, as the exact meaning of the attribute depends on the domain. In particular, every domain is expected to have different approaches to error handling, i.e. a function that creates or destroys a handle may fail to, respectively, create or destroy the handle, but instead indicate the failure in a domain-specific manner, eg. through magical return values or null handle or errno or whatever.

@aaron.ballman, do you think this is a problem? Should we rather go for an attribute name that's obviously domain-specific (eg., __attribute__((fuchsia_acquire_handle))), or is it ok to re-use attributes without re-using their exact meaning?

In D70469#1776523, @NoQ wrote:

It still mildly worries me that the attributes aren't truly reusable, as the exact meaning of the attribute depends on the domain. In particular, every domain is expected to have different approaches to error handling, i.e. a function that creates or destroys a handle may fail to, respectively, create or destroy the handle, but instead indicate the failure in a domain-specific manner, eg. through magical return values or null handle or errno or whatever.

@aaron.ballman, do you think this is a problem? Should we rather go for an attribute name that's obviously domain-specific (eg., __attribute__((fuchsia_acquire_handle))), or is it ok to re-use attributes without re-using their exact meaning?

That may be part of why I keep pushing back on this addition -- it seems like these are general purpose attributes that can be used to identify what a handle is and where a handle is obtained/released. Or these could be specific to a particular coding guideline's definition of a handle and associated semantics. If the goal is to only support a limited set of use cases, then I think something like [[clang::fucschia_acquire_handle]] makes more sense. If the goal is to provided general utilities for the static analyzer to reason about handles, then I think we want the more generalized names.

In D70469#1776523, @NoQ wrote:

It still mildly worries me that the attributes aren't truly reusable, as the exact meaning of the attribute depends on the domain. In particular, every domain is expected to have different approaches to error handling, i.e. a function that creates or destroys a handle may fail to, respectively, create or destroy the handle, but instead indicate the failure in a domain-specific manner, eg. through magical return values or null handle or errno or whatever.

@aaron.ballman, do you think this is a problem? Should we rather go for an attribute name that's obviously domain-specific (eg., __attribute__((fuchsia_acquire_handle))), or is it ok to re-use attributes without re-using their exact meaning?

That may be part of why I keep pushing back on this addition -- it seems like these are general purpose attributes that can be used to identify what a handle is and where a handle is obtained/released. Or these could be specific to a particular coding guideline's definition of a handle and associated semantics. If the goal is to only support a limited set of use cases, then I think something like [[clang::fucschia_acquire_handle]] makes more sense. If the goal is to provided general utilities for the static analyzer to reason about handles, then I think we want the more generalized names.

Basically, I think the exact semantics belong to the static analysis and the annotation is only there to supplement some information about the code that is not available in the type system. So in the sense I envisioned these attributes as a general way to encode knowledge about an API without relying on any semantics from any analysis. And the FuchsiaHandleChecker would be a first analysis to consume these handles. Inevitable, different analyses have to interpret these slightly differently when it comes to error handling and other corner cases. This is the result of APIs being fundamentally different. The question is, does this justify having a separate set of annotations for all handles, like posix_handle_acquire, fuchsia_handle_acquire? I do not have strong feelings and I am also OK with going a Fuchsia specific name. It is not that hard to add a new spelling without code repetition. Also, having separate spelling could provide some additional benefits, e.g. we could check of someone is trying to close a posix handle using a fuchsia API. I think both directions have some pros and cons the ultimate questions is what do we care about more. Assigning more specific semantics to annotations (as opposed to the analyses only) or reducing the annotation vocabulary.

NoQ added a comment.Dec 10 2019, 4:30 PM

What about __attribute__((acquire_handle("fuchsia")))?

In D70469#1778609, @NoQ wrote:

What about __attribute__((acquire_handle("fuchsia")))

I would by fine with this as well. Aaron?

xazax.hun marked an inline comment as done.Dec 11 2019, 8:41 AM
xazax.hun added inline comments.
clang/lib/Sema/SemaType.cpp
7444

Hmm. The problem is that a function might do multiple things with multiple handles, for example in Fuchsia there are functions that replacing a handle with another one. So I am afraid naively putting the attributes on the function type might end up being ambiguous sometimes. I totally agree with your argument about the annotation not being part of the type identity. If we do want to put this on the function type maybe we could do something similar to nonnull and enumerate parameter indices? What do you think?

The use annotation might be a bit misleading. The user does not necessarily need to annotate all uses.
Let's consider the following function and a call:

void f(int *handle);
close_handle(handle);
f(&handle); // What will this do?

The function f would be free to do anything, like store the address of the handle in a global data structure, overwrite the closed handle and so on. The purpose of the use annotation is to tell the analyzer that none of this will happen, so we can safely diagnose a use after release. So the original idea was to annotate those functions that are guaranteed to never close, overwrite etc the handle.
For unannotated functions the analyzer will be conservative and assume that the code is OK.

NoQ added inline comments.Dec 11 2019, 12:44 PM
clang/lib/Sema/SemaType.cpp
7444

I heard that type attributes are relatively scary and require a lot of work to implement, and nonnull/_Nonnull is definitely one of the historical examples of why they're scary. Like, you don't want to mess with the type system and find yourself explaining whether it's still the same type or a different type simply because it wears an attribute. Declaration attributes are much more straightforward in this sense.

In D70469#1778609, @NoQ wrote:

What about __attribute__((acquire_handle("fuchsia")))

I would by fine with this as well. Aaron?

I would also be fine with that. I don't have a strong opinion on whether to use a string literal or an identifier as the argument, but I slightly swing towards string literal so you could do something like "C++ Core Guideline" where there are spaces or punctuators.

clang/lib/Sema/SemaType.cpp
7444

I heard that type attributes are relatively scary and require a lot of work to implement

They're harder to implement than declaration attributes because there's more to think about, but that's not a reason to not use a type attribute when that's the correct semantics we want.

7444

The purpose of the use annotation is to tell the analyzer that none of this will happen, so we can safely diagnose a use after release. So the original idea was to annotate those functions that are guaranteed to never close, overwrite etc the handle.

Thank you for the explanation, that helps me understand better.

I think the default for "use" may be backward -- it seems to me that less functions will escape/close a handle than will operate on a handle. Do you have any data to back up my gut feeling? :-D I sort of wonder if we want the model to be: 1) mark a type as being a handle of a specific type (POSIX, fuschia, Win32, etc), 2a) mark a parameter declaration as creating a handle (the parameter type specifies the handle type), 2b) mark a function type as its return value creating a handle (the return type specifies the handle type), 3a) mark a parameter declaration as releasing a handle (the parameter type specifies the handle type), 3b) mark a function type as releasing a handle (all parameters of handle types are considered to be released), 4) mark a parameter declaration as possibly mutating the state of the handle (closing it, changing its identity, escaping it, etc).

Usage would be something like:

typedef int PosixHandle [[clang::handle_type("POSIX")]];

PosixHandle open(const char *) [[clang::acquire_handle]];
bool open2(const char *, [[clang::acquire_handle]] PosixHandle&);

void close(PosixHandle) [[clang::release_handle]];
void close2([[clang::release_handle]] PosixHandle);

void typical_use(PosixHandle);
void atypical_use([[clang::modifies_handle]] PosixHandle&);

// Maybe we want to allow these too?
int open3(const char *) [[clang::acquire_handle("POSIX")]];
void close3([[clang::release_handle("POSIX") int);
void atypical_use2([[clang::modifies_handle("POSIX") int&);

However, when I think about this, this smells an awful lot like some existing attributes we already support: capabilities. We currently use them for thread safety analysis, but there's no reason they shouldn't work out of the box for what you want (https://clang.llvm.org/docs/ThreadSafetyAnalysis.html#basic-concepts-capabilities). The basic idea behind them is that you specify a capability like "handle" and then specify what operations grant that capability (acquire the handle) or revoke it (release the handle). It may be worth exploring whether we can reuse these annotations for what you're trying to accomplish.

xazax.hun marked an inline comment as done.Dec 16 2019, 9:35 AM
xazax.hun added inline comments.
clang/lib/Sema/SemaType.cpp
7444

While it is true that less function will escape/close handles, this is also a trade-off. If we assume all unknown functions will close/escape, the check will be less noisy with unannotated/patially annotated codebase. If we assume that unknown function will not escape/close, we will get more false positives unless the whole code base is annotated. I chose this to support gradual introduction of annotations.

Ok. So how the handle works on declarations it is pretty clear. If we put it on a parameter, it affects that parameter, if we put it on a function, it affects the return value. The question is, how to deal with types.

2b) mark a function type as its return value creating a handle (the return type specifies the handle type
3b) mark a function type as releasing a handle (all parameters of handle types are considered to be released)

This does not really work. In case of Fuchsia, none of the syscalls actually return handles. Handles are acquired using output parameters and there are syscalls that acquire multiple handles. There are also syscalls that replaces handles, i.e. whatever we do at the type level we really need the paramter granularity. Any simplification like if we annotate the function type as such and such will affect all the parameters will not work.

However, when I think about this, this smells an awful lot like some existing attributes we already support: capabilities.

I did look into capabilities, and while I am not an expert in this area I found the following problems:

  • As far as I understand capabilities also only work well with fully annotated code, and gradually adding annotations can result in lots of false positives.
  • As far as I understand, they do not have paramter level granularity, which is crucial for Fuchsia's APIs.

I did not want to generalize capability attributes as I would probably need to update all the related analysis as well.

  • Added string argument to specify the handle type.
  • Only AcquireHandleAttr can be a type attribute and it can only be applied to function types.
aaron.ballman added inline comments.Dec 20 2019, 10:46 AM
clang/include/clang/Basic/Attr.td
3475

I have no idea whether this is a problem or not (and I'm not certain if you know either) -- how bad is it that this is DeclOrTypeAttr in the case where it's an inherited parameter with the attribute? The other attributes use InheritableParamAttr, but the only thing they apply to is a parameter so that's easy. I don't know what should be done in this case, or if it's an issue.

I think a test case for this scenario would be:

void func([[clang::acquires_handle("foo") int *a);

void func(int *a) {
  // a should be marked as acquires_handle in the AST should you do an AST dump
}
clang/lib/Sema/SemaType.cpp
7638

Can you be sure to add a test where the attribute is written in the type position for a non-function type to ensure you get a reasonable diagnostic?

e.g., int [[clang::acquire_handle("")]] a;

aaron.ballman accepted this revision.Dec 20 2019, 11:05 AM

LGTM with the nits fixed, thank you!

clang/include/clang/Basic/Attr.td
3475

I verified that this is an issue, but it's not a new one. [[clang::lifetimebound]] has the same issue. You should add a FIXME comment here and a test case with a FIXME comment so we don't lose track of this, but it doesn't seem critical to fix right now.

This revision is now accepted and ready to land.Dec 20 2019, 11:05 AM
xazax.hun marked an inline comment as done.Dec 20 2019, 11:48 AM

Thanks for the review! :)

This revision was automatically updated to reflect the committed changes.
paulherman added inline comments.
clang/include/clang/Basic/AttrDocs.td
4724

Sorry for the drive-by comment, but I believe that the code snippets are out of date since the addition of the string param. Would you mind clarifying what the param should do -- is it a project name such as "fuchsia" or is it a name of the handle such as "my_awesome_pipe"? I'd be happy to take the fix, but not sure which way to go.

aaron.ballman added inline comments.Aug 23 2021, 5:22 AM
clang/include/clang/Basic/AttrDocs.td
4724

I understand it to be the name of the handle, not a project-specific name.