This is an archive of the discontinued LLVM Phabricator instance.

Introduce the callback attribute and emit !callback metadata
ClosedPublic

Authored by jdoerfert on Dec 8 2018, 9:08 PM.

Details

Summary
With commit rXXXXX (currently https://reviews.llvm.org/D54498), LLVM
gained the ability to apply existing optimizations on indirections
through callbacks. This is based on an abstraction that hides the
middle man as described in rXXXXX and the llvm::AbstractCallSite class.

This commit enables clang to emit !callback metadata that is
understood by LLVM. It does so in three different cases:
  1) For known broker functions declarations that are directly
     emitted, e.g., __kmpc_fork_call for the OpenMP pragma parallel.
  2) For known broker functions that are identified by their name and
     source location through the builtin mechanism, e.g.,
     pthread_create from the POSIX thread API.
  3) For user annotated functions that carry the "callback(idx, ...)"
     attribute. The attribute has to include the index of the callback
     callee and how the passed arguments can be identified (as many as
     the callback callee has).

For additional information, also consider the commit message and
discussion for the LLVM patch: https://reviews.llvm.org/D54498
NOTE: This is only committed after https://reviews.llvm.org/D54498 and the commit message will be modified accordingly.

Diff Detail

Event Timeline

jdoerfert created this revision.Dec 8 2018, 9:08 PM
aaron.ballman requested changes to this revision.Dec 10 2018, 12:18 PM

This is missing all of the Sema and SemaCXX tests. Should have tests for member functions, variadic functions, incorrect arguments, incorrect subjects, etc.

include/clang/Basic/Attr.td
1206

No new undocumented attributes, please.

aaron.ballman added inline comments.Dec 10 2018, 12:18 PM
include/clang/Basic/Attr.td
1205

Should this also apply to Objective-C methods?

Why should the user specify this attribute on the function as opposed to on the parameter? e.g.,

// Why this:
__attribute__((callback (1, 2, 3)))
void* broker0(void* (*callee)(void *), void *payload, int otherPayload) {
  return callee(payload);
}

// Instead of this:
void* broker0(void* (*callee)(void *) __attribute__((callback (2, 3))), void *payload, int otherPayload) {
  return callee(payload);
}

// Or this:
void* broker0(void* (*callee)(void *) __attribute__((callback (payload, otherPayload))), void *payload, int otherPayload) {
  return callee(payload);
}

I ask because these "use an index" attributes are really hard for users to use in practice. They have to account for 0-vs-1 based indexing, implicit this parameters, etc and if we can avoid that, it may be worth the effort.

This revision now requires changes to proceed.Dec 10 2018, 12:18 PM
jdoerfert marked 2 inline comments as done.Dec 10 2018, 1:47 PM

This is missing all of the Sema and SemaCXX tests. Should have tests for member functions, variadic functions, incorrect arguments, incorrect subjects, etc.

I will write more tests and update this revision.

include/clang/Basic/Attr.td
1205

Should this also apply to Objective-C methods?

I don't need it to and unless somebody does, I'd say no.

I ask because these "use an index" attributes are really hard for users to use in practice. They have to account for 0-vs-1 based indexing, implicit this parameters, etc and if we can avoid that, it may be worth the effort.

I was thinking that the function notation makes it clear that there is *only one callback per function* allowed right now. I don't expect many manual users of this feature until we improve the middle-end support, so it is unclear to me if this requirement needs to be removed as well.

Other than that, some thoughts:

  • I do not feel strongly about this.
  • The middle requirement seems not much better n the first, we would still need to deal with index numbers (callbacks without arguments are not really interesting for now).
  • The last encoding requires us to define a symbol for "unknown argument" (maybe _ or ?).
1206

Ok, I can write documentation similar to the commit message and the lang-ref documentation for the callback metadata.

Update according to initial feedback

aaron.ballman added inline comments.Dec 11 2018, 12:36 PM
include/clang/Basic/Attr.td
1205

I was thinking that the function notation makes it clear that there is *only one callback per function* allowed right now.

I don't see how that follows. Users may still try writing:

__attribute__((callback (1, 3, 4)))
__attribute__((callback (2, 3, 4)))
void broker0(void (*cb1)(void *, int), void (*cb2)(void *, int), void *payload, int otherPayload) {
  cb1(payload, otherPayload);
  cb2(payload, otherPayload);
}

and reasonably expect that to work (we should have this as a test case, and probably warn on it).

I'm not strongly opposed to the current way this is exposed to users, just wondering if we can find a better way to surface the feature.

The last encoding requires us to define a symbol for "unknown argument" (maybe _ or ?).

Ah, I wasn't aware that this was part of the feature, but the documentation you wrote helped to clarify for me. Personal preference is for ?, but any symbol will do (so long as we aren't hoping users can count commas, e.g., callback(,,,,frobble,,,foo)).

include/clang/Basic/AttrDocs.td
3711

I'd remove callee here in both places.

3714

and the following indicies are the forwarded arguments.

3716

The index '0' is used to represent a callback callee argument which is not present in the declared parameter list.

3718

attributes -> attribute

3719

, allow to make the connection -> , make the connection

3721

Switch to consistently using a single space after the period (as was done in the previous paragraph).

3723

undefinied -> undefined

3724

else than then actual callback -> anything other than the actual callback.

Thus, inspected -> Inspected,

3725

shall may be mentioned -> are listed

3730–3731

You should mention the circumstances under which we automatically recognize a callback so that users know when they're required to write the annotation vs allowing the compiler to infer it.

jdoerfert updated this revision to Diff 177771.Dec 11 2018, 1:20 PM
jdoerfert marked 11 inline comments as done.

Fix and improve documentation

Thanks for the feedback.

Once we decided on the style of the annotation, I will implement that and change the tests/documentation accordingly.

include/clang/Basic/Attr.td
1205

and reasonably expect that to work (we should have this as a test case, and probably warn on it).

We have a test case and we'll spit out an error. (Sema/attr-callback-broken.c line 21 & 22)

I'm not strongly opposed to the current way this is exposed to users, just wondering if we can find a better way to surface the feature.

I can change it to the inlined style if nobody disagrees:

void broker(int foo, void (*callback)(int, int, int, int) __attribute__((callback(foo, ?, bar, ?))), int bar);

As I mentioned, I don't have a strong opinion on this but I just don't want to change it back and forth :)

arichardson added inline comments.
include/clang/Basic/Builtins.def
942

qsort_r callback argument order is different on Linux, macOS and FreeBSD so those constants can't be hardcoded:
Linux:
void qsort_r(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *, void *), void *arg);
FreeBSD:
void qsort_r(void *base, size_t nmemb, size_t size, void *thunk, int (*compar)(void *, const void *, const void*));
macos:
void qsort_r(void *base, size_t nel, size_t width, void *thunk, int (*compar)(void *, const void *, const void *));

test/CodeGen/callback_qsort_r.c
2

This should use a linux triple otherwise the qsort_r declaration is wrong. Ideally it should also handle macos+FreeBSD with the inverted argument order.

jdoerfert marked 2 inline comments as done.Dec 13 2018, 7:05 AM
jdoerfert added inline comments.
include/clang/Basic/Builtins.def
942

Really good point. I'll remove qsort_r in the next version I guess. If we find another similar function we can extend the encoding to sth like: C:(<3,-1,-1,4>|<5,4,-1,-1>) which represents both possibilities. The Builtin::Context::performsCallback(...) will then need to check which is appropriate based on the type of the actual declaration.

rkruppe removed a subscriber: rkruppe.
rkruppe added a subscriber: rkruppe.
jdoerfert updated this revision to Diff 180710.Jan 8 2019, 11:40 AM
jdoerfert marked 3 inline comments as done.

Update encoding, rebase to HEAD

jdoerfert updated this revision to Diff 180971.Jan 9 2019, 5:43 PM

Allow to use names in the callback metadata

jdoerfert marked 3 inline comments as done.Jan 9 2019, 5:46 PM
jdoerfert added inline comments.
include/clang/Basic/Attr.td
1205

You can now use parameter names or indices in the callback attribute. The special identifiers "" (double underscore) and "this" can be used to specify an unknown (-1 in the index notation) and the implicit "this" (0 in the index notation) argument.

test/CodeGen/callback_qsort_r.c
2

I postponed qsort_r until we have another similar function or somebody want to implement the necessary functionality.

aaron.ballman added inline comments.Jan 11 2019, 7:01 AM
include/clang/Basic/Attr.td
1205

Nice! I like that approach.

include/clang/Basic/AttrDocs.td
3716

I'd use backticks around 0 and this rather than single and double quotes, to make it clear that these are syntax and not prose.

3720–3721

Grammatical issue here -- I think it should probably say "or one that is, but is potentially..."

3724

are -> is

include/clang/Basic/DiagnosticSemaKinds.td
2598

I think this should say 'callback' attribute (here and in the other diagnostics); WDYT?

2599–2600

This is not needed, see err_attribute_argument_out_of_bounds.

2606

I'd reword this to use "may not" instead of "shall not".

2609–2612

These are not needed, see err_attribute_too_many_arguments and err_attribute_too_few_arguments (or err_attribute_wrong_number_arguments).

lib/Sema/SemaDecl.cpp
13587–13588

You can combine these predicates into one if statement.

lib/Sema/SemaDeclAttr.cpp
3474

This doesn't match the documentation -- I assume you switched from ? to __ because __ at least parses as a valid identifier, whereas ? would require extra parsing support? If so, that's fine by me.

3476

This doesn't match the documentation either, but I'm less clear on why the double underscores are used.

3485

Identifiers don't match the usual naming conventions. Prefer ++U as well.

3486

Spurious newline

3495

You can pass in IdLoc->Ident and the diagnostic engine should properly print and quote it for you.

3501

Spurious newline.

3521

Newline

test/CodeGen/callback_annotated.c
19–20

Can you add a test like this:

void* broker2(void *(*callee)(void));

void* test(void) {
  return broker2(test);
}

__attribute__((callback (callee)))
void* broker2(void *(*callee)(void));

Basically, I'm trying to see whether we properly merge the declarations and note that the call to broker2() within test() really does have the callback attribute.

test/Sema/attr-callback-broken.c
6

Please run the tests through clang-format.

jdoerfert updated this revision to Diff 181420.Jan 11 2019, 8:04 PM
jdoerfert marked 21 inline comments as done.

Style changes, clang-format, documentation improvements

Updated according to your comments.

lib/Sema/SemaDeclAttr.cpp
3474

Yes, __, and __this where chosen because they work without lexer/parser changes and are in the implementation namespace. I forgot to update the documentation though. Will be fixed.

3476

If you use this, the lexer will generate the special "this" token. That one is checked explicitly to be only used inside of non-static class methods. If you have an idea how to avoid this check or make it consider uses in the attribute as OK, please let me know.

3485

OK.

Prefer ++U as well.

Out of curiosity, why?

3486

That was intentional but OK.

aaron.ballman added inline comments.Jan 14 2019, 7:04 AM
include/clang/Basic/AttrDocs.td
3718

The '?' here needs to change to be '__' instead.

include/clang/Basic/Builtins.def
96

Wrap to 80-col

lib/Sema/SemaDeclAttr.cpp
3476

Why do you want to avoid that check? It seems like that's a very good check to have -- what circumstances would you expect a user to use this in a something other than a non-static member function?

3485

Because the previous value of U is not needed (you're executing the instruction only for its side effects, not its value).

jdoerfert marked 2 inline comments as done.

Small fixes, allow "this" inside "callbacks"

Generalize the treatment of "kw_this" as "kw_ident"

aaron.ballman accepted this revision.Jan 15 2019, 12:53 PM

LGTM aside from some minor nits.

However, please hold off on committing this until *after* the 8.0 branch happens. While I don't expect there to be issues with this patch, it does introduce a pretty novel new way to interact with attributes and I want to ensure that the community has the opportunity to react to any major issues with this approach, and I'm not certain a month or so will be enough time for that. If you need this patch to go in *before* the branch for some reason, please mention it! I'm not strictly opposed to it going in sooner rather than later, but if there's not a pressing need for the functionality, I'd prefer to wait out of an abundance of caution.

include/clang/Basic/AttrDocs.td
3716

__this should be this now.

utils/TableGen/ClangAttrEmitter.cpp
2163

const Record *

2171–2172

The formatting looks off here.

This revision is now accepted and ready to land.Jan 15 2019, 12:53 PM
This revision was automatically updated to reflect the committed changes.