This is an archive of the discontinued LLVM Phabricator instance.

[Clang][Attr] Introduce the `assume` function attribute
ClosedPublic

Authored by jdoerfert on Nov 23 2020, 10:04 AM.

Details

Summary

The assume attribute is a way to provide additional, arbitrary
information to the optimizer. For now, assumptions are restricted to
strings which will be accumulated for a function and emitted as comma
separated string function attribute. The key of the LLVM-IR function
attribute is llvm.assume. Similar to llvm.assume and
__builtin_assume, the assume attribute provides a user defined
assumption to the compiler.

A follow up patch will introduce an LLVM-core API to query the
assumptions attached to a function. We also expect to add more options,
e.g., expression arguments, to the assume attribute later on.

The omp [begin] asssumes pragma will leverage this attribute and
expose the functionality in the absence of OpenMP.

Diff Detail

Event Timeline

jdoerfert created this revision.Nov 23 2020, 10:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2020, 10:04 AM
Herald added a subscriber: bollu. · View Herald Transcript
jdoerfert requested review of this revision.Nov 23 2020, 10:04 AM
jdoerfert updated this revision to Diff 307194.Nov 23 2020, 1:38 PM

Generalize check lines [NFC]

Thank you for your patience with the delays getting to your review! Along with the documentation question, one line of thinking I'm also not clear on is whether this should be a type-based attribute rather than a declaration-based attribute. For instance, if you want to write an assumption about the parameters to a function, those are not in scope for a declaration attribute but are for a type attribute. But also, it seems like this assumption information is something that would be useful to carry around as part of the function's identity. For instance, would users want to be able to make a call through a function pointer and have the same set of assumptions conveyed to the backend?

clang/include/clang/Basic/AttrDocs.td
3989

It's not clear to me how this string literal relates to program constructs being assumed. I think adding some example usage would be very instructive, but I also think the string literal needs more documentation. e.g.,

__attribute__((assume("the parameter 'i' will be nonnegative"))) void func(int i);

seems reasonable from the existing text but probably isn't what we're looking for. ;-)

If we're going to require the assumption to be a valid expression, then I'm not certain a string literal is doing us any favors given that there won't be any expression validity checking done on the string. Why not support an expression as the argument directly?

jdoerfert updated this revision to Diff 310724.Dec 9 2020, 5:12 PM

List known assumptions and warn if unknown.

I added the warning and typo recognition as discussed in IRC. We list known assumption strings in the documentation and SemaDeclAttr. As we add new ones we should include a short description or link to their meaning. The ones I added so far are OpenMP specific (dubbed omp_XXX) and the OpenMP standard is given as reference.

Thank you for your patience with the delays getting to your review! Along with the documentation question, one line of thinking I'm also not clear on is whether this should be a type-based attribute rather than a declaration-based attribute. For instance, if you want to write an assumption about the parameters to a function, those are not in scope for a declaration attribute but are for a type attribute.

I would like to see that, in our last proposal of something similar we suggested to add something like __attribute__((arg_attr(3, "nofree"))) which would translate to the nofree IR attribute for argument 3. I can also envision a type based assumption attribute but I feel a declaration one will be needed nevertheless. We'll also add expression assumptions as they are required by OpenMP and generally useful. The "payload" of the assume attribute will then be a union, similar to the AlignedAttribute.

But also, it seems like this assumption information is something that would be useful to carry around as part of the function's identity. For instance, would users want to be able to make a call through a function pointer and have the same set of assumptions conveyed to the backend?

Like other IR attributes, if you can go from the function pointer to a (set of) declaration(s) you can use the attributes from that declaration. I don't think this is any different than other function attributes, e.g., readnone in this aspect.

jdoerfert updated this revision to Diff 310727.Dec 9 2020, 5:25 PM

minor spelling improvements

Thank you for your patience with the delays getting to your review! Along with the documentation question, one line of thinking I'm also not clear on is whether this should be a type-based attribute rather than a declaration-based attribute. For instance, if you want to write an assumption about the parameters to a function, those are not in scope for a declaration attribute but are for a type attribute.

I would like to see that, in our last proposal of something similar we suggested to add something like __attribute__((arg_attr(3, "nofree"))) which would translate to the nofree IR attribute for argument 3. I can also envision a type based assumption attribute but I feel a declaration one will be needed nevertheless. We'll also add expression assumptions as they are required by OpenMP and generally useful. The "payload" of the assume attribute will then be a union, similar to the AlignedAttribute.

I want to make sure we're on the same page with this.

If the assumption is a unit of information that applies to a single parameter, then the attribute should be written on the parameter itself rather than using positional information. e.g., void func(int i [[clang::whatever]]); is preferable to void func(int i) [[clang::whatever(1)]]; By writing the attribute directly on the parameter, we remove confusion over things like "are parameter indexes one-based or zero-based?" and "does the implicit 'this' parameter count?".

If the assumption is unit of information that applies to the function but involves one or more parameters, then writing the attribute *might* makes sense as a type attribute. e.g., void func(int i, int j) [[clang::whatever(i < j)]];

It sounds to me like we may wind up with both kinds of situations, which is fine.

But also, it seems like this assumption information is something that would be useful to carry around as part of the function's identity. For instance, would users want to be able to make a call through a function pointer and have the same set of assumptions conveyed to the backend?

Like other IR attributes, if you can go from the function pointer to a (set of) declaration(s) you can use the attributes from that declaration. I don't think this is any different than other function attributes, e.g., readnone in this aspect.

But what about the behavior the user sees? e.g., should the user be alerted to this being a potentially bad thing?

void func(int *ip) [[clang::assume(ip != 0)]];
void other_func(int *ip); // No assumptions about ip
int *get_ip(); // May return null

...
void (*fp)(int *);

if (rand() % 10 == 0)
  fp = other_func;
else
  fp = func; // Should we warn about this?

fp(get_ip()); // This could do bad things if fp == func because get_ip() may return null
clang/include/clang/Basic/Attr.td
3532

Should this be supported on ObjCMethod declarations as well?

clang/include/clang/Basic/AttrDocs.td
3987

wrap to 80-col

3997
4007

Can you link to that standard here so users don't have to hunt for it?

clang/include/clang/Basic/DiagnosticSemaKinds.td
742–749

This changes behavior for users because -Wno-assume now needs to be spelled -Wno-assumption -- I think we may want to keep the diagnostics for the assumption attribute separate from the diagnostics for builtin assume.

For instance, I think the assumption attribute warnings will all wind up leading to an ignored attribute, so perhaps the new grouping should be under the existing ignored attributes group.

744

I'd say that this case should probably be worded unknown assumption string '%0'; attribute ignored

747

Similarly: unknown assumption string '%0' may be misspelled; attribute ignored, did you mean '%1'?

clang/include/clang/Sema/Sema.h
9949

I think this should probably return something so that the caller knows whether to drop the attribute or not?

clang/lib/Sema/SemaDeclAttr.cpp
1720

No need to check the count manually, the common attribute handler should do that for you already.

1723

Does it make sense to warn about an unknown assumption string but then keep the attribute anyway? This seems like a situation where I'd expect the attribute to be dropped as it appears to be meaningless.

Also, I think this should pass in the source location to the argument, not the attribute itself. You can get this information out of the call to checkStringLiteralArgumentAttr() (it's an optional parameter).

1728

The current approach is reasonable, but I wonder if there's a way we can force this list to be filled out by the backend pass authors such that the frontend can query the backend for the supported list?

jdoerfert updated this revision to Diff 310938.Dec 10 2020, 9:25 AM
jdoerfert marked 8 inline comments as done.

Addressed remarks

Dropping the attribute and forcing the attribute to be somehow registered makes various use cases impossible. All we would allow is an integrated compilation where the consumers are available and the FE is aware of them. This is hard for optional LLVM plugins, potentially impossible if you separate the FE and compilation steps, e.g., as you do in LTO builds. I don't see the benefit in dropping the string if it is unknown, the warning is giving the user all the information we have, namely that the FE cannot verify this is a known assumption. Keeping the string attributes doesn't cost us anything and there is no immediate benefit by dropping it (IMHO), it just makes certain compilation scenarios harder/impossible.

clang/include/clang/Basic/Attr.td
3532

Sure, why not I guess.

clang/lib/Sema/SemaDeclAttr.cpp
1720

Copied from the generic templated handler.

jdoerfert updated this revision to Diff 311010.Dec 10 2020, 1:31 PM

Allow to register assumption stings through llvm-core.

jdoerfert updated this revision to Diff 311060.Dec 10 2020, 4:25 PM

Move into LLVMCore and provide more helper to extracts assumptions

jdoerfert updated this revision to Diff 311070.Dec 10 2020, 4:56 PM

Remove accidental test change and ensure users create KnownAssumptionString objects

aaron.ballman added inline comments.Dec 15 2020, 5:49 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
742–749

I think we can ignore this comment now, but I do wonder whether we should split the diagnostic group somewhat. To me, the "may be misspelled" warning is going to always be a serious warning -- it means the user fat-fingered something and the attribute is likely to not do anything useful. But the non-misspelling warning seems like a less severe warning because I may be relying on an undocumented string from a pass. So I could imagine wanting to turn off all of the warnings for using an undocumented string from a pass but wanting to retain all of the misspelling warnings. However, you may be envisioning a different usage pattern for this than I am. WDYT?

744

In light of wanting to keep the attribute around, I think this should be unknown assumption string '%0'; attribute is potentially ignored or something along those lines. Basically, I think the diagnostic should focus on the fact that the user cannot predict what happens with this attribute.

747

Similar here. unknown assumption string '%0' may be misspelled; attribute is potentially ignored, did you mean '%1'?

clang/include/clang/Sema/Sema.h
9945

Should this be a static function in SemaDeclAttr.cpp rather than a public part of the Sema interface? I can't think of any other callers for this aside from handleAssumptionAttr().

clang/lib/Sema/SemaDeclAttr.cpp
1720

Ah, yeah, that one needs to do the test because it may be called for an attribute expecting a variadic number of arguments.

llvm/include/llvm/IR/Assumptions.h
16

Should probably fix the lint warning here.

jdoerfert updated this revision to Diff 311922.Dec 15 2020, 8:35 AM
jdoerfert marked 5 inline comments as done.

Addressed comments

clang/lib/Sema/SemaDeclAttr.cpp
1728

We do, kinda. A backend that wants to query an assumption from the llvm.assume function attribute needs to provide a KnownAssumptionStr which has a comment people might read.

aaron.ballman accepted this revision.Dec 15 2020, 8:52 AM

LGTM (found one very minor late nit in a cmake file though). Thank you for the discussion on this!

clang/lib/Sema/CMakeLists.txt
4

Might as well keep this list alphabetized.

This revision is now accepted and ready to land.Dec 15 2020, 8:52 AM
This revision was landed with ongoing or failed builds.Dec 15 2020, 2:52 PM
This revision was automatically updated to reflect the committed changes.