Page MenuHomePhabricator

[clang] Adding the Likelihood Attribute from C++2a
AbandonedPublic

Authored by Tyker on Mar 16 2019, 4:36 PM.

Details

Summary

attributes after an if like:

if (...) [[likely]]

are now applied on the if instead of the following statement.

i added the likely attribute in the necessary places Attr.td, AttrDocs.td.

i added a diagnostics kind for C++2a and for likely not used on an if statement.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
aaron.ballman added inline comments.Mar 17 2019, 12:38 PM
clang/include/clang/Basic/DiagnosticSemaKinds.td
8164

This should follow the pattern used by decl attributes:

def err_attribute_wrong_stmt_type : Error<
  "%0 attribute only applies to statements and labels">;

There may be a diagnostic used for [[fallthrough]] that could be combined with this diagnostic, since that attribute can only be applied to null statements.

Tyker marked an inline comment as done.Mar 17 2019, 12:55 PM

I added the sema testes

Tyker added a comment.EditedMar 17 2019, 2:23 PM

if likely/unlikely can be applied to any statement or label what should be generated when it is applied to a statement that don't have any conditional branch ? should it be ignored ?

This comment was removed by lebedev.ri.

if likely/unlikely can be applied to any statement or label what should be generated when it is applied to a statement that don't have any conditional branch ? should it be ignored without warning or error ?

Disregard previous [deleted] comment from me.
Yes, i believe it should simply be always accepted, for any isa<clang::Stmt>().
How to model that in codegen i'm not sure yet. Likely by affecting the nearest (up the AST) branch / switch / etc.

if likely/unlikely can be applied to any statement or label what should be generated when it is applied to a statement that don't have any conditional branch ? should it be ignored ?

That's the semantics we need to figure out. The standard talks about paths of execution including the attribute (http://eel.is/c++draft/dcl.attr.likelihood#2.sentence-1), so I imagine this means we should be supporting code like:

void func(bool some_condition) {
  if (some_condition) {
    [[likely]];
  } else {
    [[unlikely]];
  }
}

While that may seem a bit silly (because you can add the attribute to the compound-statement), there are situations where you cannot do that, but the feature still has use. Consider:

void func() {
  try {
    some_operation();
  } catch (some_exception &e) {
    [[likely]];
    recover_without_killing_everyone_on_the_elevator();
  }
}

In some safety-critical situations, you may not care how fast the normal operation works but still need the failure mode to be highly optimized to avoid loss of life.

We have other fun questions to figure out as well, such as whether to diagnose this code and what the semantics should be:

void func(bool foo) {
  if (foo) {
    [[likely]];
    [[unlikely]];
  }
}

void func2(bool foo) {
  if (foo) {
    [[likely]];
  } else {
    [[likely]];
  }
}

We don't have to solve it all up front, but having an idea as to what the backend is going to do with this information may help us design the appropriate frontend behavior where the standard gives us this much latitude. Do you know how this is expected to hook in to LLVM?

Tyker updated this revision to Diff 191115.EditedMar 18 2019, 9:48 AM

i think we are suppose to hook likely/unlikely on @llvm.expect for if, for, while, switch. but i have no idea how we could hook it if we need to support catch.

i added a revision with likely/unlikely and the correct semantic (i think).

i also added the required checks. i didn't find a way to detect when we are in a catch block. but it should perhaps be allowded.

gcc has implemented the likely/unlikely attribute.
the are represented as hot in the backend. and they have intersting semantic

//gcc implementation
if (i) [[likely]] { //warning
  return 0;
}

if (i) { [[likely]] // no warning
  return 0;
}
Tyker updated this revision to Diff 191408.Mar 19 2019, 4:08 PM

added diagnostics for contradictory attributes like for if:

if (...) [[likely]]
  return 1;
else [[likely]]
  return 2;

handled the codegen for If to generate builtin_expect but i probably didn't do it the canonical way. i am awaiting advice of how to do it right.

cpplearner added inline comments.
clang/include/clang/Basic/DiagnosticSemaKinds.td
8166

attribue => attribute

8172

attribue => attribute

clang/lib/Parse/ParseStmt.cpp
1329

Should this use ElseBranchAttr instead of ThenBranchAttr?

Tyker updated this revision to Diff 191862.Mar 22 2019, 5:30 AM

handled codegen for if, while, for and do/while, it will generate a @llvm.expect before the condition based on the attribute
i changed slithly the semantic

if (...) {  //error
[[likely]] ...
}

if (...) [[likely]]  { // ok
 ...
}

and added tests for AST, Semantic and codegen

Additionally I think that you need to:

  • add tests for templates
  • handle the ternary operator

Also, as per the coding guidelines, you need to fix:

  • spelling of variables, eg hint -> Hint.
  • run clang-format on your patch.
clang/include/clang/Basic/DiagnosticSemaKinds.td
8173

I don't think that this is right. I don't see this restriction in the specification. A warning should definitely be emitted when the attribute is ignored, but I believe that an error is inappropriate.

8179

This should be warn_contradictory_attribute, and I am not seeing tests for this.

clang/lib/CodeGen/CodeGenFunction.h
4018

Doc ?

clang/lib/Parse/ParseStmt.cpp
1314

I don't think that this should be done here. The natural place to me seems to be in ActOnIfStmt. Additionally I think that you should wrap this in a static helper function.

clang/lib/Sema/SemaStmtAttr.cpp
257

A comment indicating what this is doing ?

riccibruno added inline comments.Mar 22 2019, 6:40 AM
clang/lib/Sema/SemaStmtAttr.cpp
74

I think that you need to test for each of the above.

Hmm, it seems that an attribute is not allowed by the grammar in the expression or assignment-expression of a conditional-expression. Was that intended when [[likely]] was added ?

Adding Richard for design strategy discussion.

clang/include/clang/Basic/DiagnosticSemaKinds.td
8173

I don't think that this is right. I don't see this restriction in the specification. A warning should definitely be emitted when the attribute is ignored, but I believe that an error is inappropriate.

Agreed.

8174

This diagnostic isn't needed -- warn_duplicate_attribute_exact will cover it.

8176

This diagnostic is not needed -- err_attributes_are_not_compatible will cover it.

8179

This note also isn't needed. You should use note_conflicting_attribute instead.

clang/lib/Sema/SemaStmtAttr.cpp
64–75

This is incorrect -- nothing in the standard requires the attribute to appear after a branch statement. I'm not even 100% convinced that this should map directly to __builtin_expect in all cases (though it certainly would for conditional branches), because that doesn't help with things like catch handlers or labels.

clang/test/SemaCXX/cxx2a-likelihood-attr.cpp
67

Be sure to add the newline, please.

Hmm, it seems that an attribute is not allowed by the grammar in the expression or assignment-expression of a conditional-expression. Was that intended when [[likely]] was added ?

Attributes do not appertain to expressions, so yes, that was intentional when we added likely. (You can have an attribute applied to an expression statement, but that's at a level that doesn't seem too useful for these attributes.)

Hmm, it seems that an attribute is not allowed by the grammar in the expression or assignment-expression of a conditional-expression. Was that intended when [[likely]] was added ?

Attributes do not appertain to expressions, so yes, that was intentional when we added likely. (You can have an attribute applied to an expression statement, but that's at a level that doesn't seem too useful for these attributes.)

Good to know, thanks !

Tyker added a comment.EditedMar 22 2019, 8:40 AM

about the semantic issue.
with the standard's latitude on where we can put the attribute, the attribute can appear any where on the path of execution and must be matched with the condition it corresponds to. we need to do it for diagnostics purposes to detect conflicts and repetitions. and we need it again for the codegen. but where in the ast can we store this information ?
is adding information to conditional statements Node acceptable ?

about the semantic issue.
with the standard's latitude on where we can put the attribute, the attribute can appear any where on the path of execution and must be matched with the condition it corresponds to. we need to do it for diagnostics purposes to detect conflicts and repetitions. and we need it again for the codegen. but where in the ast can we store this information ?

I feel like this could be tracked in the StmtBits as well as via an attributed stmt node. We need to be able to handle cases like:

if (foo) [[likely]] {
  stmts...
}

if (bar) {
  stmts...
  [[likely]];
  stmts...
}

if (baz)
  [[likely]] stmt;

The attributed statement is required because we want ast pretty printing to also print out the [[likely]] attribute usage and AST matchers to appropriately find the attribute, etc, and we need to know which statement the attribute was attached to in order to get the positioning correct. However, for diagnostic purposes, we likely want to know information like "has this path already been attributed" and that can either be through some extra data on an AST node, or it could be an Analysis pass like we do for [[fallthrough]] support with a separate pass for CodeGen.

is adding information to conditional statements Node acceptable ?

You can generally add information to AST nodes, but pay close attention to things like adding bits to a bit-field that suddenly cause the object size to increase and see if there are ways to avoid that if possible (and if not, it's still fine).

It might make sense to work your way backwards from the backend to see what possible ways we can support this feature. __builtin_expect may make sense for if statements, but for case and default labels in a switch, we may want to play with the branch weights. For catch handlers, we may need to thread information through yet another way. Once we know what kinds of paths we can actually do something interesting with, we may spot a pattern of how to expose the information through the AST.

To expand on the above, if you need to add a few bits to a statement/expression you can use the bit-field classes in Stmt (for example: IfStmtBitfields). You will also need to update the serialization code in Serialization/ASTWriterStmt.cpp and Serialization/ASTReaderStmt.cpp.

Tyker updated this revision to Diff 192004.Mar 23 2019, 11:23 AM

i implemented the semantic the changes for if for, while and do while statement and the AST change to if. can you review it and tell me if it is fine so i implement the rest. i didn't update the test so they will fail.

Before any further review, could you please run clang-format on your patch (but not necessarily on the tests and not on the *.td files), wrap lines to 80 cols, and in general use complete sentences in comments (that is with proper punctuation and capitalization) ? To run clang-format on your patch you can use clang-format-diff. This is the command I use :

git diff -U0 --no-color --no-prefix --cached | clang-format-diff-7 -style=LLVM -i -v, with --cached replaced appropriately.

lebedev.ri added inline comments.Mar 23 2019, 2:48 PM
clang/include/clang/AST/Stmt.h
1826

lib/AST/TextNodeDumper.cpp will need to be taught to print it too
(and astdumper test coverage to show that)

clang/lib/Serialization/ASTReaderStmt.cpp
226

[de]serialization code will too need tests
(not just that it is written and read, but that it happens successfully, losslessly)

@lebedev.ri where are tests for AST serialization are located ? i didn't find them.

@lebedev.ri where are tests for AST serialization are located ? i didn't find them.

They live in clang\tests\PCH.

i implemented the semantic the changes for if for, while and do while statement and the AST change to if. can you review it and tell me if it is fine so i implement the rest. i didn't update the test so they will fail.

I think this may be moving closer to the correct direction now, but I'm still curious to hear if @rsmith has similar opinions.

clang/include/clang/Basic/DiagnosticSemaKinds.td
8173

Typo: associated

Should probably read "attribute %0 is not associated with a branch and is ignored"

Also, I'd rename the diagnostic to be warn_no_likelihood_attr_associated_branch

8174–8175

I'd rename the diagnostic to warn_conflicting_likelihood_attrs and reword the diagnostic to "attribute %0 conflicts with previous attribute %1"

clang/include/clang/Sema/Scope.h
167

Missing full stop at the end of the comment. You should check all the comments in the patch to be sure they are full sentences (start with a capital letter, end with punctuation, are grammatically correct, etc).

168

Rather than store an Attr here, would it make sense to instead store a LikelihoodAttr directly? That may clean up some casts in the code.

clang/include/clang/Sema/Sema.h
3838

You should check all of the identifiers introduced by the patch to ensure they meet our coding style requirements (https://llvm.org/docs/CodingStandards.html). thenStmt -> ThenStmt, etc.

clang/lib/Sema/SemaStmt.cpp
545

The condition here is incorrect -- it's not checking what kind of attributed statement is present to see if it's a likelihood statement.

However, what is the value in diagnosing this construct? It seems not-entirely-unreasonable for a user to write, for instance:

if (foo) {
  [[likely]];
  ...
} else [[unlikely]];
567

This is fragile -- you should do something more like:

if (const auto *LA = dyn_cast<LikelihoodAttr>(ThenAttr)) {
  Hint = LA->isLikely() ? ... : ...;
}

You'll want to add an accessor onto the LikelihoodAttr object in Attr.td to implement the isLikely() (and isUnlikely()) functions based on the attribute spelling. There are examples of this in Attr.td.

clang/lib/Sema/SemaStmtAttr.cpp
66–73

This doesn't seem to account for switch statements, like this, does it?

switch (foo) {
[[likely]] case 1: break;
[[unlikely]] case 2: break;
[[likely]] case 3: break;
[[unlikely]] default: break;
}

Note, it's perfectly reasonable for there to be repeated attributes here because some cases may be more likely than others.

Tyker updated this revision to Diff 192811.EditedMar 29 2019, 5:25 AM
Tyker retitled this revision from [clang] Adding the Likely Attribute from C++2a to AST to [clang] Adding the Likelihood Attribute from C++2a.

@aaron.ballman fixed based on feedback.
added semantic support for switch statment.

the modification in clang/lib/Analysis/CFG.cpp
is to fixe a false diagnostic from the fallthrough attribute in code like

switch (...) {
[[likely]] case 1:
        ...
        [[fallthrough]];
default:
        ...
}
Tyker marked 24 inline comments as done.Mar 29 2019, 1:10 PM
Tyker updated this revision to Diff 192979.Mar 30 2019, 9:23 AM
riccibruno added a subscriber: NoQ.Mar 30 2019, 1:04 PM

It seems that the tests are not present in this diff ? Also, again, could you please:

  1. Use clang-format, and
  2. Make sure that the comments are full sentences with appropriate punctuation, and
  3. Follow the style guide regarding for the names of variables and functions.
clang/include/clang/AST/Stmt.h
68

Can you make this a scoped enumeration to avoid injecting these names everywhere ? (+ add a comment describing what it is used for)

clang/include/clang/Sema/Scope.h
168

Perhaps BranchAttr -> BranchLikelihoodAttr ?

clang/lib/AST/Stmt.cpp
832

A small remark: there is no need to initialize it here since this will be done during deserialization. Not initializing it here has the advantage that forgetting the initialization during deserialization will (hopefully) cause an msan failure. The above bits are special since they are needed to correctly access the trailing objects.

clang/lib/Analysis/CFG.cpp
2212

I don't understand why this is needed. Can you explain it ? Also I think that someone familiar with this code should comment on this (maybe @NoQ ?)

clang/lib/CodeGen/CGStmt.cpp
705

I believe that the lowering is incorrect. I applied your patch and here (

) is the IR that clang generates (obtained with -O1 -S -emit-llvm -Xclang -disable-llvm-passes -g0) for this code:

bool f(bool i);
bool g(bool i);

bool h1(bool i) {
  if (i) [[likely]]
    return f(i);
  return g(i);
}

bool h2(bool i) {
  if (__builtin_expect(i, true))
    return f(i);
  return g(i);
}

In particular for the branch in h1 we have:

%tobool = trunc i8 %0 to i1
%expval = call i1 @llvm.expect.i1(i1 %tobool, i1 true)
br i1 %tobool, label %if.then, label %if.end

Note that %expval is not used. Compare this to the branch in h2:

%tobool = trunc i8 %0 to i1
%conv = zext i1 %tobool to i64
%expval = call i64 @llvm.expect.i64(i64 %conv, i64 1)
%tobool1 = icmp ne i64 %expval, 0
br i1 %tobool1, label %if.then, label %if.end

where the extra conversions are because of the signature of __builtin_expect.

clang/lib/Parse/ParseStmt.cpp
1262

Perhaps ThenAttr -> ThenLikelihoodAttr ?

1304

Perhaps ElseAttr -> ElseLikelihoodAttr ?

clang/lib/Sema/SemaStmt.cpp
543–546

I would appreciate some documentation above HandleIfStmtHint.

548

Do you have to use getSpelling() here ? Why not use isLikely() and isUnlikely() as below ?

clang/lib/Sema/SemaStmtAttr.cpp
63

Type not obvious -> no auto please.

Tyker marked 2 inline comments as done.Mar 31 2019, 6:27 AM
Tyker added inline comments.
clang/lib/Analysis/CFG.cpp
2212

the detail of why are complicated and i don't have them all in head but without this edit in cases like

switch (...) {
[[likely]] case 1:
        ...
        [[fallthrough]];
default:
        ...
}

the fallthrough attribute emitted a diagnostic because is wasn't handling attributed case statement. the edit i performed is probably not the optimal way to solve the issue as it only solves the issue for likelihood attribute. but i don't know any other attribute that can be applied on a case statement but if they were others they would probably have the same issue. but the code is quite hard to follow and i didn't wanted to break anything. so this is what i came up with.
i am going to look into it to find a better solution.

clang/lib/CodeGen/CGStmt.cpp
705

from reading the documentation it seemed to me that both were equivalent. but after further checking there aren't.

riccibruno added inline comments.Mar 31 2019, 6:59 AM
clang/lib/CodeGen/CGStmt.cpp
705

Looking at how the intrinsic is lowered in the LowerExpectIntrinsic pass (in handleBrSelExpect), it seems that for a branch or select the following patterns are supported:

// Handle non-optimized IR code like:
//   %expval = call i64 @llvm.expect.i64(i64 %conv1, i64 1)
//   %tobool = icmp ne i64 %expval, 0
//   br i1 %tobool, label %if.then, label %if.end
//
// Or the following simpler case:
//   %expval = call i1 @llvm.expect.i1(i1 %cmp, i1 1)
//   br i1 %expval, label %if.then, label %if.end
Tyker updated this revision to Diff 193015.Mar 31 2019, 8:20 AM

@riccibruno i fixed based on feedback everything except the CFG edit as i still need to analyse the situation.

added AST and CodeGen for For, While, Do and CXXFor.
added tests for Semantic, AST, PCH

Tyker marked an inline comment as done.Mar 31 2019, 8:25 AM
Tyker added inline comments.
clang/test/SemaCXX/cxx2a-likelihood-attr.cpp
71

just saw the //clang format off at the bottom ill remove it

Tyker marked an inline comment as done.Mar 31 2019, 8:41 AM
Tyker added inline comments.
clang/lib/Sema/SemaStmt.cpp
528

clang-format passed on the whole file i am fixing it

Tyker updated this revision to Diff 193016.Mar 31 2019, 8:45 AM
Tyker marked 11 inline comments as done and an inline comment as not done.Mar 31 2019, 8:49 AM
NoQ added inline comments.Mar 31 2019, 2:29 PM
clang/lib/Analysis/CFG.cpp
2212

The [[likely]] attribute should not affect the overall topology of the CFG. It might be a nice piece of metadata to add to a CFG edge or to a CFG terminator, but for most consumers of the CFG (various static analyses such as analysis-based warnings or the Static Analyzer) the attribute should have little to no effect - the tool would still need to explore both branches. I don't know how exactly the fallthrough warning operates, but i find it likely (no pun intended) that the fallthrough warning itself should be updated, not the CFG.

It is probable that for compiler warnings it'll only cause false negatives, which is not as bad as false positives, but i wouldn't rely on that. Additionally, false negatives in such rare scenarios will be very hard to notice later. So i'm highly in favor of aiming for the correct solution in this patch.

Tyker marked an inline comment as done.Apr 1 2019, 10:02 AM
Tyker added inline comments.
clang/lib/Analysis/CFG.cpp
2212

i think we all agree that the CFG structure shouldn't be affected by the presence or absence of the likely attribute. but in the current state(no changes to the CFG) it does for example.

the CFG were obtained without the edit in CFG.cpp but with the added likely attribute
using: clang -cc1 -analyze test.cpp -analyzer-checker=debug.DumpCFG

input:

int f(int i) {
        switch (i) {
        [[likely]] case 1:
                return 1;
        }
        return i;
}

outputs:

 [B5 (ENTRY)]
   Succs (1): B2

 [B1]
   1: i
   2: [B1.1] (ImplicitCastExpr, LValueToRValue, int)
   3: return [B1.2];
   Preds (2): B3 B2
   Succs (1): B0

 [B2]
   1: i
   2: [B2.1] (ImplicitCastExpr, LValueToRValue, int)
   T: switch [B2.2]
   Preds (1): B5
   Succs (2): B4 B1

 [B3]
   1:  [[likely]]case 1:
[B4.2]   Succs (1): B1

 [B4]
  case 1:
   1: 1
   2: return [B4.1];
   Preds (1): B2
   Succs (1): B0

 [B0 (EXIT)]
   Preds (2): B1 B4

and
input:

int f(int i) {
        switch (i) {
         case 1:
                return 1;
        }
        return i;
}

outputs:

[B4 (ENTRY)]
  Succs (1): B2

[B1]
  1: i
  2: [B1.1] (ImplicitCastExpr, LValueToRValue, int)
  3: return [B1.2];
  Preds (1): B2
  Succs (1): B0

[B2]
  1: i
  2: [B2.1] (ImplicitCastExpr, LValueToRValue, int)
  T: switch [B2.2]
  Preds (1): B4
  Succs (2): B3 B1

[B3]
 case 1:
  1: 1
  2: return [B3.1];
  Preds (1): B2
  Succs (1): B0

[B0 (EXIT)]
  Preds (2): B1 B3

i think think this is the underlying issue. the false diagnostic from fallthrough previously mentioned is a consequence of this

NoQ added inline comments.Apr 1 2019, 6:20 PM
clang/lib/Analysis/CFG.cpp
2212

Hmm, i see. I got it all wrong. Please forgive me!

You're just trying to make CFGBuilder support AttributedStmt correctly in general. And the logic you're writing says "support it as any other generic Stmt that doesn't have any control flow in it, unless it's a [[likely]]-attributed AttributedStmt, in which case jump straight to children".

Could you instead do this by implementing CFGBuilder::VisitAttributedStmt with that logic?

I'm also not sure this logic is actually specific to [[likely]]. Maybe we should unwrap all AttributedStmts similarly?

Tyker marked an inline comment as done.Apr 2 2019, 3:50 AM
Tyker added inline comments.
clang/lib/Analysis/CFG.cpp
2212

we shouldn't handle all AttributedStmt this way. for example fallthrough needs to be present in the CFG because the analysis based warning depends on it. but we probably can handle every AttributedStmt on a case statement this way or even every AttributedStmt that aren't applied on a NullStmt. but perhaps this should be in an other patch, this one is already quite big and this issue is more general the likely attribute.

Tyker updated this revision to Diff 193269.EditedApr 2 2019, 6:13 AM

fixed the CFG issue a proper way

Tyker updated this revision to Diff 193284.Apr 2 2019, 7:23 AM
NoQ added inline comments.Apr 2 2019, 5:32 PM
clang/lib/Analysis/CFG.cpp
2212

Thanks, this makes perfect sense to me!

Tyker marked 7 inline comments as done.Apr 3 2019, 12:26 PM
Tyker abandoned this revision.Tue, May 21, 8:44 AM