Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 37137 Build 37136: arc lint + arc unit
Event Timeline
There was a request in the linked bug for some code archaeology to see why this behavior exists in the first place. What were the results of that? I'm not opposed to the patch, but I would like to understand why it behaves the way it does.
I could imagine "confusing user intent" being a valid reason why someone might want this warning, so we may want to default-off this diagnostic (because the code is safe) but still provide users with a way to enable it.
clang/test/Sema/format-strings-enum-fixed-type.cpp | ||
---|---|---|
82–83 | This comment is now incorrect. |
Since printf is a variadic function, integral argument types are promoted to int. The warning code runs the matchesType check twice, once to check if the promoted type (int) is able to be printed with the format and once to check if the original type (char) is able to be printed with the format.
printf("%d", [char]) is caught by the first case
printf("%hhd", [char]) is caught by the second case.
printf("%hd", [char]) is a warning because an exception has not been made for that case.
That explains what the implementation does, but does not attempt to answer the question *why* things are the way they are.
I read https://bugs.llvm.org/show_bug.cgi?id=41467#c4 as
- any narrowing is always diagnosed
- promotion to wider than int is diagnosed
- passthrough is not diagnosed
- promotion to something smaller than int is diagnosed (the current case)
I can interpret it as: we already know that
therefore why are you first implicitly promoting to int and then implicitly truncating?
Did you mean to print the original value? Did you mean to print int?
That doesn't sound too outlandish to me.
This all makes sense as to how things work today, but I was more wondering why they worked that way in the first place. I'm especially interested to know whether this is diagnosed because it shows confusion of the user's intent, because that seems like a valuable behavior to retain (though perhaps it doesn't need to be default-on).
As far as I can tell this case was just overlooked. The original commit adding this change https://reviews.llvm.org/rG0208793e41018ac168412a3da8b2fba70aba9716 only allows chars to int and chars to chars. Another commit ignores typing of chars https://reviews.llvm.org/rG74e82bd190017d59d5d78b07dedca5b06b4547da. I did not see anything related to this particular case in previous commits.
Hmm, it looks like, at least from this review, someone thought the behavior was for demonstrating user intent: http://llvm.org/viewvc/llvm-project?view=revision&revision=157961.
I've convinced myself that -Wformat should disable that diagnostic by default, but there is utility in keeping it exposed through a different format warning flag. It seems like -Wformat-pedantic should still diagnose this case.
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
8080–8083 | Match isn't used outside of this block later on, so i don't think you need *this* change. | |
8100–8108 | Just add a new variable // All further checking is done on the subexpression analyze_printf::ArgType::MatchKind Match2 = AT.matchesType(S.Context, ExprTy); if (Match2 == analyze_printf::ArgType::Match) return true; Pedantic |= Match2 == analyze_printf::ArgType::NoMatchPedantic; |
clang/test/Sema/format-strings-enum-fixed-type.cpp | ||
---|---|---|
82–83 | Not quite what I had in mind. I would remove the // no-warning comments that were added and instead change the comment on line 82 to say This is not correct, but it is safe. Only warned in pedantic mode because '%hd' shows intent. or something along those lines. | |
clang/test/Sema/format-strings.c | ||
280–282 | I'd drop the no-warning comments here, or say warning with -Wformat-pedantic only if you think it adds value. |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
8101 | Maybe leave the top level Match const and just create a new one? It may be surprising if someone goes to reuse Match below not noticing that it may be modified here. |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
8100–8108 | Early return would simplify this still |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
8105 | I don't think this needs to use |= true. If Pedantic was true, this is a noop. If it was false, this sets it to true. Either way the value is true, so I think it should just be Pedantic = true; The logic gets easier if you write this as: if (ImplicitMatch && ImplicitMatch != analyze_printf::ArgType::NoMatchPedantic) return true; Pedantic |= ImplicitMatch; |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
8100–8108 | Could be ArgType::NoMatch and wouldn't display a warning |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
8100–8108 | Wait this should definitely be an else-if |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
8100–8108 | Nevermind you are correct. Will remove else case. |
Please can you explain why the snippet i posted in-line does not work for you?
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
8103–8105 | I do not understand this. |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
8106 | At this point ImplicitMatch can only have the value analyze_printf::ArgType::NoMatchPedantic as ArgType::MatchKind is an enum with only 3 values (see clang/include/clang/AST/FormatString.h). Rather than have conditional/ternaries for each case, I think 2 conditionals are maybe simpler: if (ImplicitMatch == analyze_printf::ArgType::Match) return true; else if (ImplicitMatch == analyze_printf::ArgType::NoMatchPedantic) Pedantic = true; The |= is kind of more complicated than simple assignment. Then you don't need a block for if (ImplicitMatch). |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
8101 | if (ImplicitMatch == analyze_printf::ArgType::NoMatchPedantic) should have stayed probably? |
Match isn't used outside of this block later on, so i don't think you need *this* change.