This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Don't warn on printf('%hd', [char]) (PR41467)
ClosedPublic

Authored by Nathan-Huckleberry on Aug 13 2019, 4:11 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2019, 4:11 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
  • Add comment in test
nickdesaulniers accepted this revision.Aug 14 2019, 10:30 AM

LGTM thanks for the patch!

This revision is now accepted and ready to land.Aug 14 2019, 10:30 AM
lebedev.ri retitled this revision from Fix warning on printf('%hd', [char]) to [Sema] Don't warn on printf('%hd', [char]) (PR41467).Aug 14 2019, 10:54 AM
lebedev.ri added a reviewer: aaron.ballman.

There was a request in the linked bug for some code archaeology to see why this behavior exists in the first place. What were the results of that? I'm not opposed to the patch, but I would like to understand why it behaves the way it does.

I could imagine "confusing user intent" being a valid reason why someone might want this warning, so we may want to default-off this diagnostic (because the code is safe) but still provide users with a way to enable it.

clang/test/Sema/format-strings-enum-fixed-type.cpp
82 ↗(On Diff #214966)

This comment is now incorrect.

There was a request in the linked bug for some code archaeology to see why this behavior exists in the first place. What were the results of that? I'm not opposed to the patch, but I would like to understand why it behaves the way it does.

Since printf is a variadic function, integral argument types are promoted to int. The warning code runs the matchesType check twice, once to check if the promoted type (int) is able to be printed with the format and once to check if the original type (char) is able to be printed with the format.

printf("%d", [char]) is caught by the first case
printf("%hhd", [char]) is caught by the second case.

printf("%hd", [char]) is a warning because an exception has not been made for that case.

There was a request in the linked bug for some code archaeology to see why this behavior exists in the first place. What were the results of that? I'm not opposed to the patch, but I would like to understand why it behaves the way it does.

Since printf is a variadic function, integral argument types are promoted to int. The warning code runs the matchesType check twice, once to check if the promoted type (int) is able to be printed with the format and once to check if the original type (char) is able to be printed with the format.

printf("%d", [char]) is caught by the first case
printf("%hhd", [char]) is caught by the second case.

printf("%hd", [char]) is a warning because an exception has not been made for that case.

That explains what the implementation does, but does not attempt to answer the question *why* things are the way they are.

I read https://bugs.llvm.org/show_bug.cgi?id=41467#c4 as

  • any narrowing is always diagnosed
  • promotion to wider than int is diagnosed
  • passthrough is not diagnosed
  • promotion to something smaller than int is diagnosed (the current case)

I can interpret it as: we already know that

Since printf is a variadic function, integral argument types are promoted to int.

therefore why are you first implicitly promoting to int and then implicitly truncating?
Did you mean to print the original value? Did you mean to print int?

That doesn't sound too outlandish to me.

There was a request in the linked bug for some code archaeology to see why this behavior exists in the first place. What were the results of that? I'm not opposed to the patch, but I would like to understand why it behaves the way it does.

Since printf is a variadic function, integral argument types are promoted to int. The warning code runs the matchesType check twice, once to check if the promoted type (int) is able to be printed with the format and once to check if the original type (char) is able to be printed with the format.

printf("%d", [char]) is caught by the first case
printf("%hhd", [char]) is caught by the second case.

printf("%hd", [char]) is a warning because an exception has not been made for that case.

This all makes sense as to how things work today, but I was more wondering why they worked that way in the first place. I'm especially interested to know whether this is diagnosed because it shows confusion of the user's intent, because that seems like a valuable behavior to retain (though perhaps it doesn't need to be default-on).

lebedev.ri requested changes to this revision.Aug 15 2019, 12:09 PM

(just want to mark it as "unanswered questions")

This revision now requires changes to proceed.Aug 15 2019, 12:09 PM

As far as I can tell this case was just overlooked. The original commit adding this change https://reviews.llvm.org/rG0208793e41018ac168412a3da8b2fba70aba9716 only allows chars to int and chars to chars. Another commit ignores typing of chars https://reviews.llvm.org/rG74e82bd190017d59d5d78b07dedca5b06b4547da. I did not see anything related to this particular case in previous commits.

aaron.ballman requested changes to this revision.Aug 16 2019, 6:53 AM

As far as I can tell this case was just overlooked. The original commit adding this change https://reviews.llvm.org/rG0208793e41018ac168412a3da8b2fba70aba9716 only allows chars to int and chars to chars. Another commit ignores typing of chars https://reviews.llvm.org/rG74e82bd190017d59d5d78b07dedca5b06b4547da. I did not see anything related to this particular case in previous commits.

Hmm, it looks like, at least from this review, someone thought the behavior was for demonstrating user intent: http://llvm.org/viewvc/llvm-project?view=revision&revision=157961.

I've convinced myself that -Wformat should disable that diagnostic by default, but there is utility in keeping it exposed through a different format warning flag. It seems like -Wformat-pedantic should still diagnose this case.

  • Warn when -Wformat-pedantic is set
lebedev.ri added inline comments.Aug 22 2019, 1:57 PM
clang/lib/Sema/SemaChecking.cpp
8080–8083 ↗(On Diff #216702)

Match isn't used outside of this block later on, so i don't think you need *this* change.

8100–8107 ↗(On Diff #216702)

Just add a new variable

// All further checking is done on the subexpression
analyze_printf::ArgType::MatchKind Match2 = AT.matchesType(S.Context, ExprTy);
if (Match2 == analyze_printf::ArgType::Match)
  return true;
Pedantic |= Match2 == analyze_printf::ArgType::NoMatchPedantic;
aaron.ballman added inline comments.Aug 22 2019, 1:59 PM
clang/test/Sema/format-strings-enum-fixed-type.cpp
82 ↗(On Diff #214966)

Not quite what I had in mind. I would remove the // no-warning comments that were added and instead change the comment on line 82 to say This is not correct, but it is safe. Only warned in pedantic mode because '%hd' shows intent. or something along those lines.

clang/test/Sema/format-strings.c
280–281 ↗(On Diff #216702)

I'd drop the no-warning comments here, or say warning with -Wformat-pedantic only if you think it adds value.

clang/lib/Sema/SemaChecking.cpp
8101 ↗(On Diff #216702)

Maybe leave the top level Match const and just create a new one? It may be surprising if someone goes to reuse Match below not noticing that it may be modified here.

  • Add variable for implicit match and fix comments
lebedev.ri added inline comments.Aug 22 2019, 2:14 PM
clang/lib/Sema/SemaChecking.cpp
8100–8107 ↗(On Diff #216702)

Early return would simplify this still

aaron.ballman added inline comments.Aug 22 2019, 2:24 PM
clang/lib/Sema/SemaChecking.cpp
8106 ↗(On Diff #216713)

I don't think this needs to use |= true. If Pedantic was true, this is a noop. If it was false, this sets it to true. Either way the value is true, so I think it should just be Pedantic = true; The logic gets easier if you write this as:

if (ImplicitMatch && ImplicitMatch != analyze_printf::ArgType::NoMatchPedantic)
  return true;
Pedantic |= ImplicitMatch;
Nathan-Huckleberry marked an inline comment as done.Aug 22 2019, 3:52 PM
Nathan-Huckleberry added inline comments.
clang/lib/Sema/SemaChecking.cpp
8100–8107 ↗(On Diff #216702)

Could be ArgType::NoMatch and wouldn't display a warning

Nathan-Huckleberry marked an inline comment as done.Aug 22 2019, 3:54 PM
Nathan-Huckleberry added inline comments.
clang/lib/Sema/SemaChecking.cpp
8100–8107 ↗(On Diff #216702)

Wait this should definitely be an else-if

Nathan-Huckleberry marked an inline comment as done.Aug 22 2019, 3:55 PM
Nathan-Huckleberry added inline comments.
clang/lib/Sema/SemaChecking.cpp
8100–8107 ↗(On Diff #216702)

Nevermind you are correct. Will remove else case.

  • Remove else and |=
lebedev.ri requested changes to this revision.Aug 22 2019, 4:03 PM

Please can you explain why the snippet i posted in-line does not work for you?

clang/lib/Sema/SemaChecking.cpp
8105–8107 ↗(On Diff #216729)

I do not understand this.
Even if you set Pedantic = true, you will immediately return true without ever using that Pedantic.
This seems broken?

This revision now requires changes to proceed.Aug 22 2019, 4:03 PM
  • Fix broken logic from previous revision
clang/lib/Sema/SemaChecking.cpp
8107 ↗(On Diff #216730)

At this point ImplicitMatch can only have the value analyze_printf::ArgType::NoMatchPedantic as ArgType::MatchKind is an enum with only 3 values (see clang/include/clang/AST/FormatString.h).

Rather than have conditional/ternaries for each case, I think 2 conditionals are maybe simpler:

if (ImplicitMatch == analyze_printf::ArgType::Match)
  return true;
else if (ImplicitMatch == analyze_printf::ArgType::NoMatchPedantic)
  Pedantic = true;

The |= is kind of more complicated than simple assignment.

Then you don't need a block for if (ImplicitMatch).

nickdesaulniers requested changes to this revision.Aug 22 2019, 4:28 PM
This revision now requires changes to proceed.Aug 22 2019, 4:28 PM
  • Simplify logic for readability
nickdesaulniers accepted this revision.Aug 22 2019, 4:56 PM

Thanks for being responsive to all this code review! 💃🏽

Looks about right

clang/lib/Sema/SemaChecking.cpp
8105–8106 ↗(On Diff #216739)

No else after return.

aaron.ballman accepted this revision.Aug 23 2019, 5:17 AM

LGTM aside from the else after return nit.

lebedev.ri accepted this revision.Aug 23 2019, 5:35 AM

LGTM aside from the else after return nit.

This revision is now accepted and ready to land.Aug 23 2019, 5:35 AM
lebedev.ri requested changes to this revision.Aug 23 2019, 10:13 AM
lebedev.ri added inline comments.
clang/lib/Sema/SemaChecking.cpp
8106 ↗(On Diff #216884)

if (ImplicitMatch == analyze_printf::ArgType::NoMatchPedantic) should have stayed probably?

This revision now requires changes to proceed.Aug 23 2019, 10:13 AM
  • Add if without else
lebedev.ri accepted this revision.Aug 23 2019, 10:23 AM

LG, thank you.

This revision is now accepted and ready to land.Aug 23 2019, 10:23 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2019, 11:04 AM