Page MenuHomePhabricator

Make -Wstring-plus-int warns even if when the result is not out of bounds
ClosedPublic

Authored by ArnaudBienner on Dec 6 2018, 12:38 PM.

Diff Detail

Repository
rC Clang

Event Timeline

ArnaudBienner created this revision.Dec 6 2018, 12:38 PM

I found this warning to be really useful but was confused that it is not always triggered.
I initially discovered -Wstring-plus-char (and fixed some issues in the projects I work on where sometimes developers didn't realized the "+" wasn't doing to do concatenation because it was applied on char and char*).
Then I also activated -Werror=string-plus-int to reduce the risk of developers doing mistakes in our codebase.

Turns out that because we had some buggy code not catch by this warning for this reason: the result of the concatenation was not out of bounds, so the warning wasn't triggered.

I understand that point of view, but IMHO this warning should be activated even though, like for -Wstring-plus-char: it's easy to fix the warning by having a nicer, more readable code, and IMO code raising this warning is likely to indicate an issue in the code.
Having developers accidentally concatenating string and integer can happen (even more if when not concatenating literals directly).
But I've found a bug in our codebase even more tricky: when concatenating enum and literals char* strings:
We had a code like this (using Qt framework):

QString path = QStandardPaths::StandardLocation(QStandardPaths::DataLocation) + "/filename.ext";

The developer was trying to use QStandardPaths::standardLocations [1] static function (mind the lowercase s and the extra s at the end) which takes an enum of type QStandardPaths::StandardLocation. Similar name (so easy to do a typo) but very different meanings.
So instead of getting a string object and concatenating it with some literal strings, path was set to a truncated version of "/filename.ext".
This kind of bugs is hard to catch during code review process (and wasn't in my case).

Long story, but I think it's interesting to illustrate the need to have this warning applied to more cases.

[1]: http://doc.qt.io/qt-5/qstandardpaths.html#standardLocations

Nico: I've added you as reviewer since you originally implemented this warning (thanks for this BTW :) as said, it's really helpful), but feel free to delegate to someone else.

The usual process for adding / tweaking warnings is to build some large open-source project with the new / tweaked warning and note true and false positive rate. IIRC I did this on Chromium when I added the warning, and the out-of-bounds check prevented false positives without removing true positives. (...or maybe Richard asked for it during review? Did you try and find the review where this got added?)

That is, can you try building Chromium with this tweak and see what it does to true and false positive rate?

I found the commit: [1] which links to a related discussion [2] but there isn't much details. I wasn't sure if there was other discussions.
I will try to build Firefox with this change, and see how it goes (just need to find some time to do it).
I'll keep you posted.

[1]: https://github.com/llvm-mirror/clang/commit/1cb2d742eb6635aeab6132ee5f0b5781d39487d7
[2]: http://comments.gmane.org/gmane.comp.compilers.clang.scm/47203

So I built Firefox with my patched version of clang and the only place it found a warning was in ffmpeg [1], a case you already reported in [2] when you implemented that patch, so no new positive AFAICT.
Do you also want to try on Chromium and see how it goes? Or is Firefox enough? (it's a pretty big C++ codebase, so I would assume that's enough)

[1]: mozilla-central/media/ffvpx/libavutil/utils.c:73:42: warning: adding 'unsigned long' to a string does not append to the string [-Wstring-plus-int]
278:13.00 return LICENSE_PREFIX FFMPEG_LICENSE + sizeof(LICENSE_PREFIX) - 1;
[2]: http://comments.gmane.org/gmane.comp.compilers.clang.scm/47203

thakis accepted this revision.Dec 13 2018, 7:49 AM

Firefox is good enough.

Thanks for finding http://comments.gmane.org/gmane.comp.compilers.clang.scm/47203 -- this was from before we did reviews on phab, so I was able to find the review in my inbox. I can't find the full discussion online (we also moved list servers :-/ Fragments are at http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20120220.txt ). Here's the discussion between Richard and me about the behavior you're changing:

Richard: """

When building all of chromium and its many dependencies, the warning
did also find 3 false positives, but they all look alike: Ffmepg
contains three different functions that all look like this:

4 bugs and 3 false positives doesn't sound great to me. Have you considered
relaxing the warning in cases where the integral summand is a constant
expression and is in-bounds? Would this pattern have matched any of your
real bugs?
"""

Me: """

4 bugs and 3 false positives doesn't sound great to me.

True. All of the the 3 false positives are really the same snippet
though. (On the other hand, 3 of the 4 bugs are very similar as well.)
The bugs this did find are crash bugs, and they were in rarely run
error-logging code, so it does find interesting bugs. Maybe you guys
can run it on your codebase after it's in, and we can move it out of
-Wmost if you consider it too noisy in practice?

Have you considered
relaxing the warning in cases where the integral summand is a constant
expression and is in-bounds? Would this pattern have matched any of your
real bugs?

That's a good idea (for my bugs, the int was always a declrefexpr,
never a literal). The ffmpeg example roughly looks like this:

  #define A "foo"
  #define B "bar"
  consume(A B + sizeof(A) - 1);

The RHS is just "sizeof(A)" without the "- 1", but since A B has a
length of 6, this still makes this warning go away. I added this to
the patch. With this change, it's 4 bugs / 0 false positives.

Note that this suppresses the warning for most enums which start at 0.
Or did you mean to do this only for non-enum constant expressions?
"""

Richard: """

Note that this suppresses the warning for most enums which start at 0.
Or did you mean to do this only for non-enum constant expressions?

I could imagine code legitimately wanting to do something like this:

template</*...*/>
struct S {
  enum { Offset = /* ... */ };
  static const char *str = "some string" + Offset;
};

That said, I'm happy for us to warn on such code. Whatever you prefer seems fine to me; we can refine this later, if a need arises.
"""

Given that discussion, your results on Firefox, and that this is useful to you sound like it should be fine to land this. So let's do that and see what happens. If anyone out there has any feedback on this change, we're all ears :-)

This revision is now accepted and ready to land.Dec 13 2018, 7:49 AM
sylvestre.ledru edited the summary of this revision. (Show Details)Dec 13 2018, 8:06 AM
This revision was automatically updated to reflect the committed changes.

Thanks for the archeological work and finding the conversation related to the initial patch :)
Interesting that the last case you mentioned is exactly the bug I had in my project (though in your case this would have been intended).

Actually this has been failing for 8 hours. So reverted in r349117. Also reverted your attempt to update the test. It wasn't updating the right test: r349118

dyung added a subscriber: dyung.EditedDec 13 2018, 4:56 PM

Actually this has been failing for 8 hours. So reverted in r349117. Also reverted your attempt to update the test. It wasn't updating the right test: r349118

And I was just about to commit what I think is the work-around for this failure!

Like Adam pointed out, your fix in r349118 was not updating the correct test and caused an additional failure.

I was able to get the test passing by adding a flag to suppress the warning that you added here. I changed line 54 in test_diagnostics.py to be the following:

def test_diagnostic_range(self):
    tu = get_tu(source='void f() { int i = "a" + 1; }', flags=["-Wno-string-plus-int"])    # <-- This is line 54 that was changed.
    self.assertEqual(len(tu.diagnostics), 1)

If you apply that fix, the output should be as it originally was and the test passes.

Really sorry about breaking the build :( Thanks for reverting it.
Actually, I think we could fix that test by removing the " +1": AFAICT this test is checking that warnings are correctly emitted from clang, but not testing all the warnings. So the conversion from const char* to int is enough: no need to have an extra +1 at the end.
I will update my patch to update the test accordingly.

ArnaudBienner reopened this revision.Dec 18 2018, 1:46 AM

Reopening since this broke some tests.

This revision is now accepted and ready to land.Dec 18 2018, 1:46 AM

Update tests broken by this change

Fixed two tests broken by this change:

  • test_diagnostics.py: AFAICT we are not testing all warnings here, but just that warnings are emitted correctly. The "+1" didn't seem to be useful, since the warning triggered was about the const char* being converted to an int (and this warning still applies)
  • test/SemaCXX/constant-expression-cxx1y.cpp: is a false positive for -Wstring-plus-int so use the preferred, equivalent syntax

@thakis: do those additional changes look OK to you? Or do you want someone else to review those?

thakis accepted this revision.Dec 22 2018, 7:13 AM

Still looks fine to me.

Make -Wstring-plus-int warns even if when the result is not out of bounds

serge-sans-paille resigned from this revision.Jan 3 2019, 8:29 AM

Just wanted to review the change to the Python script to ensure it remains compatible with Python2 and Python3, which is indeed the case; so LGTM on that aspect.

This revision was automatically updated to reflect the committed changes.

OK, thanks Serge! :)

For the record, I updated the patch one last time after it was accepted to remove my change to constant-expression-cxx1y.cpp since someone else did the same change in a separate commit meanwhile.