This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] New checker for not null-terminated result caused by strlen(), size() or equal length
ClosedPublic

Authored by Charusso on Mar 29 2018, 8:46 AM.

Details

Summary

New checker called bugprone-not-null-terminated-result. This checker finds
function calls where it is possible to cause a not null-terminated result.
Usually the proper length of a string is strlen(src) + 1 or equal length
of this expression, because the null terminator needs an extra space.
Without the null terminator it can result in undefined behaviour when the
string is read.

The following and their respective wchar_t based functions are checked:

memcpy, memcpy_s, memchr, memmove, memmove_s, strerror_s,
strncmp, strxfrm

The following is a real-world example where the programmer forgot to
increase the passed third argument, which is size_t length.
That is why the length of the allocated memory is not enough to hold the
null terminator.

static char *stringCpy(const std::string &str) {
  char *result = reinterpret_cast<char *>(malloc(str.size()));
  memcpy(result, str.data(), str.size());
  return result;
}

In addition to issuing warnings, fix-it rewrites all the necessary code.
It also tries to adjust the capacity of the destination array:

static char *stringCpy(const std::string &str) {
  char *result = reinterpret_cast<char *>(malloc(str.size() + 1));
  strcpy(result, str.data());
  return result;
}

Note: It cannot guarantee to rewrite every of the path-sensitive memory
allocations.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Charusso updated this revision to Diff 143698.Apr 24 2018, 2:52 AM
Charusso marked an inline comment as done.

Changed CXX11 to C11 and made a little refactor.

Charusso marked an inline comment as done.Apr 24 2018, 2:52 AM
Charusso edited the summary of this revision. (Show Details)Apr 24 2018, 2:55 AM
aaron.ballman added inline comments.Apr 24 2018, 4:20 AM
clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
226 ↗(On Diff #140337)

I don't think you should assume the _s versions are available in C++, because they're not available everywhere. In fact, they're not available everywhere in C either (for instance, there is no Annex K implementation in glibc).

My recommendation would be to either gate this on the presence of a definition of the __STDC_LIB_EXT1__ macro and that the __STDC_WANT_LIB_EXT1__ macro is defined but not defined to 0, or to make it a configurable option. I have a slight preference for the config option because not all implementations implement all of Annex K but still have a reasonable _s implementation of the Annex K APIs (like Microsoft's implementation) so I can picture users wanting a way to opt-in even without the macros.

Charusso updated this revision to Diff 143825.EditedApr 24 2018, 3:50 PM
Charusso marked an inline comment as done.

Defined an option to switch off the _s suffixed functions usage.

Charusso edited the summary of this revision. (Show Details)Apr 24 2018, 3:57 PM

Shouldn't it catch in curl also this code?

urllen = strlen(url_clone);
....
memcpy(newest, url_clone, urllen);

Edit: if possible, report these bugs to project developers :)

Thanks for your idea @xbolva00, I will implement this feature, but currently I have problems with parens which cause ugly fix-its. After the review I will share the results with the devs.

Another idea if you want to implement it - check fopen.

FILE *f = fopen("file", "r"); read only
fputs("str", f);
we are writing -> boom, sigsegv or something like that.

Thanks for your sharing but I think I will move forward to Static Analyzer with my own projects.

xbolva00 resigned from this revision.May 16 2018, 5:56 AM
Charusso updated this revision to Diff 149116.May 30 2018, 8:19 AM

@xbolva00 idea implemented, doubled the checker's power. Now if the given argument is a DeclRefExpr this should handle it as it would be inlined as a simple CallExpr.

Results:
I have found 37 memcpy() and 2 memchr() errors in: curl, FFmpeg, git, memcached, OpenSSL, PostGreSQL, Redis, twin.

Example of this new DRE case:

Found memchr() errors:


All findings:

Thanks! Happy to see it is more powerful now :)

memcpy(crypt_buf, passwd, passwd_len); <--- warning
memcpy(crypt_buf + passwd_len, salt, salt_len);

This is a false warning since it appends strings using memcpy. But no idea what to do and if it is possible to avoid these false warnings.

Charusso updated this revision to Diff 149129.May 30 2018, 8:41 AM

Clang format.

memcpy(crypt_buf, passwd, passwd_len); <--- warning
memcpy(crypt_buf + passwd_len, salt, salt_len);

This is a false warning since it appends strings using memcpy. But no idea what to do and if it is possible to avoid these false warnings.

I have just tested it because of the malloc() function. I'm using CodeChecker and leaved the default settings, so IsSafeFunctionsAreAvailable = 1. Because of the malloc strncpy_s() cannot handle this case, but if the check would ran with IsSafeFunctionsAreAvailable = 0, it rewrites it to strncpy(crypt_buf, passwd, passwd_len + 1) which is a good transformation, as the official memcpy()'s result not null-terminated.

memcpy(crypt_buf, passwd, passwd_len); <--- warning
memcpy(crypt_buf + passwd_len, salt, salt_len);

This is a false warning since it appends strings using memcpy. But no idea what to do and if it is possible to avoid these false warnings.

I have just tested it because of the malloc() function. I'm using CodeChecker and leaved the default settings, so IsSafeFunctionsAreAvailable = 1. Because of the malloc strncpy_s() cannot handle this case, but if the check would ran with IsSafeFunctionsAreAvailable = 0, it rewrites it to strncpy(crypt_buf, passwd, passwd_len + 1) which is a good transformation, as the official memcpy()'s result not null-terminated.

Yeah, it is a valid recommendation.

I would like to see more negative tests.
What does it do if the len/size is a constant?
Variable that wasn't just assigned with strlen() return?

I would like to see more negative tests.
What does it do if the len/size is a constant?
Variable that wasn't just assigned with strlen() return?

Thanks for the comment! What do you mean by negative test?

I would like to see more negative tests.
What does it do if the len/size is a constant?
Variable that wasn't just assigned with strlen() return?

Thanks for the comment! What do you mean by negative test?

The tests where the check matches and issues a warning is a positive test.
The tests where the check does not match and/or does not issue a warning is a negative test.

I would like to see more negative tests.
What does it do if the len/size is a constant?
Variable that wasn't just assigned with strlen() return?

Thanks for the comment! What do you mean by negative test?

To prove we have no false warnings.

whisperity requested changes to this revision.Jun 3 2018, 7:43 AM

In general, make sure the documentation page renders well in a browser.

Mostly style and phrasing stuff inline:

clang-tidy/bugprone/NotNullTerminatedResultCheck.h
34–36 ↗(On Diff #149129)

More of a language or phrasing thing, but the singular/plural wording is anything but matched in this case: handler functions which are safer. What is a "target version" in this case? Shouldn't it be something like "target environment" or "target standard" or just simply "target"?

The variable name is also problematic. AreSafeFunctionsAvailable would be better.

docs/clang-tidy/checks/bugprone-not-null-terminated-result.rst
20–21 ↗(On Diff #149129)

badly? Also, perhaps a half sentence of explanation would be nice here:

need an extra space, thus the result is not null-terminated, which can result in undefined behaviour when the string is read.

39 ↗(On Diff #149129)

That existed.

105 ↗(On Diff #149129)

Why is this an integer, rather than a bool?

This revision now requires changes to proceed.Jun 3 2018, 7:43 AM
Charusso updated this revision to Diff 150643.Jun 9 2018, 4:36 PM
Charusso marked 4 inline comments as done.

memchr() revision: it is problematic if the second argument is '\0', there is no other case. Added a new type of matcher, to match function calls. More static functions and test cases and huge refactor.

Charusso added a comment.EditedJun 9 2018, 4:46 PM

@lebedev.ri there is all the false positive results from the last publicated result-dump:

  1. second result:
  2. all result:
  3. may false positive:

(Note: the two memchr() result were false positive in that post and there is no new result with the new matcher.)

Does the new test cases cover your advice?

docs/clang-tidy/checks/bugprone-not-null-terminated-result.rst
105 ↗(On Diff #149129)

This is how the other Tidy checkers are use their options, I do not know either.

Charusso marked an inline comment as done.Jun 9 2018, 4:46 PM
Charusso updated this revision to Diff 150644.Jun 9 2018, 4:49 PM

Clang format.

Charusso updated this revision to Diff 150645.Jun 9 2018, 5:10 PM

Fix some comment.

I would like to see more negative tests.
What does it do if the len/size is a constant?
Variable that wasn't just assigned with strlen() return?

Thanks for the comment! What do you mean by negative test?

The tests where the check matches and issues a warning is a positive test.
The tests where the check does not match and/or does not issue a warning is a negative test.

@lebedev.ri there is all the false positive results from the last publicated result-dump:

  1. second result:
  2. all result:
  3. may false positive:

(Note: the two memchr() result were false positive in that post and there is no new result with the new matcher.)

Does the new test cases cover your advice?

Not exactly.
I want to see tests.
I.e. they should be in test/clang-tidy/bugprone-not-null-terminated-result-*.
E.g. i did not find the following tests:

void test1(char* dst, char* src) {
  strcpy(dst, src, 10);
}
void test2(char* dst, char* src, int len) {
  strcpy(dst, src, len);
}
void test3(char* dst, char* src) {
  memcpy(dst, src, 10);
}
void test4(char* dst, char* src, int len) {
  memcpy(dst, src, len);
}
...
clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
377–381 ↗(On Diff #150645)

The fixits are incorrect.
Increment by int(1) can result in overflow, which is then extended to size_t
It should be + 1UL.
https://godbolt.org/g/4nQiek

test/clang-tidy/bugprone-not-null-terminated-result-in-initialization.c
158–180 ↗(On Diff #150645)

??
These look like false-negatives.
How do you make an assumption about dest buffer from src string length?

Charusso updated this revision to Diff 157916.Jul 29 2018, 11:29 PM

Excuse me for the huge patch, but what I can done wrongly, I did, so I have made tons of revision.

I would like to show the current direction of the checker. Currently it has too much overlapping function, so please don't go too deep into the core.

Major fixes in every function:

  • Equal length of strlen(src) is now handled.
  • It can read out the length of the destination buffer.
  • If the length increases in the passed argument, it also increases the destination buffer (if overflow is looks like to possible).

Major fixes in memcpy() and memmove():

  • It can decide which function is the best in performance and safety. (The last implementation was rely on C++11 version, which was a huge mistake.)
  • Handles cases where the destination is unsigned char or signed char, which cannot be passed to any string handler function. (It haven't checked the type so that made wrong fix-its.)

Minor fixes:

  • Removed all memory allocation matchers in general, but the current functions behave the same to increase the buffer length.
  • Now the test files' RUN command working well. (CheckOptions was wrong.)

Problematic:

  1. It allows all custom memory allocation function. Sometimes the read buffer size is not the size parameter.
  2. May it is not a good idea to heuristically read and increase buffer lengths.
  3. If the new function transformed from memcpy() to strncpy(), the checker adds the null terminator expression in the next line, like dest[length] = '\0', which is looks like a quite big injection.
Charusso marked 2 inline comments as done.Jul 29 2018, 11:37 PM
Charusso added inline comments.
clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
377–381 ↗(On Diff #150645)

Could you show me a true example where it causes a problem? I think it is too rare to rewrite all code as you mentioned.

test/clang-tidy/bugprone-not-null-terminated-result-in-initialization.c
158–180 ↗(On Diff #150645)

My idea here is you do not want to set your null terminator to anything else, because then the result is not null-terminated.

MTC added a subscriber: MTC.Jul 30 2018, 1:09 AM
whisperity added a subscriber: gsd.Jul 31 2018, 9:22 AM
Charusso updated this revision to Diff 162942.EditedAug 28 2018, 1:49 PM
Charusso marked an inline comment as done.
Charusso retitled this revision from [clang-tidy] New checker for not null-terminated result caused by strlen or wcslen to [clang-tidy] New checker for not null-terminated result caused by strlen(), size() or equal length.
Charusso edited the summary of this revision. (Show Details)
  • The checker by default search for __STDC_WANT_LIB_EXT1__ macro to determine if _s suffixed functions are available. This behaviour could be overridden by specifying AreSafeFunctionsAvailable, which is became a string (it was an integer, but it can be used in the same way).
  • If the destination array length is based on a custom malloc() the current approach only want to rewrite it if it is 100% sure the chosen argument is specifying the length.
  • If the checker works with size() function now it is only match for std::string (e.g. it matched to StringRefs).
  • Now it handles macro defined lengths properly.
  • Added an option to inject UL suffix.
Charusso marked 2 inline comments as done.Aug 28 2018, 1:50 PM

Whew, this is a big one. Generally looks good, although I would separate implementation detail functions a bit better, with perhaps more comments to move them apart a bit, it is really harsh to scroll through.

Could you please re-run the findings on the projects? There has been a lot of improvements to the checker.
I think we should review it as it is now, and do further enhancements later on, in subsequent patches, where the enhancement implementations themselves are revealed better in the diff.

Generally grammar / phrasing things that have caught my eye.

The rest of the code looks okay, bar my previous comment about better commenting / separating chunks of helper functions.

clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
74 ↗(On Diff #162942)

getLocEnd will be removed, see D50353

294 ↗(On Diff #162942)

getLocEnd and getLocStart used here too.

docs/clang-tidy/checks/bugprone-not-null-terminated-result.rst
42 ↗(On Diff #162942)

with -> of

46 ↗(On Diff #162942)

functions

47 ↗(On Diff #162942)

are, versions

(Perhaps as this is user-facing code a half sentence about what safe version means could also be put here.)

47 ↗(On Diff #162942)

Analogly -> Respectively, / In a similar fashion,

48 ↗(On Diff #162942)

``-related
handled

52 ↗(On Diff #162942)

What is "just char"? Is unsigned char or signed char not "just char"?

52–53 ↗(On Diff #162942)

"that means it cannot be any string handler function" This sentence doesn't make any sense.

54 ↗(On Diff #162942)

Will be, or can be? Simply allocating more memory won't magically make a null terminator happen, I think. (Although I can be wrong.)

59 ↗(On Diff #162942)

The destination by itself can't overflow as it is a properly allocated memory.

Perhaps you meant "If copy to the destination array cannot overflow"?

59 ↗(On Diff #162942)

... then the simple cpy()-like functions should be used as they are more efficient.

(is should be the)

62 ↗(On Diff #162942)

Same, mention that the copy/move/write overflows, not the array itself.

Also, I'd phrase it as such: and the option AreSafeFunctionsAvailable is 1.

63–64 ↗(On Diff #162942)

read?

could be the safe version, cpy_s().

66–67 ↗(On Diff #162942)

If the new function could be the safe version and C++ files are analysed, the length of the destination array can be omitted.

(the target environment is not C++, but Linux / Apple / PowerPC, etc.)

69 ↗(On Diff #162942)

Start this line with "Rewrite based on" too so the rhythm of the paragraphs are kept.

71 ↗(On Diff #162942)

What is an "equals length"?
Perhaps strlen(source) or source.length()?

72–73 ↗(On Diff #162942)

Same comment about the "is should be".

"because it is" can be rephrased as "as it is", and also mention why the safe version shouldn't be used.

95 ↗(On Diff #162942)

What is this third argument?

100–101 ↗(On Diff #162942)

Too many indirections here. Instead of specifying the position of the argument, name them: "length of the source string", etc.

103 ↗(On Diff #162942)

The third argument (whatever it actually is as opposed to a positional indirection) is incremented accordingly.

105 ↗(On Diff #162942)

same about "incremented" instead of "getting a +1 operation" (that sounds like a surgery).

Also, the function name's codeblock is not terminated, see the syntax highlight.

125 ↗(On Diff #162942)

A string which is can be used as an integer.

Okay, let's just bask in the fact that we are C++ people and not mention subtle type conversions from the user into the parser here...

125–132 ↗(On Diff #162942)

Generally if you can (RST syntax allows) please reformat this block to include individual cases bulleted.

126–127 ↗(On Diff #162942)

or not set empty string (default value), the checker relies on the __STDC_WANT_LIB_EXT1__ macro being defined.

128–130 ↗(On Diff #162942)

value

the target environment is considered to implements

"Character handler" -- did you mean "string transformation" or "string handling"?

131–132 ↗(On Diff #162942)

No, n, or 0 (zero), the ...

134 ↗(On Diff #162942)

If this is a user-facing thing, I would like to find a better name for this. Why and when are unsigned long increments needed?

whisperity requested changes to this revision.Sep 4 2018, 5:05 AM
This revision now requires changes to proceed.Sep 4 2018, 5:05 AM

D50353 has landed, so after a rebase this patch will not compile.

Charusso updated this revision to Diff 166802.EditedSep 24 2018, 9:08 PM
Charusso marked 28 inline comments as done.
  • Removed InjectUL option so the checker handles the special case if the capacity of a buffer is int.Max()

The results are accessible trough the index.html in each folder:

@aaron.ballman a friendly reminder.

I have checked the results, thank you for uploading them, they look solid to me, although I'm not exactly a developer for these projects, without full understanding of what and where allocates and true path-sensitive analysis and memory modelling, they look good. (E.g. one thing this check misses I think is when the allocator returns an explicitly zero-filled memory, because that way the write without the good size is still NUL-terminated... but this requires modelling we might just not be capable of, especially not in Clang-Tidy.)

With a bit of focused glancing, the check's code is also understandable, thanks. :)

aaron.ballman added inline comments.Sep 25 2018, 1:31 PM
clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
37 ↗(On Diff #166802)

Please drop the LHK_ prefix -- the enum class name is sufficient as a prefix.

44 ↗(On Diff #166802)

This seems like it should be a parameter threaded through the function calls from check() rather than a static data member.

55 ↗(On Diff #166802)

These functions don't really add much benefit given that they're one-liners where the implementation is shorter than the declaration.

75 ↗(On Diff #166802)

Length is could -> Length could

93 ↗(On Diff #166802)

it denoted as unknown -> it is denoted as unknown.

(Be sure to add the full stop at the end of the sentence.)

101 ↗(On Diff #166802)

prevents to increase -> prevents increasing

106 ↗(On Diff #166802)

Missing the full stop at the end of the sentence.

110 ↗(On Diff #166802)

that is not -> that it is not

Missing the full stop at the end of the sentence.

111 ↗(On Diff #166802)

isNoWrongLength should probably be named isNotWrongLength() or perhaps reverse the logic to isCorrectLength().

118 ↗(On Diff #166802)

EQ -> Equal, but perhaps "lenEquaSrcLen()" is more succinct?

131 ↗(On Diff #166802)

Reorder the words and add a full stop: Increase or decrease an integral expression by one.

137 ↗(On Diff #166802)

Similar: Increase or decrease the passed integral argument by one.

143 ↗(On Diff #166802)

Missing full stop.

196 ↗(On Diff #166802)

You can call IgnoreParenImpCasts() instead.

201 ↗(On Diff #166802)

You can remove the local const qualifications here (we don't typically use that for local unless the qualified type is a pointer or a reference).

Same elsewhere in the patch.

268 ↗(On Diff #166802)

case -> cases

331 ↗(On Diff #166802)

case -> cases

383 ↗(On Diff #166802)

You can use 0U here instead of an explicit cast.

387 ↗(On Diff #166802)

case -> cases

459–460 ↗(On Diff #166802)

This is insufficient. Just because the user *wants* the safe functions doesn't mean they're actually available. You also need to check that __STDC_LIB_EXT1__ is defined, as that tells you the implementation has Annex K functionality. If that's defined and the user also defines __STDC_WANT_LIB_EXT1__ before including C standard library headers, then they can access them.

I am betting our include fixer doesn't properly handle ensuring the correct macros are defined before including the headers, but I think that's acceptable for now.

514 ↗(On Diff #166802)

it is cannot -> it cannot

Also, add a full stop at the end of the sentence.

569–571 ↗(On Diff #166802)

This code is duplicated from above -- it'd be nice to refactor it into one place, perhaps.

595–596 ↗(On Diff #166802)

I think it's just as understandable, but more simple, to say "the length is too short to include the null terminator".

700 ↗(On Diff #166802)

Is there a way we can do this without storing duplicate information? This is already tracked by the Preprocessor instance, I believe, so if we had a way to query that rather than store our own list, that would be better.

715 ↗(On Diff #166802)

const auto *

733 ↗(On Diff #166802)

const auto *

843 ↗(On Diff #166802)

Missing full stop.

871–872 ↗(On Diff #166802)

Do not use auto when the type is not spelled out in the initialization.

879–883 ↗(On Diff #166802)

These should not be declared as auto -- can skip the assignment operator. Same suggestion applies elsewhere.

962 ↗(On Diff #166802)

Do not use auto here either.

983–984 ↗(On Diff #166802)

Don't use auto (elsewhere as well).

1038 ↗(On Diff #166802)

This should be done using a Twine.

clang-tidy/bugprone/NotNullTerminatedResultCheck.h
42–45 ↗(On Diff #166802)

These comments are out of date -- you don't need "Yes", just Y or y (e.g.) as the first character. So "Yolo" will also work. :-P

docs/clang-tidy/checks/bugprone-not-null-terminated-result.rst
9 ↗(On Diff #166802)

Drop "an".

whisperity added inline comments.
clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
1038 ↗(On Diff #166802)

Can you please give an example of how to well-approach this? We had issues with using Twines on the stack and then later on running into weird undefined behaviour. The documentation is not clear to a lot of our colleagues on where the Twine's usage barriers are... (Asking this not just for @Charusso but also for a few more colleagues, @baloghadamsoftware e.g.)

JonasToth added inline comments.
clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
1038 ↗(On Diff #166802)

naive approach:

std::string New = Twine("\n").concat(SpaceBeforeStmtStr).concat(exprToStr(..))...).str();

I think retrieving a SmallString is not supported. Please note, that Twine does support taking integer and floats directly without prior conversion.
You can use operator+ instead of concat too.

Did this help or did you have more specific issues?

aaron.ballman added inline comments.Sep 26 2018, 8:57 AM
clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
1038 ↗(On Diff #166802)

The important bit is -- don't try to store a Twine (locally or otherwise). It's fine as a parameter in a function, but otherwise you should try to use it only as part of an expression. e.g., (Twine("foo") + "bar" + baz).str() is a good pattern to get used to.

Charusso updated this revision to Diff 167597.Sep 29 2018, 6:38 AM
Charusso marked 37 inline comments as done.
Charusso edited the summary of this revision. (Show Details)

Thanks for the very in-depth review @aaron.ballman!

I have not found a way to obtain the instance of Preprocessor nicely, but it is working well. Changes:

  • Huge refactor .
  • Macro definition checking is rely on Preprocessor.
  • isUnjectUL() is now a function, not a static variable.
  • WantToUseSafeFunctions is the new integer specifying the AreSafeFunctionsAvailable variable without the override feature rely on the necessary macros has been defined.
  • Added a test case in bugprone-not-null-terminated-result-memcpy-before-safe.c if the user wants to use safe functions but it is unavailable.
  • Extra: the LOC is now a round number.
clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
1038 ↗(On Diff #166802)

+1 for @aaron.ballman's example, it is an unspoken standard.

Charusso marked an inline comment as done.Sep 29 2018, 6:39 AM
whisperity accepted this revision.Oct 12 2018, 3:15 AM

Have been looked at this far and wide too many times now. I say we roll this out and fix and improve later on, lest this only going into Clang 72. Results look promising, let's hope users start reporting good findings too.

Considering Tidy has no alpha group the name bugprone seems like a double-edged sword, as what is bugprone, the checker or what it finds? 😜

This revision is now accepted and ready to land.Oct 12 2018, 3:15 AM

@aaron.ballman Neither I nor @Charusso has commit rights, could you please commit this in our stead?

JonasToth added a comment.EditedOct 12 2018, 10:11 AM

The patch does not apply clean, could you please take a look at it?

  • was a non-issue from arc **

Am 12.10.2018 um 17:21 schrieb Whisperity via Phabricator:

whisperity added a comment.

@aaron.ballman Neither I nor @Charusso has commit rights, could you please commit this in our stead?

https://reviews.llvm.org/D45050

JonasToth closed this revision.Oct 12 2018, 10:25 AM

Thanks everyone for the contribution!

Unfortunatly this check does not compile on some ARM platforms. It seems that a matcher exceeds the recursion limit in template instantations.

http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/4405/steps/build%20stage%201/logs/stdio

I will revert the check for now as I did not find an easy fix now.

JonasToth reopened this revision.Oct 13 2018, 2:34 AM
This revision is now accepted and ready to land.Oct 13 2018, 2:34 AM
JonasToth requested changes to this revision.Oct 13 2018, 2:34 AM
This revision now requires changes to proceed.Oct 13 2018, 2:34 AM

Unfortunatly this check does not compile on some ARM platforms. It seems that a matcher exceeds the recursion limit in template instantations.

http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/4405/steps/build%20stage%201/logs/stdio

Well, I can do nothing, but ask for @klimek's help.

Manuel, could you review this error please?

I am in contact with the guy that actually discovered this breakage. It seems to be a weird and hard to reproduce error, but happens on x86 as well. So it is a compile/header something combination that results in the error. I am pretty sure that will take a while until we figured it out :/

Sorry for responding so late, but i thought i get a shot to look at it. Here is the mail from Douglas Yung who was so kind to isolate the breakage and find a reproducer.
If you have the chance please take a look at it.

Hi Jonas,

I have attached a bzipped preprocessed file that I can confirm will show the failure if built with clang++ version 3.7.1, specifically the version that shipped with ubuntu 14.04.5 LTS. Here is the full version information:

Ubuntu clang version 3.7.1-svn253742-1~exp1 (branches/release_37) (based on LLVM 3.7.1)

Target: x86_64-pc-linux-gnu
Thread model: posix

If you build it with “clang++ -std=c++11 NotNullTerminatedResultCheck.preproc.cpp” you should see the failure.

Hope this helps.
Douglas Yung

The file is in the attachement of this comment.

whisperity added a subscriber: dyung.

I have attached a bzipped preprocessed file that I can confirm will show the failure if built with clang++ version 3.7.1, specifically the version that shipped with ubuntu 14.04.5 LTS. Here is the full version information:

Ubuntu clang version 3.7.1-svn253742-1~exp1 (branches/release_37) (based on LLVM 3.7.1)

Target: x86_64-pc-linux-gnu
Thread model: posix

If you build it with “clang++ -std=c++11 NotNullTerminatedResultCheck.preproc.cpp” you should see the failure.

I have installed said Ubuntu in a virtual machine for testing this, but unfortunately only the following Clangs are available in the package manager for Trusty:

clang - C, C++ and Objective-C compiler (LLVM based)
clang-3.3 - C, C++ and Objective-C compiler (LLVM based)
clang-3.4 - C, C++ and Objective-C compiler (LLVM based)
clang-3.5 - C, C++ and Objective-C compiler (LLVM based)
clang-3.6 - C, C++ and Objective-C compiler (LLVM based)
clang-3.8 - C, C++ and Objective-C compiler (LLVM based)
clang-3.9 - C, C++ and Objective-C compiler (LLVM based)

(Where clang is just a synonym for clang-3.4.) There is no Clang 3.7 in the package upstream, it seems.


However, 16.04 LTS (Xenial) at the time of writing this comment has an clang-3.7 package, specifically this version:

Ubuntu clang version 3.7.1-2ubuntu2 (tags/RELEASE_371/final) (based on LLVM 3.7.1)
Target: x86_64-pc-linux-gnu
Thread model: posix

With this I can confirm I get a huge trace and the template depth overflow failure.

However, from the filename-line mappings of the preprocessed output, I can see that the type_traits header comes from /usr/include/c++/4.8/type_traits, which is a version 4.8 standard library, but installing the clang-3.7 package (through some transitivity in libasan and such) depended on gcc-5-base, upgrading it from the system-default 5.3.1 to 5.4.0.

Isn't this a discrepancy, relying on an older standard library than what is seemingly available on the system?

Interesting, did you try out gcc-4.8 and if it is reproducable with
that one? This version of gcc is still supported and if the problem
occurs with the libstdc++ shipped there, we need to consider it.

However, 16.04 LTS (Xenial) at the time of writing this comment has an clang-3.7 package, specifically this version:

Ubuntu clang version 3.7.1-2ubuntu2 (tags/RELEASE_371/final) (based on LLVM 3.7.1)
Target: x86_64-pc-linux-gnu
Thread model: posix

With this I can confirm I get a huge trace and the template depth overflow failure.

However, from the filename-line mappings of the preprocessed output, I can see that the type_traits header comes from /usr/include/c++/4.8/type_traits, which is a version 4.8 standard library, but installing the clang-3.7 package (through some transitivity in libasan and such) depended on gcc-5-base, upgrading it from the system-default 5.3.1 to 5.4.0.

Isn't this a discrepancy, relying on an older standard library than what is seemingly available on the system?

https://reviews.llvm.org/D45050

dyung added a comment.Oct 30 2018, 3:12 PM

I have installed said Ubuntu in a virtual machine for testing this, but unfortunately only the following Clangs are available in the package manager for Trusty:

clang - C, C++ and Objective-C compiler (LLVM based)
clang-3.3 - C, C++ and Objective-C compiler (LLVM based)
clang-3.4 - C, C++ and Objective-C compiler (LLVM based)
clang-3.5 - C, C++ and Objective-C compiler (LLVM based)
clang-3.6 - C, C++ and Objective-C compiler (LLVM based)
clang-3.8 - C, C++ and Objective-C compiler (LLVM based)
clang-3.9 - C, C++ and Objective-C compiler (LLVM based)

(Where clang is just a synonym for clang-3.4.) There is no Clang 3.7 in the package upstream, it seems.

Hi, I did not initially setup the machine that hit the failure, so I cannot say for certain where it got clang. I did notice though that the llvm.org releases page does seem to include a download link for clang 3.7.1 for ubuntu 14.04 (http://releases.llvm.org/3.7.1/clang+llvm-3.7.1-x86_64-linux-gnu-ubuntu-14.04.tar.xz).


However, 16.04 LTS (Xenial) at the time of writing this comment has an clang-3.7 package, specifically this version:

Ubuntu clang version 3.7.1-2ubuntu2 (tags/RELEASE_371/final) (based on LLVM 3.7.1)
Target: x86_64-pc-linux-gnu
Thread model: posix

With this I can confirm I get a huge trace and the template depth overflow failure.

However, from the filename-line mappings of the preprocessed output, I can see that the type_traits header comes from /usr/include/c++/4.8/type_traits, which is a version 4.8 standard library, but installing the clang-3.7 package (through some transitivity in libasan and such) depended on gcc-5-base, upgrading it from the system-default 5.3.1 to 5.4.0.

Isn't this a discrepancy, relying on an older standard library than what is seemingly available on the system?

Again, I'm not sure how the toolchain was installed on the system, and if it was installed by simply unzipping the tarball above (or similar) then I could see how dependencies could go unmet. Offhand I do not even know if gcc 5+ is installed on the machine, but I can check if you think it is important.

gerazo added a subscriber: gerazo.Nov 26 2018, 7:20 AM
Charusso updated this revision to Diff 220215.Sep 14 2019, 8:23 AM
Charusso edited the summary of this revision. (Show Details)
Charusso removed reviewers: hokein, ilya-biryukov, xbolva00, dyung.
Charusso set the repository for this revision to rCTE Clang Tools Extra.

After a while I try to make this patch arrive. I wanted to split it up to multiple patches, but everything tied together so I decided to fix false positives instead with improving the existing APIs. Please visit the diff of the test cases and the documentation to see the changes.

Here are some interesting findings:

bitcoin/src/leveldb/db/c.cc:
- char* result = reinterpret_cast<char*>(malloc(sizeof(char) * str.size()));
- memcpy(result, str.data(), sizeof(char) * str.size());
+ char* result = reinterpret_cast<char*>(malloc((sizeof(char) * str.size()) + 1));
+ strcpy(result, str.data());

ffmpeg/libavformat/avio.c:
- memmove(start, key+1, strlen(key));
+ memmove(start, key+1, strlen(key) + 1);

ffmpeg/libavformat/mpeg.c:
- memcpy(ext, !strncmp(ext, "IDX", 3) ? "SUB" : "sub", 3);
+ strcpy(ext, !strncmp(ext, "IDX", 3) ? "SUB" : "sub");

ffmpeg/libavformat/oggparseskeleton.c:
- strncmp(buf, "fishead", 8)
+ strncmp(buf, "fishead", 7)

sqlite/shell.c:
#define APND_MARK_PREFIX     "Start-Of-SQLite3-"
#define APND_MARK_PREFIX_SZ  17
unsigned char a[APND_MARK_SIZE];
- memcpy(a, APND_MARK_PREFIX, APND_MARK_PREFIX_SZ);
+ strcpy((char *)a, APND_MARK_PREFIX);
Herald added a project: Restricted Project. · View Herald TranscriptSep 14 2019, 8:23 AM

Hm, I wanted to upload the new patch here to see the changes with the diff-mode, but it does not work, sorry.

A couple of drive-by comments.

clang-tools-extra/clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
23–34

Binding names are copied/read a lot (for memoization purposes, see BoundNodesMap) during matching. It makes sense to make them shorter. Ideally they should fit into the inline std::string buffer (15 characters in libc++ on x86-64, https://gcc.godbolt.org/z/oNBghM).

277–278

This conditional statement never affects the result of the function. Either there's a typo somewhere or this statement can be removed.

Charusso updated this revision to Diff 221096.EditedSep 20 2019, 12:58 PM
  • Fix.
  • Tiny refactor.
Charusso marked 3 inline comments as done.Sep 20 2019, 12:59 PM

Thanks!

clang-tools-extra/clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
23–34

Great explanation, thanks you!

Charusso marked an inline comment as done.Sep 20 2019, 12:59 PM
JonasToth resigned from this revision.Oct 11 2019, 3:47 AM
This revision is now accepted and ready to land.Oct 11 2019, 3:47 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Oct 13 2019, 3:49 AM

This fails on Windows: http://45.33.8.238/win/386/step_7.txt

Please take a look, and if it's not immediately clear how to fix, b please revert while you investigate.

This fails on Windows: http://45.33.8.238/win/386/step_7.txt

Please take a look, and if it's not immediately clear how to fix, b please revert while you investigate.

Oh, thanks! We define it like typedef __typeof(sizeof(int)) size_t;, so by rL374715 it needs to work.

xbolva00 added a comment.EditedOct 13 2019, 4:09 AM

typedef __SIZE_TYPE__ size_t;

typedef __SIZE_TYPE__ size_t;

Hm, yes, I have overestimated the uploaded solution:

$ clang/test grep -rn 'typedef __SIZE_TYPE__ size_t;' | wc -l
60
$ clang/test grep -rn 'typedef __typeof(sizeof(int)) size_t;' | wc -l
45

Thanks!

I think everything is working fine on the build-bot side. One tiny issue on Windows left where the getLength() could not obtain the length properly: rL374713, I will look into that.

Shouldn't it catch in curl also this code?

urllen = strlen(url_clone);
....
memcpy(newest, url_clone, urllen);

Edit: if possible, report these bugs to project developers :)

@xbolva00, now I answer it: it is out of the scope of the Tidy, sadly. I really wanted to make this work with tons of heuristics, which made that patch so enormous. I have not forgotten it, I will report the found issues some months later. Now I want to do the same patch with the help of the Analyzer, and research that, whether this type of matching is could be done more appropriately. I have already started that idea by D68725. Thanks again!

Thanks @JonasToth, you really wanted to see my first patch upstreamed, and also thanks everyone, your support really helped me to understand how to do reviews and how to Clang. I apologize for the size, I felt that the larger the better.