Diagnose the case when a negative value is assigned to an unsigned variable.
Unfortunately, there are no existing diagnostics for such assignments
that cause the loss of the sign, potentially resulting in nasty silent runtime failures.
Differential D10634
Loss of Sign Checker soumitra on Jun 22 2015, 10:21 PM. Authored by
Details
Diagnose the case when a negative value is assigned to an unsigned variable. Unfortunately, there are no existing diagnostics for such assignments
Diff Detail Event TimelineComment Actions I recommend a more specific name than LossOfSignChecker unless you want it to have more warnings later. Right now, a name such as LossOfSignInAssignmentChecker.cpp would be better imho. Sure there can be issues with loss of sign. If we can warn about mistakes that is awesome! However personally I think it seems noisy to warn about: unsigned char uc1 = -1; As I see it, it is quite clear that the developer wants to assign -1 to uc1 and that a cast is expected. On a normal machine the uc1 would get the value 255 and this is likely expected. Comment Actions I don't have immediate plans to add more warnings on this, but there definitely are many more cases where we can lose the sign and potentially cause runtime failures. That said, I can rename it as you suggest since I do not have immediate plans to address more cases.
Sure, but wouldn't it be better to expect an explicit cast if the developer really meant what he wrote? No, not really. However, I hit this issue when moving a large software from x86 to AArch64, where unfortunately, the 'sign' of a plain char changes, causing us to lose days of effort in narrowing down the issue. For example, the following representative program that works perfectly on x86 would fail silently on AArch64: 1 #include <stdio.h> 2 3 const char data[] = { -1 }; 4 5 int main() { 6 int value = data[0]; 7 if (value != -1) { 8 printf("failed! value = %d\n", value); 9 return 1; 10 } 11 return 0; 12 } Do you think narrowing down the checker to the above singular case would be more appropriate? I would still think that while the 'unsigned char' case you point out may be a grey area, an "integer" assignment with the wrong sign would certainly not be a good programming practice. Comment Actions ok. I guess more warnings can be added later. So a generic name for the class is not a bad idea. I'd recommend that you change the id in the td like: def LossOfSignChecker : Checker<"LossOfSignAssign">, And then also rename the testfile to LossOfSignAssign.c.
Well.. I am not sure what the Clang users thinks about this. Maybe they would accept changing their code. Personally I'd prefer: unsigned char = ~0U; It would be nice to see some actual warnings and see if there are possible mistakes.
ok. I can run it on some projects.. I have a testscript I run.
Nice example.
I would definitely feel better about this then. But I will try your current patch on some open source projects.. and then we will have some actual results to look at. Feel free to run your patch on some projects too. Comment Actions hmm.. I get stack dumps. Please review the code around line 151 in LossOfSignChecker.cpp To reproduce a dump on a linux box, you can use such commands: Stack dump: #0 0x46e3574 llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/danielm/llvm/lib/Support/Unix/Signals.inc:437:0 Comment Actions I also get such stack dumps when I run 'make check-all' .. does everything work fine for you when you run all the tests? Comment Actions I had missed a check for initializer on a variable in my previous revision, resulting in a stack dump. Fixed. Comment Actions I get a compiler error now. Either I do something wrong, or you did not update the Clang source code. Let me know if I do something wrong.. /home/danielm/llvm/tools/clang/lib/StaticAnalyzer/Checkers/LossOfSignChecker.cpp:63:61: error: no matching function for call to ‘clang::ento::CheckerContext::emitReport(clang::ento::BugReport*)’ C.emitReport(new BugReport(*BT, BT->getDescription(), N)); Comment Actions Missed updating my working copy prior to creating the patch. Comment Actions
Thank you! The code is not unreachable as far as I see.. the function yylex() can be called from another source file for instance. The warning is correct if we assume that the global variables have their default initialisation values. That is probably not the case in the real code. For now I think the yylex warning can be ok. At least in the reduced code, we can't prove that rhs is never negative. If we want to fix these warnings later, I think that it must be solved in the Clang infrastructure. Then I would suggest we do that in a separate patch. I will also run this checker on some projects..
Comment Actions Incorporated review comments by Daniel. Comment Actions Here are some warnings.. could you please look at some of them... There are a few cases where RHS is a constant integer literal.. do you think there could be a bug in some of them? ftp://ftp.sunet.se/pub/Linux/distributions/Debian/debian/pool/main/a/abcm2ps/abcm2ps_7.8.9.orig.tar.gz ps_x = -1; ftp://ftp.sunet.se/pub/Linux/distributions/Debian/debian/pool/main/a/adolc/adolc_2.5.2.orig.tar.gz number = lastTayP1 - ADOLC_CURRENT_TAPE_INFOS.tayBuffer; ftp://ftp.sunet.se/pub/Linux/distributions/Debian/debian/pool/main/a/alsa-lib/alsa-lib_1.0.28.orig.tar.bz2 snd_pcm_uframes_t ptr1 = -2LL /* invalid value */, ptr2; ftp://ftp.sunet.se/pub/Linux/distributions/Debian/debian/pool/main/a/anubis/anubis_4.1.1+dfsg1.orig.tar.gz rc_yy_size_t new_size = (rc_yy_n_chars) + number_to_move + ((rc_yy_n_chars) >> 1); ftp://ftp.sunet.se/pub/Linux/distributions/Debian/debian/pool/main/a/atheme-services/atheme-services_7.0.7.orig.tar.bz2 clang-tidy atheme-services-7.0.7/libathemecore/reslib.c n = labellen(cp - 1); /* XXX */ clang-tidy atheme-services-7.0.7/libmowgli-2/src/libmowgli/dns/dns_evloop_reslib.c n = labellen(cp - 1); /* XXX */ clang-tidy avfs-1.0.1/src/sysdeps.c static avuid_t myuid = -1; ^ sysdeps.c:251:5: note: assigning negative value to unsigned variable loses sign and may cause undesired runtime behavior static avuid_t myuid = -1; ^ sysdeps.c:252:5: warning: assigning negative value to unsigned variable loses sign and may cause undesired runtime behavior [clang-analyzer-alpha.core.LossOfSignAssign] static avuid_t mygid = -1; ^ sysdeps.c:252:5: note: assigning negative value to unsigned variable loses sign and may cause undesired runtime behavior static avuid_t mygid = -1; ^ sysdeps.c:251:5: warning: assigning negative value to unsigned variable loses sign and may cause undesired runtime behavior [clang-analyzer-alpha.core.LossOfSignAssign] static avuid_t myuid = -1; ^ sysdeps.c:251:5: note: assigning negative value to unsigned variable loses sign and may cause undesired runtime behavior static avuid_t myuid = -1; ^ sysdeps.c:252:5: warning: assigning negative value to unsigned variable loses sign and may cause undesired runtime behavior [clang-analyzer-alpha.core.LossOfSignAssign] static avuid_t mygid = -1; ^ sysdeps.c:252:5: note: assigning negative value to unsigned variable loses sign and may cause undesired runtime behavior static avuid_t mygid = -1; ^ clang-tidy avfs-1.0.1/zlib/deflate.c n = read_buf(s->strm, s->window + s->strstart + s->lookahead, more);
Comment Actions Incorporated further review comments by Daniel. Investigating individual diagnostics pointed out by Daniel. Comment Actions Only a temporary assignment, before being assigned to "signed char" ps_func, and hence safe.
This one indeed looks suspicious, since "number" is subsequently used to compute "chunks", which is being used as a loop control guard initialized with 0.
This can be potentially problematic due to the subsequent equality check with "ptr2".
Here, "new_size" is the target size of a realloc, which can go haywire if the assigned value was indeed negative.
This is probably problematic, since a subsequent check "loading_gdbv >= 4" (in function "db_h_grp") will always be true, unless modified by an intervening assignment.
This looks safe to me.
This looks safe to me.
Looks like you missed mentioning the source.
My understanding was that the above should cause the subsequent comparison with '-1' to fail, but it doesn't. Promotion?
I thought I am only emitting a warning. Where is this "note" originating from? Can this duplicate diagnostic be suppressed?
Same as above.
Here, "n" is used to adjust the "lookahead", which is then compared with "MIN_MATCH". On assigning a negative value (in effect a huge positive value), the above comparison may fail. Comment Actions Here is a bigger log with more warnings. there are some cases where I don't think there should be a warning.. for instance some compound assignments: args.c:990:9: warning: assigning negative value to unsigned variable loses sign and may cause undesired runtime behavior [clang-analyzer-alpha.core.LossOfSignAssign] tmp /= 10; ^ read.c:4940:13: warning: assigning negative value to unsigned variable loses sign and may cause undesired runtime behavior [clang-analyzer-alpha.core.LossOfSignAssign] value >>= 7; ^ I am not an expert but it seems strange to me that clang seems to think that 10 and 7 are negative. Can you investigate this? Comment Actions Here's a short test case that shows the problem: 1 void safe_mul(unsigned long x) { 2 unsigned long multiplier = 1000000000000000000; 3 while (x > 0) { 4 multiplier /= 10; 5 } 6 } I may be wrong here, but I believe that the warning is due to an overflow during constant folding of the initializer, which causes the RHS to be treated as negative. Comment Actions
Thanks! I believe a fix for this problem would be a good thing. So that when this patch is added we don't get these false positives. It should be a separate patch. Comment Actions I have a couple of high level comments. Why do you have checkASTDecl (which is a syntactic check) in addition to the checkBind (which is a path-sensitive check)? Does checkASTDecl find more issues? can those be found using a path sensitive callback? I am leaning toward allowing explicit assignments to "-1", like in this case: "unsigned int j = -1". The tool is much more usable if there are few false positives. Some of the results look like false positives. For example, this one from https://drive.google.com/file/d/0BykPmWrCOxt2aGtRNTY4eXQ1OU0/view: namelen -= 9; /* deleted the string 'HIERARCH ' */ It's hard to know what they are if we are only looking at the line where the issue is reported without seeing the pull path. Do we know that the value can be negative on that path?
Comment Actions Yes, it seems that the global assignments can only be caught using the ASTDecl checker (probably because they do not have an associated "path"?) Please do let me know in case I am wrong in my understanding.
This is exactly what I started off with, albeit with a plain 'char' instead of 'unsigned int'. We were hitting a runtime issue while porting a large piece of software to AArch64 since the "signedness" of plain char changes across x86 and AArch64, and a negative value was used as a initializer.
I don't see the above in the link you posted. Can you please point me to the source so that I can look into this?
What I am trying to do is check if the RHS of an unsigned initialization is negative and if so, flag off that assignment.
Comment Actions I am also still skeptic about this. Ideally there should only be warnings when there is a mistake. In your example code you showed previously there were portability problems because the signedness changed. A warning for that is ok in my opinion. If the code says 'unsigned int j = -1;' then there is no such portability problem. Comment Actions Ah! Now I get it!! So, the case I have is assignment of a negative value to a 'plain char'. Is there a way I can check for a 'plain char' in the checker as against a generic 'unsigned' integral variable? |
I don't know see the connection against "ExprEngine".. so this comment looks confusing to me.. how about removing "a builtin check in ExprEngine"?