With the option '-analyzer-config stable-report-filename=true', instead of report-XXXXXX.html, generate report-<filename>-<function, method name>-<function position>-<id>.html. (id = i++ for several issues found in the same function/method)
Details
Diff Detail
Event Timeline
I'm not sure this is that much better—do we guarantee that diagnostics are always reported in the same order between two runs? I'm not sure we do. And there's also a race here if two analyzer processes are analyzing two files with the same name, or if they end up reporting issues in a common header.
I'd be okay with putting the file and declaration name in before the random string, though.
lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp | ||
---|---|---|
79 | Might as well be const from here all the way down. |
I agree, this patch is not perfect. I do not guarantee that the file name will remain the same....
However, the generated file is "source name + function/method name", in most of the cases, we won't have too many issues in the same function.
Maybe I could add the type of error in the filename. This will decrease the risk of collision (but not fix it completely)
I'd be okay with putting the file and declaration name in before the random string, though.
That would not fix my problem. I am running daily scan-build on LLVM/Clang (http://buildd-clang.debian.net/scan-build/) + Firefox but I cannot share the URL with others because they are not stable enough (because of the random aspect)...
lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp | ||
---|---|---|
79 | Right, I will fix that. |
If you want, I can keep the current behavior and instead add a option --no-random-URL or -no-random-filename-report. That won't impact scan-build and you will make me (and my users) happy :p
I think my point is that your URLs still won't be stable unless your build is non-parallel and the analyzer is deterministic, but that is a good use case.
If the analyzer is non-deterministic in the output order of bugs from a single TU, I think we want to fix that, but if the source is changing, you could introduce bugs in between other bugs and shift the numbers down. Maybe you could put the line number in as well?
I'm not quite sure how to stop multiple TUs from reporting the same bugs. For that you'd need to use some kind of issue hash, like CmpRuns.py does.
OK. I was expecting clang analyzer to process them in the same order of reading... Sorry! :)
I will add the line number to the file name.
It's still possible to have problems with this (if there are multiple issues occurring on the same line), but mostly I'd think you wouldn't want to use the line number as a stable URL. What if someone inserts a comment at the top of the file?
(This is why the hash in the plist output is based on "lines from the start of the function body", when possible. Definitely not perfect either, but slightly more resilient.)
I guess a lot of this depends on what you are using this for. Do you want report URLs to stay stable even when the source code changes? Do you want to try to deduplicate when different main files produce the same diagnostic? Do you want to be able to handle multiple reports on the same line?
(One way to handle the last is to include the checker name in the report somehow, possibly via hash. It's less likely that the same line will contain two issues reported by the same checker, though not impossible.)
Also, if you are not already, please check that scan-view continues to work as well as scan-build. I don't foresee any problems since you're keeping the "report-" prefix, but just in case.
lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp | ||
---|---|---|
281 | declName may contain non-valid URL characters. The name of an overloaded operator function is the operator. Objective-C methods probably don't include the parent class in the name, but they could: -[FooBarClass fooBar]. |
I'd think you wouldn't want to use the line number as a stable URL. What if someone inserts a comment at the top of the file?
That is why I didn't add the line at the first try (I added it because you asked :) or maybe you were talking about "lines from the start of the function body"
Do you want report URLs to stay stable even when the source code changes?
Yes, at least for a few weeks until a bug is fixed and the report disappears.
Do you want to try to deduplicate when different main files produce the same diagnostic? Do you want to be able to handle multiple reports on the same line?
It is not my current goals but maybe during the GSoC.
Also, if you are not already, please check that scan-view continues to work as well as scan-build. I don't foresee any problems since you're keeping the "report-" prefix, but just in case.
Yep, I did. It works fine!
lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp | ||
---|---|---|
281 | OK. I am not familiar with Objective-C. I will have a look on how improve the filename. |
I'd think you wouldn't want to use the line number as a stable URL. What if someone inserts a comment at the top of the file?
That is why I didn't add the line at the first try (I added it because you asked :) or maybe you were talking about "lines from the start of the function body"
Yeah, I'm not sure what to say about that. Sequentially numbering bugs is a problem anyway if someone else introduces a new bug. I guess you'll have to pick your poison.
I'd like to keep this new behavior off by default for now until either Ted or Anna decides that it's enough superior to the current behavior that it's worth changing, but I think it's good enough to take under a flag now, once you've decided which problem you're going to live with.
OK. I will implement these changes!
lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp | ||
---|---|---|
281 | Do you know if we have a method for this or should I implement it? |
lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp | ||
---|---|---|
281 | To do what, mangle an arbitrary string into an identifier, or URL-safe string? Sorry, I can't remember seeing one, though there's a lot in llvm/Support and llvm/ADT that I eventually discover. You could also just leave the special characters in and make sure they're escaped when scan-build references them by URL. |
existing -analyzer-option API. I cannot find a reference to this.
Do you mean using the same mechanism as -analyzer-eagerly-assume (for example) or using -analyzer-config ?
If you are talking about the first, where is it defined? Thanks
lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp | ||
---|---|---|
73 | _ removed. I will keep AnalyzerOpts if you don't mind, as consistency with PlistDiagnostics.cpp | |
133 | Same here. As consistency with PlistDiagnostics in which a std::string is used. I don't mind updating the two files if you prefer. |
Yes, sorry, -analyzer-config, not -analyzer-option.
(There's been a lot going on these past two weeks.)
lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp | ||
---|---|---|
133 | Yes, please. No need to incur extra random heap allocations, even if it doesn't make such a difference in this case. |
This is looking good! Thanks for going through these revisions. One more major change, to avoid stomping on the same file.
include/clang/StaticAnalyzer/Core/AnalyzerOptions.h | ||
---|---|---|
365–369 | Please note which is the default here. Also, you're missing a period on the first line. | |
lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp | ||
132–133 | Funny indentation here. | |
289–301 | This is race-y: what if you have two analyzer instances running at once? You should use the F_Excl flag for llvm::sys::fs::openFileForWrite. Also, please put spaces around =. |
This is race-y: what if you have two analyzer instances running at once? You should use the F_Excl flag for llvm::sys::fs::openFileForWrite.
Are you sure this can happen ? I had the feeling that the analyzer is run a on a file as a whole and since I am writing a file with a file in the name, I cannot have collision.
If you confirm, I will fix that.
If the issue is in a header file, multiple implementation files might include that header file. Better safe than sorry, anyway.
No, just put the open() in the loop, and skip the exists() check. That's faster anyway—less disk access.
Uh, you should probably check that the error code is actually "the file already exists" rather than something completely unrelated, like "you don't have permission to write to this directory".
Sorry. Long week. I am doing dumb things :)
I updated the code to this:
EC = llvm::sys::fs::openFileForWrite(Model.str(), FD, llvm::sys::fs::OpenFlags::F_RW | llvm::sys::fs::F_Excl); if (EC && EC != std::errc::file_exists) { llvm::errs() << "warning: could not create file '" << Model.str() << "': " << EC.message() << '\n'; return; }
Work for me
Okay, this one looks good. Last few comments, but no need for another round of review. Thanks, Sylvestre!
lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp | ||
---|---|---|
294 | Inconsistent here about whether or not you want to say "OpenFlags". Also, remember the 80-column limit in LLVM. | |
lib/StaticAnalyzer/Core/PlistDiagnostics.cpp | ||
418 ↗ | (On Diff #10402) | Ah, that's why. In this case the std::string is better because it avoids a copy. In the long run we should be eliminating getAsString in favor of streaming to raw_ostreams, but that shouldn't block your patch. |
I implemented the 80 column change + backout of the smallstring change + remove of openflags.
Thanks again!
Please note which is the default here. Also, you're missing a period on the first line.