This change adds an option to detect all null dereferences for non-default address spaces, except for address spaces 256, 257 and 258. Those address spaces are special since null dereferences are not errors. All address spaces can be considered (except for 256, 257, and 258) by using -analyzer-config core.NullDereference:DetectAllNullDereferences=true. This option is false by default, retaining the original behavior. A LIT test was enhanced to cover this case, and the rst documentation was updated to describe this behavior.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Couls you please provide some context about why did we have this suppression in the first place?
Other than that, I'm on board with this, but we should really investigate the historical reasons why we had this.
If it turns out to be a valid usecase, but a rather special one we could still have an analyzer config option making this change optional.
@NoQ do you have any background on this?
@steakhal, good question. Digging into the change archaeology, I see this change was made by Anna Zaks in Jan 2016 - see commit header below.
There's no reference to a review with comments, and I don't know how to find those if they exist, perhaps @NoQ could help with that?
In any case, judging by the changes I had to make to the test cases, there were no regressions in the current test set, and at least our build tests ran ok - seems like this is no longer necessary. But I understand the need to approach this carefully.
Thanks
commit c9f16fe48c40e7683d0029b65090d0007414d7af
Author: Anna Zaks <ganna@apple.com>
Date:   Wed Jan 6 00:32:49 2016 +0000
    [analyzer] Don't report null dereferences on address_space annotated memory
    
    llvm-svn: 256885I looked up the history. I believe this refers to https://clang.llvm.org/docs/LanguageExtensions.html#memory-references-to-specified-segments:
Annotating a pointer with address space #256 causes it to be code generated relative to the X86 GS segment register, address space #257 causes it to be relative to the X86 FS segment, and address space #258 causes it to be relative to the X86 SS segment. Note that this is a very very low-level feature that should only be used if you know what you’re doing (for example in an OS kernel).
So basically ((void *__attribute__((address_space(256))) *)0) is a valid pointer that can be safely dereferenced.
I'm not aware of other situations of this kind. Probably this address space range can be hardcoded, and/or a run-time checker option can be provided to enable/disable this behavior.
Well, it seems like, on Windows, the FS register points to the Win32 Thread Environment Block - TEB (wiki). And at offset 0x0, resides the pointer to the Current Structured Exception Handling (SEH) frame, making it perfectly valid to load and dereference FS:[0]. Using FS and GS registers are frequently used for implementing anti-debug tricks in practice, but I think it's fairly safe to say that they are not our primary audience.
I'm not aware of other situations of this kind. Probably this address space range can be hardcoded, and/or a run-time checker option can be provided to enable/disable this behavior.
That being said, I think it might deserve a user-facing analyzer config option.
Ahhh, I see - thanks for the references and discussion. I'll craft a patch to comprehend the use case and user-facing option, and post for review. We'll see how it goes :)
| clang/docs/analyzer/checkers.rst | ||
|---|---|---|
| 69–83 | IMO this should be in a separate paragraph. It's such a niche use-case. | |
| clang/lib/StaticAnalyzer/Checkers/DereferenceChecker.cpp | ||
| 120 | Shouldn't we check for this target? | |
| 126–134 | Even though DetectAllNullDereferences=true, address_space(256) would be suppressed, which seems to be a contradiction to me. We should either pick a different name or implement a different semantic. Then name should encode that it's purely about address-space attributes. QualType Ty = E->getType();
if (!Ty.hasAddressSpace())
  return false;
if (SuppressAddressSpaces)
  return true;
// On X86 addr spaces ...., thus ignored.
return target == x86 && addressspace in {256,257,258}; | |
| 343 | dead-code? | |
No major issues left.
| clang/lib/StaticAnalyzer/Checkers/DereferenceChecker.cpp | ||
|---|---|---|
| 56 | If I'm correct, we tend to use the 'DefaultBool' instead of plain old bools for checker options. Could you please check if this is the case and consolidate this? | |
| 343 | This still looks dead. | |
| clang/test/Analysis/cast-value-notes.cpp | ||
| 39 | Nit: the default triple is the host/native triple, but you did actually pin it to x86. Hence it might not be the default on other platforms. | |
Thanks for the the comments. I'll make the changes and resubmit. Please let me know your preferences for the test case (default vs pinned x86 or same).
Best!
| clang/lib/StaticAnalyzer/Checkers/DereferenceChecker.cpp | ||
|---|---|---|
| 56 | I'll address. FYI, looks like I chose the red pill when I found an example to go by :/ Looks like there's opportunity for some cleanup NFC type of patches. I'll pick these up in a separate patch. 120 namespace {
121 class DeadStoresChecker : public Checker<check::ASTCodeBody> {
122 public:
123   bool ShowFixIts = false;
124   bool WarnForDeadNestedAssignments = true;
125 
126   void checkASTCodeBody(const Decl *D, AnalysisManager &Mgr,
127                         BugReporter &BR) const;
128 }; | |
| 343 | arg, missed that. I'll clean this up. | |
| clang/test/Analysis/cast-value-notes.cpp | ||
| 39 | Yeah, the result is the same (as far as what the test code would look like). Not sure what your preferences are here. Would you prefer to see separate cases (default and pinned x86), or perhaps the macro names renamed to something that means both (instead of implying default)? | |
The two small issues are addressed, I think the only outstanding issue is the question about the test case. Please let me know a preference and I'll implement accordingly. Best!
I believe all comments have been addressed. After discussion with @steakhal, we're leaning towards removing the DefaultBool class and refactoring in favor of the language native bool type - mainly because DefaultBool doesn't add anything. Minimal yet effective is usually better.
IMO this should be in a separate paragraph. It's such a niche use-case.