This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Fix clang_analyzer_getExtent for heap regions
AbandonedPublic

Authored by steakhal on Mar 31 2021, 6:02 AM.

Details

Summary

This patch fixes an interesting case with the clang_analyzer_getExtent analyzer debug intrinsic.

Previously, one could not query the extent for a heap-allocated object.
I'm resolving this issue, by querying the extent of the base region of the given region.
This way, we will be able to query the extent of a new/malloced region in tests.

This should not change any meaningful behavior inside the analyzer.

Diff Detail

Event Timeline

steakhal created this revision.Mar 31 2021, 6:02 AM
steakhal requested review of this revision.Mar 31 2021, 6:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2021, 6:02 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
vsavchenko accepted this revision.Mar 31 2021, 6:32 AM

Looking great! Thanks!

This revision is now accepted and ready to land.Mar 31 2021, 6:32 AM
vsavchenko added inline comments.Mar 31 2021, 6:33 AM
clang/test/Analysis/explain-svals.cpp
55–56

I guess this should go now, right?

martong added inline comments.Mar 31 2021, 7:14 AM
clang/lib/StaticAnalyzer/Checkers/ExprInspectionChecker.cpp
257

Wait a bit, I think it should consider the offset too. So, getDynamicSizeWithOffset should be used IMHO.

martong added inline comments.Mar 31 2021, 7:17 AM
clang/lib/StaticAnalyzer/Checkers/ExprInspectionChecker.cpp
257

Or we should have a clang_analyzer_getExtentOfBaseRegion to be precise.

Ah, I see you need nonloc::SymbolVal in your next patch, and getDynamicSizeWithOffset returns an Unknown if the extend is symbolic.

Anyway, I still feel misleading that clang_analyzer_getExtent does not handle the offset. Could we change getDynamicSizeWithOffset to return with a symbolic offset instead unknown without regression?

steakhal planned changes to this revision.Mar 31 2021, 7:39 AM

Ah, I see you need nonloc::SymbolVal in your next patch, and getDynamicSizeWithOffset returns an Unknown if the extend is symbolic.

Anyway, I still feel misleading that clang_analyzer_getExtent does not handle the offset. Could we change getDynamicSizeWithOffset to return with a symbolic offset instead unknown without regression?

I'm gonna investigate. Thank you all for the snappy review!

BTW I really don't like the naming of these. Both 'dynamic' and 'size' are somewhat overloaded. I very much prefer extent to any of these.

clang/lib/StaticAnalyzer/Checkers/ExprInspectionChecker.cpp
257

I was thinking exactly the same!
I would rather choose getDynamicSizeWithOffset() instead of creating another one.

clang/test/Analysis/explain-svals.cpp
55–56

uh I missed it :D

steakhal updated this revision to Diff 334470.Mar 31 2021, 9:50 AM
steakhal marked 3 inline comments as done.

Fix comments.
I could not manage to create an unknown extent, where the behavior would diverge.

This revision is now accepted and ready to land.Mar 31 2021, 9:50 AM
NoQ added a comment.Apr 1 2021, 8:39 PM

What about clang_analyzer_getExtent(&x[2]) then?

In D99658#2665730, @NoQ wrote:

What about clang_analyzer_getExtent(&x[2]) then?

I guess (extent of heap segment that starts at symbol of type 'int *' conjured at statement 'new int [ext]') - 8 is the value I expect - which is the size of the remaining part of the memory region from x+2 in bytes.
We can argue about the semantics of this debug intrinsic or the name of it though.

NoQ accepted this revision.Apr 6 2021, 10:07 AM

I mean, the extent of an ElementRegion is the size of a single element. The reason why our intrinsic isn't doing what you expect is because we represent the pointer with offset as ElementRegion regardless of whether operator [] was used. On the other hand, an SVal doesn't ever represent a region at all, it only points to its first byte, regardless of the structure of MemRegion inside it; for that reason clang_analyzer_getExtent() is impossible to implement correctly in our current model.

So i think both behaviors are incorrect but if you think it makes it easier to write tests then absolutely go for it!

steakhal abandoned this revision.Apr 7 2021, 4:06 AM
In D99658#2671747, @NoQ wrote:

I mean, the extent of an ElementRegion is the size of a single element. The reason why our intrinsic isn't doing what you expect is because we represent the pointer with offset as ElementRegion regardless of whether operator [] was used. On the other hand, an SVal doesn't ever represent a region at all, it only points to its first byte, regardless of the structure of MemRegion inside it; for that reason clang_analyzer_getExtent() is impossible to implement correctly in our current model.

So i think both behaviors are incorrect but if you think it makes it easier to write tests then absolutely go for it!

Oh, I got it. Hm, yes, both are wrong xD

TBH since the new dumpExtent debug functions 89d210fe1a7a1c6cbf926df0595b6f107bc491d5 landed, I no longer need this, thus I'm abandoning this change.
Thank you all for the review though.