This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][NFC] Refine CStringLength modeling API
Needs ReviewPublic

Authored by steakhal on Jul 30 2020, 12:56 PM.

Details

Summary

Refines the CStringLength modeling API to be pure.
Also removes the magic hypothetical parameter from the CStringChecker.

Now the getters and setters are behaving as expected.

Diff Detail

Event Timeline

steakhal created this revision.Jul 30 2020, 12:56 PM

It seems the Pre-merge clang-format went crazy.
I think just ignore the lint warnings for the unchanged lines.

Really I am still not totally familiar how the checkers work and if it is good to have these function names. I was thinking about any checker that needs a string length information. It could get the data from CStringChecker using a "get" function or test if there is data at all. In this case it may happen that this get function modifies state or computes the length at that point? And how do the set operation work, it only stores a value computed by the calling code? And the create operation then does the computation and set together? The functions look not symmetrical, the set and create takes a MemRegion but the get takes a SVal. The create function sets the length for the passed memory region but makes no tests on it like the set function (can we call the set function from the create?).

NoQ added inline comments.Aug 5 2020, 11:38 AM
clang/lib/StaticAnalyzer/Checkers/CStringChecker/CStringLength.h
42

Just pass by value then. It's already a "Ref".

50

This const does nothing.

steakhal updated this revision to Diff 286212.Aug 18 2020, 1:25 AM
steakhal marked 2 inline comments as done.
steakhal edited the summary of this revision. (Show Details)

Fixed Artem's inline comments:

  • cstring::getCStringLength now takes StateRef by value
  • cstring::dumpCStringLengths now takes by StateRef by non const value

Really I am still not totally familiar how the checkers work and if it is good to have these function names.

Yea, naming things is pretty hard. I'm open to have a less verbose one - especially since these API functions will live under some namespace.

[...] It could get the data from CStringChecker using a "get" function or test if there is data at all. In this case it may happen that this get function modifies state or computes the length at that point?

This patch makes sure that the get function will not modify the state (takes a const ProgramStateRef).
We don't really want to store the length of a StringLiteral, since that can be computed on demand.

And how do the set operation work, it only stores a value computed by the calling code?

It tries to avoid storing the length information. E.g. if a region refers to a string literal - we don't have to store the length. (This was the previous behavior, I don't want to change that.)
However, we don't handle ElementRegions (for now, but I'm working on it).

And the create operation then does the computation and set together?

CStringChecker needs a way to defer associating the length of the region.
I'm not sure why, but the magic Hypothetical parameter was introduced just for accomplishing that.
I'm not confident enough to refactor the function which exploits this. [1]

The functions look not symmetrical, the set and create takes a MemRegion but the get takes a SVal.

The current implementation checks if the SVal is a GotoLabel (which is not a MemRegion). I didn't want to change its behavior in this patch.
MemRegion parameter would fit much more nicely, just as you said.

The create function sets the length for the passed memory region

No, it just creates a symbol, representing a cstring length. That symbol is (strictly speaking) not yet associated with the memory region.
Furthermore, it applies implicit constraints like assuming that the cstring length is always less than the corresponding extent (It is not yet done, but I'm working on this too).

but makes no tests on it like the set function (can we call the set function from the create?).

We could call, but we need a mechanism to defer associating the length of a region.
CStringChecker relies on this behavior. See my paragraph about the Hypothetical parameter.


@balazske Unfortunately, I don't have satisfying answers to you, but any further improvements could be done in the future.