This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Replace StoreManager::CastRetrievedVal with SValBuilder::evalCast
ClosedPublic

Authored by ASDenysPetrov on Feb 4 2021, 4:11 PM.

Details

Summary

Move logic from CastRetrievedVal to evalCast and replace CastRetrievedVal with evalCast. Also move guts from SimpleSValBuilder::dispatchCast inside evalCast.
OriginalTy become a default param, because evalCast intends to substitute dispatchCast, evalCastFromNonLoc and evalCastFromLoc which don't have OriginalTy param. OriginalTy provides additional information for casting, which is useful for some cases and useless for others. If OriginalTy.isNull() is true, then the cast performs based on CastTy only. Now evalCast operates in two ways. It retains all previous behavior and take over dispatchCast behavior.

From this patch use evalCast instead of dispatchCast, evalCastFromNonLoc and evalCastFromLoc functions. dispatchCast redirects to evalCast.

This patch ideally shall not change any behavior. (Should we mark it as [NFC]?)

This is an intermediate revision for D89055.

Diff Detail

Event Timeline

ASDenysPetrov created this revision.Feb 4 2021, 4:11 PM
ASDenysPetrov requested review of this revision.Feb 4 2021, 4:11 PM

Please, consider removing D95799 from the parent revisions if that is not a hard requirement of this patch.
It would be awesome to land your patches on dealing with cast problems in the close future.
Dealing with floating-point values is a plus, but maybe not a necessity.

Please, consider removing D95799 from the parent revisions if that is not a hard requirement of this patch.

Actually D95799 is one of the steps to clean up CastRetrievedVal initially. It significantly reduce the next effort of replacing with evalCast. But if float cast fix blocks the patch stack, I can try to find a way to bypass it.

ASDenysPetrov updated this revision to Diff 322422.EditedFeb 9 2021, 9:18 AM

Updated. Unlinked parent revision D95799 from the stack. Made corresponding changes in this patch.
@steakhal I did it. The changes should be safer now.

Updated. Unlinked parent revision D95799 from the stack. Made corresponding changes in this patch.
@steakhal I did it. The changes should be safer now.

In the upcoming days, I'm gonna schedule this on our CI. We will see the results.

@steakhal

In the upcoming days, I'm gonna schedule this on our CI. We will see the results.

Thank you. That would be great!

I'm glad to see these patches, the SMT API will benefit greatly from them!

@ASDenysPetrov Could you please rebase this?
git apply does not seem to apply this patch on top of llvm/main.
If it depends on any parent revisions please document them as well.

This patch preserves all previous reports as expected.
You can check it by yourself at https://codechecker-demo.eastus.cloudapp.azure.com/Default/runs?run=D96090&items-per-page=50.

However, I still have some minor concerns inline.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
101–103

I'm not sure if this FIXME is still applicable.

I'm also confused about having two functions doing effectively the same thing.
SimpleSValBuilder::dispatchCast is a virtual function, which just invokes a non-virtual function SValBuilder::evalCast.

Why should it be virtual in the first place?

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
727–728

You are resolving exactly this FIXME in this patch if I'm correct.
Shouldn't you update this comment?

868–870

Eh, I don't like comments preceding a single-statement block.
It might be a personal preference though.

This patch preserves all previous reports as expected.

That's great results!

Could you please rebase this?

I'll update and rebase this patch soon.

If it depends on any parent revisions please document them as well.

It does. You can see this in the stack. Do I need to mention this somewhere else?

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
101–103

I want to make the patch smaller avoiding things that could be changed individually. We can remove dispatchCast in the next post-cleanup patch.

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
727–728

I am not actually sure that the function I've made is good enough and is the single one. =)

868–870

I am trying to highlight the redirection to another cast function. If you think it's redundant, I'll remove it.

Could you please rebase this?

I'll update and rebase this patch soon.

If it depends on any parent revisions please document them as well.

It does. You can see this in the stack. Do I need to mention this somewhere else?

No, I probably missed that. My bad.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
101–103

Seems reasonable to me.

ASDenysPetrov edited the summary of this revision. (Show Details)

Rebased revision on top of the main branch. Minor improvements.

ASDenysPetrov added a comment.EditedFeb 19 2021, 6:39 AM

@steakhal wrote:

This patch preserves all previous reports as expected.

If this patch is technically full-baked and does not bring any harm, can we approve it as well to move forward?

@steakhal wrote:

This patch preserves all previous reports as expected.

Even with or without Z3 refutation.

If this patch is technically full-baked and does not bring any harm, can we approve it as well to move forward?

I would like @NoQ or someone else to also review this.

@steakhal

I would like @NoQ or someone else to also review this.

Oh, I would be very pleased if our colleagues pay attention to my recent patches. But seems you are the only who helps me with this stuff 🙂

NoQ added inline comments.Mar 3 2021, 6:04 PM
clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
539–542

This is one of the most important APIs of SValBuilder, alongside evalBinOp(). The newly added behavior is definitely obscure and needs explaining. So i think you should move this comment into the header, turn it into a doxygen comment, and elaborate what "less accurately" actually means here.

NoQ added a comment.Mar 3 2021, 6:12 PM

So, like, I think it's a start. You introduced a single source of truth for casting SVals and it's good. But i'm worried about our API surface.

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
539–542

Like, are you sure that the new behavior makes sense at all for call sites that aren't ex-CastRetrievedVal? When, and why, would a regular user use this functionality?

Maybe instead of obfuscating this functionality as a null pointer we should add an explicit flag and give it an understandable name? Even if it's bool CastRetrievedValBackdoorHackDoNotUse = false it's still much better than "umm we kind of expect it to work if you pass null here but we don't really know how".

564

No, don't add default:. All possible cases are currently handled. It breaks compiler warning about not all cases handled in the switch when new enum cases are added. Compile-time warnings are better than run-time warnings.

Rebased on main.

@NoQ
Thanks, I could finally draw your attention :)

When, and why, would a regular user use this functionality?

IMO there is no reason to use evalCast without an original type. But I'm not sure that you can get an original type in every case.

About CastRetrievedValBackdoorHackDoNotUse I need a strong advise or some discussion. Look.
Initially evalCast deduces a new SVal using original QualType and cast QualType in case when we've got a state and expression and are able to get the original type. E.g.:

...

SVal OldV = Context.getSVal(Expression);
QualType OriginalTy = Expr->getType();
SVal NewV = SVB.evalCast(OldV, CastTy, Expr->getType());
...

But dispatchCast, evalCastFromNonLoc, evalCastFromLoc operate using cast QualType only without knowing the original one.

SVal OldV = Context.getSVal(Expression);
SVal NewV = SVB.evalCastFromLoc(OldV.castAs<Loc>, CastTy);

Actually many cases don't need to know an exact original type. Some cases can extract the type from SVal (SymbolVal, ConcreteInt) Other cases need it from outside (MemRegionVal, LocAsInteger).
dispatchCast, evalCastFromNonLoc, evalCastFromLoc have a lot of similar code which is already in evalCast. They also calls inside evalCast function. I decided to move their functionality into a new splitted version of evalCast to decrease complexity of comprehension and try to substitute them in the future with the single evalCast. But substitusion needs to deal something with absent original type parameter. Then I decided to add support for`evalCast` to work in both modes. Practically, new evaCast has additional checks and casts when the original type is not null.

Now evalCastFromNonLoc and evalCastFromLoc are used in SimpleSValBuilder functions only. I think it would be better to add an explaination to the doc like OriginalTy.isNull() is a CastRetrievedVal backdoor hack. Do not use it. instead of new flags. I'll make a better description, but at first I'd try to investigate how to pass an original type to evalCastFromNonLoc and evalCastFromLoc in SimpleSValBuilder functions. Or maybe some other solution.

In sum I'll try to handle this and provide a better solution.

Actually many cases don't need to know an exact original type. Some cases can extract the type from SVal (SymbolVal, ConcreteInt) Other cases need it from outside (MemRegionVal, LocAsInteger).
dispatchCast, evalCastFromNonLoc, evalCastFromLoc have a lot of similar code which is already in evalCast. They also calls inside evalCast function. I decided to move their functionality into a new splitted version of evalCast to decrease complexity of comprehension and try to substitute them in the future with the single evalCast. But substitusion needs to deal something with absent original type parameter. Then I decided to add support for`evalCast` to work in both modes. Practically, new evaCast has additional checks and casts when the original type is not null.

Seems reasonable to me.

Now evalCastFromNonLoc and evalCastFromLoc are used in SimpleSValBuilder functions only. I think it would be better to add an explaination to the doc like OriginalTy.isNull() is a CastRetrievedVal backdoor hack. Do not use it. instead of new flags. I'll make a better description, but at first I'd try to investigate how to pass an original type to evalCastFromNonLoc and evalCastFromLoc in SimpleSValBuilder functions. Or maybe some other solution.

What about not defaulting that parameter? That would better represent our expectation that it requires an original-type.
You could still pass a default constructed QualType at each callsite.

ASDenysPetrov edited the summary of this revision. (Show Details)
ASDenysPetrov added a comment.EditedMar 10 2021, 1:07 PM

You could still pass a default constructed QualType at each callsite.

We can try. At least it will be like this is a hack for the particular cases then, that will emphasis to pay attention to. Well, I'll present this version tomorrow.

NoQ added a comment.Mar 10 2021, 7:00 PM

Actually many cases don't need to know an exact original type.

In all cases we're supposed to have an original type, whether we need it or not. Simply because we're simulating a typed language. If we don't have it it's a bug.

Some cases can extract the type from SVal

Which is generally impossible because integral casts aren't modeled correctly. Which, again, is a bug. I don't know whether we'll have a need to pass the type separately when all other parts of the static analyzer become type-correct; I suspect that we won't need it. But one way or another, my point is, one or more of the following is true: "we have the ability to provide the type in all cases", "we don't need to provide the type in any of the cases". None of these ideal situations leave room for an API that receives the type optionally. For this reason I suggest to either always have the caller supply the original type (even if means extra work on the caller side) or never have it supply the original type.

ASDenysPetrov updated this revision to Diff 329982.EditedMar 11 2021, 9:03 AM

Updated.

In D96090#2618410, @NoQ wrote:

In all cases we're supposed to have an original type, whether we need it or not. Simply because we're simulating a typed language. If we don't have it it's a bug.

I agree. That's how evalCastFromNonLoc and evalCastFromLoc work now. Finally I want to replace them along with passing a correct original type. This is just one another step closer to this direction. I don't want patches be too complicated.

Some cases can extract the type from SVal

Which is generally impossible because integral casts aren't modeled correctly.

Maybe I'm missing smth, But how about this:

QualType T = V.castAs<nonloc::SymbolVal>().getSymbol()->getType();
QualType T = cast<SymbolicRegion>(V.castAs<loc::MemRegionVal>().getAsRegion()).getSymbol()->getType();
// Belowees are not exact types but well narrowed information which can be sufficient.
V.castAs<nonloc::ConcreteInt>() <-> OriginalTy->isIntegralOrEnumerationType() 
V.castAs<loc::ConcreteInt>() | V.castAs<loc::MemRegionVal>() | V.castAs<loc::GotoLabel>() <-> Loc::isLocType(CastTy)
V.castAs<nonloc::PointerToMemberData>() <-> OriginalTy->isMemberPointerType()  (I guess)
...

None of these ideal situations leave room for an API that receives the type optionally.

I've returned back OriginalTy param from a default to a regular one as @steakhal advised in D96090#2616666.

Just a ping for you, guys, to let me move in this direction.

steakhal accepted this revision.Mar 26 2021, 3:25 AM

I don't know. I think it's already way better than it was.
I think we can reiterate this later.

I want to get the llvm_unreachable("Unknown SVal kind") statements back. Besides those, I have no objections to this patch.
It preserves the old semantics, so it should be safe to land.
What do you think @NoQ?

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
565–566

You probably don't want to remove this. The same applies to similar ones.

This revision is now accepted and ready to land.Mar 26 2021, 3:25 AM

@steakhal thanks for the approval.

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
565–566

Yep. I'd keep it here, beacuse of compiler warnings of no-return function.

NoQ added a comment.Apr 8 2021, 12:46 AM

This looks amazing, thanks a lot.

I have one final question: can we already remove dispatchCast()&Co.? I don't see any other call sites anymore. Looks like it becomes dead code after your patch.

Which is fairly shocking and mind blowing that we've organically developed two independent implementations of casting, one for RegionStore and one for everything else. I'm super happy that we're cleaning this up.

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
565–566

compiler warnings of no-return function.

Aha ok, in this case it's much better to keep llvm_unreachable outside the switch, like here and unlike the one in evalCastKind functions. This way we're getting compile-time warnings for unhandled enum cases.

NoQ added a comment.Apr 8 2021, 10:53 AM

Which is fairly shocking and mind blowing that we've organically developed two independent implementations of casting, one for RegionStore and one for everything else.

Wait, no, nvm, please disregard this. It wasn't like this forever, i just happened to catch code in an intermediate state after D90157. Either way, it's definitely getting much better, and either way, i'm curious if dispatchCast can now be eliminated.

ASDenysPetrov added a comment.EditedApr 13 2021, 5:10 AM

@NoQ
Many thanks for your evaluation!

i'm curious if dispatchCast can now be eliminated.

Already approved a month ago D97277 :-) That's the next step.

Already approved a month ago D97277 :-) That's the next step.

In addition to the above:

clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
68

@NoQ
I've mentioned the need in elimination.