Page MenuHomePhabricator

[analyzer] Wrong type cast occures during pointer dereferencing after type punning
Needs ReviewPublic

Authored by ASDenysPetrov on Thu, Oct 8, 9:48 AM.

Details

Summary

During pointer dereferencing CastRetrievedVal uses wrong type from the Store after type punning. Namely, the pointer casts to another type and then assigns with a value of one more another type. It produces NonLoc value when Loc is expected.

Example for visibility:

void foo(char ***c, int *i) {
  *(unsigned**)c = (unsigned*)i; // type punning
  ***c; // uses 'unsigned**' from the Store instead of 'char***'
}

Fixes: https://bugs.llvm.org/show_bug.cgi?id=37503

Diff Detail

Event Timeline

ASDenysPetrov created this revision.Thu, Oct 8, 9:48 AM
ASDenysPetrov requested review of this revision.Thu, Oct 8, 9:48 AM
NoQ added a comment.EditedThu, Oct 8, 3:36 PM

This actually looks correct to me but i suggest a bit more investigation. Architecturally, it doesn't make sense to me that CastRetrievedVal() does anything beyond svalBuilder.dispatchCast(V, castTy). After all, there's only one way to cast a value from one type to another, and dispatchCast() was supposed to be the ultimate method to do so.

Can we collapse this function further towards this goal? Say, why not pump every region unconditionally through castRegion()? Does dispatchCast() use castRegion() internally - and if it does, why are there so many branches in this function?

@NoQ

Can we collapse this function further towards this goal? Say, why not pump every region unconditionally through castRegion()? Does dispatchCast() use castRegion() internally - and if it does, why are there so many branches in this function?

Thank you for raising such a good questions :)
Honestly, I spent a week debugging to undersand what's going on in the Store, I never had a chance to dig into it before. It was tough. I'll proceed this to try to make what you mentioned.

The thing I'm sure that the type of retrieved value should be the same as the type of which this value was declared? In other words we can unconditionally cast all the values, can't we? Is there any case that contradicts me?

ASDenysPetrov added a comment.EditedFri, Oct 9, 8:37 AM

@NoQ
BTW, what do you think we should do with D77062 and D88477 then?

If you don't mind I would also request some credit in the summary since I've spent some time on this as well.
Besides that I don't have much objection about this patch. It's defenately not my expertiese.
Good job coming up with the fix though, I had to focus on other things.

clang/test/Analysis/string.c
367–373

I would prefer slightly more readbale names.
func_strcpy_no_assertion() -> get_void_ptr()
ptr_strcpy_no_assertion -> type_punned_ptr
no-assertion -> no-crash
You could also hoist the char a as a function parameter to spare a line :)

ASDenysPetrov updated this revision to Diff 297579.EditedMon, Oct 12, 7:17 AM

Updat patch due to suggestions and fixed formating.

Proceeding work on improving solution by moving the logic inside SimpleSValBuilder::dispatchCast.

Updated. Removed a new test file, moved the test to an existing file instead.

@NoQ

Can we collapse this function further towards this goal? Say, why not pump every region unconditionally through castRegion()? Does dispatchCast() use castRegion() internally - and if it does, why are there so many branches in this function?

I've made some investigations. castRegion() does not use inside dispatchCast(), it belongs to StoreManager. Moving it to SimpleSValBuilder will be hard at least for now, since it is quite complex, but we can try. What do you think. Should I go further or we can apply this patch as is?

ASDenysPetrov updated this revision to Diff 298388.EditedThu, Oct 15, 8:10 AM

Updated. Moved castRegion() from StoreManager to SValBuilder.

Actually it wasn't so hard as I thought :)
Would it be better to make this moving in a separate patch as [NFC]?

NoQ added a comment.Sat, Oct 17, 5:22 AM

Aha, great, sounds like all casting can be made more correct, not just casting on store retrieve! Maybe it's worth celebrating through extra unittests on evalCast(). Thank you for looking into this.

The new code should obviously be restricted into evalCastFromLoc() because if it's a region it's a Loc.

You didn't need to move castRegion(): StoreManager is available in SValBuilder through this->StateMgr. I don't have a strong preference on where castRegion() should live; the original idea was that StoreManager should be allowed to have an opinion on this matter because different store managers would handle that differently; however, as of now there's only one store manager and there doesn't seem to be an interest in introducing more of them whereas the abstraction has already leaked all over the place anyway.

I still have questions about the extra if-statements that you've added. Shouldn't we do castRegion() unconditionally, given that we're trying to cast a region and that function presumably does exactly that? I just want to avoid these situations:

In D89055#2320363, @NoQ wrote:

dispatchCast() was supposed to be the ultimate method to do so

Ugh, sorry, no, that's evalCast(). Like evalBinOp() etc. My bad. Can we also use evalCast()?

ASDenysPetrov added a comment.EditedSat, Oct 17, 4:20 PM

@NoQ

The new code should obviously be restricted into evalCastFromLoc() because if it's a region it's a Loc.

The first I tryed was evalCastFromLoc(), but it turned out that SVal which binds to a pointer can be NonLoc as well through violation of pointing levels.
Look here:

void foo(int** p) { // here is a two-level pointer
   *(int*)p = 42; // pretend as a one-level pointer, dereference it and assign a number
   *p; // dereferencing once gives a `nonloc::ConcreteInt`, though it actually should be a one-level pointer aka Loc.
}

P.S. I can have miscomprehension of this, due to a lack of experience, but this is what I observed.

Shouldn't we do castRegion() unconditionally,

I've also tryed and got 10+ tests unpassed. Didn't dig deeper, just refused this idea.

Can we also use evalCast() ?

Do you mean to move castRegion() to evalCast() instead of dispatchCast(). I can investigate this.
Could you explain please what is a major difference between evalCast() and dispatchCast()?