This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Treat unions as structs.
ClosedPublic

Authored by merrymeerkat on Dec 27 2022, 8:18 AM.

Details

Summary

This is a straightfoward way to handle unions in dataflow analysis. Without this change, nullability verification crashes on files that contain unions.

Diff Detail

Event Timeline

merrymeerkat created this revision.Dec 27 2022, 8:18 AM
Herald added a project: Restricted Project. · View Herald Transcript
merrymeerkat requested review of this revision.Dec 27 2022, 8:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 27 2022, 8:18 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
ymandel added inline comments.Dec 27 2022, 8:47 AM
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1551–1552

Why push this off to another patch?

merrymeerkat added inline comments.Dec 27 2022, 10:11 AM
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1551–1552

good point! I was pushing it because this is just a quick fix to avoid a crash, and the current changes are sufficient for that

ymandel added inline comments.Dec 27 2022, 10:57 AM
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1551–1552

SGTM. Please use FIXME instead of TODO, though. But, glad to see this is enough to avoid the crashing!

That said, can you expand on where the actual crash was happening? I'm concerned that its possible that the crash site should be more robust and that this patch is hiding that weakness.

Change TODO to FIXME

merrymeerkat added inline comments.Dec 27 2022, 11:42 AM
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1551–1552

Done!

The crash was happening because of a null pointer cast in the builtin transfer function for CFGInitializers: https://github.com/llvm/llvm-project/blob/781eabeb40b8e47e3a46b0b927784e63f0aad9ab/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp#L316

Hmm so do you think it'd be helpful to add a null check in the file above?

ymandel added inline comments.Dec 27 2022, 12:00 PM
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1551–1552

Thanks, that's quite helpful! Yes, I think that would be a better fix. It's a matter of perspective (and opinion) so feel free to push back but I'd say that the code you pointed to is buggy -- it assumes the this loc is populated, but *by design* it's not populated for unions (I didn't know that unions could ever have this so #TIL?). I think we're admitting that we have a class of initializers for which we knowingly don't create the this pointer and therefore should express that in the code.

Alternatively, you could argue that the complete class of this pointer generating code is structs and unions, so if we just make sure (like your fix) to generate a this pointer in both cases, we can assert it's non-nullness (via the cast) and be done.

Given that the framework in general is riddled with places where we don't know if a value or storage location or... is initialized and we have to check, I think that's the better (and consistent) approach. Moreover, given that we're not actually adding support for unions, just trying to avoid a crash, I think changing that code site better expresses that intent.

Still, it's a matter of opinion, so I'll leave it to you to decide. WDYT?

merrymeerkat added inline comments.Dec 27 2022, 12:26 PM
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1551–1552

Thanks for the comment!

What you say makes sense. I guess we introduced the union initialization because it could also be useful in the future, but I don't know if it makes sense to add a feature that doesn't have any semantic use yet.

If the framework does these kinds of null checks in lots of other places, then I agree that it'd be good to have it here too for consistency. I'm leaning towards making the change you suggested.

merrymeerkat added inline comments.Dec 28 2022, 4:20 AM
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1551–1552

Hi @ymandel! Apologies for the late reply.

To clarify, were you suggesting that I remove the support for union and add a null check at the site of crash, or keep the support and add the null check? I had assumed the former, but now I re-read your comment and I am not sure.

I thought more about this and I think I'd actually prefer to add support for unions and skip the null check. The support could be useful for users. Now, as for why I think we should skip the null check: in l. 226 of Analysis/FlowSensitive/DataflowEnvironment.cpp, we check for methods that are not static. These are the prerequisites for something having a "this" pointer*. So, since we initialise the ThisPointeeLoc there, any intializer we're processing in TypeErasedDataflowAnalysis::builtinTransferInitializer should already have a not-null "this" pointer, making the check unnecessary. What do you think?

*lambdas could also have a this pointer, but I will leave them to be the subject of another revision (I've added a FIXME for that). And, in any case, lambdas wouldn't be passed to builtinTransferInitializer, so the old crash is avoided anyway.

ymandel accepted this revision.Dec 28 2022, 4:50 AM
ymandel added inline comments.
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1551–1552

Sounds good! I had meant the former. But, I think you make a compelling case for adding the support and skipping the null check.

And, in any case, lambdas wouldn't be passed to builtinTransferInitializer, so the old crash is avoided anyway.

Why is this? I'm not sure of the structure of lambda's constructors offhand, so I don't have a sense one way or another.

This revision is now accepted and ready to land.Dec 28 2022, 4:50 AM
merrymeerkat added inline comments.Dec 28 2022, 5:12 AM
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1551–1552

Thanks for the review!

Why is this? I'm not sure of the structure of lambda's constructors offhand, so I don't have a sense one way or another. `

builtinTransferInitializer takes in a CFGInitializer, i.e., the base/member initializer in a constructor initialization list. For example, in

struct S {
    int t;
    S(int i):t(i) {}
};

t(i) corresponds to a CFGInitializer.

Since (afaik) lambdas don't have these constructors, the input to builtinTransferInitializer shouldn't come from a lambda.

Since (afaik) lambdas don't have these constructors, the input to builtinTransferInitializer shouldn't come from a lambda.

I thought they might because they need to store captured variables. But, godbolt doesn't show any custom constructors so seems they don't. I have no idea how they handle captured variables but that's hardly the only mystery about lambdas. :)

another thought: please verify that createStorageLoction and createValue can correctly handle union types. Otherwise, you'll likely end up with other (surprising) crashes in the system. E.g.
https://github.com/llvm/llvm-project/blob/781eabeb40b8e47e3a46b0b927784e63f0aad9ab/clang/lib/Analysis/FlowSensitive/Transfer.cpp#L457
assumes that the base of a member expression is an AggregateStorageLocation, which means that in the case that this is such a base and it's a union type, you need to be sure that createStorageLocation has allocated an AggregateStorageLocation.

Unfortunately, we don't have written down anywhere (aside from the code itself) the invariants we require.

gribozavr2 accepted this revision.Dec 29 2022, 6:53 AM
merrymeerkat updated this revision to Diff 485626.EditedDec 29 2022, 10:58 AM

Addressing review comment. Add createValue functionality for unions.

another thought: please verify that createStorageLoction and createValue can correctly handle union types. Otherwise, you'll likely end up with other (surprising) crashes in the system. E.g.
https://github.com/llvm/llvm-project/blob/781eabeb40b8e47e3a46b0b927784e63f0aad9ab/clang/lib/Analysis/FlowSensitive/Transfer.cpp#L457
assumes that the base of a member expression is an AggregateStorageLocation, which means that in the case that this is such a base and it's a union type, you need to be sure that createStorageLocation has allocated an AggregateStorageLocation.

Unfortunately, we don't have written down anywhere (aside from the code itself) the invariants we require.

Good point!

Based on the assertions in TransferTest.UnionThisMember, we are creating storage locations for unions. So, the code you linked to should not crash (at least on unions as simple as that in the test).

We weren't creating values yet though. Turns out that was because createValue calls createValueUnlessSelfReferential, which creates values only for certain types, returning a nullptr otherwise: https://github.com/llvm/llvm-project/blob/8eb3698b940c4064b772f3ff5848d45f28523753/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp#L618-L699 . I've just changed this function's if-statement on structs and classes to accept union types too, and now we are creating values for unions. So, values should hopefully not be a problem either.

There was already another unit test relating to unions (TransferTest.AssignToUnionMember), but it was only partly implemented. Now that we initialize locations and values for union types, I implemented the rest of the test. (thank you @gribozavr2 for the help debugging!)

gribozavr2 accepted this revision.Dec 29 2022, 11:38 AM

another thought: please verify that createStorageLoction and createValue can correctly handle union types. Otherwise, you'll likely end up with other (surprising) crashes in the system. E.g.
https://github.com/llvm/llvm-project/blob/781eabeb40b8e47e3a46b0b927784e63f0aad9ab/clang/lib/Analysis/FlowSensitive/Transfer.cpp#L457
assumes that the base of a member expression is an AggregateStorageLocation, which means that in the case that this is such a base and it's a union type, you need to be sure that createStorageLocation has allocated an AggregateStorageLocation.

Unfortunately, we don't have written down anywhere (aside from the code itself) the invariants we require.

Good point!

Based on the assertions in TransferTest.UnionThisMember, we are creating storage locations for unions. So, the code you linked to should not crash (at least on unions as simple as that in the test).

We weren't creating values yet though. Turns out that was because createValue calls createValueUnlessSelfReferential, which creates values only for certain types, returning a nullptr otherwise: https://github.com/llvm/llvm-project/blob/8eb3698b940c4064b772f3ff5848d45f28523753/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp#L618-L699 . I've just changed this function's if-statement on structs and classes to accept union types too, and now we are creating values for unions. So, values should hopefully not be a problem either.

There was already another unit test relating to unions (TransferTest.AssignToUnionMember), but it was only partly implemented. Now that we initialize locations and values for union types, I implemented the rest of the test. (thank you @gribozavr2 for the help debugging!)

Sorry to go back-and-forth here, but I'm afraid this might be overly ambitious. With this change, unions are modeled exactly as structs (as far as I can tell), which is unsound. I think we need to think carefully before doing this (it might be fine, but it's a non trivial step) and we'd need to test thoroughly. Moreover, this goes well beyond simply fixing the bug you set out to fix.

I realize that I'm the one who raised the concern with these functions, so I should explain: I had in mind just that you verify they work correctly with the fix (while retaining existing behavior of not modeling unions), rather than go the extra step of allowing unions altogether. I think that it would be enough to modify createStorageLocation as you have (so that unions are guaranteed to map from an AggregateStorageLocation), but then leave createValue as is. The result will be that all union-typed expressions will map to AggregateStorageLocation, so member expressions will work correctly, but the fields will always map to nullptr, so the modeling will be sound.

However, this all just begs the question of why go down this path to begin with. If we can't properly model union fields, does it make sense to model a union-typed this pointer? Going back to your original argument, I think I missed a crucial point (and apologies that I didn't catch this sooner!):

I thought more about this and I think I'd actually prefer to add support for unions and skip the null check. The support could be useful for users. Now, as for why I think we should skip the null check: in l. 226 of Analysis/FlowSensitive/DataflowEnvironment.cpp, we check for methods that are not static. These are the prerequisites for something having a "this" pointer*. So, since we initialise the ThisPointeeLoc there, any intializer we're processing in TypeErasedDataflowAnalysis::builtinTransferInitializer should already have a not-null "this" pointer, making the check unnecessary. What do you think?

You're making the case here for why we should avoid the null check, given that we support unions. And you're logic is sound! But, it jumps over the hard question: when this is a union, do we want to initialise it all? Or, like other expressions we can't model, should we leave this null and then update the downstream code to avoid assuming that that this is non-null? (Or, a twist on that: should we avoid analysing the method at all when the object type is a union, since we're severely limited in what we can do. That would give us both benefits: we can still assume that the ThisPointeeLoc is never null, and avoid the bug.) I think we need to dig into this question first, and then the answer about the fix follows. I would assert that to fix the bug you're encountering, you're fastest solution is to add the null check. Then, if you want to come back later and extend the framework to support unions, we can do that (and correspondingly remove the null check), but its a longer and more difficult process.

Again, sorry for the confusion...

With this change, unions are modeled exactly as structs (as far as I can tell), which is unsound.

Modeling just the storage of the union (but not the value) is sound. The first revision of the patch was a targeted fix that allowed us to continue maintaining the invariant that this is always modeled with a non-null StorageLocation.

About modeling values of unions as structs (and members of unions as struct members), I can think of some unsoundness but nothing that affects what we are modeling now (e.g., pointers to two struct members should compare unequal, but pointers to two union members should compare equal). If the code writes to one union member, but reads from the other, code has UB. So it generally shouldn't be a problem, for the purpose of modeling, to model more storage in the union that a program free of union-related UB should never use. WDYT?

With this change, unions are modeled exactly as structs (as far as I can tell), which is unsound.

Modeling just the storage of the union (but not the value) is sound. The first revision of the patch was a targeted fix that allowed us to continue maintaining the invariant that this is always modeled with a non-null StorageLocation.

About modeling values of unions as structs (and members of unions as struct members), I can think of some unsoundness but nothing that affects what we are modeling now (e.g., pointers to two struct members should compare unequal, but pointers to two union members should compare equal). If the code writes to one union member, but reads from the other, code has UB. So it generally shouldn't be a problem, for the purpose of modeling, to model more storage in the union that a program free of union-related UB should never use. WDYT?

Good point about UB! So, we can drop the soundness concern. That leaves the "this is a big change" concern. I'm ok with it, but I'd prefer it wait for a thorough test (running over a large variety of TUs). That said, you two can decide how best to proceed -- I don't want to block your progress.

Extend documentation comment on storage locations with a FIXME about unions.

@ymandel - thank you for pointing all of this out! And no need to apologise at all :)

I agree that the changes made here are not ideal. But the alternative is also not ideal.

The question becomes: is it better to model unions in a way that is lacking, or to not model them at all?

To model unions properly, we would need to be able to set the storage location of all the members of a union to be the same. But storage location is tied to type, so we can't create the same storage location for members of different types. From what I can tell, correctly modelling unions would require us to do an overhaul of how storage locations are implemented.

Like Dmitri said, I think most of the issues with this simple approach to modelling unions would only be revealed with UB code. So, unless we want to compare the storage locations of pointers to different union members (which I don't think will be a very common use), this modelling should work all right for defined behaviour. I added a documentation comment regarding the lacking implementation of storage location.

Coming back to the question above. Yitzhak, you seem to prefer not modelling union at all rather than having this model that falls short. Dmitri seems to prefer a model that works on most cases than no model. I am a bit torn, but I think I'm leaning a bit on the side of having a model that falls short. I don't think we plan on doing the storage location overhaul anytime soon, and until then it'd be good to at least have some functionality.

Happy new year by the way!!

@ymandel - thank you for pointing all of this out! And no need to apologise at all :)

I agree that the changes made here are not ideal. But the alternative is also not ideal.

The question becomes: is it better to model unions in a way that is lacking, or to not model them at all?

To model unions properly, we would need to be able to set the storage location of all the members of a union to be the same. But storage location is tied to type, so we can't create the same storage location for members of different types. From what I can tell, correctly modelling unions would require us to do an overhaul of how storage locations are implemented.

Like Dmitri said, I think most of the issues with this simple approach to modelling unions would only be revealed with UB code. So, unless we want to compare the storage locations of pointers to different union members (which I don't think will be a very common use), this modelling should work all right for defined behaviour. I added a documentation comment regarding the lacking implementation of storage location.

Coming back to the question above. Yitzhak, you seem to prefer not modelling union at all rather than having this model that falls short. Dmitri seems to prefer a model that works on most cases than no model. I am a bit torn, but I think I'm leaning a bit on the side of having a model that falls short. I don't think we plan on doing the storage location overhaul anytime soon, and until then it'd be good to at least have some functionality.

Happy new year by the way!!

Happy new year! I should clarify: Dmitri convinced me that this form of modeling is ok for unions. My remaining concern is testing: that we've seen many things that seem ok on paper and then results in difficult-to-debug crashes when run over a large corpus. Some of those could come from the implementation itself (bugs or we reasoned incorrectly about how it works or there is UB in the code that breaks our assumptions, etc.); some of it can come about simply because we've now enabled analysis of a larger range of code paths. So, the prudent approach is to test a change like this thoroughly before committing it. In contrast, the change you need just to avoid the crash is quite small and limited in its scope.

That said, we don't know of any large users depending on this analysis, and we don't (yet) have any automated setup for running over a large corpus, so I'm ok going ahead to unblock you.

This revision was landed with ongoing or failed builds.Jan 3 2023, 10:36 AM
This revision was automatically updated to reflect the committed changes.

Thanks for clarifying! I've gone ahead and landed the change. At least this should reduce the number of false negatives we get (hopefully without introducing too much complexity).