This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Singleton pointer values for null pointers.
ClosedPublic

Authored by wyt on Jun 17 2022, 7:10 AM.

Details

Summary

When a nullptr is assigned to a pointer variable, it is wrapped in a ImplicitCastExpr with cast kind CK_NullTo(Member)Pointer. This patch assigns singleton pointer values representing null to these expressions.

For each pointee type, a singleton null PointerValue is created and stored in the NullPointerVals map of the DataflowAnalysisContext class. The pointee type is retrieved from the implicit cast expression, and used to initialise the PointeeLoc field of the PointerValue. The PointeeLoc created is not mapped to any Value, reflecting the absence of value indicated by null pointers.

Diff Detail

Event Timeline

wyt created this revision.Jun 17 2022, 7:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2022, 7:10 AM
wyt requested review of this revision.Jun 17 2022, 7:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2022, 7:10 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I am wondering about the future plans regarding how pointers are represented.
What will be the expected behavior when the analysis discovers that the pointer has a null value? E.g.:

if (p == nullptr)
{
  ....
}

Would we expect p in this case to have the same singleton value in the then block of the if statement?

clang/include/clang/Analysis/FlowSensitive/DataflowAnalysisContext.h
163

Since you always want this function to create a null pointer value, I think it would be less error prone to ask for the location instead of an arbitrary value. Currently, a confused caller could put a non-null value into a table.

166

I think getAsString is considered expensive. Could you use QualType directly as the key?

wyt updated this revision to Diff 439078.Jun 22 2022, 10:11 AM

Use QualType as key to singleton map, implement getOrCreate factory function for retrieving null pointer values.

wyt marked an inline comment as done.Jun 22 2022, 11:06 AM

@xazax.hun

Since you always want this function to create a null pointer value, I think it would be less error prone to ask for the location instead of an arbitrary value. Currently, a confused caller could put a non-null value into a table.

Replaced with a getOrCreate factory to encapsulate pointer value creation.

I think getAsString is considered expensive. Could you use QualType directly as the key?

Done.

I am wondering about the future plans regarding how pointers are represented.
What will be the expected behavior when the analysis discovers that the pointer has a null value? E.g.:

if (p == nullptr)
{
  ....
}

Would we expect p in this case to have the same singleton value in the then block of the if statement?

This is an interesting idea.
IIUC, for non-boolean values, the core framework currently does not dissect the equality expression - p == nullptr would be represented by a symbolic boolean, but p and nullptr would not be entangled together.
However, we are working on a pointer nullability analysis on top of the framework that should add information to interrelate two pointer values and their nullability information when we do a comparison between pointers.

clang/include/clang/Analysis/FlowSensitive/DataflowAnalysisContext.h
163

Since you always want this function to create a null pointer value, I think it would be less error prone to ask for the location instead of an arbitrary value. Currently, a confused caller could put a non-null value into a table.

166

I think getAsString is considered expensive. Could you use QualType directly as the key?

xazax.hun accepted this revision.Jun 22 2022, 4:38 PM
xazax.hun added inline comments.
clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
65

Should we have a PointerValue without PointeeLoc to represent null pointers?

This revision is now accepted and ready to land.Jun 22 2022, 4:38 PM
sgatev accepted this revision.Jun 23 2022, 12:05 AM
sgatev added inline comments.
clang/include/clang/Analysis/FlowSensitive/DataflowAnalysisContext.h
156–158
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
318–320
321

Let's move this below getThisPointeeStorageLocation so that it's closer to related members. Same for the cpp file.

wyt updated this revision to Diff 439721.Jun 24 2022, 6:00 AM
wyt marked 5 inline comments as done.

Address comments.

wyt added inline comments.Jun 24 2022, 6:05 AM
clang/include/clang/Analysis/FlowSensitive/DataflowAnalysisContext.h
311–314

@xazax.hun

Should we have a PointerValue without PointeeLoc to represent null pointers?

PointeeLoc is a reference field in the current PointerValue so it is required for now.
Creating an independent NullPointerValue class without PointeeLoc is something we are considering, but in this patch we will be using PointerValues with a PointeeLoc corresponding to the pointee type to represent null pointers.

gribozavr2 accepted this revision.Jun 27 2022, 3:48 AM
gribozavr2 added inline comments.
clang/include/clang/Analysis/FlowSensitive/DataflowAnalysisContext.h
308–309
clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
59
wyt updated this revision to Diff 440162.Jun 27 2022, 4:11 AM
wyt marked 2 inline comments as done.

Address comments.

gribozavr2 accepted this revision.Jun 27 2022, 5:13 AM
This revision was automatically updated to reflect the committed changes.