This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
70 ms | x64 debian > LLVM.Bindings/Go::go.test |
Event Timeline
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h | ||
---|---|---|
34 | I am just wondering if this is the right level of abstraction. Maybe users do think of this in terms of skipping references? | |
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp | ||
95 | What is the model for expressions that evaluate to different locations in every loop iterations, e.g.: for(int *p = array; p != array + size; ++p) *p; // The dereference expression Here, having *p always evaluate to the same location is not correct, we'd probably need a way to widen this. | |
clang/lib/Analysis/FlowSensitive/Transfer.cpp | ||
46 | I think you probably also want to set the correct location for the whole expression. Or is that handled somewhere else? |
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h | ||
---|---|---|
34 | Absolutely! It makes sense to me to consider this (and other) alternatives so I added a FIXME as a reminder. | |
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp | ||
95 | In the current model, the storage location for p is stable and the value that is assigned to it changes. So, in one iteration the value assigned to the storage location for p is val1 which is a PointerValue that points to loc1 and in another iteration the value assigned to the storage location for p is val2 which is a PointerValue that points to loc2. The storage location for *p is also stable and the value that is assigned to it is a ReferenceValue that points to either loc1 or loc2. I implemented this and added a test. | |
clang/lib/Analysis/FlowSensitive/Transfer.cpp | ||
46 | You're right. I updated the code and added a test. |
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h | ||
---|---|---|
36 | I'm not sure that value categories are right here. The key is that skipping is optional, unlike value categories which are fixed. I think the problem is just that C++ references are weird (for lack of a technical term). A reference variable exists simultaneously in two different categories -- a pointer and the value it points to. That's what this is trying to capture. So, I'd recommend just changing the name to reflect that. WDYT of something like this? /// Indicates how to treat references, if encountered -- as a reference or the value referred to -- when retrieving /// storage locations or values. enum class RefTreatment { /// Consider the reference itself. AsSelf, /// Consider the referent. AsReferent, }; | |
111 | Why have a SkipPast argument here? I understand the concept for considering var-decls - it corresponds to the weirdness of lvalue-refs themselves. But StorageLocation is a lower-level (and simpler) concept, to which this doesn't seem to map. I'm fine with resolving indirections, but what is the value of only doing so conditionally? | |
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp | ||
242–247 | ||
clang/lib/Analysis/FlowSensitive/Transfer.cpp | ||
37 | maybe comment as to why this is necessary? I'd have thought that CFG's flattening of expressions would handle this (why, in general, we don't need to look deep into expressions). | |
135 | nit: might be easier to read if you invert this: const Expr *InitExpr = D.getInit(); if (InitExpr == nullptr) return; | |
144–145 | nit: else-if (one line)? |
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h | ||
---|---|---|
36 | Yeah, value categories are indeed fixed, I was thinking along the lines of treating expressions as if there was an LValueToRValue or RValueToLValue conversion. I think some of this problem might come from the ambiguity. E.g., when I want to query the analysis state about the value of *p, what do I want? Do I want the value for *p as an lvalue or *p as an rvalue? I would get a different answer in both cases. If I see *p in the source code it is fixed but when I want to query the analysis state it is not. On the other hand I see that SkipPast is used in the transfer functions, and I am wondering if that is justified. According to the language rules, I can never observe the actual value of the reference. I can modify the pointee, (ref = val) or create a pointer to the same value (&ref), but I cannot actually read or modify what is inside the reference. So I wonder if it might be less error prone if there would be no way in the transfer functions to observe the values of the references either. But I do not have a strong feeling about any of this at this point so feel free to leave everything as it is.
I am not sure how clear Referent is. For a non-native speaker Pointee might be clearer, although I do see how it can be confusing. | |
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp | ||
95 | In this case, does it really make sense to materialize this mapping? If every subexpression corresponds to a unique location, and always the same location, we could use the subexpression directly as a location. |
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h | ||
---|---|---|
36 | Thanks both! I agree with the comments. I prefer to keep it as is for now. I updated the FIXME with a pointer to this discussion so that we remember to get back to it at some point. | |
111 | You're right. It's not necessary here. Removed it. | |
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp | ||
95 | True, but a StorageLocation can also be assigned to a ValueDecl, not only to Expr. We also keep some additional information in the StorageLocation sub-classes so I don't think we can replace it directly. | |
clang/lib/Analysis/FlowSensitive/Transfer.cpp | ||
37 | Added a comment. The CFG does not contain ParenExpr as top-level statements in basic blocks, however sub-expressions can still be of that type. |
I am just wondering if this is the right level of abstraction. Maybe users do think of this in terms of skipping references?
Or would they think in terms of value categories? E.g., having getLValue vs getRValue. I think we can leave this as is for now, I just like the idea of exploring alternatives.