This patch introduces a fiaxable for pointer initializations of the type T *p = q so that p and q are added as nodes to the pointer assignment graph introduced in https://reviews.llvm.org/D145739
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/Analysis/UnsafeBufferUsage.cpp | ||
---|---|---|
566 | It seems that PointerInitTag never gets to bind a node? It is used in the constructor to initialize the member PI, which is also never used. I suggest we either remove the member or bind the node with the tag. | |
clang/test/SemaCXX/warn-unsafe-buffer-usage-multi-decl-ptr-init.cpp | ||
28 | Did we miss to add expected-note for change type of 'p' ... or we cannot fix p, q or r or some reason (curious)? |
clang/lib/Analysis/UnsafeBufferUsage.cpp | ||
---|---|---|
2015 | Sorry, this version of the patch was never meant to be reviewed. I uploaded purely to make sure I don't lose those changes. |
clang/test/SemaCXX/warn-unsafe-buffer-usage-multi-decl-ptr-init.cpp | ||
---|---|---|
28 | That's a great observation! You are right that the fixit for p implies a fixit for q due to the assignment statement int *p = q;. However, q is part of another fixable (int *r = q) that doesn't have a fixit. Since p and q need to be fixed either both together or none at all, we do not emit fixits for either (the method impossibleToFixForVar has the check that decides whether a fixit is impossible if any variable in the group doesn't have a fixit.) More generally, when a variable doesn't have a fixit, it either means that the operation is safe or that the operation isn't handled by our machine. Currently, we over-approximate and do not emit fixits. In this particular test case, we don't know how r will be fixed and therefore can't tell how it affects the fixit for q |
This is interesting, your new gadget is a single-variable gadget for the purposes of grouping and DRE claiming, but a multi-variable gadget for the purposes of strategy implications.
In particular it's never put into the bucket of fixables for the LHS variable.
This sounds scary but seems to be fine in this case. Indeed,
- If the LHS variable is safe (i.e. the strategy is wontfix even after variable grouping), we're not emitting any fixes for it so it doesn't matter what the bucket contains, so everything works well.
- In the future we may need to implement a .data() fix for the situation where the LHS is safe but RHS isn't. But we aren't actually fixing the LHS in this case, just the RHS, so buckets are still irrelevant.
- If the LHS variable is unsafe, the RHS variable is brought in through strategy implications, so we choose a span strategy for both variables and run impossibleToFixForVar() on each of them to confirm that the RHS can be fixed, so this scenario works well too.
- In particular, in the hypothetical situation that both strategies are span but the new gadget refuses to produce the fix (currently it always produces a trivial fix), fixits for the entire group will be correctly abandoned because the RHS is guaranteed to be part of the group.
So I think I'm ok with keeping such clever gadgets legal. Namely, it's ok for a fixable to not declare relation to a variable that participates in a strategy implication, as long as it's the source of the strategy implication. We should probably add an assert to verify that the gadget actually behaves this way - maybe put in the code that iterates over all strategy implications?
Another side effect of such implementation is that when the new gadget isn't found on a variable declaration statement, the declaration of the variable does not constitute an unclaimed use of that variable. The machine still believes that it fully understands the variable. This may or may not be a good thing; ultimately we may need a safeguard against declarations that *may* constitute a strategy implication (but aren't covered by any initialization fixables), so that to avoid suggesting low-quality fixits in such cases. But this isn't a regression; we already suggest such low-quality fixits and the patch only makes it better (https://godbolt.org/z/Mcn3Por5d). So we may want to implement that separately, but this doesn't necessarily need to follow the usual claiming process.
clang/test/SemaCXX/warn-unsafe-buffer-usage-multi-decl-ptr-init.cpp | ||
---|---|---|
28 | Maybe let's add a // FIXME to the test then? |
Other than me trying to understand the architectural implications here, I think the patch looks great and almost good to go!
clang/lib/Analysis/UnsafeBufferUsage.cpp | ||
---|---|---|
564 | Am I reading this right that isInUnspecifiedUntypedContext() is here to avoid situations like void foo(int *p) P // Changing `q` to `span` is quite problematic! if (int *q = p) { // ... } } ? Don't we already avoid that when we decide on fixability of variables themselves (https://godbolt.org/z/vGK7Ezo8j)? I suspect it may be unnecessary. | |
564 | (typo, P => {) |
We've had an offline discussion about this and I think the right way to do that architecturally is to separate concerns. The new gadget doesn't really participate in fixit generation all that much, it's only there for the strategy implications. We struggle to implement getClaimedVarUseSites() in a way that makes sense, because it doesn't make sense to implement it in the first place. This isn't really a FixableGadget, this is something else, like we can call it StrategyGadget and make getStrategyImplications() a pure virtual method of StrategyGadget while getClaimedVarUseSites() remains in FixableGadget. Then for this gadget we can leave fixit generation to the code that fixes variable declarations.
Same with the assignment gadget in the previous patch; we can split it into a StrategyGadget that returns implication but doesn't claim variables, and a FixableGadget that fixes variable uses depending on the chosen strategy without participating in deciding the strategy. In this case such separation has a valuable effect of removing the UUC check from the StrategyGadget's matcher, so that strategy implications were always visible regardless of our ability to enumerate all possible contexts in which it becomes fixable.
It also plays nicely with the idea that FixableGadget should have always been a very thin abstraction layer on top of RewriteRule. At first we were like "But muh strategy implications!" but now it's quite clear that we shouldn't really be conflating them together.
We should address this separately though. It'll keep biting us, so we *should* address it soon-ish, but the current patch works well and I that shouldn't stop us from having progress, so LGTM!
clang/lib/Analysis/UnsafeBufferUsage.cpp | ||
---|---|---|
564 | I think they're both statement matchers, no need for an extra layer. |
Am I reading this right that isInUnspecifiedUntypedContext() is here to avoid situations like
?
Don't we already avoid that when we decide on fixability of variables themselves (https://godbolt.org/z/vGK7Ezo8j)? I suspect it may be unnecessary.