This is an archive of the discontinued LLVM Phabricator instance.

[-Wunsafe-buffer-usage][NFC] Refactor `getFixIts`---where fix-its are generated
ClosedPublic

Authored by ziqingluo-90 on Jul 31 2023, 4:22 PM.

Details

Diff Detail

Event Timeline

ziqingluo-90 created this revision.Jul 31 2023, 4:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 31 2023, 4:22 PM
ziqingluo-90 requested review of this revision.Jul 31 2023, 4:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 31 2023, 4:22 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
NoQ added a comment.Aug 1 2023, 4:27 PM

I really like this!

clang/lib/Analysis/UnsafeBufferUsage.cpp
2248
2313
2362

Hmm, do we suffer from a crash similar to D150386 in presence of blocks instead of lambdas?

I vaguely remember that there were subtle differences, but I also think our approach should probably be the same, so this fixme is probably correct.

2597–2598

A bit more idiomatic this way.

2598–2601

Ultimately we should accept ObjCMethodDecl here, and we should also accept BlockDecls that live in global scope (where they can't be checked together with the surrounding function because there's no surrounding function).

Of course parameter fixits are going to be quite different in these cases, but it shouldn't stop us from trying. Even today, even though we don't know how to fix ObjC method parameters, we can already encounter cases where fixing local variables is sufficient.

In other words, I think const FunctionDecl * is the right parameter type for eg. createFunctionOverloadsForParms() (like I suggested before), but probably not for the entire getFixits(),

NoQ added inline comments.Aug 1 2023, 5:28 PM
clang/lib/Analysis/UnsafeBufferUsage.cpp
2289

Architecturally speaking, I think I just realized something confusing about our code.

We already have variable groups well-defined at the Strategy phase, i.e. before we call getFixIts(), but then getFixIts() continues to reason about all variables collectively and indiscriminately. It continues to use entities such as the FixItsForVariable map which contain fixits for variables from *all* groups, not just the ones that are currently relevant. Then it re-introduces per-group data structures such as ParmsNeedFixMask on an ad-hoc basis, and it tries to compute them this way using the global, indiscriminate data structures.

I'm starting to suspect that the code would start making a lot more sense if we invoke getFixIts() separately for each variable group. So that each such invocation produced a single collective fixit for the group, or failed doing so.

This way we might be able to avoid sending steganographic messages through FixItsForVariable, but instead say directly "these are the variables that we're currently focusing on". It is the responsibility of the Strategy class to answer "should this variable be fixed?"; we shouldn't direct that question to any other data structures.

And if a group fails at any point, just early-return None and proceed directly to the next getFixIts() invocation for the next group. We don't need to separately record which individual variables have failed. In particular, eraseVarsForUnfixableGroupMates() would become a simple early return.

It probably also makes sense to store groups themselves inside the Strategy class. After all, fixing variables together is a form of strategy.

NoQ added inline comments.Aug 1 2023, 5:43 PM
clang/lib/Analysis/UnsafeBufferUsage.cpp
2289

(I don't think this needs to be addressed in the current patch, but this could help us untangle the code in general.)

t-rasmud added inline comments.Aug 3 2023, 12:53 PM
clang/lib/Analysis/UnsafeBufferUsage.cpp
2215–2234

// Erases variables in FixItsForVariable, if such a variable has an unfixable

ziqingluo-90 added inline comments.Aug 3 2023, 2:43 PM
clang/lib/Analysis/UnsafeBufferUsage.cpp
2289

This makes absolute sense! Each group is independent for fix-it generation. Moreover, when we support more strategy kinds, the constraint solving for a proper strategy will also be group-based.

ziqingluo-90 marked 6 inline comments as done.

Address comments.

NoQ accepted this revision.Aug 8 2023, 5:08 PM

I see you've started to address the big comment in D157441, so, LGTM, thanks a lot for splitting stuff up!

I have one tiny stylistic nitpick.

clang/lib/Analysis/UnsafeBufferUsage.cpp
2228–2229

(also indentation does a lot of heavy lifting here, maybe let's stash the any_of into a variable?)
(also we have using namespace llvm at the beginning of the file, so llvm:: might be redundant)

2289

the constraint solving for a proper strategy will also be group-based.

Hmm, my head-nomenclature (like head-canon but nomenclature) says that grouping is a sub-task of solving for strategy. I.e., we take "strategy constraints" of the form

'p' is any safe type => 'q' is any safe type

and solve these constraints by crawling through the implication graph.

But all the solving subtasks below grouping should probably indeed be separate!

This revision is now accepted and ready to land.Aug 8 2023, 5:08 PM
This revision was landed with ongoing or failed builds.Aug 18 2023, 5:44 PM
This revision was automatically updated to reflect the committed changes.