This is an archive of the discontinued LLVM Phabricator instance.

clang/lib/Analysis/UnsafeBufferUsage.cpp
2248
2313
2362	Hmm, do we suffer from a crash similar to D150386 in presence of blocks instead of lambdas? I vaguely remember that there were subtle differences, but I also think our approach should probably be the same, so this fixme is probably correct.
2597–2598	A bit more idiomatic this way.
2598–2601	Ultimately we should accept `ObjCMethodDecl` here, and we should also accept `BlockDecl`s that live in global scope (where they can't be checked together with the surrounding function because there's no surrounding function). Of course parameter fixits are going to be quite different in these cases, but it shouldn't stop us from trying. Even today, even though we don't know how to fix ObjC method parameters, we can already encounter cases where fixing local variables is sufficient. In other words, I think `const FunctionDecl *` is the right parameter type for eg. `createFunctionOverloadsForParms()` (like I suggested before), but probably not for the entire `getFixits()`,

NoQ added inline comments.Aug 1 2023, 5:28 PM

clang/lib/Analysis/UnsafeBufferUsage.cpp
2289	Architecturally speaking, I think I just realized something confusing about our code. We already have variable groups well-defined at the Strategy phase, i.e. before we call `getFixIts()`, but then `getFixIts()` continues to reason about all variables collectively and indiscriminately. It continues to use entities such as the `FixItsForVariable` map which contain fixits for variables from all groups, not just the ones that are currently relevant. Then it re-introduces per-group data structures such as `ParmsNeedFixMask` on an ad-hoc basis, and it tries to compute them this way using the global, indiscriminate data structures. I'm starting to suspect that the code would start making a lot more sense if we invoke `getFixIts()` separately for each variable group. So that each such invocation produced a single collective fixit for the group, or failed doing so. This way we might be able to avoid sending steganographic messages through `FixItsForVariable`, but instead say directly "these are the variables that we're currently focusing on". It is the responsibility of the `Strategy` class to answer "should this variable be fixed?"; we shouldn't direct that question to any other data structures. And if a group fails at any point, just early-return `None` and proceed directly to the next getFixIts() invocation for the next group. We don't need to separately record which individual variables have failed. In particular, `eraseVarsForUnfixableGroupMates()` would become a simple early return. It probably also makes sense to store groups themselves inside the `Strategy` class. After all, fixing variables together is a form of strategy.

NoQ added inline comments.Aug 1 2023, 5:43 PM

clang/lib/Analysis/UnsafeBufferUsage.cpp
2289	(I don't think this needs to be addressed in the current patch, but this could help us untangle the code in general.)

t-rasmud added inline comments.Aug 3 2023, 12:53 PM

clang/lib/Analysis/UnsafeBufferUsage.cpp
2215–2234	// Erases variables in `FixItsForVariable`, if such a variable has an unfixable

ziqingluo-90 added inline comments.Aug 3 2023, 2:43 PM

clang/lib/Analysis/UnsafeBufferUsage.cpp
2289	This makes absolute sense! Each group is independent for fix-it generation. Moreover, when we support more strategy kinds, the constraint solving for a proper strategy will also be group-based.

Address comments.

Harbormaster completed remote builds in B250469: Diff 547405.Aug 4 2023, 4:49 PM

I see you've started to address the big comment in D157441, so, LGTM, thanks a lot for splitting stuff up!

I have one tiny stylistic nitpick.

clang/lib/Analysis/UnsafeBufferUsage.cpp
2228–2229	(also indentation does a lot of heavy lifting here, maybe let's stash the `any_of` into a variable?) (also we have `using namespace llvm` at the beginning of the file, so `llvm::` might be redundant)
2289	the constraint solving for a proper strategy will also be group-based. Hmm, my head-nomenclature (like head-canon but nomenclature) says that grouping is a sub-task of solving for strategy. I.e., we take "strategy constraints" of the form 'p' is any safe type => 'q' is any safe type and solve these constraints by crawling through the implication graph. But all the solving subtasks below grouping should probably indeed be separate!

This revision is now accepted and ready to land.Aug 8 2023, 5:08 PM

This revision was landed with ongoing or failed builds.Aug 18 2023, 5:44 PM

Closed by commit rGacc8a33b257f: [-Wunsafe-buffer-usage][NFC] Refactor `getFixIts`---where fix-its are generated (authored by ziqingluo-90). · Explain Why

This revision was automatically updated to reflect the committed changes.

ziqingluo-90 added a commit: rGacc8a33b257f: [-Wunsafe-buffer-usage][NFC] Refactor `getFixIts`---where fix-its are generated.

Revision Contents

Path

Size

clang/

lib/

Analysis/

UnsafeBufferUsage.cpp

120 lines

Diff 551684

clang/lib/Analysis/UnsafeBufferUsage.cpp

Show First 20 Lines • Show All 2,206 Lines • ▼ Show 20 Lines return llvm::any_of(FixIts, [](const FixItHint &Hint) {

auto Range = Hint.RemoveRange; auto Range = Hint.RemoveRange;

if (Range.getBegin().isMacroID() || Range.getEnd().isMacroID()) if (Range.getBegin().isMacroID() || Range.getEnd().isMacroID())

// If the range (or the first token) is (part of) a macro expansion: // If the range (or the first token) is (part of) a macro expansion:

return true; return true;

return false; return false;

}); });

} }

static bool impossibleToFixForVar(const FixableGadgetSets &FixablesForAllVars, // Erases variables in `FixItsForVariable`, if such a variable has an unfixable

const Strategy &S, // group mate. A variable `v` is unfixable iff `FixItsForVariable` does not

const VarDecl * Var) { // contain `v`.

for (const auto &F : FixablesForAllVars.byVar.find(Var)->second) { static void eraseVarsForUnfixableGroupMates(

std::optional<FixItList> Fixits = F->getFixits(S); std::map<const VarDecl *, FixItList> &FixItsForVariable,

if (!Fixits) { const VariableGroupsManager &VarGrpMgr) {

return true; // Variables will be removed from `FixItsForVariable`:

SmallVector<const VarDecl *, 8> ToErase;

for (auto [VD, Ignore] : FixItsForVariable) {

VarGrpRef Grp = VarGrpMgr.getGroupOfVar(VD);

if (llvm::any_of(Grp,

[&FixItsForVariable](const VarDecl *GrpMember) -> bool {

return FixItsForVariable.find(GrpMember) ==

FixItsForVariable.end();

NoQUnsubmitted

Not Done

[&FixItsForVariable](const VarDecl *GrpMember) -> bool {

- return FixItsForVariable.find(GrpMember) ==

- FixItsForVariable.end();

+ return FixItsForVariable.count(GrpMember) == 0;

})) {

(also indentation does a lot of heavy lifting here, maybe let's stash the any_of into a variable?)
(also we have using namespace llvm at the beginning of the file, so llvm:: might be redundant)

NoQ: (also indentation does a lot of heavy lifting here, maybe let's stash the `any_of` into a…

})) {

// At least one group member cannot be fixed, so we have to erase the

// whole group:

for (const VarDecl *Member : Grp)

ToErase.push_back(Member);

t-rasmudUnsubmitted

Done

// Erases variables in FixItsForVariable, if such a variable has an unfixable

t-rasmud: // Erases variables in `FixItsForVariable`, if such a variable has an unfixable

} }

return false; for (auto *VarToErase : ToErase)

FixItsForVariable.erase(VarToErase);

} }

static std::map<const VarDecl *, FixItList> static std::map<const VarDecl *, FixItList>

getFixIts(FixableGadgetSets &FixablesForAllVars, const Strategy &S, getFixIts(FixableGadgetSets &FixablesForAllVars, const Strategy &S,

ASTContext &Ctx, ASTContext &Ctx,

/* The function decl under analysis */ const Decl *D, /* The function decl under analysis */ const Decl *D,

const DeclUseTracker &Tracker, UnsafeBufferUsageHandler &Handler, const DeclUseTracker &Tracker, UnsafeBufferUsageHandler &Handler,

const VariableGroupsManager &VarGrpMgr) { const VariableGroupsManager &VarGrpMgr) {

// `FixItsForVariable` will map each variable to a set of fix-its directly

// associated to the variable itself. Fix-its of distinct variables in

NoQUnsubmitted

Done

// `FixItsForVariable` will map each variable to a set of fix-its directly

- // associated to the variable itself. Fix-its of distict variables in

+ // associated to the variable itself. Fix-its of distinct variables in

// `FixItsForVariable` are disjoint.

NoQ:

// `FixItsForVariable` are disjoint.

std::map<const VarDecl *, FixItList> FixItsForVariable; std::map<const VarDecl *, FixItList> FixItsForVariable;

// Populate `FixItsForVariable` with fix-its directly associated with each

// variable. Fix-its directly associated to a variable 'v' are the ones

// produced by the `FixableGadget`s whose claimed variable is 'v'.

for (const auto &[VD, Fixables] : FixablesForAllVars.byVar) { for (const auto &[VD, Fixables] : FixablesForAllVars.byVar) {

FixItsForVariable[VD] = FixItsForVariable[VD] =

fixVariable(VD, S.lookup(VD), D, Tracker, Ctx, Handler); fixVariable(VD, S.lookup(VD), D, Tracker, Ctx, Handler);

// If we fail to produce Fix-It for the declaration we have to skip the // If we fail to produce Fix-It for the declaration we have to skip the

// variable entirely. // variable entirely.

if (FixItsForVariable[VD].empty()) { if (FixItsForVariable[VD].empty()) {

FixItsForVariable.erase(VD); FixItsForVariable.erase(VD);

continue; continue;

} }

bool ImpossibleToFix = false;

llvm::SmallVector<FixItHint, 16> FixItsForVD;

for (const auto &F : Fixables) { for (const auto &F : Fixables) {

std::optional<FixItList> Fixits = F->getFixits(S); std::optional<FixItList> Fixits = F->getFixits(S);

if (!Fixits) {

if (Fixits) {

FixItsForVariable[VD].insert(FixItsForVariable[VD].end(),

Fixits->begin(), Fixits->end());

continue;

}

#ifndef NDEBUG #ifndef NDEBUG

Handler.addDebugNoteForVar( Handler.addDebugNoteForVar(

VD, F->getBaseStmt()->getBeginLoc(), VD, F->getBaseStmt()->getBeginLoc(),

("gadget '" + F->getDebugName() + "' refused to produce a fix") ("gadget '" + F->getDebugName() + "' refused to produce a fix")

.str()); .str());

#endif #endif

ImpossibleToFix = true;

break;

} else {

const FixItList CorrectFixes = Fixits.value();

FixItsForVD.insert(FixItsForVD.end(), CorrectFixes.begin(),

CorrectFixes.end());

}

if (ImpossibleToFix) {

FixItsForVariable.erase(VD); FixItsForVariable.erase(VD);

continue;

}

{

const auto VarGroupForVD = VarGrpMgr.getGroupOfVar(VD);

for (const VarDecl * V : VarGroupForVD) {

if (V == VD) {

continue;

}

if (impossibleToFixForVar(FixablesForAllVars, S, V)) {

ImpossibleToFix = true;

break; break;

} }

if (ImpossibleToFix) { // `FixItsForVariable` now contains only variables that can be

FixItsForVariable.erase(VD); // fixed. A variable can be fixed if its' declaration and all Fixables

for (const VarDecl * V : VarGroupForVD) { // associated to it can all be fixed.

FixItsForVariable.erase(V);

} // To further remove from `FixItsForVariable` variables whose group mates

continue; // cannot be fixed...

} eraseVarsForUnfixableGroupMates(FixItsForVariable, VarGrpMgr);

NoQUnsubmitted

Not Done

Architecturally speaking, I think I just realized something confusing about our code.

We already have variable groups well-defined at the Strategy phase, i.e. before we call getFixIts(), but then getFixIts() continues to reason about all variables collectively and indiscriminately. It continues to use entities such as the FixItsForVariable map which contain fixits for variables from *all* groups, not just the ones that are currently relevant. Then it re-introduces per-group data structures such as ParmsNeedFixMask on an ad-hoc basis, and it tries to compute them this way using the global, indiscriminate data structures.

I'm starting to suspect that the code would start making a lot more sense if we invoke getFixIts() separately for each variable group. So that each such invocation produced a single collective fixit for the group, or failed doing so.

This way we might be able to avoid sending steganographic messages through FixItsForVariable, but instead say directly "these are the variables that we're currently focusing on". It is the responsibility of the Strategy class to answer "should this variable be fixed?"; we shouldn't direct that question to any other data structures.

And if a group fails at any point, just early-return None and proceed directly to the next getFixIts() invocation for the next group. We don't need to separately record which individual variables have failed. In particular, eraseVarsForUnfixableGroupMates() would become a simple early return.

It probably also makes sense to store groups themselves inside the Strategy class. After all, fixing variables together is a form of strategy.

NoQ: Architecturally speaking, I think I just realized something confusing about our code. We…

NoQUnsubmitted

Not Done

(I don't think this needs to be addressed in the current patch, but this could help us untangle the code in general.)

NoQ: (I don't think this needs to be addressed in the current patch, but this could help us untangle…

ziqingluo-90AuthorUnsubmitted

Done

This makes absolute sense! Each group is independent for fix-it generation. Moreover, when we support more strategy kinds, the constraint solving for a proper strategy will also be group-based.

ziqingluo-90: This makes absolute sense! Each group is independent for fix-it generation. Moreover, when we…

NoQUnsubmitted

Not Done

the constraint solving for a proper strategy will also be group-based.

Hmm, my head-nomenclature (like head-canon but nomenclature) says that grouping is a sub-task of solving for strategy. I.e., we take "strategy constraints" of the form

'p' is any safe type => 'q' is any safe type

and solve these constraints by crawling through the implication graph.

But all the solving subtasks below grouping should probably indeed be separate!

NoQ: > the constraint solving for a proper strategy will also be group-based. Hmm, my head…

} // Now `FixItsForVariable` gets further reduced: a variable is in

FixItsForVariable[VD].insert(FixItsForVariable[VD].end(), // `FixItsForVariable` iff it can be fixed and all its group mates can be

FixItsForVD.begin(), FixItsForVD.end()); // fixed.

// Fix-it shall not overlap with macros or/and templates:

if (overlapWithMacro(FixItsForVariable[VD]) ||

clang::internal::anyConflict(FixItsForVariable[VD],

Ctx.getSourceManager())) {

FixItsForVariable.erase(VD);

continue;

}

// The map that maps each variable `v` to fix-its for the whole group where // The map that maps each variable `v` to fix-its for the whole group where

// `v` is in: // `v` is in:

std::map<const VarDecl *, FixItList> FinalFixItsForVariable{ std::map<const VarDecl *, FixItList> FinalFixItsForVariable{

FixItsForVariable}; FixItsForVariable};

for (auto &[Var, Ignore] : FixItsForVariable) { for (auto &[Var, Ignore] : FixItsForVariable) {

const auto VarGroupForVD = VarGrpMgr.getGroupOfVar(Var); const auto VarGroupForVD = VarGrpMgr.getGroupOfVar(Var);

for (const VarDecl *GrpMate : VarGroupForVD) { for (const VarDecl *GrpMate : VarGroupForVD) {

if (Var == GrpMate) if (Var == GrpMate)

continue; continue;

if (FixItsForVariable.count(GrpMate)) if (FixItsForVariable.count(GrpMate))

FinalFixItsForVariable[Var].insert(FinalFixItsForVariable[Var].end(), FinalFixItsForVariable[Var].insert(FinalFixItsForVariable[Var].end(),

FixItsForVariable[GrpMate].begin(), FixItsForVariable[GrpMate].begin(),

FixItsForVariable[GrpMate].end()); FixItsForVariable[GrpMate].end());

} }

// Fix-its that will be applied in one step shall NOT:

// 1. overlap with macros or/and templates; or

// 2. conflict with each other.

NoQUnsubmitted

Done

// 1. overlap with macros or/and templates; or

- // 2. conflicting each other.

+ // 2. conflict with each other.

// Otherwise, the fix-its will be dropped.

NoQ:

// Otherwise, the fix-its will be dropped.

for (auto Iter = FinalFixItsForVariable.begin();

Iter != FinalFixItsForVariable.end();)

if (overlapWithMacro(Iter->second) ||

clang::internal::anyConflict(Iter->second, Ctx.getSourceManager())) {

Iter = FinalFixItsForVariable.erase(Iter);

} else

Iter++;

return FinalFixItsForVariable; return FinalFixItsForVariable;

} }

static Strategy static Strategy

getNaiveStrategy(const llvm::SmallVectorImpl<const VarDecl *> &UnsafeVars) { getNaiveStrategy(const llvm::SmallVectorImpl<const VarDecl *> &UnsafeVars) {

Strategy S; Strategy S;

for (const VarDecl *VD : UnsafeVars) { for (const VarDecl *VD : UnsafeVars) {

S.set(VD, Strategy::Kind::Span); S.set(VD, Strategy::Kind::Span);

} }

return S; return S;

} }

Show All 21 Lines void clang::checkUnsafeBufferUsage(const Decl *D,

bool EmitSuggestions) { bool EmitSuggestions) {

#ifndef NDEBUG #ifndef NDEBUG

Handler.clearDebugNotes(); Handler.clearDebugNotes();

#endif #endif

assert(D && D->getBody()); assert(D && D->getBody());

// We do not want to visit a Lambda expression defined inside a method independently. // We do not want to visit a Lambda expression defined inside a method independently.

// Instead, it should be visited along with the outer method. // Instead, it should be visited along with the outer method.

// FIXME: do we want to do the same thing for `BlockDecl`s?

NoQUnsubmitted

Done

Hmm, do we suffer from a crash similar to D150386 in presence of blocks instead of lambdas?

I vaguely remember that there were subtle differences, but I also think our approach should probably be the same, so this fixme is probably correct.

NoQ: Hmm, do we suffer from a crash similar to D150386 in presence of blocks instead of lambdas? I…

if (const auto *fd = dyn_cast<CXXMethodDecl>(D)) { if (const auto *fd = dyn_cast<CXXMethodDecl>(D)) {

if (fd->getParent()->isLambda() && fd->getParent()->isLocalClass()) if (fd->getParent()->isLambda() && fd->getParent()->isLocalClass())

return; return;

} }

// Do not emit fixit suggestions for functions declared in an // Do not emit fixit suggestions for functions declared in an

// extern "C" block. // extern "C" block.

if (const auto *FD = dyn_cast<FunctionDecl>(D)) { if (const auto *FD = dyn_cast<FunctionDecl>(D)) {

▲ Show 20 Lines • Show All 218 Lines • ▼ Show 20 Lines if (VisitedVars.find((*I).first) == VisitedVars.end()) {

++I; ++I;

} }

Strategy NaiveStrategy = getNaiveStrategy(UnsafeVars); Strategy NaiveStrategy = getNaiveStrategy(UnsafeVars);

VariableGroupsManagerImpl VarGrpMgr(Groups, VarGrpMap); VariableGroupsManagerImpl VarGrpMgr(Groups, VarGrpMap);

FixItsForVariableGroup = FixItsForVariableGroup =

getFixIts(FixablesForAllVars, NaiveStrategy, D->getASTContext(), D, getFixIts(FixablesForAllVars, NaiveStrategy, D->getASTContext(), D,

Tracker, Handler, VarGrpMgr); Tracker, Handler, VarGrpMgr);

NoQUnsubmitted

Done

VariableGroupsManagerImpl VarGrpMgr(Groups, VarGrpMap);

- if (isa<FunctionDecl>(D))

+ if (const auto *FD = dyn_cast<FunctionDecl>(D))

// The only case where `D` is not a `FunctionDecl` is when `D` is a

// `BlockDecl`. Let's NOT try to fix variables in blocks for now. Becuase

// those variables could be declared implicitly (captured variables) or in

// enclosing scopes.

FixItsForVariableGroup =

getFixIts(FixablesForAllVars, NaiveStrategy, D->getASTContext(),

- cast<FunctionDecl>(D), Tracker, Handler, VarGrpMgr);

+ FD, Tracker, Handler, VarGrpMgr);

for (const auto &G : UnsafeOps.noVar) {

A bit more idiomatic this way.

NoQ: A bit more idiomatic this way.

for (const auto &G : UnsafeOps.noVar) { for (const auto &G : UnsafeOps.noVar) {

Handler.handleUnsafeOperation(G->getBaseStmt(), /*IsRelatedToDecl=*/false); Handler.handleUnsafeOperation(G->getBaseStmt(), /*IsRelatedToDecl=*/false);

} }

NoQUnsubmitted

Done

Ultimately we should accept ObjCMethodDecl here, and we should also accept BlockDecls that live in global scope (where they can't be checked together with the surrounding function because there's no surrounding function).

Of course parameter fixits are going to be quite different in these cases, but it shouldn't stop us from trying. Even today, even though we don't know how to fix ObjC method parameters, we can already encounter cases where fixing local variables is sufficient.

In other words, I think const FunctionDecl * is the right parameter type for eg. createFunctionOverloadsForParms() (like I suggested before), but probably not for the entire getFixits(),

NoQ: Ultimately we should accept `ObjCMethodDecl` here, and we should also accept `BlockDecl`s that…

for (const auto &[VD, WarningGadgets] : UnsafeOps.byVar) { for (const auto &[VD, WarningGadgets] : UnsafeOps.byVar) {

auto FixItsIt = FixItsForVariableGroup.find(VD); auto FixItsIt = FixItsForVariableGroup.find(VD);

Handler.handleUnsafeVariableGroup(VD, VarGrpMgr, Handler.handleUnsafeVariableGroup(VD, VarGrpMgr,

FixItsIt != FixItsForVariableGroup.end() FixItsIt != FixItsForVariableGroup.end()

? std::move(FixItsIt->second) ? std::move(FixItsIt->second)

: FixItList{}); : FixItList{});

for (const auto &G : WarningGadgets) { for (const auto &G : WarningGadgets) {

Handler.handleUnsafeOperation(G->getBaseStmt(), /*IsRelatedToDecl=*/true); Handler.handleUnsafeOperation(G->getBaseStmt(), /*IsRelatedToDecl=*/true);

} }

This is an archive of the discontinued LLVM Phabricator instance.

[-Wunsafe-buffer-usage][NFC] Refactor `getFixIts`---where fix-its are generatedClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 551684

clang/lib/Analysis/UnsafeBufferUsage.cpp

[-Wunsafe-buffer-usage][NFC] Refactor `getFixIts`---where fix-its are generated
ClosedPublic