Download Raw Diff

Details

Reviewers

NoQ
jkorous
t-rasmud
ziqingluo-90

Commits

rG50d4a1f70e11: [-Wunsafe-buffer-usage] Safe-buffers re-architecture to introduce Fixable…

Diff Detail

Event Timeline

malavikasamak created this revision.Dec 14 2022, 2:57 PM

Herald added a reviewer: NoQ. · View Herald TranscriptDec 14 2022, 2:57 PM

Herald added a project: Restricted Project. · View Herald Transcript

malavikasamak requested review of this revision.Dec 14 2022, 2:57 PM

Harbormaster completed remote builds in B203226: Diff 483006.Dec 14 2022, 2:57 PM

malavikasamak added reviewers: jkorous, t-rasmud, ziqingluo-90.Dec 14 2022, 2:58 PM

Ok let's explain what's going on.

We're basically realizing that most unsafe patterns aren't fixable without additional context. The original design was like

"Suppose we see arr[n] in the code. This is both an indication that the code is unsafe, and an acknowledgement that this use of arr doesn't require fixing as all safe containers will act as a drop-in replacement."

This isn't correct though, because overloaded operator [] is very different from raw operator []. Namely, if the array has N elements, then &arr[N] is valid code when the operator is raw but undefined behavior when the operator is overloaded (which would be caught by -fsanitize=library as a hard crash). Therefore, even though arr[n] is an unsafe pattern, we cannot actually fix it until we discover more context, such as an implicit lvalue-to-rvalue conversion of the subscript result.

Well, technically, we can always "fix" expression arr[n] by replacing it with arr.data()[n]. We don't even need to include [n] in our context, we can replace any occurrence of arr with arr.data() and call it a day. Such "fix", even though it works in every context, is hovewer a "low-quality" fix because it defeats the purpose of safe buffer usage: the container will no longer be able to check bounds in this operation at runtime.

Of course, in some contexts such "low-quality fix" may be the only possible fix.

Which naturally leads to the idea of fixable gadgets acting as "refinements" of each other: out of all gadgets that consume the given use of the variable we could pick the one that acknowledges the most context, because it provides the most high-quality fix. This could potentially save us some work, but it could also lead to sometimes emitting low-quality fixits when manual user intervention would have led to much better code. This would also maintain the uniformity between unsafe and safe gadgets: unsafe gadgets are still fixable gadgets, just most of them will be "refined" later.

Another solution could be to turn gadgets inside out by using ParentMap to discover the context as part of fixit. Such solution is prone to rapidly getting out of control when contexts of different variable use sites start overlapping (eg. in expressions like ptr1 = ptr2).

So for now, in order to avoid overengineering, we stick to a very simple approach instead: keep unsafe gadgets completely separate, not use them for fixit generation at all. Don't have gadgets refine each other, but only implement gadgets that capture *all* the necessary context. Now the only thing they have in common with safe gadgets is that they both are represented as ASTMatcher-based patterns for which the code can be easily scanned. Therefore the getFixits() method gets moved to the safe gadget interface, and unsafe gadgets don't even need to worry about it. If an unsafe gadget turns out to be fixable without additional context, it will have to be duplicated as safe gadget. But this is fine because that's like 1-2 extra classes in a hierarchy of ~30.

I did a superficial pass and the code LGTM (just nits).

Let me add one more thought to our motivation here.
@malavikasamak has pointed out that replacing arr with arr.data() when its type changes from T* to std::span<T> is not a universally correct fix. A counter-example is LHS of an assignment as arr.data() is not an LValue:

arr = nullptr;
=>
arr.data() = nullptr; // error: expression is not assignable

Which means that not only would gadgets that ignore relevant context be of lower quality but in some cases these would also be incorrect.

clang/lib/Analysis/UnsafeBufferUsage.cpp
579	This looks like a debug print in the parent commit. We should remove it from this patch.
619	(unrelated to this patch) In some follow-up patch we should rename `Pushed` to make the code easier to understand. IIUC it means "is there a variable declaration this warning relates to?".
651	Is this a left-over debug print?

This revision is now accepted and ready to land.Dec 15 2022, 5:36 PM

ziqingluo-90 added inline comments.Dec 15 2022, 5:39 PM

clang/lib/Analysis/UnsafeBufferUsage.cpp
200	Do we still keep these terms "safe gadget" and "unsafe gadget" for referring to something else or we should forget about them from now on?
651	I think an assertion `assert(false && "This should not be entered" )` might be better. But why this code should be unreachable?

In D140062#4000094, @jkorous wrote:
Let me add one more thought to our motivation here.
@malavikasamak has pointed out that replacing arr with arr.data() when its type changes from T* to std::span<T> is not a universally correct fix. A counter-example is LHS of an assignment as arr.data() is not an LValue:
arr = nullptr;
=>
arr.data() = nullptr; // error: expression is not assignable
Which means that not only would gadgets that ignore relevant context be of lower quality but in some cases these would also be incorrect.

Yeah that's right, my bad, it's only a correct fix for ImplicitCastExpr<LValueToRValue>(arr). You cannot turn a std::span or std::array into a pointer lvalue, so it's fine that we don't have a low-quality fix for such cases.

Looks great to me as well! I have a few nitpicks.

clang/lib/Analysis/UnsafeBufferUsage.cpp
159	I already nuked this function, so you probably need to rebase :(
185	`getBaseStmt()` can probably go to `WarningGadget` now?
190	Hmm so this function has to stay on the `Gadget` class because we want to group warnings by variables. Ok right.
200	No let's wipe them out!
641–642	In your case every gadget in the list is a warning gadget. So you can simply check whether the list is empty.

NoQ retitled this revision from [WIP] Safe-buffers re-architecture to introduce Fixable gadgets to [WIP][-Wunsafe-buffer-usage] Safe-buffers re-architecture to introduce Fixable gadgets.Dec 15 2022, 6:15 PM

NoQ mentioned this in D140179: [-Wunsafe-buffer-usage] Add unsafe buffer checking opt-out pragmas.Dec 15 2022, 6:57 PM

NoQ mentioned this in D138321: [-Wunsafe-buffer-usage] Ignore array subscript on literal zero.Dec 15 2022, 7:03 PM

ziqingluo-90 added a child revision: D139737: [-Wunsafe-buffer-usage] Initiate Fix-it generation for local variable declarations.Dec 19 2022, 5:30 PM

ziqingluo-90 removed a child revision: D139737: [-Wunsafe-buffer-usage] Initiate Fix-it generation for local variable declarations.Dec 19 2022, 5:32 PM

Addressed some comments. Need to rebase to Artem's changes.

Harbormaster completed remote builds in B204227: Diff 484365.Dec 20 2022, 1:03 PM

malavikasamak added inline comments.Dec 20 2022, 1:03 PM

clang/lib/Analysis/UnsafeBufferUsage.cpp
200	I will do a round and wipe these out from comments.
651	This is a left over debug message I added to test the re-architecture works as intended. Currently, it should not be reachable as we don't have any fixable gadgets available.

malavikasamak updated this revision to Diff 486420.Jan 4 2023, 4:02 PM

malavikasamak retitled this revision from [WIP][-Wunsafe-buffer-usage] Safe-buffers re-architecture to introduce Fixable gadgets to [-Wunsafe-buffer-usage] Safe-buffers re-architecture to introduce Fixable gadgets.

Harbormaster completed remote builds in B205793: Diff 486420.Jan 4 2023, 4:03 PM

malavikasamak added a parent revision: D139233: [-Wunsafe-buffer-usage] Add an unsafe gadget for pointer-arithmetic operations.Jan 4 2023, 4:04 PM

jkorous added inline comments.Jan 4 2023, 5:53 PM

clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def
30	Nit: Can we add the newline?
30	Please ignore - this is a stale comment.
clang/lib/Analysis/UnsafeBufferUsage.cpp
223	I think this is a bug.

malavikasamak added inline comments.Jan 4 2023, 5:58 PM

clang/lib/Analysis/UnsafeBufferUsage.cpp
223	Yes. Good catch. Thank you.

jkorous added inline comments.Jan 4 2023, 6:17 PM

clang/lib/Analysis/UnsafeBufferUsage.cpp
526

malavikasamak updated this revision to Diff 486622.Jan 5 2023, 10:24 AM

Harbormaster completed remote builds in B205946: Diff 486622.Jan 5 2023, 10:25 AM

malavikasamak updated this revision to Diff 486625.Jan 5 2023, 10:31 AM

malavikasamak marked an inline comment as done.

Harbormaster completed remote builds in B205949: Diff 486625.Jan 5 2023, 10:31 AM

ziqingluo-90 added a child revision: D139737: [-Wunsafe-buffer-usage] Initiate Fix-it generation for local variable declarations.Jan 5 2023, 1:09 PM

This revision was landed with ongoing or failed builds.Jan 6 2023, 11:45 AM

Closed by commit rG50d4a1f70e11: [-Wunsafe-buffer-usage] Safe-buffers re-architecture to introduce Fixable… (authored by malavikasamak). · Explain Why

This revision was automatically updated to reflect the committed changes.

malavikasamak added a commit: rG50d4a1f70e11: [-Wunsafe-buffer-usage] Safe-buffers re-architecture to introduce Fixable….

Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2023, 11:45 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

ziqingluo-90 mentioned this in D139737: [-Wunsafe-buffer-usage] Initiate Fix-it generation for local variable declarations.Jan 9 2023, 12:24 PM

NoQ mentioned this in D138940: [-Wunsafe-buffer-usage] Introduce the `unsafe_buffer_usage` attribute.Jan 19 2023, 3:44 PM

Diff 484365

clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def

	//=- UnsafeBufferUsageGadgets.def - List of ways to use a buffer --- C++ --=//			//=- UnsafeBufferUsageGadgets.def - List of ways to use a buffer --- C++ --=//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef GADGET			#ifndef GADGET
	#define GADGET(name)			#define GADGET(name)
	#endif			#endif

	#ifndef UNSAFE_GADGET			#ifndef WARNING_GADGET
	#define UNSAFE_GADGET(name) GADGET(name)			#define WARNING_GADGET(name) GADGET(name)
	#endif			#endif

	#ifndef SAFE_GADGET			#ifndef FIXABLE_GADGET
	#define SAFE_GADGET(name) GADGET(name)			#define FIXABLE_GADGET(name) GADGET(name)
	#endif			#endif

	UNSAFE_GADGET(Increment)			WARNING_GADGET(Increment)
	UNSAFE_GADGET(Decrement)			WARNING_GADGET(Decrement)
	UNSAFE_GADGET(ArraySubscript)			WARNING_GADGET(ArraySubscript)
	UNSAFE_GADGET(UnsafeBufferUsageAttr)			WARNING_GADGET(UnsafeBufferUsageAttr)
	UNSAFE_GADGET(PointerArithmetic)			WARNING_GADGET(PointerArithmetic)

	#undef SAFE_GADGET			#undef FIXABLE_GADGET
	#undef UNSAFE_GADGET			#undef WARNING_GADGET
	#undef GADGET			#undef GADGET
				No newline at end of file
				jkorousUnsubmitted Not Done Reply Inline Actions Nit: Can we add the newline? jkorous: Nit: Can we add the newline?
				jkorousUnsubmitted Not Done Reply Inline Actions Please ignore - this is a stale comment. jkorous: Please ignore - this is a stale comment.

clang/lib/Analysis/UnsafeBufferUsage.cpp

//===- UnsafeBufferUsage.cpp - Replace pointers with modern C++ -----------===// //===- UnsafeBufferUsage.cpp - Replace pointers with modern C++ -----------===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#include "clang/Analysis/Analyses/UnsafeBufferUsage.h" #include "clang/Analysis/Analyses/UnsafeBufferUsage.h"

#include "clang/AST/RecursiveASTVisitor.h" #include "clang/AST/RecursiveASTVisitor.h"

#include "llvm/ADT/SmallVector.h" #include "llvm/ADT/SmallVector.h"

#include <iostream>

using namespace llvm; using namespace llvm;

using namespace clang; using namespace clang;

using namespace ast_matchers; using namespace ast_matchers;

namespace clang::ast_matchers::internal { namespace clang::ast_matchers::internal {

// A `RecursiveASTVisitor` that traverses all descendants of a given node "n" // A `RecursiveASTVisitor` that traverses all descendants of a given node "n"

// except for those belonging to a different callable of "n". // except for those belonging to a different callable of "n".

▲ Show 20 Lines • Show All 119 Lines • ▼ Show 20 Lines

} }

namespace { namespace {

/// Gadget is an individual operation in the code that may be of interest to /// Gadget is an individual operation in the code that may be of interest to

/// this analysis. Each (non-abstract) subclass corresponds to a specific /// this analysis. Each (non-abstract) subclass corresponds to a specific

/// rigid AST structure that constitutes an operation on a pointer-type object. /// rigid AST structure that constitutes an operation on a pointer-type object.

/// Discovery of a gadget in the code corresponds to claiming that we understand /// Discovery of a gadget in the code corresponds to claiming that we understand

/// what this part of code is doing well enough to potentially improve it. /// what this part of code is doing well enough to potentially improve it.

/// Gadgets can be unsafe (immediately deserving a warning) or safe (not /// Gadgets can be warning (immediately deserving a warning) or fixable (not

/// deserving a warning per se, but affecting our decision-making process /// always deserving a warning per se, but requires our attention to identify

/// nonetheless). /// it warrants a fixit).

class Gadget { class Gadget {

public: public:

enum class Kind { enum class Kind {

#define GADGET(x) x, #define GADGET(x) x,

#include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def" #include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def"

#undef GADGETS #undef GADGETS

}; };

/// Determine if a kind is a safe kind. Slower than calling isSafe(). /// Determine if a kind is a safe kind. Slower than calling isWarningGadget().

NoQUnsubmitted

Not Done

I already nuked this function, so you probably need to rebase :(

NoQ: I already nuked this function, so you probably need to rebase :(

static bool isSafeKind(Kind K) { static bool isWarningKind(Kind K) {

switch (K) { switch (K) {

#define UNSAFE_GADGET(x) \ #define WARNING_GADGET(x) \

case Kind::x: case Kind::x:

#include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def" #include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def"

#undef UNSAFE_GADGET #undef WARNING_GADGET

return false; return true;

#define SAFE_GADGET(x) \ #define FIXABLE_GADGET(x) \

case Kind::x: case Kind::x:

#include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def" #include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def"

#undef SAFE_GADGET #undef FIXABLE_GADGET

return true; return false;

} }

llvm_unreachable("Invalid gadget kind!"); llvm_unreachable("Invalid gadget kind!");

} }

/// Common type of ASTMatchers used for discovering gadgets. /// Common type of ASTMatchers used for discovering gadgets.

/// Useful for implementing the static matcher() methods /// Useful for implementing the static matcher() methods

/// that are expected from all non-abstract subclasses. /// that are expected from all non-abstract subclasses.

using Matcher = decltype(stmt()); using Matcher = decltype(stmt());

Gadget(Kind K) : K(K) {} Gadget(Kind K) : K(K) {}

Kind getKind() const { return K; } Kind getKind() const { return K; }

virtual bool isSafe() const = 0; virtual bool isWarningGadget() const = 0;

NoQUnsubmitted

Done

getBaseStmt() can probably go to WarningGadget now?

NoQ: `getBaseStmt()` can probably go to `WarningGadget` now?

virtual const Stmt *getBaseStmt() const = 0;

/// Returns the list of pointer-type variables on which this gadget performs /// Returns the list of pointer-type variables on which this gadget performs

/// its operation. Typically there's only one variable. This isn't a list /// its operation. Typically there's only one variable. This isn't a list

/// of all DeclRefExprs in the gadget's AST! /// of all DeclRefExprs in the gadget's AST!

virtual DeclUseList getClaimedVarUseSites() const = 0; virtual DeclUseList getClaimedVarUseSites() const = 0;

NoQUnsubmitted

Not Done

Hmm so this function has to stay on the Gadget class because we want to group warnings by variables. Ok right.

NoQ: Hmm so this function has to stay on the `Gadget` class because we want to group warnings by…

/// Returns a fixit that would fix the current gadget according to

/// the current strategy. Returns None if the fix cannot be produced;

/// returns an empty list if no fixes are necessary.

virtual Optional<FixItList> getFixits(const Strategy &) const {

return None;

}

virtual ~Gadget() {} virtual ~Gadget() {}

private: private:

Kind K; Kind K;

}; };

using GadgetList = std::vector<std::unique_ptr<Gadget>>;

/// Unsafe gadgets correspond to unsafe code patterns that warrants /// Warning gadgets correspond to unsafe code patterns that warrants

ziqingluo-90Unsubmitted

Done

Do we still keep these terms "safe gadget" and "unsafe gadget" for referring to something else or we should forget about them from now on?

ziqingluo-90: Do we still keep these terms "safe gadget" and "unsafe gadget" for referring to something else…

NoQUnsubmitted

Done

No let's wipe them out!

NoQ: No let's wipe them out!

malavikasamakAuthorUnsubmitted

Not Done

I will do a round and wipe these out from comments.

malavikasamak: I will do a round and wipe these out from comments.

/// an immediate warning. /// an immediate warning.

class UnsafeGadget : public Gadget { class WarningGadget : public Gadget {

public: public:

UnsafeGadget(Kind K) : Gadget(K) { WarningGadget(Kind K) : Gadget(K) {

assert(classof(this) && "Invalid unsafe gadget kind!"); assert(classof(this) && "Invalid unsafe gadget kind!");

} }

static bool classof(const Gadget *G) { return !isSafeKind(G->getKind()); } static bool classof(const Gadget *G) { return isWarningKind(G->getKind()); }

bool isSafe() const override { return false; } bool isWarningGadget() const override { return true; }

virtual const Stmt *getBaseStmt() const = 0;

}; };

/// Safe gadgets correspond to code patterns that aren't unsafe but need to be /// Fixable gadgets correspond to code patterns that aren't always unsafe but need to be

/// properly recognized in order to emit correct warnings and fixes over unsafe /// properly recognized in order to emit correct fixes. For example, if a raw pointer-type

/// gadgets. For example, if a raw pointer-type variable is replaced by /// variable is replaced by a safe C++ container, every use of such variable maust be

/// a safe C++ container, every use of such variable may need to be

/// carefully considered and possibly updated. /// carefully considered and possibly updated.

class SafeGadget : public Gadget { class FixableGadget : public Gadget {

public: public:

SafeGadget(Kind K) : Gadget(K) { FixableGadget(Kind K) : Gadget(K) {

assert(classof(this) && "Invalid safe gadget kind!"); assert(classof(this) && "Invalid safe gadget kind!");

} }

jkorousUnsubmitted

Not Done

static bool classof(const Gadget *G) { return !G->isWarningGadget(); }

- bool isWarningGadget() const final { return true; }

+ bool isWarningGadget() const final { return false; }

/// Returns a fixit that would fix the current gadget according to

I think this is a bug.

jkorous: I think this is a bug.

malavikasamakAuthorUnsubmitted

Done

Yes. Good catch. Thank you.

malavikasamak: Yes. Good catch. Thank you.

static bool classof(const Gadget *G) { return isSafeKind(G->getKind()); } static bool classof(const Gadget *G) { return !isWarningKind(G->getKind()); }

bool isSafe() const override { return true; } bool isWarningGadget() const override { return false; }

/// Returns a fixit that would fix the current gadget according to

/// the current strategy. Returns None if the fix cannot be produced;

/// returns an empty list if no fixes are necessary.

virtual Optional<FixItList> getFixits(const Strategy &) const {

return None;

}

}; };

using FixableGadgetList = std::vector<std::unique_ptr<FixableGadget>>;

using WarningGadgetList = std::vector<std::unique_ptr<WarningGadget>>;

/// An increment of a pointer-type value is unsafe as it may run the pointer /// An increment of a pointer-type value is unsafe as it may run the pointer

/// out of bounds. /// out of bounds.

class IncrementGadget : public UnsafeGadget { class IncrementGadget : public WarningGadget {

const UnaryOperator *Op; const UnaryOperator *Op;

public: public:

IncrementGadget(const MatchFinder::MatchResult &Result) IncrementGadget(const MatchFinder::MatchResult &Result)

: UnsafeGadget(Kind::Increment), : WarningGadget(Kind::Increment),

Op(Result.Nodes.getNodeAs<UnaryOperator>("op")) {} Op(Result.Nodes.getNodeAs<UnaryOperator>("op")) {}

static bool classof(const Gadget *G) { static bool classof(const Gadget *G) {

return G->getKind() == Kind::Increment; return G->getKind() == Kind::Increment;

} }

static Matcher matcher() { static Matcher matcher() {

return stmt(unaryOperator( return stmt(unaryOperator(

Show All 11 Lines DeclUseList getClaimedVarUseSites() const override {

} }

return {}; return {};

} }

}; };

/// A decrement of a pointer-type value is unsafe as it may run the pointer /// A decrement of a pointer-type value is unsafe as it may run the pointer

/// out of bounds. /// out of bounds.

class DecrementGadget : public UnsafeGadget { class DecrementGadget : public WarningGadget {

const UnaryOperator *Op; const UnaryOperator *Op;

public: public:

DecrementGadget(const MatchFinder::MatchResult &Result) DecrementGadget(const MatchFinder::MatchResult &Result)

: UnsafeGadget(Kind::Decrement), : WarningGadget(Kind::Decrement),

Op(Result.Nodes.getNodeAs<UnaryOperator>("op")) {} Op(Result.Nodes.getNodeAs<UnaryOperator>("op")) {}

static bool classof(const Gadget *G) { static bool classof(const Gadget *G) {

return G->getKind() == Kind::Decrement; return G->getKind() == Kind::Decrement;

} }

static Matcher matcher() { static Matcher matcher() {

return stmt(unaryOperator( return stmt(unaryOperator(

Show All 11 Lines DeclUseList getClaimedVarUseSites() const override {

} }

return {}; return {};

} }

}; };

/// Array subscript expressions on raw pointers as if they're arrays. Unsafe as /// Array subscript expressions on raw pointers as if they're arrays. Unsafe as

/// it doesn't have any bounds checks for the array. /// it doesn't have any bounds checks for the array.

class ArraySubscriptGadget : public UnsafeGadget { class ArraySubscriptGadget : public WarningGadget {

const ArraySubscriptExpr *ASE; const ArraySubscriptExpr *ASE;

public: public:

ArraySubscriptGadget(const MatchFinder::MatchResult &Result) ArraySubscriptGadget(const MatchFinder::MatchResult &Result)

: UnsafeGadget(Kind::ArraySubscript), : WarningGadget(Kind::ArraySubscript),

ASE(Result.Nodes.getNodeAs<ArraySubscriptExpr>("arraySubscr")) {} ASE(Result.Nodes.getNodeAs<ArraySubscriptExpr>("arraySubscr")) {}

static bool classof(const Gadget *G) { static bool classof(const Gadget *G) {

return G->getKind() == Kind::ArraySubscript; return G->getKind() == Kind::ArraySubscript;

} }

static Matcher matcher() { static Matcher matcher() {

return stmt( return stmt(

Show All 11 Lines DeclUseList getClaimedVarUseSites() const override {

} }

return {}; return {};

} }

}; };

/// A call of a function or method that performs unchecked buffer operations /// A call of a function or method that performs unchecked buffer operations

/// over one of its pointer parameters. /// over one of its pointer parameters.

class UnsafeBufferUsageAttrGadget : public UnsafeGadget { class UnsafeBufferUsageAttrGadget : public WarningGadget {

const CallExpr *Op; const CallExpr *Op;

public: public:

UnsafeBufferUsageAttrGadget(const MatchFinder::MatchResult &Result) UnsafeBufferUsageAttrGadget(const MatchFinder::MatchResult &Result)

: UnsafeGadget(Kind::UnsafeBufferUsageAttr), : WarningGadget(Kind::UnsafeBufferUsageAttr),

Op(Result.Nodes.getNodeAs<CallExpr>("call_expr")) {} Op(Result.Nodes.getNodeAs<CallExpr>("call_expr")) {}

static bool classof(const Gadget *G) { static bool classof(const Gadget *G) {

return G->getKind() == Kind::UnsafeBufferUsageAttr; return G->getKind() == Kind::UnsafeBufferUsageAttr;

} }

static Matcher matcher() { static Matcher matcher() {

return stmt(callExpr(callee(functionDecl(hasAttr(attr::UnsafeBufferUsage)))) return stmt(callExpr(callee(functionDecl(hasAttr(attr::UnsafeBufferUsage))))

Show All 9 Lines DeclUseList getClaimedVarUseSites() const override {

return {}; return {};

} }

}; };

/// A pointer arithmetic expression of one of the forms: /// A pointer arithmetic expression of one of the forms:

/// \code /// \code

/// \endcode /// \endcode

class PointerArithmeticGadget : public UnsafeGadget { class PointerArithmeticGadget : public WarningGadget {

const BinaryOperator *PA; // pointer arithmetic expression const BinaryOperator *PA; // pointer arithmetic expression

const Expr * Ptr; // the pointer expression in `PA` const Expr * Ptr; // the pointer expression in `PA`

public: public:

PointerArithmeticGadget(const MatchFinder::MatchResult &Result) PointerArithmeticGadget(const MatchFinder::MatchResult &Result)

: UnsafeGadget(Kind::PointerArithmetic), : WarningGadget(Kind::PointerArithmetic),

PA(Result.Nodes.getNodeAs<BinaryOperator>("ptrAdd")), PA(Result.Nodes.getNodeAs<BinaryOperator>("ptrAdd")),

Ptr(Result.Nodes.getNodeAs<Expr>("ptrAddPtr")) {} Ptr(Result.Nodes.getNodeAs<Expr>("ptrAddPtr")) {}

static bool classof(const Gadget *G) { static bool classof(const Gadget *G) {

return G->getKind() == Kind::PointerArithmetic; return G->getKind() == Kind::PointerArithmetic;

} }

static Matcher matcher() { static Matcher matcher() {

▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines if (I == Map.end())

return Kind::Wontfix; return Kind::Wontfix;

return I->second; return I->second;

} }

}; };

} // namespace } // namespace

/// Scan the function and return a list of gadgets found with provided kits. /// Scan the function and return a list of gadgets found with provided kits.

static std::pair<GadgetList, DeclUseTracker> findGadgets(const Decl *D) { static std::tuple<FixableGadgetList, WarningGadgetList, DeclUseTracker> findGadgets(const Decl *D) {

struct GadgetFinderCallback : MatchFinder::MatchCallback { struct GadgetFinderCallback : MatchFinder::MatchCallback {

GadgetList Gadgets; FixableGadgetList FixableGadgets;

WarningGadgetList WarningGadgets;

DeclUseTracker Tracker; DeclUseTracker Tracker;

void run(const MatchFinder::MatchResult &Result) override { void run(const MatchFinder::MatchResult &Result) override {

if (const auto *DRE = Result.Nodes.getNodeAs<DeclRefExpr>("any_dre")) { if (const auto *DRE = Result.Nodes.getNodeAs<DeclRefExpr>("any_dre")) {

Tracker.discoverUse(DRE); Tracker.discoverUse(DRE);

} }

if (const auto *DS = Result.Nodes.getNodeAs<DeclStmt>("any_ds")) { if (const auto *DS = Result.Nodes.getNodeAs<DeclStmt>("any_ds")) {

Tracker.discoverDecl(DS); Tracker.discoverDecl(DS);

} }

// Figure out which matcher we've found, and call the appropriate // Figure out which matcher we've found, and call the appropriate

// subclass constructor. // subclass constructor.

// FIXME: Can we do this more logarithmically? // FIXME: Can we do this more logarithmically?

#define GADGET(x) \ #define FIXABLE_GADGET(x) \

if (Result.Nodes.getNodeAs<Stmt>(#x)) { \

jkorousUnsubmitted

Done

if (Result.Nodes.getNodeAs<Stmt>(#name)) { \

- FixableGadget.push_back(std::make_unique<name ## Gadget>(Result)); \

+ FixableGadgets.push_back(std::make_unique<name ## Gadget>(Result)); \

NEXT; \

jkorous:

FixableGadgets.push_back(std::make_unique<x ## Gadget>(Result)); \

return; \

}

#include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def"

#undef FIXABLE_GADGET

#define WARNING_GADGET(x) \

if (Result.Nodes.getNodeAs<Stmt>(#x)) { \ if (Result.Nodes.getNodeAs<Stmt>(#x)) { \

Gadgets.push_back(std::make_unique<x ## Gadget>(Result)); \ WarningGadgets.push_back(std::make_unique<x ## Gadget>(Result)); \

return; \ return; \

} }

#include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def" #include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def"

#undef GADGET #undef WARNING_GADGET

} }

}; };

MatchFinder M; MatchFinder M;

GadgetFinderCallback CB; GadgetFinderCallback CB;

// clang-format off // clang-format off

M.addMatcher( M.addMatcher(

stmt(forEveryDescendant( stmt(forEveryDescendant(

stmt(anyOf( stmt(anyOf(

// Add Gadget::matcher() for every gadget in the registry. // Add Gadget::matcher() for every gadget in the registry.

#define GADGET(x) \ #define FIXABLE_GADGET(x) \

x ## Gadget::matcher().bind(#x), x ## Gadget::matcher().bind(#x),

#include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def" #include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def"

#undef GADGET #undef FIXABLE_GADGET

#define WARNING_GADGET(x) \

x ## Gadget::matcher().bind(#x),

#include "clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def"

#undef WARNING_GADGET

// In parallel, match all DeclRefExprs so that to find out // In parallel, match all DeclRefExprs so that to find out

// whether there are any uncovered by gadgets. // whether there are any uncovered by gadgets.

declRefExpr(hasPointerType(), to(varDecl())).bind("any_dre"), declRefExpr(hasPointerType(), to(varDecl())).bind("any_dre"),

// Also match DeclStmts because we'll need them when fixing // Also match DeclStmts because we'll need them when fixing

// their underlying VarDecls that otherwise don't have // their underlying VarDecls that otherwise don't have

// any backreferences to DeclStmts. // any backreferences to DeclStmts.

declStmt().bind("any_ds") declStmt().bind("any_ds")

)) ))

// FIXME: Idiomatically there should be a forCallable(equalsNode(D)) // FIXME: Idiomatically there should be a forCallable(equalsNode(D))

// here, to make sure that the statement actually belongs to the // here, to make sure that the statement actually belongs to the

// function and not to a nested function. However, forCallable uses // function and not to a nested function. However, forCallable uses

// ParentMap which can't be used before the AST is fully constructed. // ParentMap which can't be used before the AST is fully constructed.

// The original problem doesn't sound like it needs ParentMap though, // The original problem doesn't sound like it needs ParentMap though,

// maybe there's a more direct solution? // maybe there's a more direct solution?

)), )),

&CB &CB

); );

// clang-format on // clang-format on

M.match(*D->getBody(), D->getASTContext()); M.match(*D->getBody(), D->getASTContext());

// Gadgets "claim" variables they're responsible for. Once this loop finishes, // Gadgets "claim" variables they're responsible for. Once this loop finishes,

// the tracker will only track DREs that weren't claimed by any gadgets, // the tracker will only track DREs that weren't claimed by any gadgets,

// i.e. not understood by the analysis. // i.e. not understood by the analysis.

for (const auto &G : CB.Gadgets) { for (const auto &G : CB.FixableGadgets) {

for (const auto *DRE : G->getClaimedVarUseSites()) { for (const auto *DRE : G->getClaimedVarUseSites()) {

CB.Tracker.claimUse(DRE); CB.Tracker.claimUse(DRE);

} }

return {std::move(CB.Gadgets), std::move(CB.Tracker)}; return {std::move(CB.FixableGadgets), std::move(CB.WarningGadgets), std::move(CB.Tracker)};

} }

void clang::checkUnsafeBufferUsage(const Decl *D, void clang::checkUnsafeBufferUsage(const Decl *D,

UnsafeBufferUsageHandler &Handler) { UnsafeBufferUsageHandler &Handler) {

std::cout << "REACHED HERE \n";

jkorousUnsubmitted

Not Done

This looks like a debug print in the parent commit. We should remove it from this patch.

jkorous: This looks like a debug print in the parent commit. We should remove it from this patch.

assert(D && D->getBody()); assert(D && D->getBody());

SmallSet<const VarDecl *, 8> WarnedDecls; SmallSet<const VarDecl *, 8> WarnedDecls;

auto [Gadgets, Tracker] = findGadgets(D); auto [FixableGadgets, WarningGadgets, Tracker] = findGadgets(D);

DenseMap<const VarDecl *, std::pair<std::vector<const FixableGadget *>,

DenseMap<const VarDecl *, std::vector<const Gadget *>> Map; std::vector<const WarningGadget *>>> Map;

int count = 0;

// First, let's sort gadgets by variables. If some gadgets cover more than one // First, let's sort gadgets by variables. If some gadgets cover more than one

// variable, they'll appear more than once in the map. // variable, they'll appear more than once in the map.

for (const auto &G : Gadgets) { for (const auto &G : FixableGadgets) {

DeclUseList DREs = G->getClaimedVarUseSites();

// Populate the map.

for (const DeclRefExpr *DRE : DREs) {

if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {

Map[VD].first.push_back(G.get());

}

for (const auto &G : WarningGadgets) {

DeclUseList DREs = G->getClaimedVarUseSites(); DeclUseList DREs = G->getClaimedVarUseSites();

// Populate the map. // Populate the map.

bool Pushed = false; bool Pushed = false;

jkorousUnsubmitted

Not Done

(unrelated to this patch)
In some follow-up patch we should rename Pushed to make the code easier to understand. IIUC it means "is there a variable declaration this warning relates to?".

jkorous: (unrelated to this patch) In some follow-up patch we should rename `Pushed` to make the code…

for (const DeclRefExpr *DRE : DREs) { for (const DeclRefExpr *DRE : DREs) {

if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) { if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {

Map[VD].push_back(G.get()); Map[VD].second.push_back(G.get());

Pushed = true; Pushed = true;

} }

bool b = !Pushed && !G->isSafe(); if (!Pushed) {

std::cout << "count" << count << " " << b <<"\n";

count++;

if (!Pushed && !G->isSafe()) {

// We won't return to this gadget later. Emit the warning right away. // We won't return to this gadget later. Emit the warning right away.

Handler.handleUnsafeOperation(G->getBaseStmt()); Handler.handleUnsafeOperation(G->getBaseStmt());

continue; continue;

} }

Strategy S; Strategy S;

for (const auto &Item : Map) { for (const auto &Item : Map) {

const VarDecl *VD = Item.first; const VarDecl *VD = Item.first;

const std::vector<const Gadget *> &VDGadgets = Item.second; const std::vector<const FixableGadget *> &VDFixableGadgets = Item.second.first;

const std::vector<const WarningGadget *> &VDWarningGadgets = Item.second.second;

// If the variable has no unsafe gadgets, skip it entirely. // If the variable has no warning gadgets, skip it entirely.

if (!any_of(VDGadgets, [](const Gadget *G) { return !G->isSafe(); })) if (VDWarningGadgets.empty())

NoQUnsubmitted

Done

In your case *every* gadget in the list is a warning gadget. So you can simply check whether the list is empty.

NoQ: In your case *every* gadget in the list is a warning gadget. So you can simply check whether…

continue; continue;

Optional<FixItList> Fixes = None; Optional<FixItList> Fixes = None;

// Avoid suggesting fixes if not all uses of the variable are identified // Avoid suggesting fixes if not all uses of the variable are identified

// as known gadgets. // as known gadgets.

// FIXME: Support parameter variables as well. // FIXME: Support parameter variables as well.

if (!Tracker.hasUnclaimedUses(VD) && VD->isLocalVarDecl()) { if (!Tracker.hasUnclaimedUses(VD) && VD->isLocalVarDecl()) {

// Choose the appropriate strategy. FIXME: We should try different // Choose the appropriate strategy. FIXME: We should try different

jkorousUnsubmitted

Done

Is this a left-over debug print?

jkorous: Is this a left-over debug print?

ziqingluo-90Unsubmitted

Done

I think an assertion assert(false && "This should not be entered" ) might be better. But why this code should be unreachable?

ziqingluo-90: I think an assertion `assert(false && "This should not be entered" )` might be better. But why…

malavikasamakAuthorUnsubmitted

Not Done

This is a left over debug message I added to test the re-architecture works as intended. Currently, it should not be reachable as we don't have any fixable gadgets available.

malavikasamak: This is a left over debug message I added to test the re-architecture works as intended.

// strategies. // strategies.

S.set(VD, Strategy::Kind::Span); S.set(VD, Strategy::Kind::Span);

// Check if it works. // Check if it works.

// FIXME: This isn't sufficient (or even correct) when a gadget has // FIXME: This isn't sufficient (or even correct) when a gadget has

// already produced a fixit for a different variable i.e. it was mentioned // already produced a fixit for a different variable i.e. it was mentioned

// in the map twice (or more). In such case the correct thing to do is // in the map twice (or more). In such case the correct thing to do is

// to undo the previous fix first, and then if we can't produce the new // to undo the previous fix first, and then if we can't produce the new

// fix for both variables, revert to the old one. // fix for both variables, revert to the old one.

Fixes = FixItList{}; Fixes = FixItList{};

for (const Gadget *G : VDGadgets) { for (const FixableGadget *G : VDFixableGadgets) {

Optional<FixItList> F = G->getFixits(S); Optional<FixItList> F = G->getFixits(S);

if (!F) { if (!F) {

Fixes = None; Fixes = None;

break; break;

} }

for (auto &&Fixit: *F) for (auto &&Fixit: *F)

Fixes->push_back(std::move(Fixit)); Fixes->push_back(std::move(Fixit));

} }

if (Fixes) { if (Fixes) {

// If we reach this point, the strategy is applicable. // If we reach this point, the strategy is applicable.

Handler.handleFixableVariable(VD, std::move(*Fixes)); Handler.handleFixableVariable(VD, std::move(*Fixes));

} else { } else {

// The strategy has failed. Emit the warning without the fixit. // The strategy has failed. Emit the warning without the fixit.

S.set(VD, Strategy::Kind::Wontfix); S.set(VD, Strategy::Kind::Wontfix);

for (const Gadget *G : VDGadgets) { for (const WarningGadget *G : VDWarningGadgets) {

if (!G->isSafe()) {

Handler.handleUnsafeOperation(G->getBaseStmt()); Handler.handleUnsafeOperation(G->getBaseStmt());

} }

}

This is an archive of the discontinued LLVM Phabricator instance.

[-Wunsafe-buffer-usage] Safe-buffers re-architecture to introduce Fixable gadgets
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 484365

clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def

clang/lib/Analysis/UnsafeBufferUsage.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[-Wunsafe-buffer-usage] Safe-buffers re-architecture to introduce Fixable gadgetsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 484365

clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def

clang/lib/Analysis/UnsafeBufferUsage.cpp

[-Wunsafe-buffer-usage] Safe-buffers re-architecture to introduce Fixable gadgets
ClosedPublic