This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
compiler-rt/
-
lib/dfsan/
-
dfsan/
-
dfsan.cpp
-
test/dfsan/
-
dfsan/
2/2
origin_track_ld.c
-
llvm/
-
lib/Transforms/Instrumentation/
-
Transforms/
-
Instrumentation/
8/10
DataFlowSanitizer.cpp
-
test/Instrumentation/DataFlowSanitizer/
-
Instrumentation/
-
DataFlowSanitizer/
-
basic.ll
1/1
origin_track_load.ll

Differential D100967

[dfsan] Track origin at loads
ClosedPublic

Authored by stephan.yichao.zhao on Apr 21 2021, 9:27 AM.

Download Raw Diff

Details

Reviewers

gbalats

Commits

rG7fdf27096558: [dfsan] Track origin at loads

Summary

The first version of origin tracking tracks only memory stores. Although
this is sufficient for understanding correct flows, it is hard to figure
out where an undefined value is read from. To find reading undefined values,
we still have to do a reverse binary search from the last store in the chain
with printing and logging at possible code paths. This is
quite inefficient.

Tracking memory load instructions can help this case. The main issues of
tracking loads are performance and code size overheads.

With tracking only stores, the code size overhead is 38%,
memory overhead is 1x, and cpu overhead is 3x. In practice #load is much
larger than #store, so both code size and cpu overhead increases. The
first blocker is code size overhead: link fails if we inline tracking
loads. The workaround is using external function calls to propagate
metadata. This is also the workaround ASan uses. The cpu overhead
is ~10x. This is a trade off between debuggability and performance, 
and will be used only when debugging cases that tracking only stores
is not enough.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

stephan.yichao.zhao created this revision.Apr 21 2021, 9:27 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptApr 21 2021, 9:27 AM

stephan.yichao.zhao requested review of this revision.Apr 21 2021, 9:27 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 21 2021, 9:27 AM

Herald added subscribers: llvm-commits, Restricted Project. · View Herald Transcript

stephan.yichao.zhao edited the summary of this revision. (Show Details)Apr 21 2021, 9:28 AM

Harbormaster completed remote builds in B99890: Diff 339099.Apr 21 2021, 10:26 AM

gbalats added inline comments.Apr 21 2021, 11:13 AM

compiler-rt/test/dfsan/origin_track_ld.c
3
4	This is redundant as it's the default.
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
256–263	Using different integer values to encode the level of tracking is hard to understand without looking at this exact comment right here. Why can't we use an enum instead with descriptive names? E.g., enum OriginTrackingLevel { None, StoresOnly, LoadsAndStores }; https://llvm.org/docs/CommandLine.html#selecting-an-alternative-from-a-set-of-possibilities
645	You mean InstAlignment? By the way, for documenting specific arguments of functions, I think the special `\param` Doxygen syntax would be clearer. https://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments
648
711
747	What do you mean by this? The description doesn't indicate what's the different with `loadShadowOrigin`. Maybe `loadShadowOriginSansLoadTracking`?
2443–2444	This function should be called only when tracking origins. Why not make that an assertion, instead of the if-stmt?
llvm/test/Instrumentation/DataFlowSanitizer/origin_track_load.ll
2–3

updated

llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
256–263	Thank you. This enum is helpful. It uses optimization levels as an example. If we considered our use case as some debug levels, it would work track the basic traces track more chains to get more details but slow. In the next step, maybe callsites can also be added into this option. I updated the comments to mention this. Changing this from int to enum will change existing test cases. if we wanted it to use enum, I prefer to doing so in a different CL. MSan's msan-track-origins is defined like dfsan-track-origins, with 0, 1 and 2 to control different levels. At least they are consistent for the time being.
645	Thank you,

gbalats added inline comments.Apr 21 2021, 2:45 PM

llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
256–263	Addressing this in separate change SGTM. Thanks!
645	I'm not entirely sure you can use `\param` inlined within the text. Or, if you have to add a separate line per parameter. Most cases I see do the former. Please check.

Harbormaster completed remote builds in B100090: Diff 339373.Apr 21 2021, 3:02 PM

updated

gbalats accepted this revision.Apr 21 2021, 5:08 PM

This revision is now accepted and ready to land.Apr 21 2021, 5:08 PM

Harbormaster completed remote builds in B100122: Diff 339426.Apr 21 2021, 5:42 PM

Closed by commit rG7fdf27096558: [dfsan] Track origin at loads (authored by Jianzhou Zhao <jianzhouzh@google.com>). · Explain WhyApr 22 2021, 9:26 AM

This revision was automatically updated to reflect the committed changes.

Jianzhou Zhao <jianzhouzh@google.com> added a commit: rG7fdf27096558: [dfsan] Track origin at loads.

Revision Contents

Path

Size

compiler-rt/

lib/

dfsan/

dfsan.cpp

20 lines

test/

dfsan/

origin_track_ld.c

31 lines

llvm/

lib/

Transforms/

Instrumentation/

DataFlowSanitizer.cpp

82 lines

test/

Instrumentation/

DataFlowSanitizer/

basic.ll

1 line

origin_track_load.ll

32 lines

Diff 339678

compiler-rt/lib/dfsan/dfsan.cpp

Show First 20 Lines • Show All 553 Lines • ▼ Show 20 Lines	for (; size != 0; --size, ++labelp) {
// the amount of real memory used by large programs.		// the amount of real memory used by large programs.
if (label == *labelp)		if (label == *labelp)
continue;		continue;

*labelp = label;		*labelp = label;
}		}
}		}

		#define RET_CHAIN_ORIGIN(id) \
		GET_CALLER_PC_BP_SP; \
		(void)sp; \
		GET_STORE_STACK_TRACE_PC_BP(pc, bp); \
		return ChainOrigin(id, &stack);

// Return a new origin chain with the previous ID id and the current stack		// Return a new origin chain with the previous ID id and the current stack
// trace.		// trace.
extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_origin		extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_origin
__dfsan_chain_origin(dfsan_origin id) {		__dfsan_chain_origin(dfsan_origin id) {
GET_CALLER_PC_BP_SP;		RET_CHAIN_ORIGIN(id)
(void)sp;		}
GET_STORE_STACK_TRACE_PC_BP(pc, bp);
return ChainOrigin(id, &stack);		// Return a new origin chain with the previous ID id and the current stack
		// trace if the label is tainted.
		extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_origin
		__dfsan_chain_origin_if_tainted(dfsan_label label, dfsan_origin id) {
		if (!label)
		return id;
		RET_CHAIN_ORIGIN(id)
}		}

// Copy or move the origins of the len bytes from src to dst.		// Copy or move the origins of the len bytes from src to dst.
extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_mem_origin_transfer(		extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_mem_origin_transfer(
const void dst, const void src, uptr len) {		const void dst, const void src, uptr len) {
if (src == dst)		if (src == dst)
return;		return;
GET_CALLER_PC_BP;		GET_CALLER_PC_BP;
▲ Show 20 Lines • Show All 432 Lines • Show Last 20 Lines

compiler-rt/test/dfsan/origin_track_ld.c

This file was added.

// RUN: %clang_dfsan -gmlt -mllvm -dfsan-track-origins=2 -mllvm -dfsan-fast-16-labels=true %s -o %t && \

// RUN: %run %t > %t.out 2>&1

// RUN: FileCheck %s < %t.out

gbalatsUnsubmitted

Done

// RUN: %clang_dfsan -gmlt -mllvm -dfsan-track-origins=2 -mllvm -dfsan-fast-16-labels=true %s -o %t && \

- // RUN: %run %t >%t.out 2>&1

+ // RUN: %run %t > %t.out 2>&1

// RUN: FileCheck %s --check-prefix=CHECK < %t.out

gbalats:

gbalatsUnsubmitted

Done

// RUN: %run %t >%t.out 2>&1

- // RUN: FileCheck %s --check-prefix=CHECK < %t.out

+ // RUN: FileCheck %s < %t.out

// REQUIRES: x86_64-target-arch

This is redundant as it's the default.

gbalats: This is redundant as it's the default.

// REQUIRES: x86_64-target-arch

#include <sanitizer/dfsan_interface.h>

__attribute__((noinline)) uint64_t foo(uint64_t a, uint64_t b) { return a + b; }

int main(int argc, char *argv[]) {

uint64_t a = 10;

uint64_t b = 20;

dfsan_set_label(8, &a, sizeof(a));

uint64_t c = foo(a, b);

dfsan_print_origin_trace(&c, NULL);

}

// CHECK: Taint value 0x8 {{.*}} origin tracking ()

// CHECK: Origin value: {{.*}}, Taint value was stored to memory at

// CHECK: #0 {{.*}} in main {{.*}}origin_track_ld.c:[[@LINE-6]]

// CHECK: Origin value: {{.*}}, Taint value was stored to memory at

// CHECK: #0 {{.*}} in dfs$foo {{.*}}origin_track_ld.c:[[@LINE-15]]

// CHECK: #1 {{.*}} in main {{.*}}origin_track_ld.c:[[@LINE-10]]

// CHECK: Origin value: {{.*}}, Taint value was stored to memory at

// CHECK: #0 {{.*}} in main {{.*}}origin_track_ld.c:[[@LINE-13]]

// CHECK: Origin value: {{.*}}, Taint value was created at

// CHECK: #0 {{.*}} in main {{.*}}origin_track_ld.c:[[@LINE-17]]

llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp

Show First 20 Lines • Show All 247 Lines • ▼ Show 20 Lines

// TODO: This default value follows MSan. DFSan may use a different value. // TODO: This default value follows MSan. DFSan may use a different value.

static cl::opt<int> ClInstrumentWithCallThreshold( static cl::opt<int> ClInstrumentWithCallThreshold(

"dfsan-instrument-with-call-threshold", "dfsan-instrument-with-call-threshold",

cl::desc("If the function being instrumented requires more than " cl::desc("If the function being instrumented requires more than "

"this number of origin stores, use callbacks instead of " "this number of origin stores, use callbacks instead of "

"inline checks (-1 means never use callbacks)."), "inline checks (-1 means never use callbacks)."),

cl::Hidden, cl::init(3500)); cl::Hidden, cl::init(3500));

// Controls how to track origins. // Controls how to track origins.

// * 0: do not track origins. // * 0: do not track origins.

// * 1: track origins at memory store operations. // * 1: track origins at memory store operations.

// * 2: TODO: track origins at memory store operations and callsites. // * 2: track origins at memory load and store operations.

// TODO: track callsites.

static cl::opt<int> ClTrackOrigins("dfsan-track-origins", static cl::opt<int> ClTrackOrigins("dfsan-track-origins",

cl::desc("Track origins of labels"), cl::desc("Track origins of labels"),

cl::Hidden, cl::init(0)); cl::Hidden, cl::init(0));

gbalatsUnsubmitted

Not Done

Using different integer values to encode the level of tracking is hard to understand without looking at this exact comment right here. Why can't we use an enum instead with descriptive names?

E.g.,

enum OriginTrackingLevel {
  None,
  StoresOnly,
  LoadsAndStores
};

https://llvm.org/docs/CommandLine.html#selecting-an-alternative-from-a-set-of-possibilities

gbalats: Using different integer values to encode the level of tracking is hard to understand without…

stephan.yichao.zhaoAuthorUnsubmitted

Done

Thank you.

This enum is helpful. It uses optimization levels as an example.
If we considered our use case as some debug levels, it would work

track the basic traces
track more chains to get more details but slow. In the next step, maybe callsites can also be added into this option. I updated the comments to mention this.

Changing this from int to enum will change existing test cases. if we wanted it to use enum, I prefer to doing so in a different CL.

MSan's msan-track-origins is defined like dfsan-track-origins, with 0, 1 and 2 to control different levels.
At least they are consistent for the time being.

stephan.yichao.zhao: Thank you. This enum is helpful. It uses optimization levels as an example. If we considered…

gbalatsUnsubmitted

Done

Addressing this in separate change SGTM. Thanks!

gbalats: Addressing this in separate change SGTM. Thanks!

static StringRef getGlobalTypeString(const GlobalValue &G) { static StringRef getGlobalTypeString(const GlobalValue &G) {

// Types of GlobalVariables are always pointer types. // Types of GlobalVariables are always pointer types.

Type *GType = G.getValueType(); Type *GType = G.getValueType();

// For now we support excluding struct types only. // For now we support excluding struct types only.

if (StructType *SGType = dyn_cast<StructType>(GType)) { if (StructType *SGType = dyn_cast<StructType>(GType)) {

if (!SGType->isLiteral()) if (!SGType->isLiteral())

return SGType->getName(); return SGType->getName();

▲ Show 20 Lines • Show All 177 Lines • ▼ Show 20 Lines class DataFlowSanitizer {

FunctionType *DFSanUnimplementedFnTy; FunctionType *DFSanUnimplementedFnTy;

FunctionType *DFSanSetLabelFnTy; FunctionType *DFSanSetLabelFnTy;

FunctionType *DFSanNonzeroLabelFnTy; FunctionType *DFSanNonzeroLabelFnTy;

FunctionType *DFSanVarargWrapperFnTy; FunctionType *DFSanVarargWrapperFnTy;

FunctionType *DFSanCmpCallbackFnTy; FunctionType *DFSanCmpCallbackFnTy;

FunctionType *DFSanLoadStoreCallbackFnTy; FunctionType *DFSanLoadStoreCallbackFnTy;

FunctionType *DFSanMemTransferCallbackFnTy; FunctionType *DFSanMemTransferCallbackFnTy;

FunctionType *DFSanChainOriginFnTy; FunctionType *DFSanChainOriginFnTy;

FunctionType *DFSanChainOriginIfTaintedFnTy;

FunctionType *DFSanMemOriginTransferFnTy; FunctionType *DFSanMemOriginTransferFnTy;

FunctionType *DFSanMaybeStoreOriginFnTy; FunctionType *DFSanMaybeStoreOriginFnTy;

FunctionCallee DFSanUnionFn; FunctionCallee DFSanUnionFn;

FunctionCallee DFSanCheckedUnionFn; FunctionCallee DFSanCheckedUnionFn;

FunctionCallee DFSanUnionLoadFn; FunctionCallee DFSanUnionLoadFn;

FunctionCallee DFSanUnionLoadFastLabelsFn; FunctionCallee DFSanUnionLoadFastLabelsFn;

FunctionCallee DFSanLoadLabelAndOriginFn; FunctionCallee DFSanLoadLabelAndOriginFn;

FunctionCallee DFSanUnimplementedFn; FunctionCallee DFSanUnimplementedFn;

FunctionCallee DFSanSetLabelFn; FunctionCallee DFSanSetLabelFn;

FunctionCallee DFSanNonzeroLabelFn; FunctionCallee DFSanNonzeroLabelFn;

FunctionCallee DFSanVarargWrapperFn; FunctionCallee DFSanVarargWrapperFn;

FunctionCallee DFSanLoadCallbackFn; FunctionCallee DFSanLoadCallbackFn;

FunctionCallee DFSanStoreCallbackFn; FunctionCallee DFSanStoreCallbackFn;

FunctionCallee DFSanMemTransferCallbackFn; FunctionCallee DFSanMemTransferCallbackFn;

FunctionCallee DFSanCmpCallbackFn; FunctionCallee DFSanCmpCallbackFn;

FunctionCallee DFSanChainOriginFn; FunctionCallee DFSanChainOriginFn;

FunctionCallee DFSanChainOriginIfTaintedFn;

FunctionCallee DFSanMemOriginTransferFn; FunctionCallee DFSanMemOriginTransferFn;

FunctionCallee DFSanMaybeStoreOriginFn; FunctionCallee DFSanMaybeStoreOriginFn;

SmallPtrSet<Value *, 16> DFSanRuntimeFunctions; SmallPtrSet<Value *, 16> DFSanRuntimeFunctions;

MDNode *ColdCallWeights; MDNode *ColdCallWeights;

MDNode *OriginStoreWeights; MDNode *OriginStoreWeights;

DFSanABIList ABIList; DFSanABIList ABIList;

DenseMap<Value *, Function *> UnwrappedFnMap; DenseMap<Value *, Function *> UnwrappedFnMap;

AttrBuilder ReadOnlyNoneAttrs; AttrBuilder ReadOnlyNoneAttrs;

▲ Show 20 Lines • Show All 152 Lines • ▼ Show 20 Lines struct DFSanFunction {

/// Generates IR to compute the union of the two given shadows, inserting it /// Generates IR to compute the union of the two given shadows, inserting it

/// before Pos. The combined value is with primitive type. /// before Pos. The combined value is with primitive type.

Value *combineShadows(Value *V1, Value *V2, Instruction *Pos); Value *combineShadows(Value *V1, Value *V2, Instruction *Pos);

/// Combines the shadow values of V1 and V2, then converts the combined value /// Combines the shadow values of V1 and V2, then converts the combined value

/// with primitive type into a shadow value with the original type T. /// with primitive type into a shadow value with the original type T.

Value *combineShadowsThenConvert(Type *T, Value *V1, Value *V2, Value *combineShadowsThenConvert(Type *T, Value *V1, Value *V2,

Instruction *Pos); Instruction *Pos);

Value *combineOperandShadows(Instruction *Inst); Value *combineOperandShadows(Instruction *Inst);

std::pair<Value *, Value *> loadShadowOrigin(Value *ShadowAddr, uint64_t Size,

/// Generates IR to load shadow and origin corresponding to bytes [\p

/// Addr, \p Addr + \p Size), where addr has alignment \p

gbalatsUnsubmitted

Done

You mean InstAlignment? By the way, for documenting specific arguments of functions, I think the special \param Doxygen syntax would be clearer.

https://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments

gbalats: You mean InstAlignment? By the way, for documenting specific arguments of functions, I think…

stephan.yichao.zhaoAuthorUnsubmitted

Done

Thank you,

stephan.yichao.zhao: Thank you,

gbalatsUnsubmitted

Done

I'm not entirely sure you can use \param inlined within the text. Or, if you have to add a separate line per parameter. Most cases I see do the former.
Please check.

gbalats: I'm not entirely sure you can use `\param` inlined within the text. Or, if you have to add a…

/// InstAlignment, and take the union of each of those shadows. The returned

/// shadow always has primitive type.

///

gbalatsUnsubmitted

Done

/// those shadows. The returned shadow always has primitive type.

///

- /// When enabling tracking loads, the returned origin is a chain at the

+ /// When tracking loads is enabled, the returned origin is a chain at the

/// current stack if the returned shadow shadow is tainted.

gbalats:

/// When tracking loads is enabled, the returned origin is a chain at the

/// current stack if the returned shadow is tainted.

std::pair<Value *, Value *> loadShadowOrigin(Value *Addr, uint64_t Size,

Align InstAlignment, Align InstAlignment,

Instruction *Pos); Instruction *Pos);

void storePrimitiveShadowOrigin(Value *Addr, uint64_t Size, void storePrimitiveShadowOrigin(Value *Addr, uint64_t Size,

Align InstAlignment, Value *PrimitiveShadow, Align InstAlignment, Value *PrimitiveShadow,

Value *Origin, Instruction *Pos); Value *Origin, Instruction *Pos);

/// Applies PrimitiveShadow to all primitive subtypes of T, returning /// Applies PrimitiveShadow to all primitive subtypes of T, returning

/// the expanded shadow value. /// the expanded shadow value.

/// ///

/// EFP({T1,T2, ...}, PS) = {EFP(T1,PS),EFP(T2,PS),...} /// EFP({T1,T2, ...}, PS) = {EFP(T1,PS),EFP(T2,PS),...}

/// EFP([n x T], PS) = [n x EFP(T,PS)] /// EFP([n x T], PS) = [n x EFP(T,PS)]

Show All 39 Lines private:

Align getOriginAlign(Align InstAlignment); Align getOriginAlign(Align InstAlignment);

/// Because 4 contiguous bytes share one 4-byte origin, the most accurate load /// Because 4 contiguous bytes share one 4-byte origin, the most accurate load

/// is __dfsan_load_label_and_origin. This function returns the union of all /// is __dfsan_load_label_and_origin. This function returns the union of all

/// labels and the origin of the first taint label. However this is an /// labels and the origin of the first taint label. However this is an

/// additional call with many instructions. To ensure common cases are fast, /// additional call with many instructions. To ensure common cases are fast,

/// checks if it is possible to load labels and origins without using the /// checks if it is possible to load labels and origins without using the

/// callback function. /// callback function.

///

/// When enabling tracking load instructions, we always use

gbalatsUnsubmitted

Not Done

/// callback function.

///

- /// When enabling tracking load instructions, we always use

+ /// When tracking load instructions is enabled, we always use

/// __dfsan_load_label_and_origin to reduce code size.

gbalats:

/// __dfsan_load_label_and_origin to reduce code size.

bool useCallbackLoadLabelAndOrigin(uint64_t Size, Align InstAlignment); bool useCallbackLoadLabelAndOrigin(uint64_t Size, Align InstAlignment);

/// Returns a chain at the current stack with previous origin V. /// Returns a chain at the current stack with previous origin V.

Value *updateOrigin(Value *V, IRBuilder<> &IRB); Value *updateOrigin(Value *V, IRBuilder<> &IRB);

/// Returns a chain at the current stack with previous origin V if Shadow is

/// tainted.

Value *updateOriginIfTainted(Value *Shadow, Value *Origin, IRBuilder<> &IRB);

/// Creates an Intptr = Origin | Origin << 32 if Intptr's size is 64. Returns /// Creates an Intptr = Origin | Origin << 32 if Intptr's size is 64. Returns

/// Origin otherwise. /// Origin otherwise.

Value *originToIntptr(IRBuilder<> &IRB, Value *Origin); Value *originToIntptr(IRBuilder<> &IRB, Value *Origin);

/// Stores Origin into the address range [StoreOriginAddr, StoreOriginAddr + /// Stores Origin into the address range [StoreOriginAddr, StoreOriginAddr +

/// Size). /// Size).

void paintOrigin(IRBuilder<> &IRB, Value *Origin, Value *StoreOriginAddr, void paintOrigin(IRBuilder<> &IRB, Value *Origin, Value *StoreOriginAddr,

uint64_t StoreOriginSize, Align Alignment); uint64_t StoreOriginSize, Align Alignment);

/// Stores Origin in terms of its Shadow value. /// Stores Origin in terms of its Shadow value.

/// * Do not write origins for zero shadows because we do not trace origins /// * Do not write origins for zero shadows because we do not trace origins

/// for untainted sinks. /// for untainted sinks.

/// * Use __dfsan_maybe_store_origin if there are too many origin store /// * Use __dfsan_maybe_store_origin if there are too many origin store

/// instrumentations. /// instrumentations.

void storeOrigin(Instruction *Pos, Value *Addr, uint64_t Size, Value *Shadow, void storeOrigin(Instruction *Pos, Value *Addr, uint64_t Size, Value *Shadow,

Value *Origin, Value *StoreOriginAddr, Align InstAlignment); Value *Origin, Value *StoreOriginAddr, Align InstAlignment);

/// Convert a scalar value to an i1 by comparing with 0. /// Convert a scalar value to an i1 by comparing with 0.

Value *convertToBool(Value *V, IRBuilder<> &IRB, const Twine &Name = ""); Value *convertToBool(Value *V, IRBuilder<> &IRB, const Twine &Name = "");

bool shouldInstrumentWithCall(); bool shouldInstrumentWithCall();

/// Generates IR to load shadow and origin corresponding to bytes [\p

/// Addr, \p Addr + \p Size), where addr has alignment \p

/// InstAlignment, and take the union of each of those shadows. The returned

/// shadow always has primitive type.

gbalatsUnsubmitted

Done

What do you mean by this? The description doesn't indicate what's the different with loadShadowOrigin. Maybe loadShadowOriginSansLoadTracking?

gbalats: What do you mean by this? The description doesn't indicate what's the different with…

std::pair<Value *, Value *>

loadShadowOriginSansLoadTracking(Value *Addr, uint64_t Size,

Align InstAlignment, Instruction *Pos);

int NumOriginStores = 0; int NumOriginStores = 0;

}; };

class DFSanVisitor : public InstVisitor<DFSanVisitor> { class DFSanVisitor : public InstVisitor<DFSanVisitor> {

public: public:

DFSanFunction &DFSF; DFSanFunction &DFSF;

DFSanVisitor(DFSanFunction &DFSF) : DFSF(DFSF) {} DFSanVisitor(DFSanFunction &DFSF) : DFSF(DFSF) {}

▲ Show 20 Lines • Show All 372 Lines • ▼ Show 20 Lines DFSanNonzeroLabelFnTy =

FunctionType::get(Type::getVoidTy(*Ctx), None, /*isVarArg=*/false); FunctionType::get(Type::getVoidTy(*Ctx), None, /*isVarArg=*/false);

DFSanVarargWrapperFnTy = FunctionType::get( DFSanVarargWrapperFnTy = FunctionType::get(

Type::getVoidTy(*Ctx), Type::getInt8PtrTy(*Ctx), /*isVarArg=*/false); Type::getVoidTy(*Ctx), Type::getInt8PtrTy(*Ctx), /*isVarArg=*/false);

DFSanCmpCallbackFnTy = DFSanCmpCallbackFnTy =

FunctionType::get(Type::getVoidTy(*Ctx), PrimitiveShadowTy, FunctionType::get(Type::getVoidTy(*Ctx), PrimitiveShadowTy,

/*isVarArg=*/false); /*isVarArg=*/false);

DFSanChainOriginFnTy = DFSanChainOriginFnTy =

FunctionType::get(OriginTy, OriginTy, /*isVarArg=*/false); FunctionType::get(OriginTy, OriginTy, /*isVarArg=*/false);

Type *DFSanChainOriginIfTaintedArgs[2] = {PrimitiveShadowTy, OriginTy};

DFSanChainOriginIfTaintedFnTy = FunctionType::get(

OriginTy, DFSanChainOriginIfTaintedArgs, /*isVarArg=*/false);

Type *DFSanMaybeStoreOriginArgs[4] = {IntegerType::get(*Ctx, ShadowWidthBits), Type *DFSanMaybeStoreOriginArgs[4] = {IntegerType::get(*Ctx, ShadowWidthBits),

Int8Ptr, IntptrTy, OriginTy}; Int8Ptr, IntptrTy, OriginTy};

DFSanMaybeStoreOriginFnTy = FunctionType::get( DFSanMaybeStoreOriginFnTy = FunctionType::get(

Type::getVoidTy(*Ctx), DFSanMaybeStoreOriginArgs, /*isVarArg=*/false); Type::getVoidTy(*Ctx), DFSanMaybeStoreOriginArgs, /*isVarArg=*/false);

Type *DFSanMemOriginTransferArgs[3] = {Int8Ptr, Int8Ptr, IntptrTy}; Type *DFSanMemOriginTransferArgs[3] = {Int8Ptr, Int8Ptr, IntptrTy};

DFSanMemOriginTransferFnTy = FunctionType::get( DFSanMemOriginTransferFnTy = FunctionType::get(

Type::getVoidTy(*Ctx), DFSanMemOriginTransferArgs, /*isVarArg=*/false); Type::getVoidTy(*Ctx), DFSanMemOriginTransferArgs, /*isVarArg=*/false);

Type *DFSanLoadStoreCallbackArgs[2] = {PrimitiveShadowTy, Int8Ptr}; Type *DFSanLoadStoreCallbackArgs[2] = {PrimitiveShadowTy, Int8Ptr};

▲ Show 20 Lines • Show All 217 Lines • ▼ Show 20 Lines void DataFlowSanitizer::initializeRuntimeFunctions(Module &M) {

{ {

AttributeList AL; AttributeList AL;

AL = AL.addParamAttribute(M.getContext(), 0, Attribute::ZExt); AL = AL.addParamAttribute(M.getContext(), 0, Attribute::ZExt);

AL = AL.addAttribute(M.getContext(), AttributeList::ReturnIndex, AL = AL.addAttribute(M.getContext(), AttributeList::ReturnIndex,

Attribute::ZExt); Attribute::ZExt);

DFSanChainOriginFn = Mod->getOrInsertFunction("__dfsan_chain_origin", DFSanChainOriginFn = Mod->getOrInsertFunction("__dfsan_chain_origin",

DFSanChainOriginFnTy, AL); DFSanChainOriginFnTy, AL);

} }

{

AttributeList AL;

AL = AL.addParamAttribute(M.getContext(), 0, Attribute::ZExt);

AL = AL.addParamAttribute(M.getContext(), 1, Attribute::ZExt);

AL = AL.addAttribute(M.getContext(), AttributeList::ReturnIndex,

Attribute::ZExt);

DFSanChainOriginIfTaintedFn = Mod->getOrInsertFunction(

"__dfsan_chain_origin_if_tainted", DFSanChainOriginIfTaintedFnTy, AL);

}

DFSanMemOriginTransferFn = Mod->getOrInsertFunction( DFSanMemOriginTransferFn = Mod->getOrInsertFunction(

"__dfsan_mem_origin_transfer", DFSanMemOriginTransferFnTy); "__dfsan_mem_origin_transfer", DFSanMemOriginTransferFnTy);

{ {

AttributeList AL; AttributeList AL;

AL = AL.addParamAttribute(M.getContext(), 0, Attribute::ZExt); AL = AL.addParamAttribute(M.getContext(), 0, Attribute::ZExt);

AL = AL.addParamAttribute(M.getContext(), 3, Attribute::ZExt); AL = AL.addParamAttribute(M.getContext(), 3, Attribute::ZExt);

DFSanMaybeStoreOriginFn = Mod->getOrInsertFunction( DFSanMaybeStoreOriginFn = Mod->getOrInsertFunction(

Show All 23 Lines DFSanRuntimeFunctions.insert(

DFSanStoreCallbackFn.getCallee()->stripPointerCasts()); DFSanStoreCallbackFn.getCallee()->stripPointerCasts());