This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Transforms/
-
llvm/
-
Transforms/
-
Instrumentation.h
-
lib/Transforms/Instrumentation/
-
Transforms/
-
Instrumentation/
5/27
EfficiencySanitizer.cpp
-
test/Instrumentation/EfficiencySanitizer/
-
Instrumentation/
-
EfficiencySanitizer/
-
working_set_basic.ll

Differential D20483

[esan] EfficiencySanitizer working set tool fastpath
ClosedPublic

Authored by bruening on May 20 2016, 1:17 PM.

Download Raw Diff

Details

Reviewers

aizatsky

Commits

rG5662b939856d: [esan|wset] EfficiencySanitizer working set tool fastpath
rL270640: [esan|wset] EfficiencySanitizer working set tool fastpath

Summary

Adds fastpath instrumentation for esan's working set tool. The
instrumentation for aligned loads and stores consists of inlined writes
to shadow memory bits for each accessed cache line.

Adds a basic test for this instrumentation.

Diff Detail

Event Timeline

bruening updated this revision to Diff 57976.May 20 2016, 1:17 PM

bruening retitled this revision from to [esan] EfficiencySanitizer working set tool fastpath.

bruening updated this object.

bruening added a reviewer: aizatsky.

bruening added subscribers: llvm-commits, eugenis, kcc and 2 others.

aizatsky added inline comments.May 20 2016, 1:53 PM

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
71	uh-oh. These are really magical constants copied. Unfortunately there's no proper way to share code between runtime / llvm. I talked with our cube and the idea is to introduce a separate header (e.g. esan_compile_interface.h) and put all shareable constants there. At least that would make it clear what you are sharing. At best - you can manually synchronize the file by copying. WDYT?
413	Document what code are you generating - it is hard to read IR building.

bruening added inline comments.May 20 2016, 2:58 PM

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
71	Yes, this lack of ability to share is something I was surprised at in the other sanitizers. I talked to kcc about it early on when starting this project and it sounded like copying a handful of constants was something everyone was willing to live with and that they did not plan to push for any better approach, so I followed suit here. Sure, a separate header sounds good. Is there a plan to do this for the other sanitizers? I assume the header would go into include/llvm/Transforms/ and should probably follow the CamelCase naming.

aizatsky added inline comments.May 20 2016, 3:06 PM

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
71	Sure, a separate header sounds good. Is there a plan to do this for the other sanitizers? Everyone agrees that the way it is now is very painful and we need to try this. If a separate header works for you - we would definitely steer others in the same direction. I assume the header would go into include/llvm/Transforms/ and should probably follow the CamelCase naming. Yes. Even if you don't decide to copy it to runtime, it would clearly document all inter-dependencies and will make updating easier.

I would prefer having a working tool before adding the fast-paths.

Thanks for woking on this,

Filipe

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
70	Is the `4` on purpose, with three constants? Either way, I'd prefer to have no bound on the declaration and have an explicit initialization (with the last one as `0`), to explicitly show it's on purpose (or keep the size and add the extra `0`. If we had a lot of zeros at the end, then I'd be more ok with having the bounds + "hidden" zeros.
70	I'd prefer to have the tool name (or some abbreviation if it's too long) in the shadow scales/offsets, if it's expected to change between tools. If it's the same for both the tools we have now (cache frag + working set), then I'm ok with leaving it as is (+ small comment saying it's those values for both those tools) for now.
71	Yes. Even if you don't decide to copy it to runtime, it would clearly document all inter-dependencies and will make updating easier. I don't like that half-way thing. Please, if you're making it a header signalizing that the stuff there is shared, then do it on both places, not just one.
108	`kDefaultCacheFragToolShadowScale` or something.
215	I'd prefer to construct the object in an invalid state and have `doInitialization` always set `ShadowScale`, etc. That way we'd only need to look at this if we're experimenting with different shadow scales, etc. Instead of having to track down the initial setting for the default tool, or the other tools. We'd have much better code locality this way. P.S: `s/6/kDefaultWorkingSetToolShadowScale/` or something.
322	I'm not sure this is appropriate. Shouldn't you be setting `Alignment = 1` if it's 0 (which guarantees no alignment)? Can loads/stores ever have an alignment that is 0? If they can (and do), then setting it here isn't too bad. But otherwise, I'd have an `assert(Alignment != 0)` here, and just set `Alignment` to 1 above, where you're setting it to 0, instead of setting it twice.
422	Should you have a debug printf or some statistics on skipped addresses?
437	Don't you need an atomic load? At the very least, if the original instruction was atomic?
446	Same for atomic store.

bruening added inline comments.May 23 2016, 9:10 AM

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
71	Thinking more about this, I'm not sure it makes sense to treat just these two shadow constants as different from all of the other shared interface constants such as all of the runtime library function names: static const char *const EsanInitName = "__esan_init"; This includes the load and store handler names ("__esan_aligned_load8", e.g.), which are generated in EfficiencySanitizer.cpp. The cleanest thing would be to use macros to turn esan_init into a string or a var, and include that header in both places. We'd have to remove the loop that generates esan_aligned_loadX and hardcode each one. We'd put the ToolType enum in there too. However, compiler-rt is not supposed to directly include an llvm header, right? If it's not allowed, and we'd rather have the loop that generates those names, and we already have a nice compiler-rt/lib/esan/esan_interface_internal.h header for the runtime, and nobody else is going to include a new llvm header, perhaps simply clearly labeling the interface (functions, tooltype, shadow consts) in a section at the top of EfficiencySanitizer.cpp is the best option.

zhaoqin added inline comments.May 23 2016, 9:27 AM

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
71	IMHO, there are two possible benefits of adding a header file here: avoid duplication clear separation/documentation about inter-dependencies between runtime. However, since we do not include an llvm header, we cannot avoid duplication. And there is only one EfficiencySanitizer.cpp, it looks to me that a clear section about the duplicated part is easier to track than a separate header file.

I think the right way to share the constants between instrumentation and the runtime library is to put them in an LLVM header, install that to the build directory (under include/llvm) and locate that through llvm-config under if (COMPILER_RT_STANDALONE_BUILD).

In D20483#436873, @eugenis wrote:

I think the right way to share the constants between instrumentation and the runtime library is to put them in an LLVM header, install that to the build directory (under include/llvm) and locate that through llvm-config under if (COMPILER_RT_STANDALONE_BUILD).

Given that we are now talking about refactoring existing code, let's move that to a separate CL. Perhaps if someone has ideas for precisely how to do it and knows more about the setup you could lead the way by refactoring one of the other sanitizers -- I'm not sure what the COMPILER_RT_STANDALONE_BUILD supports (comments in the code make it sound unsupported; does it support building with an installed clang rather than a local build dir and in that case won't llvm-config not work unless we install our header which seems overkill; if it really requires a local llvm build dir in what sense is it standalone; etc.)

bruening marked 3 inline comments as done.May 23 2016, 1:11 PM

bruening added inline comments.

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
70	These do not change: our general shadow mapping handles all scales we will want.
322	Are you sure that 0 means no alignment? Is that documented somewhere? I looked and I could not find a definitive answer to what 0 means. The alignment of a brand-new LoadInst is 0. Tsan and asan treat alignment of 0 as type-aligned with checks like this: if (Alignment == 0 \|\| Alignment >= 8 \|\| (Alignment % (TypeSize / 8)) == 0) Additionally, looking around I see code like this: lib/CodeGen/SelectionDAG/SelectionDAG.cpp: if (Alignment == 0) lib/CodeGen/SelectionDAG/SelectionDAG.cpp: Alignment = getDataLayout().getPrefTypeAlignment(C->getType()); Which while it's not specifically about loads or stores, combined with the behavior of the other sanitizers, implies that 0 means the default for the platform. If someone has a definitive answer to what 0 means, please update the docs and all the existing sanitizers.
422	See NumFastpaths.
437	We are explicitly fine with races accessing our shadow memory. This is documented in the shadow interface. I will add another comment here.
446	Ditto.

Address reviewer comments.

aizatsky added inline comments.May 24 2016, 2:13 PM

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
420	I'm sorry but I (and people in my cube) don't really understand what "straddle a cache line" mean. Could you rewrite the comment to explain the reasoning better? What is the plan for TypeSize != 8? Slow path?
422	NumFastpaths is not incremented though.

bruening marked an inline comment as done.May 24 2016, 3:04 PM

bruening added inline comments.

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
422	It is incremented by the caller based on the return value. The number of slowpaths is also available by computing (NumInstrumentedStores + NumInstrumentedLoads) - NumFastpaths.

Clarifies cache line check comment.

aizatsky added inline comments.May 24 2016, 3:20 PM

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
419	This comment doesn't seem to correspond to the code. Isn't it "Bail to the slowpath for unaligned access"? For multiple cache lines, don't you want to compare TypeSize with cache line size?
425	This sentence is immediately contradicted by the next. "each cache line" vs "we've already ruled out".

bruening added inline comments.May 24 2016, 3:28 PM

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp
419	A memory access that is aligned to its size is guaranteed to only touch one cache line. (So long as it's <= 64 bytes which seems a safe assumption as memcpy, etc. intrinsics are handled elsewhere: this is a single load or store.)
425	The first sentence is talking about the tool's instrumentation in general. I will update it to remove any confusion.

Expand comments around single vs multiple cache lines.

aizatsky accepted this revision.May 24 2016, 3:38 PM

aizatsky edited edge metadata.

This revision is now accepted and ready to land.May 24 2016, 3:38 PM

Closed by commit rL270640: [esan|wset] EfficiencySanitizer working set tool fastpath (authored by bruening). · Explain WhyMay 24 2016, 5:23 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

include/

llvm/

Transforms/

Instrumentation.h

1 line

lib/

Transforms/

Instrumentation/

EfficiencySanitizer.cpp

104 lines

test/

Instrumentation/

EfficiencySanitizer/

working_set_basic.ll

164 lines

Diff 58339

include/llvm/Transforms/Instrumentation.h

Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	ModulePass *createDataFlowSanitizerPass(
void (getArgTLS)() = nullptr, void (getRetValTLS)() = nullptr);		void (getArgTLS)() = nullptr, void (getRetValTLS)() = nullptr);

// Options for EfficiencySanitizer sub-tools.		// Options for EfficiencySanitizer sub-tools.
struct EfficiencySanitizerOptions {		struct EfficiencySanitizerOptions {
EfficiencySanitizerOptions() : ToolType(ESAN_None) {}		EfficiencySanitizerOptions() : ToolType(ESAN_None) {}
enum Type {		enum Type {
ESAN_None = 0,		ESAN_None = 0,
ESAN_CacheFrag,		ESAN_CacheFrag,
		ESAN_WorkingSet,
} ToolType;		} ToolType;
};		};

// Insert EfficiencySanitizer instrumentation.		// Insert EfficiencySanitizer instrumentation.
ModulePass *createEfficiencySanitizerPass(		ModulePass *createEfficiencySanitizerPass(
const EfficiencySanitizerOptions &Options = EfficiencySanitizerOptions());		const EfficiencySanitizerOptions &Options = EfficiencySanitizerOptions());

// Options for sanitizer coverage instrumentation.		// Options for sanitizer coverage instrumentation.
▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

lib/Transforms/Instrumentation/EfficiencySanitizer.cpp

Show All 36 Lines

#define DEBUG_TYPE "esan"		#define DEBUG_TYPE "esan"

// The tool type must be just one of these ClTool* options, as the tools		// The tool type must be just one of these ClTool* options, as the tools
// cannot be combined due to shadow memory constraints.		// cannot be combined due to shadow memory constraints.
static cl::opt<bool>		static cl::opt<bool>
ClToolCacheFrag("esan-cache-frag", cl::init(false),		ClToolCacheFrag("esan-cache-frag", cl::init(false),
cl::desc("Detect data cache fragmentation"), cl::Hidden);		cl::desc("Detect data cache fragmentation"), cl::Hidden);
		static cl::opt<bool>
		ClToolWorkingSet("esan-working-set", cl::init(false),
		cl::desc("Measure the working set size"), cl::Hidden);
// Each new tool will get its own opt flag here.		// Each new tool will get its own opt flag here.
// These are converted to EfficiencySanitizerOptions for use		// These are converted to EfficiencySanitizerOptions for use
// in the code.		// in the code.

static cl::opt<bool> ClInstrumentLoadsAndStores(		static cl::opt<bool> ClInstrumentLoadsAndStores(
"esan-instrument-loads-and-stores", cl::init(true),		"esan-instrument-loads-and-stores", cl::init(true),
cl::desc("Instrument loads and stores"), cl::Hidden);		cl::desc("Instrument loads and stores"), cl::Hidden);
static cl::opt<bool> ClInstrumentMemIntrinsics(		static cl::opt<bool> ClInstrumentMemIntrinsics(
"esan-instrument-memintrinsics", cl::init(true),		"esan-instrument-memintrinsics", cl::init(true),
cl::desc("Instrument memintrinsics (memset/memcpy/memmove)"), cl::Hidden);		cl::desc("Instrument memintrinsics (memset/memcpy/memmove)"), cl::Hidden);

STATISTIC(NumInstrumentedLoads, "Number of instrumented loads");		STATISTIC(NumInstrumentedLoads, "Number of instrumented loads");
STATISTIC(NumInstrumentedStores, "Number of instrumented stores");		STATISTIC(NumInstrumentedStores, "Number of instrumented stores");
STATISTIC(NumFastpaths, "Number of instrumented fastpaths");		STATISTIC(NumFastpaths, "Number of instrumented fastpaths");
STATISTIC(NumAccessesWithIrregularSize,		STATISTIC(NumAccessesWithIrregularSize,
"Number of accesses with a size outside our targeted callout sizes");		"Number of accesses with a size outside our targeted callout sizes");

static const char *const EsanModuleCtorName = "esan.module_ctor";		static const char *const EsanModuleCtorName = "esan.module_ctor";
static const char *const EsanInitName = "__esan_init";		static const char *const EsanInitName = "__esan_init";

		// We must keep these Shadow* constants consistent with the esan runtime.
		// FIXME: Try to place these shadow constants, the names of the __esan_*
		// interface functions, and the ToolType enum into a header shared between
		filcabUnsubmitted Done Reply Inline Actions Is the `4` on purpose, with three constants? Either way, I'd prefer to have no bound on the declaration and have an explicit initialization (with the last one as `0`), to explicitly show it's on purpose (or keep the size and add the extra `0`. If we had a lot of zeros at the end, then I'd be more ok with having the bounds + "hidden" zeros. filcab: Is the `4` on purpose, with three constants? Either way, I'd prefer to have no bound on the…
		filcabUnsubmitted Not Done Reply Inline Actions I'd prefer to have the tool name (or some abbreviation if it's too long) in the shadow scales/offsets, if it's expected to change between tools. If it's the same for both the tools we have now (cache frag + working set), then I'm ok with leaving it as is (+ small comment saying it's those values for both those tools) for now. filcab: I'd prefer to have the tool name (or some abbreviation if it's too long) in the shadow…
		brueningAuthorUnsubmitted Not Done Reply Inline Actions These do not change: our general shadow mapping handles all scales we will want. bruening: These do not change: our general shadow mapping handles all scales we will want.
		// llvm and compiler-rt.
		aizatskyUnsubmitted Not Done Reply Inline Actions uh-oh. These are really magical constants copied. Unfortunately there's no proper way to share code between runtime / llvm. I talked with our cube and the idea is to introduce a separate header (e.g. esan_compile_interface.h) and put all shareable constants there. At least that would make it clear what you are sharing. At best - you can manually synchronize the file by copying. WDYT? aizatsky: uh-oh. These are really magical constants copied. Unfortunately there's no proper way to share…
		brueningAuthorUnsubmitted Not Done Reply Inline Actions Yes, this lack of ability to share is something I was surprised at in the other sanitizers. I talked to kcc about it early on when starting this project and it sounded like copying a handful of constants was something everyone was willing to live with and that they did not plan to push for any better approach, so I followed suit here. Sure, a separate header sounds good. Is there a plan to do this for the other sanitizers? I assume the header would go into include/llvm/Transforms/ and should probably follow the CamelCase naming. bruening: Yes, this lack of ability to share is something I was surprised at in the other sanitizers. I…
		aizatskyUnsubmitted Not Done Reply Inline Actions Sure, a separate header sounds good. Is there a plan to do this for the other sanitizers? Everyone agrees that the way it is now is very painful and we need to try this. If a separate header works for you - we would definitely steer others in the same direction. I assume the header would go into include/llvm/Transforms/ and should probably follow the CamelCase naming. Yes. Even if you don't decide to copy it to runtime, it would clearly document all inter-dependencies and will make updating easier. aizatsky: > Sure, a separate header sounds good. Is there a plan to do this for the other sanitizers?
		filcabUnsubmitted Not Done Reply Inline Actions Yes. Even if you don't decide to copy it to runtime, it would clearly document all inter-dependencies and will make updating easier. I don't like that half-way thing. Please, if you're making it a header signalizing that the stuff there is shared, then do it on both places, not just one. filcab: > Yes. Even if you don't decide to copy it to runtime, it would clearly document all inter…
		brueningAuthorUnsubmitted Not Done Reply Inline Actions Thinking more about this, I'm not sure it makes sense to treat just these two shadow constants as different from all of the other shared interface constants such as all of the runtime library function names: static const char const EsanInitName = "__esan_init"; This includes the load and store handler names ("__esan_aligned_load8", e.g.), which are generated in EfficiencySanitizer.cpp. The cleanest thing would be to use macros to turn esan_init into a string or a var, and include that header in both places. We'd have to remove the loop that generates esan_aligned_loadX and hardcode each one. We'd put the ToolType enum in there too. However, compiler-rt is not supposed to directly include an llvm header, right? If it's not allowed, and we'd rather have the loop that generates those names, and we already have a nice compiler-rt/lib/esan/esan_interface_internal.h header for the runtime, and nobody else is going to include a new llvm header, perhaps simply clearly labeling the interface (functions, tooltype, shadow consts) in a section at the top of EfficiencySanitizer.cpp is the best option. bruening:* Thinking more about this, I'm not sure it makes sense to treat just these two shadow constants…
		zhaoqinUnsubmitted Not Done Reply Inline Actions IMHO, there are two possible benefits of adding a header file here: avoid duplication clear separation/documentation about inter-dependencies between runtime. However, since we do not include an llvm header, we cannot avoid duplication. And there is only one EfficiencySanitizer.cpp, it looks to me that a clear section about the duplicated part is easier to track than a separate header file. zhaoqin: IMHO, there are two possible benefits of adding a header file here: 1. avoid duplication 2.
		static const uint64_t ShadowMask = 0x00000fffffffffffull;
		static const uint64_t ShadowOffs[3] = { // Indexed by scale
		0x0000130000000000ull,
		0x0000220000000000ull,
		0x0000440000000000ull,
		};
		// This array is indexed by the ToolType enum.
		static const int ShadowScale[] = {
		0, // ESAN_None.
		2, // ESAN_CacheFrag: 4B:1B, so 4 to 1 == >>2.
		6, // ESAN_WorkingSet: 64B:1B, so 64 to 1 == >>6.
		};

namespace {		namespace {

static EfficiencySanitizerOptions		static EfficiencySanitizerOptions
OverrideOptionsFromCL(EfficiencySanitizerOptions Options) {		OverrideOptionsFromCL(EfficiencySanitizerOptions Options) {
if (ClToolCacheFrag)		if (ClToolCacheFrag)
Options.ToolType = EfficiencySanitizerOptions::ESAN_CacheFrag;		Options.ToolType = EfficiencySanitizerOptions::ESAN_CacheFrag;
		else if (ClToolWorkingSet)
		Options.ToolType = EfficiencySanitizerOptions::ESAN_WorkingSet;

// Direct opt invocation with no params will have the default ESAN_None.		// Direct opt invocation with no params will have the default ESAN_None.
// We run the default tool in that case.		// We run the default tool in that case.
if (Options.ToolType == EfficiencySanitizerOptions::ESAN_None)		if (Options.ToolType == EfficiencySanitizerOptions::ESAN_None)
Options.ToolType = EfficiencySanitizerOptions::ESAN_CacheFrag;		Options.ToolType = EfficiencySanitizerOptions::ESAN_CacheFrag;

return Options;		return Options;
}		}

/// EfficiencySanitizer: instrument each module to find performance issues.		/// EfficiencySanitizer: instrument each module to find performance issues.
class EfficiencySanitizer : public ModulePass {		class EfficiencySanitizer : public ModulePass {
public:		public:
EfficiencySanitizer(		EfficiencySanitizer(
const EfficiencySanitizerOptions &Opts = EfficiencySanitizerOptions())		const EfficiencySanitizerOptions &Opts = EfficiencySanitizerOptions())
: ModulePass(ID), Options(OverrideOptionsFromCL(Opts)) {}		: ModulePass(ID), Options(OverrideOptionsFromCL(Opts)) {}
const char *getPassName() const override;		const char *getPassName() const override;
		filcabUnsubmitted Done Reply Inline Actions `kDefaultCacheFragToolShadowScale` or something. filcab: `kDefaultCacheFragToolShadowScale` or something.
bool runOnModule(Module &M) override;		bool runOnModule(Module &M) override;
static char ID;		static char ID;

private:		private:
bool initOnModule(Module &M);		bool initOnModule(Module &M);
void initializeCallbacks(Module &M);		void initializeCallbacks(Module &M);
bool runOnFunction(Function &F, Module &M);		bool runOnFunction(Function &F, Module &M);
bool instrumentLoadOrStore(Instruction *I, const DataLayout &DL);		bool instrumentLoadOrStore(Instruction *I, const DataLayout &DL);
bool instrumentMemIntrinsic(MemIntrinsic *MI);		bool instrumentMemIntrinsic(MemIntrinsic *MI);
bool shouldIgnoreMemoryAccess(Instruction *I);		bool shouldIgnoreMemoryAccess(Instruction *I);
int getMemoryAccessFuncIndex(Value *Addr, const DataLayout &DL);		int getMemoryAccessFuncIndex(Value *Addr, const DataLayout &DL);
		Value appToShadow(Value Shadow, IRBuilder<> &IRB);
bool instrumentFastpath(Instruction *I, const DataLayout &DL, bool IsStore,		bool instrumentFastpath(Instruction *I, const DataLayout &DL, bool IsStore,
Value *Addr, unsigned Alignment);		Value *Addr, unsigned Alignment);
// Each tool has its own fastpath routine:		// Each tool has its own fastpath routine:
bool instrumentFastpathCacheFrag(Instruction *I, const DataLayout &DL,		bool instrumentFastpathCacheFrag(Instruction *I, const DataLayout &DL,
Value *Addr, unsigned Alignment);		Value *Addr, unsigned Alignment);
		bool instrumentFastpathWorkingSet(Instruction *I, const DataLayout &DL,
		Value *Addr, unsigned Alignment);

EfficiencySanitizerOptions Options;		EfficiencySanitizerOptions Options;
LLVMContext *Ctx;		LLVMContext *Ctx;
Type *IntptrTy;		Type *IntptrTy;
// Our slowpath involves callouts to the runtime library.		// Our slowpath involves callouts to the runtime library.
// Access sizes are powers of two: 1, 2, 4, 8, 16.		// Access sizes are powers of two: 1, 2, 4, 8, 16.
static const size_t NumberOfAccessSizes = 5;		static const size_t NumberOfAccessSizes = 5;
Function *EsanAlignedLoad[NumberOfAccessSizes];		Function *EsanAlignedLoad[NumberOfAccessSizes];
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	bool EfficiencySanitizer::initOnModule(Module &M) {
std::tie(EsanCtorFunction, std::ignore) = createSanitizerCtorAndInitFunctions(		std::tie(EsanCtorFunction, std::ignore) = createSanitizerCtorAndInitFunctions(
M, EsanModuleCtorName, EsanInitName, /InitArgTypes=/{OrdTy},		M, EsanModuleCtorName, EsanInitName, /InitArgTypes=/{OrdTy},
/InitArgs=/{		/InitArgs=/{
ConstantInt::get(OrdTy, static_cast<int>(Options.ToolType))});		ConstantInt::get(OrdTy, static_cast<int>(Options.ToolType))});

appendToGlobalCtors(M, EsanCtorFunction, 0);		appendToGlobalCtors(M, EsanCtorFunction, 0);

return true;		return true;
}		}
		filcabUnsubmitted Done Reply Inline Actions I'd prefer to construct the object in an invalid state and have `doInitialization` always set `ShadowScale`, etc. That way we'd only need to look at this if we're experimenting with different shadow scales, etc. Instead of having to track down the initial setting for the default tool, or the other tools. We'd have much better code locality this way. P.S: `s/6/kDefaultWorkingSetToolShadowScale/` or something. filcab: I'd prefer to construct the object in an invalid state and have `doInitialization` always set…

		Value EfficiencySanitizer::appToShadow(Value Shadow, IRBuilder<> &IRB) {
		// Shadow = ((App & Mask) + Offs) >> Scale
		Shadow = IRB.CreateAnd(Shadow, ConstantInt::get(IntptrTy, ShadowMask));
		uint64_t Offs;
		int Scale = ShadowScale[Options.ToolType];
		if (Scale <= 2)
		Offs = ShadowOffs[Scale];
		else
		Offs = ShadowOffs[0] << Scale;
		Shadow = IRB.CreateAdd(Shadow, ConstantInt::get(IntptrTy, Offs));
		if (Scale > 0)
		Shadow = IRB.CreateLShr(Shadow, Scale);
		return Shadow;
		}

bool EfficiencySanitizer::shouldIgnoreMemoryAccess(Instruction *I) {		bool EfficiencySanitizer::shouldIgnoreMemoryAccess(Instruction *I) {
if (Options.ToolType == EfficiencySanitizerOptions::ESAN_CacheFrag) {		if (Options.ToolType == EfficiencySanitizerOptions::ESAN_CacheFrag) {
// We'd like to know about cache fragmentation in vtable accesses and		// We'd like to know about cache fragmentation in vtable accesses and
// constant data references, so we do not currently ignore anything.		// constant data references, so we do not currently ignore anything.
return false;		return false;
		} else if (Options.ToolType == EfficiencySanitizerOptions::ESAN_WorkingSet) {
		// TODO: the instrumentation disturbs the data layout on the stack, so we
		// may want to add an option to ignore stack references (if we can
		// distinguish them) to reduce overhead.
}		}
// TODO(bruening): future tools will be returning true for some cases.		// TODO(bruening): future tools will be returning true for some cases.
return false;		return false;
}		}

bool EfficiencySanitizer::runOnModule(Module &M) {		bool EfficiencySanitizer::runOnModule(Module &M) {
bool Res = initOnModule(M);		bool Res = initOnModule(M);
initializeCallbacks(M);		initializeCallbacks(M);
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	if (LoadInst *Load = dyn_cast<LoadInst>(I)) {
Alignment = 0;		Alignment = 0;
Addr = Xchg->getPointerOperand();		Addr = Xchg->getPointerOperand();
} else		} else
llvm_unreachable("Unsupported mem access type");		llvm_unreachable("Unsupported mem access type");

Type *OrigTy = cast<PointerType>(Addr->getType())->getElementType();		Type *OrigTy = cast<PointerType>(Addr->getType())->getElementType();
const uint32_t TypeSizeBytes = DL.getTypeStoreSizeInBits(OrigTy) / 8;		const uint32_t TypeSizeBytes = DL.getTypeStoreSizeInBits(OrigTy) / 8;
Value *OnAccessFunc = nullptr;		Value *OnAccessFunc = nullptr;

		// Convert 0 to the default alignment.
		if (Alignment == 0)
		Alignment = DL.getPrefTypeAlignment(OrigTy);
		filcabUnsubmitted Not Done Reply Inline Actions I'm not sure this is appropriate. Shouldn't you be setting `Alignment = 1` if it's 0 (which guarantees no alignment)? Can loads/stores ever have an alignment that is 0? If they can (and do), then setting it here isn't too bad. But otherwise, I'd have an `assert(Alignment != 0)` here, and just set `Alignment` to 1 above, where you're setting it to 0, instead of setting it twice. filcab: I'm not sure this is appropriate. Shouldn't you be setting `Alignment = 1` if it's 0 (which…
		brueningAuthorUnsubmitted Not Done Reply Inline Actions Are you sure that 0 means no alignment? Is that documented somewhere? I looked and I could not find a definitive answer to what 0 means. The alignment of a brand-new LoadInst is 0. Tsan and asan treat alignment of 0 as type-aligned with checks like this: if (Alignment == 0 \|\| Alignment >= 8 \|\| (Alignment % (TypeSize / 8)) == 0) Additionally, looking around I see code like this: lib/CodeGen/SelectionDAG/SelectionDAG.cpp: if (Alignment == 0) lib/CodeGen/SelectionDAG/SelectionDAG.cpp: Alignment = getDataLayout().getPrefTypeAlignment(C->getType()); Which while it's not specifically about loads or stores, combined with the behavior of the other sanitizers, implies that 0 means the default for the platform. If someone has a definitive answer to what 0 means, please update the docs and all the existing sanitizers. bruening: Are you sure that 0 means no alignment? Is that documented somewhere? I looked and I could…

if (IsStore)		if (IsStore)
NumInstrumentedStores++;		NumInstrumentedStores++;
else		else
NumInstrumentedLoads++;		NumInstrumentedLoads++;
int Idx = getMemoryAccessFuncIndex(Addr, DL);		int Idx = getMemoryAccessFuncIndex(Addr, DL);
if (Idx < 0) {		if (Idx < 0) {
OnAccessFunc = IsStore ? EsanUnalignedStoreN : EsanUnalignedLoadN;		OnAccessFunc = IsStore ? EsanUnalignedStoreN : EsanUnalignedLoadN;
IRB.CreateCall(OnAccessFunc,		IRB.CreateCall(OnAccessFunc,
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	int EfficiencySanitizer::getMemoryAccessFuncIndex(Value *Addr,
return Idx;		return Idx;
}		}

bool EfficiencySanitizer::instrumentFastpath(Instruction *I,		bool EfficiencySanitizer::instrumentFastpath(Instruction *I,
const DataLayout &DL, bool IsStore,		const DataLayout &DL, bool IsStore,
Value *Addr, unsigned Alignment) {		Value *Addr, unsigned Alignment) {
if (Options.ToolType == EfficiencySanitizerOptions::ESAN_CacheFrag) {		if (Options.ToolType == EfficiencySanitizerOptions::ESAN_CacheFrag) {
return instrumentFastpathCacheFrag(I, DL, Addr, Alignment);		return instrumentFastpathCacheFrag(I, DL, Addr, Alignment);
		} else if (Options.ToolType == EfficiencySanitizerOptions::ESAN_WorkingSet) {
		return instrumentFastpathWorkingSet(I, DL, Addr, Alignment);
}		}
return false;		return false;
}		}

bool EfficiencySanitizer::instrumentFastpathCacheFrag(Instruction *I,		bool EfficiencySanitizer::instrumentFastpathCacheFrag(Instruction *I,
const DataLayout &DL,		const DataLayout &DL,
Value *Addr,		Value *Addr,
unsigned Alignment) {		unsigned Alignment) {
// TODO(bruening): implement a fastpath for aligned accesses		// TODO(bruening): implement a fastpath for aligned accesses
return false;		return false;
}		}

		bool EfficiencySanitizer::instrumentFastpathWorkingSet(
		aizatskyUnsubmitted Done Reply Inline Actions Document what code are you generating - it is hard to read IR building. aizatsky: Document what code are you generating - it is hard to read IR building.
		Instruction I, const DataLayout &DL, Value Addr, unsigned Alignment) {
		assert(ShadowScale[Options.ToolType] == 6); // The code below assumes this
		IRBuilder<> IRB(I);
		Type *OrigTy = cast<PointerType>(Addr->getType())->getElementType();
		const uint32_t TypeSize = DL.getTypeStoreSizeInBits(OrigTy);
		// Bail to the slowpath if the access might touch multiple cache lines.
		aizatskyUnsubmitted Not Done Reply Inline Actions This comment doesn't seem to correspond to the code. Isn't it "Bail to the slowpath for unaligned access"? For multiple cache lines, don't you want to compare TypeSize with cache line size? aizatsky: This comment doesn't seem to correspond to the code. Isn't it "Bail to the slowpath for…
		brueningAuthorUnsubmitted Not Done Reply Inline Actions A memory access that is aligned to its size is guaranteed to only touch one cache line. (So long as it's <= 64 bytes which seems a safe assumption as memcpy, etc. intrinsics are handled elsewhere: this is a single load or store.) bruening: A memory access that is aligned to its size is guaranteed to only touch one cache line. (So…
		// An access aligned to its size is guaranteed to be intra-cache-line.
		aizatskyUnsubmitted Done Reply Inline Actions I'm sorry but I (and people in my cube) don't really understand what "straddle a cache line" mean. Could you rewrite the comment to explain the reasoning better? What is the plan for TypeSize != 8? Slow path? aizatsky: I'm sorry but I (and people in my cube) don't really understand what "straddle a cache line"…
		// getMemoryAccessFuncIndex has already ruled out a size larger than 16
		// and thus larger than a cache line for platforms this tool targets
		filcabUnsubmitted Not Done Reply Inline Actions Should you have a debug printf or some statistics on skipped addresses? filcab: Should you have a debug printf or some statistics on skipped addresses?
		brueningAuthorUnsubmitted Not Done Reply Inline Actions See NumFastpaths. bruening: See NumFastpaths.
		aizatskyUnsubmitted Not Done Reply Inline Actions NumFastpaths is not incremented though. aizatsky: NumFastpaths is not incremented though.
		brueningAuthorUnsubmitted Not Done Reply Inline Actions It is incremented by the caller based on the return value. The number of slowpaths is also available by computing (NumInstrumentedStores + NumInstrumentedLoads) - NumFastpaths. bruening: It is incremented by the caller based on the return value. The number of slowpaths is also…
		// (and our shadow memory setup assumes 64-byte cache lines).
		asssert(TypeSize < 64);
		if (!(TypeSize == 8 \|\|
		aizatskyUnsubmitted Not Done Reply Inline Actions This sentence is immediately contradicted by the next. "each cache line" vs "we've already ruled out". aizatsky: This sentence is immediately contradicted by the next. "each cache line" vs "we've already…
		brueningAuthorUnsubmitted Not Done Reply Inline Actions The first sentence is talking about the tool's instrumentation in general. I will update it to remove any confusion. bruening: The first sentence is talking about the tool's instrumentation in general. I will update it to…
		(Alignment % (TypeSize / 8)) == 0))
		return false;

		// We inline instrumentation to set the corresponding shadow bits for
		// each cache line touched by the application. Here we handle a single
		// load or store where we've already ruled out the possibility that it
		// might touch more than one cache line and thus we simply update the
		// shadow memory for a single cache line.
		// Our shadow memory model is fine with races when manipulating shadow values.
		// We generate the following code:
		//
		// const char BitMask = 0x81;
		filcabUnsubmitted Not Done Reply Inline Actions Don't you need an atomic load? At the very least, if the original instruction was atomic? filcab: Don't you need an atomic load? At the very least, if the original instruction was atomic?
		brueningAuthorUnsubmitted Not Done Reply Inline Actions We are explicitly fine with races accessing our shadow memory. This is documented in the shadow interface. I will add another comment here. bruening: We are explicitly fine with races accessing our shadow memory. This is documented in the…
		// char *ShadowAddr = appToShadow(AppAddr);
		// if ((*ShadowAddr & BitMask) != BitMask)
		// *ShadowAddr \|= Bitmask;
		//
		Value *AddrPtr = IRB.CreatePointerCast(Addr, IntptrTy);
		Value *ShadowPtr = appToShadow(AddrPtr, IRB);
		Type ShadowTy = IntegerType::get(Ctx, 8U);
		Type *ShadowPtrTy = PointerType::get(ShadowTy, 0);
		// The bottom bit is used for the current sampling period's working set.
		filcabUnsubmitted Not Done Reply Inline Actions Same for atomic store. filcab: Same for atomic store.
		brueningAuthorUnsubmitted Not Done Reply Inline Actions Ditto. bruening: Ditto.
		// The top bit is used for the total working set. We set both on each
		// memory access, if they are not already set.
		Value *ValueMask = ConstantInt::get(ShadowTy, 0x81); // 10000001B

		Value *OldValue = IRB.CreateLoad(IRB.CreateIntToPtr(ShadowPtr, ShadowPtrTy));
		// The AND and CMP will be turned into a TEST instruction by the compiler.
		Value *Cmp = IRB.CreateICmpNE(IRB.CreateAnd(OldValue, ValueMask), ValueMask);
		TerminatorInst *CmpTerm = SplitBlockAndInsertIfThen(Cmp, I, false);
		// FIXME: do I need to call SetCurrentDebugLocation?
		IRB.SetInsertPoint(CmpTerm);
		// We use OR to set the shadow bits to avoid corrupting the middle 6 bits,
		// which are used by the runtime library.
		Value *NewVal = IRB.CreateOr(OldValue, ValueMask);
		IRB.CreateStore(NewVal, IRB.CreateIntToPtr(ShadowPtr, ShadowPtrTy));
		IRB.SetInsertPoint(I);

		return true;
		}

test/Instrumentation/EfficiencySanitizer/working_set_basic.ll

This file was added.

				; Test basic EfficiencySanitizer working set instrumentation.
				;
				; RUN: opt < %s -esan -esan-working-set -S \| FileCheck %s

				;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
				; Intra-cache-line

				define i8 @aligned1(i8* %a) {
				entry:
				%tmp1 = load i8, i8* %a, align 1
				ret i8 %tmp1
				; CHECK: @llvm.global_ctors = {{.*}}@esan.module_ctor
				; CHECK: %0 = ptrtoint i8* %a to i64
				; CHECK-NEXT: %1 = and i64 %0, 17592186044415
				; CHECK-NEXT: %2 = add i64 %1, 1337006139375616
				; CHECK-NEXT: %3 = lshr i64 %2, 6
				; CHECK-NEXT: %4 = inttoptr i64 %3 to i8*
				; CHECK-NEXT: %5 = load i8, i8* %4
				; CHECK-NEXT: %6 = and i8 %5, -127
				; CHECK-NEXT: %7 = icmp ne i8 %6, -127
				; CHECK-NEXT: br i1 %7, label %8, label %11
				; CHECK: %9 = or i8 %5, -127
				; CHECK-NEXT: %10 = inttoptr i64 %3 to i8*
				; CHECK-NEXT: store i8 %9, i8* %10
				; CHECK-NEXT: br label %11
				; CHECK: %tmp1 = load i8, i8* %a, align 1
				; CHECK-NEXT: ret i8 %tmp1
				}

				define i16 @aligned2(i16* %a) {
				entry:
				%tmp1 = load i16, i16* %a, align 2
				ret i16 %tmp1
				; CHECK: %0 = ptrtoint i16* %a to i64
				; CHECK-NEXT: %1 = and i64 %0, 17592186044415
				; CHECK-NEXT: %2 = add i64 %1, 1337006139375616
				; CHECK-NEXT: %3 = lshr i64 %2, 6
				; CHECK-NEXT: %4 = inttoptr i64 %3 to i8*
				; CHECK-NEXT: %5 = load i8, i8* %4
				; CHECK-NEXT: %6 = and i8 %5, -127
				; CHECK-NEXT: %7 = icmp ne i8 %6, -127
				; CHECK-NEXT: br i1 %7, label %8, label %11
				; CHECK: %9 = or i8 %5, -127
				; CHECK-NEXT: %10 = inttoptr i64 %3 to i8*
				; CHECK-NEXT: store i8 %9, i8* %10
				; CHECK-NEXT: br label %11
				; CHECK: %tmp1 = load i16, i16* %a, align 2
				; CHECK-NEXT: ret i16 %tmp1
				}

				define i32 @aligned4(i32* %a) {
				entry:
				%tmp1 = load i32, i32* %a, align 4
				ret i32 %tmp1
				; CHECK: %0 = ptrtoint i32* %a to i64
				; CHECK-NEXT: %1 = and i64 %0, 17592186044415
				; CHECK-NEXT: %2 = add i64 %1, 1337006139375616
				; CHECK-NEXT: %3 = lshr i64 %2, 6
				; CHECK-NEXT: %4 = inttoptr i64 %3 to i8*
				; CHECK-NEXT: %5 = load i8, i8* %4
				; CHECK-NEXT: %6 = and i8 %5, -127
				; CHECK-NEXT: %7 = icmp ne i8 %6, -127
				; CHECK-NEXT: br i1 %7, label %8, label %11
				; CHECK: %9 = or i8 %5, -127
				; CHECK-NEXT: %10 = inttoptr i64 %3 to i8*
				; CHECK-NEXT: store i8 %9, i8* %10
				; CHECK-NEXT: br label %11
				; CHECK: %tmp1 = load i32, i32* %a, align 4
				; CHECK-NEXT: ret i32 %tmp1
				}

				define i64 @aligned8(i64* %a) {
				entry:
				%tmp1 = load i64, i64* %a, align 8
				ret i64 %tmp1
				; CHECK: %0 = ptrtoint i64* %a to i64
				; CHECK-NEXT: %1 = and i64 %0, 17592186044415
				; CHECK-NEXT: %2 = add i64 %1, 1337006139375616
				; CHECK-NEXT: %3 = lshr i64 %2, 6
				; CHECK-NEXT: %4 = inttoptr i64 %3 to i8*
				; CHECK-NEXT: %5 = load i8, i8* %4
				; CHECK-NEXT: %6 = and i8 %5, -127
				; CHECK-NEXT: %7 = icmp ne i8 %6, -127
				; CHECK-NEXT: br i1 %7, label %8, label %11
				; CHECK: %9 = or i8 %5, -127
				; CHECK-NEXT: %10 = inttoptr i64 %3 to i8*
				; CHECK-NEXT: store i8 %9, i8* %10
				; CHECK-NEXT: br label %11
				; CHECK: %tmp1 = load i64, i64* %a, align 8
				; CHECK-NEXT: ret i64 %tmp1
				}

				;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
				; Not guaranteed to be intra-cache-line

				define i16 @unaligned2(i16* %a) {
				entry:
				%tmp1 = load i16, i16* %a, align 1
				ret i16 %tmp1
				; CHECK: %0 = bitcast i16* %a to i8*
				; CHECK-NEXT: call void @__esan_unaligned_load2(i8* %0)
				; CHECK-NEXT: %tmp1 = load i16, i16* %a, align 1
				; CHECK-NEXT: ret i16 %tmp1
				}

				define i32 @unaligned4(i32* %a) {
				entry:
				%tmp1 = load i32, i32* %a, align 2
				ret i32 %tmp1
				; CHECK: %0 = bitcast i32* %a to i8*
				; CHECK-NEXT: call void @__esan_unaligned_load4(i8* %0)
				; CHECK-NEXT: %tmp1 = load i32, i32* %a, align 2
				; CHECK-NEXT: ret i32 %tmp1
				}

				define i64 @unaligned8(i64* %a) {
				entry:
				%tmp1 = load i64, i64* %a, align 4
				ret i64 %tmp1
				; CHECK: %0 = bitcast i64* %a to i8*
				; CHECK-NEXT: call void @__esan_unaligned_load8(i8* %0)
				; CHECK-NEXT: %tmp1 = load i64, i64* %a, align 4
				; CHECK-NEXT: ret i64 %tmp1
				}

				;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
				; Ensure that esan converts intrinsics to calls:

				declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture, i64, i32, i1)
				declare void @llvm.memmove.p0i8.p0i8.i64(i8* nocapture, i8* nocapture, i64, i32, i1)
				declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1)

				define void @memCpyTest(i8* nocapture %x, i8* nocapture %y) {
				entry:
				tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %x, i8* %y, i64 16, i32 4, i1 false)
				ret void
				; CHECK: define void @memCpyTest
				; CHECK: call i8* @memcpy
				; CHECK: ret void
				}

				define void @memMoveTest(i8* nocapture %x, i8* nocapture %y) {
				entry:
				tail call void @llvm.memmove.p0i8.p0i8.i64(i8* %x, i8* %y, i64 16, i32 4, i1 false)
				ret void
				; CHECK: define void @memMoveTest
				; CHECK: call i8* @memmove
				; CHECK: ret void
				}

				define void @memSetTest(i8* nocapture %x) {
				entry:
				tail call void @llvm.memset.p0i8.i64(i8* %x, i8 77, i64 16, i32 4, i1 false)
				ret void
				; CHECK: define void @memSetTest
				; CHECK: call i8* @memset
				; CHECK: ret void
				}

				;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
				; Top-level:

				; CHECK: define internal void @esan.module_ctor()
				; CHECK: call void @__esan_init(i32 2)