This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Instrumentation/
-
Transforms/
-
Instrumentation/
5/6
DataFlowSanitizer.cpp
-
test/Instrumentation/DataFlowSanitizer/
-
Instrumentation/
-
DataFlowSanitizer/
1/3
origin_load.ll

Differential D101584

[dfsan] Fix origin tracking for fast8
ClosedPublic

Authored by gbalats on Apr 29 2021, 3:59 PM.

Download Raw Diff

Details

Reviewers

stephan.yichao.zhao

Commits

rGa45fd436aef4: [dfsan] Fix origin tracking for fast8

Summary

The problem is the following. With fast8, we broke an important invariant when loading shadows.
A wide shadow of 64 bits used to correspond to 4 application bytes with fast16; so, generating a single load was okay since those 4 application bytes would share a single origin.
Now, using fast8, a wide shadow of 64 bits corresponds to 8 application bytes that should be backed by 2 origins (but we kept generating just one).

Let’s say our wide shadow is 64-bit and consists of the following: 0xABCDEFGH. To check if we need the second origin value, we could do the following (on the 64-bit wide shadow) case:

bitwise shift the wide shadow left by 32 bits (yielding 0xEFGH0000)
push the result along with the first origin load to the shadow/origin vectors
load the second 32-bit origin of the 64-bit wide shadow
push the wide shadow along with the second origin to the shadow/origin vectors.

The combineOrigins would then select the second origin if the wide shadow is of the form 0xABCDE0000.
The tests illustrate how this change affects the generated bitcode.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

gbalats created this revision.Apr 29 2021, 3:59 PM

Herald added a subscriber: hiraditya. · View Herald TranscriptApr 29 2021, 3:59 PM

gbalats requested review of this revision.Apr 29 2021, 3:59 PM

Herald added a subscriber: llvm-commits. · View Herald TranscriptApr 29 2021, 3:59 PM

Harbormaster completed remote builds in B101766: Diff 341689.Apr 29 2021, 7:04 PM

stephan.yichao.zhao added inline comments.Apr 29 2021, 9:46 PM

llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
505	-> loadNextOriginAndIncAddr or a similar name? So a reader can follow the main code w/o looking into the method.
2150	The LangRef needs both operands have the same type. I guess on a 64bit system, this is fine. But making this one have type WideShadowTy would make them always consistent.
2195	If we lift this origin load out of the if-else, this if-else could be shared with the if-else before the for-loop, so the comments and the assertion assert(BytesPerWideShadow == 8); will also be shared. FirstOrigin = DFS.loadNextOrigin(Pos, OriginAlign, &OriginAddr); if () { ... Origins.push_back(FirstOrigin) } else { ... Origins.push_back(FirstOrigin) }
llvm/test/Instrumentation/DataFlowSanitizer/origin_load.ll
258	If WIDE_SHADOW_LO is named to be WIDE_SHADOW_1; and WIDE_SHADOW is named WIDE_SHADOW_12; and ORIGIN is named ORIGIN1; and we do not reuse ORIGIN, but named new assignment to ORIGIN as ORIGIN23 if it is a select from ORIGIN2 and ORIGIN3, it helps read the code.

Address reviewer comments.

Add lambda to remove code duplication and description for loadOrigin.

gbalats marked an inline comment as done.Apr 30 2021, 2:47 PM

gbalats added inline comments.

llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
505	The "Next" part of the name alludes to a pointer advance (how can you load the next thing without advancing the pointer). Also, the fact that the `OriginAddr` param is passed by pointer (indicating that it will be updated). However, I did add some documentation to make it clearer.
2150	Done. Thanks for catching this!
2195	Done. Extracted to a lambda.
llvm/test/Instrumentation/DataFlowSanitizer/origin_load.ll
258	I think if we use WIDE_SHADOW_1 instead of _LO it will be difficult to differentiate between the different wide shadows and their left-shifted counterparts. The naming pattern I used is adding `_LO` suffix to indicate that `WIDE_SHADOW(N)_LO` is the left-shifted version of `WIDE_SHADOW(N)`. Overwriting `ORIGIN` is essential to keep the test smaller by reusing the same expression later on (see lines 312, 316). We need a way to state that, whatever path has been taken (fast8 or fast16, combine load ptr or not), the `ORIGIN` will refer to the name of the final origin that we have computed. Otherwise, we'd have to replicate those lines as well even though they are essentially the same. However, I can partially address this (remove some of the overwriting) when removing fast16.

stephan.yichao.zhao accepted this revision.Apr 30 2021, 2:52 PM

stephan.yichao.zhao added inline comments.

llvm/test/Instrumentation/DataFlowSanitizer/origin_load.ll
258	Got it. I missed the pattern needs to be shared. Yes. Please see if this can be simplified after fast16 is removed.

This revision is now accepted and ready to land.Apr 30 2021, 2:52 PM

Pulling latest changes

This revision was landed with ongoing or failed builds.Apr 30 2021, 3:58 PM

Closed by commit rGa45fd436aef4: [dfsan] Fix origin tracking for fast8 (authored by gbalats). · Explain Why

This revision was automatically updated to reflect the committed changes.

gbalats added a commit: rGa45fd436aef4: [dfsan] Fix origin tracking for fast8.

Harbormaster completed remote builds in B102026: Diff 342049.Apr 30 2021, 4:51 PM

Harbormaster completed remote builds in B102030: Diff 342056.Apr 30 2021, 5:00 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Instrumentation/

DataFlowSanitizer.cpp

54 lines

test/

Instrumentation/

DataFlowSanitizer/

origin_load.ll

23 lines

Diff 342076

llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp

Show First 20 Lines • Show All 496 Lines • ▼ Show 20 Lines	class DataFlowSanitizer {
void addGlobalNamePrefix(GlobalValue *GV);		void addGlobalNamePrefix(GlobalValue *GV);
Function buildWrapperFunction(Function F, StringRef NewFName,		Function buildWrapperFunction(Function F, StringRef NewFName,
GlobalValue::LinkageTypes NewFLink,		GlobalValue::LinkageTypes NewFLink,
FunctionType *NewFT);		FunctionType *NewFT);
Constant getOrBuildTrampolineFunction(FunctionType FT, StringRef FName);		Constant getOrBuildTrampolineFunction(FunctionType FT, StringRef FName);
void initializeCallbackFunctions(Module &M);		void initializeCallbackFunctions(Module &M);
void initializeRuntimeFunctions(Module &M);		void initializeRuntimeFunctions(Module &M);
void injectMetadataGlobals(Module &M);		void injectMetadataGlobals(Module &M);

		stephan.yichao.zhaoUnsubmitted Not Done Reply Inline Actions -> loadNextOriginAndIncAddr or a similar name? So a reader can follow the main code w/o looking into the method. stephan.yichao.zhao: -> loadNextOriginAndIncAddr or a similar name? So a reader can follow the main code w/o looking…
		gbalatsAuthorUnsubmitted Done Reply Inline Actions The "Next" part of the name alludes to a pointer advance (how can you load the next thing without advancing the pointer). Also, the fact that the `OriginAddr` param is passed by pointer (indicating that it will be updated). However, I did add some documentation to make it clearer. gbalats: The "Next" part of the name alludes to a pointer advance (how can you load the next thing…
bool init(Module &M);		bool init(Module &M);

		/// Advances \p OriginAddr to point to the next 32-bit origin and then loads
		/// from it. Returns the origin's loaded value.
		Value loadNextOrigin(Instruction Pos, Align OriginAlign,
		Value **OriginAddr);

/// Returns whether fast8 or fast16 mode has been specified.		/// Returns whether fast8 or fast16 mode has been specified.
bool hasFastLabelsEnabled();		bool hasFastLabelsEnabled();

/// Returns whether the given load byte size is amenable to inlined		/// Returns whether the given load byte size is amenable to inlined
/// optimization patterns.		/// optimization patterns.
bool hasLoadSizeForFastPath(uint64_t Size);		bool hasLoadSizeForFastPath(uint64_t Size);

/// Returns whether the pass tracks origins. Support only fast16 mode in TLS		/// Returns whether the pass tracks origins. Support only fast16 mode in TLS
▲ Show 20 Lines • Show All 1,573 Lines • ▼ Show 20 Lines	bool DFSanFunction::useCallbackLoadLabelAndOrigin(uint64_t Size,
// This should ensure that common cases run efficiently.		// This should ensure that common cases run efficiently.
if (Size <= 2)		if (Size <= 2)
return false;		return false;

const Align Alignment = llvm::assumeAligned(InstAlignment.value());		const Align Alignment = llvm::assumeAligned(InstAlignment.value());
return Alignment < MinOriginAlignment \|\| !DFS.hasLoadSizeForFastPath(Size);		return Alignment < MinOriginAlignment \|\| !DFS.hasLoadSizeForFastPath(Size);
}		}

		Value DataFlowSanitizer::loadNextOrigin(Instruction Pos, Align OriginAlign,
		Value **OriginAddr) {
		IRBuilder<> IRB(Pos);
		*OriginAddr =
		IRB.CreateGEP(OriginTy, *OriginAddr, ConstantInt::get(IntptrTy, 1));
		return IRB.CreateAlignedLoad(OriginTy, *OriginAddr, OriginAlign);
		}

std::pair<Value , Value > DFSanFunction::loadFast16ShadowFast(		std::pair<Value , Value > DFSanFunction::loadFast16ShadowFast(
Value ShadowAddr, Value OriginAddr, uint64_t Size, Align ShadowAlign,		Value ShadowAddr, Value OriginAddr, uint64_t Size, Align ShadowAlign,
Align OriginAlign, Value FirstOrigin, Instruction Pos) {		Align OriginAlign, Value FirstOrigin, Instruction Pos) {
const bool ShouldTrackOrigins = DFS.shouldTrackOrigins();		const bool ShouldTrackOrigins = DFS.shouldTrackOrigins();
const uint64_t ShadowSize = Size * DFS.ShadowWidthBytes;		const uint64_t ShadowSize = Size * DFS.ShadowWidthBytes;

assert(Size >= 4 && "Not large enough load size for fast path!");		assert(Size >= 4 && "Not large enough load size for fast path!");

Show All 15 Lines	std::pair<Value , Value > DFSanFunction::loadFast16ShadowFast(
Type *WideShadowTy =		Type *WideShadowTy =
ShadowSize == 4 ? Type::getInt32Ty(DFS.Ctx) : Type::getInt64Ty(DFS.Ctx);		ShadowSize == 4 ? Type::getInt32Ty(DFS.Ctx) : Type::getInt64Ty(DFS.Ctx);

IRBuilder<> IRB(Pos);		IRBuilder<> IRB(Pos);
Value *WideAddr = IRB.CreateBitCast(ShadowAddr, WideShadowTy->getPointerTo());		Value *WideAddr = IRB.CreateBitCast(ShadowAddr, WideShadowTy->getPointerTo());
Value *CombinedWideShadow =		Value *CombinedWideShadow =
IRB.CreateAlignedLoad(WideShadowTy, WideAddr, ShadowAlign);		IRB.CreateAlignedLoad(WideShadowTy, WideAddr, ShadowAlign);

if (ShouldTrackOrigins) {		unsigned WideShadowBitWidth = WideShadowTy->getIntegerBitWidth();
Shadows.push_back(CombinedWideShadow);		const uint64_t BytesPerWideShadow = WideShadowBitWidth / DFS.ShadowWidthBits;
Origins.push_back(FirstOrigin);
		auto AppendWideShadowAndOrigin = [&](Value WideShadow, Value Origin) {
		if (BytesPerWideShadow > 4) {
		assert(BytesPerWideShadow == 8);
		// The wide shadow relates to two origin pointers: one for the first four
		// application bytes, and one for the latest four. We use a left shift to
		// get just the shadow bytes that correspond to the first origin pointer,
		// and then the entire shadow for the second origin pointer (which will be
		stephan.yichao.zhaoUnsubmitted Done Reply Inline Actions The LangRef needs both operands have the same type. I guess on a 64bit system, this is fine. But making this one have type WideShadowTy would make them always consistent. stephan.yichao.zhao: The [[ https://releases.llvm.org/2.6/docs/LangRef.html#i_shl \| LangRef ]] needs both operands…
		gbalatsAuthorUnsubmitted Done Reply Inline Actions Done. Thanks for catching this! gbalats: Done. Thanks for catching this!
		// chosen by combineOrigins() iff the least-significant half of the wide
		// shadow was empty but the other half was not).
		Value *WideShadowLo = IRB.CreateShl(
		WideShadow, ConstantInt::get(WideShadowTy, WideShadowBitWidth / 2));
		Shadows.push_back(WideShadow);
		Origins.push_back(DFS.loadNextOrigin(Pos, OriginAlign, &OriginAddr));

		Shadows.push_back(WideShadowLo);
		Origins.push_back(Origin);
		} else {
		Shadows.push_back(WideShadow);
		Origins.push_back(Origin);
}		}
		};

		if (ShouldTrackOrigins)
		AppendWideShadowAndOrigin(CombinedWideShadow, FirstOrigin);

// First OR all the WideShadows (i.e., 64bit or 32bit shadow chunks) linearly;		// First OR all the WideShadows (i.e., 64bit or 32bit shadow chunks) linearly;
// then OR individual shadows within the combined WideShadow by binary ORing.		// then OR individual shadows within the combined WideShadow by binary ORing.
// This is fewer instructions than ORing shadows individually, since it		// This is fewer instructions than ORing shadows individually, since it
// needs logN shift/or instructions (N being the bytes of the combined wide		// needs logN shift/or instructions (N being the bytes of the combined wide
// shadow).		// shadow).
unsigned WideShadowBitWidth = WideShadowTy->getIntegerBitWidth();
const uint64_t BytesPerWideShadow = WideShadowBitWidth / DFS.ShadowWidthBits;

for (uint64_t ByteOfs = BytesPerWideShadow; ByteOfs < Size;		for (uint64_t ByteOfs = BytesPerWideShadow; ByteOfs < Size;
ByteOfs += BytesPerWideShadow) {		ByteOfs += BytesPerWideShadow) {
WideAddr = IRB.CreateGEP(WideShadowTy, WideAddr,		WideAddr = IRB.CreateGEP(WideShadowTy, WideAddr,
ConstantInt::get(DFS.IntptrTy, 1));		ConstantInt::get(DFS.IntptrTy, 1));
Value *NextWideShadow =		Value *NextWideShadow =
IRB.CreateAlignedLoad(WideShadowTy, WideAddr, ShadowAlign);		IRB.CreateAlignedLoad(WideShadowTy, WideAddr, ShadowAlign);
CombinedWideShadow = IRB.CreateOr(CombinedWideShadow, NextWideShadow);		CombinedWideShadow = IRB.CreateOr(CombinedWideShadow, NextWideShadow);
if (ShouldTrackOrigins) {		if (ShouldTrackOrigins) {
Shadows.push_back(NextWideShadow);		Value *NextOrigin = DFS.loadNextOrigin(Pos, OriginAlign, &OriginAddr);
OriginAddr = IRB.CreateGEP(DFS.OriginTy, OriginAddr,		AppendWideShadowAndOrigin(NextWideShadow, NextOrigin);
ConstantInt::get(DFS.IntptrTy, 1));
Origins.push_back(
IRB.CreateAlignedLoad(DFS.OriginTy, OriginAddr, OriginAlign));
}		}
}		}
for (unsigned Width = WideShadowBitWidth / 2; Width >= DFS.ShadowWidthBits;		for (unsigned Width = WideShadowBitWidth / 2; Width >= DFS.ShadowWidthBits;
Width >>= 1) {		Width >>= 1) {
Value *ShrShadow = IRB.CreateLShr(CombinedWideShadow, Width);		Value *ShrShadow = IRB.CreateLShr(CombinedWideShadow, Width);
CombinedWideShadow = IRB.CreateOr(CombinedWideShadow, ShrShadow);		CombinedWideShadow = IRB.CreateOr(CombinedWideShadow, ShrShadow);
}		}
return {IRB.CreateTrunc(CombinedWideShadow, DFS.PrimitiveShadowTy),		return {IRB.CreateTrunc(CombinedWideShadow, DFS.PrimitiveShadowTy),
ShouldTrackOrigins		ShouldTrackOrigins
? combineOrigins(Shadows, Origins, Pos,		? combineOrigins(Shadows, Origins, Pos,
ConstantInt::getSigned(IRB.getInt64Ty(), 0))		ConstantInt::getSigned(IRB.getInt64Ty(), 0))
: DFS.ZeroOrigin};		: DFS.ZeroOrigin};
		stephan.yichao.zhaoUnsubmitted Done Reply Inline Actions If we lift this origin load out of the if-else, this if-else could be shared with the if-else before the for-loop, so the comments and the assertion assert(BytesPerWideShadow == 8); will also be shared. FirstOrigin = DFS.loadNextOrigin(Pos, OriginAlign, &OriginAddr); if () { ... Origins.push_back(FirstOrigin) } else { ... Origins.push_back(FirstOrigin) } stephan.yichao.zhao: If we lift this origin load out of the if-else, this if-else could be shared with the if-else…
		gbalatsAuthorUnsubmitted Done Reply Inline Actions Done. Extracted to a lambda. gbalats: Done. Extracted to a lambda.
}		}

Value DFSanFunction::loadLegacyShadowFast(Value ShadowAddr, uint64_t Size,		Value DFSanFunction::loadLegacyShadowFast(Value ShadowAddr, uint64_t Size,
Align ShadowAlign,		Align ShadowAlign,
Instruction *Pos) {		Instruction *Pos) {
// Fast path for the common case where each byte has identical shadow: load		// Fast path for the common case where each byte has identical shadow: load
// shadow 64 (or 32) bits at a time, fall out to a __dfsan_union_load call if		// shadow 64 (or 32) bits at a time, fall out to a __dfsan_union_load call if
// any shadow is non-equal.		// any shadow is non-equal.
▲ Show 20 Lines • Show All 1,195 Lines • Show Last 20 Lines

llvm/test/Instrumentation/DataFlowSanitizer/origin_load.ll

Show First 20 Lines • Show All 185 Lines • ▼ Show 20 Lines	define i64 @load64(i64* %p) {
; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]		; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]
; CHECK16-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 16		; CHECK16-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 16
; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]		; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]
; CHECK16-NEXT: %[[#SHADOW:]] = trunc i64 %[[#WIDE_SHADOW]] to i[[#SBITS]]		; CHECK16-NEXT: %[[#SHADOW:]] = trunc i64 %[[#WIDE_SHADOW]] to i[[#SBITS]]
; CHECK16-NEXT: %[[#SHADOW_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW2]], 0		; CHECK16-NEXT: %[[#SHADOW_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW2]], 0
; CHECK16-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW_NZ]], i32 %[[#ORIGIN2]], i32 %[[#ORIGIN]]		; CHECK16-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW_NZ]], i32 %[[#ORIGIN2]], i32 %[[#ORIGIN]]

; COMM: On fast8, no need to OR the wide shadow but one more shift is needed.		; COMM: On fast8, no need to OR the wide shadow but one more shift is needed.
		; CHECK8-NEXT: %[[#WIDE_SHADOW_LO:]] = shl i64 %[[#WIDE_SHADOW]], 32
		; CHECK8-NEXT: %[[#ORIGIN_PTR2:]] = getelementptr i32, i32* %[[#ORIGIN_PTR]], i64 1
		; CHECK8-NEXT: %[[#ORIGIN2:]] = load i32, i32* %[[#ORIGIN_PTR2]], align 8
; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 32		; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 32
; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]		; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]
; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 16		; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 16
; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]		; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]
; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 8		; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 8
; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]		; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]
; CHECK8-NEXT: %[[#SHADOW:]] = trunc i64 %[[#WIDE_SHADOW]] to i[[#SBITS]]		; CHECK8-NEXT: %[[#SHADOW:]] = trunc i64 %[[#WIDE_SHADOW]] to i[[#SBITS]]
		; CHECK8-NEXT: %[[#SHADOW_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW_LO]], 0
		; CHECK8-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW_NZ]], i32 %[[#ORIGIN]], i32 %[[#ORIGIN2]]

; COMBINE_LOAD_PTR-NEXT: %[[#SHADOW:]] = or i[[#SBITS]] %[[#SHADOW]], %[[#PS]]		; COMBINE_LOAD_PTR-NEXT: %[[#SHADOW:]] = or i[[#SBITS]] %[[#SHADOW]], %[[#PS]]
; COMBINE_LOAD_PTR-NEXT: %[[#NZ:]] = icmp ne i[[#SBITS]] %[[#PS]], 0		; COMBINE_LOAD_PTR-NEXT: %[[#NZ:]] = icmp ne i[[#SBITS]] %[[#PS]], 0
; COMBINE_LOAD_PTR-NEXT: %[[#ORIGIN:]] = select i1 %[[#NZ]], i32 %[[#PO]], i32 %[[#ORIGIN]]		; COMBINE_LOAD_PTR-NEXT: %[[#ORIGIN:]] = select i1 %[[#NZ]], i32 %[[#PO]], i32 %[[#ORIGIN]]

; CHECK-NEXT: %a = load i64, i64* %p, align 8		; CHECK-NEXT: %a = load i64, i64* %p, align 8
; CHECK-NEXT: store i[[#SBITS]] %[[#SHADOW]], i[[#SBITS]]* bitcast ([100 x i64]* @__dfsan_retval_tls to i[[#SBITS]]*), align [[ALIGN]]		; CHECK-NEXT: store i[[#SBITS]] %[[#SHADOW]], i[[#SBITS]]* bitcast ([100 x i64]* @__dfsan_retval_tls to i[[#SBITS]]*), align [[ALIGN]]
; CHECK-NEXT: store i32 %[[#ORIGIN]], i32* @__dfsan_retval_origin_tls, align 4		; CHECK-NEXT: store i32 %[[#ORIGIN]], i32* @__dfsan_retval_origin_tls, align 4
Show All 36 Lines	define i128 @load128(i128* %p) {
; CHECK-NEXT: %[[#SHADOW_ADDR:INTP+1]] = and i64 %[[#INTP]], [[#MASK]]		; CHECK-NEXT: %[[#SHADOW_ADDR:INTP+1]] = and i64 %[[#INTP]], [[#MASK]]
; CHECK16-NEXT: %[[#SHADOW_ADDR:]] = mul i64 %[[#SHADOW_ADDR]], 2		; CHECK16-NEXT: %[[#SHADOW_ADDR:]] = mul i64 %[[#SHADOW_ADDR]], 2
; CHECK-NEXT: %[[#SHADOW_PTR:]] = inttoptr i64 %[[#SHADOW_ADDR]] to i[[#SBITS]]*		; CHECK-NEXT: %[[#SHADOW_PTR:]] = inttoptr i64 %[[#SHADOW_ADDR]] to i[[#SBITS]]*
; CHECK-NEXT: %[[#ORIGIN_ADDR:]] = add i64 %[[#INTP+1]], [[#ORIGIN_MASK]]		; CHECK-NEXT: %[[#ORIGIN_ADDR:]] = add i64 %[[#INTP+1]], [[#ORIGIN_MASK]]
; CHECK-NEXT: %[[#ORIGIN_PTR:]] = inttoptr i64 %[[#ORIGIN_ADDR]] to i32*		; CHECK-NEXT: %[[#ORIGIN_PTR:]] = inttoptr i64 %[[#ORIGIN_ADDR]] to i32*
; CHECK-NEXT: %[[#ORIGIN:]] = load i32, i32* %[[#ORIGIN_PTR]], align 8		; CHECK-NEXT: %[[#ORIGIN:]] = load i32, i32* %[[#ORIGIN_PTR]], align 8
; CHECK-NEXT: %[[#WIDE_SHADOW_PTR:]] = bitcast i[[#SBITS]]* %[[#SHADOW_PTR]] to i64*		; CHECK-NEXT: %[[#WIDE_SHADOW_PTR:]] = bitcast i[[#SBITS]]* %[[#SHADOW_PTR]] to i64*
; CHECK-NEXT: %[[#WIDE_SHADOW:]] = load i64, i64* %[[#WIDE_SHADOW_PTR]], align [[#SBYTES]]		; CHECK-NEXT: %[[#WIDE_SHADOW:]] = load i64, i64* %[[#WIDE_SHADOW_PTR]], align [[#SBYTES]]
		; CHECK8-NEXT: %[[#WIDE_SHADOW_LO:]] = shl i64 %[[#WIDE_SHADOW]], 32
		stephan.yichao.zhaoUnsubmitted Not Done Reply Inline Actions If WIDE_SHADOW_LO is named to be WIDE_SHADOW_1; and WIDE_SHADOW is named WIDE_SHADOW_12; and ORIGIN is named ORIGIN1; and we do not reuse ORIGIN, but named new assignment to ORIGIN as ORIGIN23 if it is a select from ORIGIN2 and ORIGIN3, it helps read the code. stephan.yichao.zhao: If WIDE_SHADOW_LO is named to be WIDE_SHADOW_1; and WIDE_SHADOW is named WIDE_SHADOW_12; and…
		gbalatsAuthorUnsubmitted Done Reply Inline Actions I think if we use WIDE_SHADOW_1 instead of _LO it will be difficult to differentiate between the different wide shadows and their left-shifted counterparts. The naming pattern I used is adding `_LO` suffix to indicate that `WIDE_SHADOW(N)_LO` is the left-shifted version of `WIDE_SHADOW(N)`. Overwriting `ORIGIN` is essential to keep the test smaller by reusing the same expression later on (see lines 312, 316). We need a way to state that, whatever path has been taken (fast8 or fast16, combine load ptr or not), the `ORIGIN` will refer to the name of the final origin that we have computed. Otherwise, we'd have to replicate those lines as well even though they are essentially the same. However, I can partially address this (remove some of the overwriting) when removing fast16. gbalats: I think if we use WIDE_SHADOW_1 instead of _LO it will be difficult to differentiate between…
		stephan.yichao.zhaoUnsubmitted Not Done Reply Inline Actions Got it. I missed the pattern needs to be shared. Yes. Please see if this can be simplified after fast16 is removed. stephan.yichao.zhao: Got it. I missed the pattern needs to be shared. Yes. Please see if this can be simplified…
		; CHECK8-NEXT: %[[#ORIGIN_PTR2:]] = getelementptr i32, i32* %[[#ORIGIN_PTR]], i64 1
		; CHECK8-NEXT: %[[#ORIGIN2:]] = load i32, i32* %[[#ORIGIN_PTR2]], align 8
; CHECK-NEXT: %[[#WIDE_SHADOW_PTR2:]] = getelementptr i64, i64* %[[#WIDE_SHADOW_PTR]], i64 1		; CHECK-NEXT: %[[#WIDE_SHADOW_PTR2:]] = getelementptr i64, i64* %[[#WIDE_SHADOW_PTR]], i64 1
; CHECK-NEXT: %[[#WIDE_SHADOW2:]] = load i64, i64* %[[#WIDE_SHADOW_PTR2]], align [[#SBYTES]]		; CHECK-NEXT: %[[#WIDE_SHADOW2:]] = load i64, i64* %[[#WIDE_SHADOW_PTR2]], align [[#SBYTES]]
; CHECK-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW2]]		; CHECK-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW2]]
; CHECK-NEXT: %[[#ORIGIN_PTR2:]] = getelementptr i32, i32* %[[#ORIGIN_PTR]], i64 1
; CHECK-NEXT: %[[#ORIGIN2:]] = load i32, i32* %[[#ORIGIN_PTR2]], align 8

; COMM: On fast16, we need to OR 4x64bits for the wide shadow, before ORing its bytes.		; COMM: On fast16, we need to OR 4x64bits for the wide shadow, before ORing its bytes.
		; CHECK16-NEXT: %[[#ORIGIN_PTR2:]] = getelementptr i32, i32* %[[#ORIGIN_PTR]], i64 1
		; CHECK16-NEXT: %[[#ORIGIN2:]] = load i32, i32* %[[#ORIGIN_PTR2]], align 8
; CHECK16-NEXT: %[[#WIDE_SHADOW_PTR3:]] = getelementptr i64, i64* %[[#WIDE_SHADOW_PTR2]], i64 1		; CHECK16-NEXT: %[[#WIDE_SHADOW_PTR3:]] = getelementptr i64, i64* %[[#WIDE_SHADOW_PTR2]], i64 1
; CHECK16-NEXT: %[[#WIDE_SHADOW3:]] = load i64, i64* %[[#WIDE_SHADOW_PTR3]], align [[#SBYTES]]		; CHECK16-NEXT: %[[#WIDE_SHADOW3:]] = load i64, i64* %[[#WIDE_SHADOW_PTR3]], align [[#SBYTES]]
; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW3]]		; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW3]]
; CHECK16-NEXT: %[[#ORIGIN_PTR3:]] = getelementptr i32, i32* %[[#ORIGIN_PTR2]], i64 1		; CHECK16-NEXT: %[[#ORIGIN_PTR3:]] = getelementptr i32, i32* %[[#ORIGIN_PTR2]], i64 1
; CHECK16-NEXT: %[[#ORIGIN3:]] = load i32, i32* %[[#ORIGIN_PTR3]], align 8		; CHECK16-NEXT: %[[#ORIGIN3:]] = load i32, i32* %[[#ORIGIN_PTR3]], align 8
; CHECK16-NEXT: %[[#WIDE_SHADOW_PTR4:]] = getelementptr i64, i64* %[[#WIDE_SHADOW_PTR3]], i64 1		; CHECK16-NEXT: %[[#WIDE_SHADOW_PTR4:]] = getelementptr i64, i64* %[[#WIDE_SHADOW_PTR3]], i64 1
; CHECK16-NEXT: %[[#WIDE_SHADOW4:]] = load i64, i64* %[[#WIDE_SHADOW_PTR4]], align [[#SBYTES]]		; CHECK16-NEXT: %[[#WIDE_SHADOW4:]] = load i64, i64* %[[#WIDE_SHADOW_PTR4]], align [[#SBYTES]]
; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW4]]		; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW4]]
; CHECK16-NEXT: %[[#ORIGIN_PTR4:]] = getelementptr i32, i32* %[[#ORIGIN_PTR3]], i64 1		; CHECK16-NEXT: %[[#ORIGIN_PTR4:]] = getelementptr i32, i32* %[[#ORIGIN_PTR3]], i64 1
; CHECK16-NEXT: %[[#ORIGIN4:]] = load i32, i32* %[[#ORIGIN_PTR4]], align 8		; CHECK16-NEXT: %[[#ORIGIN4:]] = load i32, i32* %[[#ORIGIN_PTR4]], align 8
; CHECK16-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 32		; CHECK16-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 32
; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]		; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]
; CHECK16-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 16		; CHECK16-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 16
; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]		; CHECK16-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]
; CHECK16-NEXT: %[[#SHADOW:]] = trunc i64 %[[#WIDE_SHADOW]] to i[[#SBITS]]		; CHECK16-NEXT: %[[#SHADOW:]] = trunc i64 %[[#WIDE_SHADOW]] to i[[#SBITS]]
; CHECK16-NEXT: %[[#SHADOW2_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW2]], 0		; CHECK16-NEXT: %[[#SHADOW2_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW2]], 0
; CHECK16-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW2_NZ]], i32 %[[#ORIGIN2]], i32 %[[#ORIGIN]]		; CHECK16-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW2_NZ]], i32 %[[#ORIGIN2]], i32 %[[#ORIGIN]]
; CHECK16-NEXT: %[[#SHADOW3_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW3]], 0		; CHECK16-NEXT: %[[#SHADOW3_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW3]], 0
; CHECK16-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW3_NZ]], i32 %[[#ORIGIN3]], i32 %[[#ORIGIN]]		; CHECK16-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW3_NZ]], i32 %[[#ORIGIN3]], i32 %[[#ORIGIN]]
; CHECK16-NEXT: %[[#SHADOW4_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW4]], 0		; CHECK16-NEXT: %[[#SHADOW4_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW4]], 0
; CHECK16-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW4_NZ]], i32 %[[#ORIGIN4]], i32 %[[#ORIGIN]]		; CHECK16-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW4_NZ]], i32 %[[#ORIGIN4]], i32 %[[#ORIGIN]]

; COMM: On fast8, we need to OR 2x64bits for the wide shadow, before ORing its bytes (one more shift).		; COMM: On fast8, we need to OR 2x64bits for the wide shadow, before ORing its bytes (one more shift).
		; CHECK8-NEXT: %[[#ORIGIN_PTR3:]] = getelementptr i32, i32* %[[#ORIGIN_PTR2]], i64 1
		; CHECK8-NEXT: %[[#ORIGIN3:]] = load i32, i32* %[[#ORIGIN_PTR3]], align 8
		; CHECK8-NEXT: %[[#WIDE_SHADOW2_LO:]] = shl i64 %[[#WIDE_SHADOW2]], 32
		; CHECK8-NEXT: %[[#ORIGIN_PTR4:]] = getelementptr i32, i32* %[[#ORIGIN_PTR3]], i64 1
		; CHECK8-NEXT: %[[#ORIGIN4:]] = load i32, i32* %[[#ORIGIN_PTR4]], align 8
; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 32		; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 32
; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]		; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]
; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 16		; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 16
; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]		; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]
; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 8		; CHECK8-NEXT: %[[#WIDE_SHADOW_SHIFTED:]] = lshr i64 %[[#WIDE_SHADOW]], 8
; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]		; CHECK8-NEXT: %[[#WIDE_SHADOW:]] = or i64 %[[#WIDE_SHADOW]], %[[#WIDE_SHADOW_SHIFTED]]
; CHECK8-NEXT: %[[#SHADOW:]] = trunc i64 %[[#WIDE_SHADOW]] to i[[#SBITS]]		; CHECK8-NEXT: %[[#SHADOW:]] = trunc i64 %[[#WIDE_SHADOW]] to i[[#SBITS]]
		; CHECK8-NEXT: %[[#SHADOW_LO_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW_LO]], 0
		; CHECK8-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW_LO_NZ]], i32 %[[#ORIGIN]], i32 %[[#ORIGIN2]]
; CHECK8-NEXT: %[[#SHADOW2_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW2]], 0		; CHECK8-NEXT: %[[#SHADOW2_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW2]], 0
; CHECK8-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW2_NZ]], i32 %[[#ORIGIN2]], i32 %[[#ORIGIN]]		; CHECK8-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW2_NZ]], i32 %[[#ORIGIN4]], i32 %[[#ORIGIN]]
		; CHECK8-NEXT: %[[#SHADOW2_LO_NZ:]] = icmp ne i64 %[[#WIDE_SHADOW2_LO]], 0
		; CHECK8-NEXT: %[[#ORIGIN:]] = select i1 %[[#SHADOW2_LO_NZ]], i32 %[[#ORIGIN3]], i32 %[[#ORIGIN]]

; COMBINE_LOAD_PTR-NEXT: %[[#SHADOW:]] = or i[[#SBITS]] %[[#SHADOW]], %[[#PS]]		; COMBINE_LOAD_PTR-NEXT: %[[#SHADOW:]] = or i[[#SBITS]] %[[#SHADOW]], %[[#PS]]
; COMBINE_LOAD_PTR-NEXT: %[[#NZ:]] = icmp ne i[[#SBITS]] %[[#PS]], 0		; COMBINE_LOAD_PTR-NEXT: %[[#NZ:]] = icmp ne i[[#SBITS]] %[[#PS]], 0
; COMBINE_LOAD_PTR-NEXT: %[[#ORIGIN:]] = select i1 %[[#NZ]], i32 %[[#PO]], i32 %[[#ORIGIN]]		; COMBINE_LOAD_PTR-NEXT: %[[#ORIGIN:]] = select i1 %[[#NZ]], i32 %[[#PO]], i32 %[[#ORIGIN]]

; CHECK-NEXT: %a = load i128, i128* %p, align 8		; CHECK-NEXT: %a = load i128, i128* %p, align 8
; CHECK-NEXT: store i[[#SBITS]] %[[#SHADOW]], i[[#SBITS]]* bitcast ([100 x i64]* @__dfsan_retval_tls to i[[#SBITS]]*), align [[ALIGN]]		; CHECK-NEXT: store i[[#SBITS]] %[[#SHADOW]], i[[#SBITS]]* bitcast ([100 x i64]* @__dfsan_retval_tls to i[[#SBITS]]*), align [[ALIGN]]
; CHECK-NEXT: store i32 %[[#ORIGIN]], i32* @__dfsan_retval_origin_tls, align 4		; CHECK-NEXT: store i32 %[[#ORIGIN]], i32* @__dfsan_retval_origin_tls, align 4
Show All 28 Lines