This is an archive of the discontinued LLVM Phabricator instance.

compiler-rt/lib/dfsan/dfsan.cpp
24	We still need dfsan_flags.h.
543	The start + size implementation seems cleaner to me. Can we use that for `WriteShadowWithSize` and make `WriteShadowInRange` call that instead?
626	This might be redundant since on Linux releasing the memory zeroes it out on the next load.
637	Should we just call `ReleaseOrigins`?
670–671	What's the reason to change this from `ReleaseMemoryPagesToOS`?
712–713	Does this do anything since we already called `MmapFixedSuperNoReserve`?
721	Can we simplify by moving all this logic into `WriteShadowWithSize`? It's confusing to have multiple layers of functions to set the shadow, with some repeated checks in each layer.
967	What's going on here? `intercept_tld_get_addr` appears unused, so why do we need to override it?
1082	`dfsan_init` is now unused. Why do we need this?
compiler-rt/lib/dfsan/dfsan.h
24	This is unused.
70	Why int instead of bool?
71	Since we currently only call `dfsan_init` from preinit_array, which happens once in a single thread, do we actually need to guard against multiple initialization?
122	This looks unused.
compiler-rt/lib/dfsan/dfsan_allocator.cpp
32	Maybe we don't need this. We release (zero) shadow on unmap, so it should still be zero if we mmap it again.
43	This appears unused except for `kSpaceBeg` below. The name sounds like the max size of the heap, rather than the start address of it.

stephan.yichao.zhao marked 4 inline comments as done.May 3 2021, 4:28 PM

stephan.yichao.zhao added inline comments.

compiler-rt/lib/dfsan/dfsan.cpp
24	dfsan_flags.h is included in dfsan.h
543	If we use "WriteShadowWithSize(dfsan_label label, uptr beg_shadow_addr, uptr size)" with the base function, here the size is preferred to be the number of labels. If not, we may have to assert size % sizeof(dfsan_label) == 0. But if size is # fo labels, we would have WriteShadowInRange(dfsan_label label, uptr beg_shadow_addr, uptr end_shadow_addr) { assert(beg_shadow_addr <= end_shadow_addr); assert((end_shadow_addr - beg_shadow_addr) % 2 == 0); WriteShadowWithSize(label, (end_shadow_addr - beg_shadow_addr) / 2); } I feel after the fast16label -> fast8label change, it is possible to do this because then #bytes == #labels.
626	This mainly follows how MSan does when allocator does unmmap: https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/msan/msan_allocator.cpp#L33 dfsan_set_label only does mmap, at unmmap event, we call ReleaseMemoryPagesToOS explicitly.
637	ReleaseOrigins does not call ReleaseMemoryPagesToOS.
670–671	This mainly follows MSan's approach. Although MSan does not have a similar function to release origin, it has SetShadow https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/msan/msan_poisoning.cpp#L216 to zero out Shadow, it calls MmapFixedSuperNoReserve w/o ReleaseMemoryPagesToOS, To be consistent with this, ReleaseOrigins replaces ReleaseMemoryPagesToOS by MmapFixedSuperNoReserve.
712–713	This should have been removed to be consistent with https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/msan/msan_poisoning.cpp#L216
721	WriteShadowInRange works for both 0 and non-zero labels. Refactored this code into ReleaseOrClearShadows.
967	This will be used by D101204 when interceptors are defined.
1082	This will be used by D101204.
compiler-rt/lib/dfsan/dfsan.h
24	This would be used by those new/delete injections dfsan_new_delete.cpp in D101204.
71	I found dfsan_init_is_running is mainly to ensure interceptors use real calls w/o wrappers before dfsan_init is done. See dfsan_interceptors.cpp from D101204. Otherwise mmap called by dfsan_init can be intercepted too.
122	This will be called by dfsan_interceptors.cpp in D101204. It makes sure that when interceptors are called, dfsan_init (shadow/origin allocation) must be done because wrappers code run. I also feel like dfsan_init/dfsan_is_running/dfsan_inited are not straightforward. But from MSan's code blame, they seem useful to some corner cases. For example, calloc, realloc and malloc can be called by DL before preinit_array runs, etc.
compiler-rt/lib/dfsan/dfsan_allocator.cpp
32	Thank you. Removed.
43	inlined the constant.

update

Harbormaster completed remote builds in B102415: Diff 342585.May 3 2021, 6:04 PM

morehouse added inline comments.May 4 2021, 11:17 AM

compiler-rt/lib/dfsan/dfsan.cpp
24	Yes. It's just a nit, but I usually try to explicitly include the things used in each file, so that we can change header files without touching all the places where they're included.
546–553
637	Shouldn't mmap(MAP_NORESERVE) have a similar effect on resident memory to madvise(MADV_DONTNEED)? Besides, we already have `ReleaseOrClearShadow` and `ReleaseOrigins`. Why do we need to reimplement that functionality in this function?
670–671	Don't they have a similar effect? It's confusing to use two different ways of releasing memory without at least a comment explaining why.
compiler-rt/lib/dfsan/dfsan.h
24	In that case, maybe we should define it in the file it will be used in, instead of here.

stephan.yichao.zhao marked 5 inline comments as done.May 4 2021, 2:07 PM

stephan.yichao.zhao added inline comments.

compiler-rt/lib/dfsan/dfsan.cpp
24	moved #include "dfsan/dfsan_flags.h" to each cc file.
546–553	Thank you.
637	removed madvise(MADV_DONTNEED), and replaced dfsan_release_meta_memory by dfsan_set_label(0).
670–671	removed those madvise calls.
compiler-rt/lib/dfsan/dfsan.h
24	moved to dfsan_new_delete.cpp

update

stephan.yichao.zhao mentioned this in D101857: [dfsan] move dfsan_flags.h to cc files.May 4 2021, 2:20 PM

Harbormaster completed remote builds in B102606: Diff 342861.May 4 2021, 3:15 PM

morehouse accepted this revision.May 4 2021, 3:46 PM

morehouse added inline comments.

compiler-rt/lib/dfsan/dfsan_allocator.cpp
32	Looks like you added it back. I think we only need to set 0 label on unmap.

This revision is now accepted and ready to land.May 4 2021, 3:46 PM

Jianzhou Zhao <jianzhouzh@google.com> mentioned this in rG36cec26b3857: [dfsan] move dfsan_flags.h to cc files.May 4 2021, 3:54 PM

stephan.yichao.zhao added inline comments.May 4 2021, 4:15 PM

compiler-rt/lib/dfsan/dfsan_allocator.cpp
32	I found those mmap interceptors in dfsan_intercerptors.cp also do dfsan_set_label(0). https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/dfsan/dfsan_interceptors.cpp#L41 For the same reason they are not needed either. In the next change, I will remove all mmap related dfsan_set_label(0) together.

update

This revision was landed with ongoing or failed builds.May 4 2021, 5:52 PM

Closed by commit rG1fb612d060e7: [dfsan] Add a DFSan allocator (authored by Jianzhou Zhao <jianzhouzh@google.com>). · Explain Why

This revision was automatically updated to reflect the committed changes.

Jianzhou Zhao <jianzhouzh@google.com> added a commit: rG1fb612d060e7: [dfsan] Add a DFSan allocator.

stephan.yichao.zhao added inline comments.May 4 2021, 6:18 PM

compiler-rt/lib/dfsan/dfsan_allocator.cpp
32	Thought about the problem a bit more. I think our discussion is based on an assumption that all unmmap will be precisely intercepted. This may not be true since if some code calls unmmap by system call number directly like what this sanitizer allocator does, there is no way to intercept that kind of unmmap. So in general, we may not be able to remove intercepting those mmap and mmap64. But for this OnMap, it should be safe because its corresponding OnUmmap is injected by our code. So in the following change, I will only remove dfsan_set_label(0) for OnMap.

Harbormaster completed remote builds in B102654: Diff 342922.May 4 2021, 6:24 PM

Jianzhou Zhao <jianzhouzh@google.com> mentioned this in rG79debe8d7b58: [dfsan] Turn off all dfsan test cases on non x86_64 OSs.May 4 2021, 10:33 PM

morehouse added inline comments.May 5 2021, 8:37 AM

compiler-rt/lib/dfsan/dfsan_allocator.cpp
32	Thought about the problem a bit more. I think our discussion is based on an assumption that all unmmap will be precisely intercepted. This may not be true since if some code calls unmmap by system call number directly like what this sanitizer allocator does, there is no way to intercept that kind of unmmap. So in general, we may not be able to remove intercepting those mmap and mmap64. Ah yes, good point. But for this OnMap, it should be safe because its corresponding OnUmmap is injected by our code. Well, I'm not 100% sure it's safe. Something else could do `syscall(SYS_mmap)`, then taint the memory, then `syscall_(SYS_munmap)`. Then we could end up mapping that same region for our heap and calling this `OnMap` callback. So in the following change, I will only remove dfsan_set_label(0) for OnMap.

stephan.yichao.zhao mentioned this in D101204: [dfsan] Use the sanitizer allocator to reduce memory cost.May 5 2021, 9:18 AM

Revision Contents

Path

Size

compiler-rt/

lib/

dfsan/

2 lines

25 lines

167 lines

30 lines

290 lines

4 lines

11 lines

27 lines

Diff 342585

compiler-rt/lib/dfsan/CMakeLists.txt

	include_directories(..)			include_directories(..)

	# Runtime library sources and build flags.			# Runtime library sources and build flags.
	set(DFSAN_RTL_SOURCES			set(DFSAN_RTL_SOURCES
	dfsan.cpp			dfsan.cpp
				dfsan_allocator.cpp
	dfsan_chained_origin_depot.cpp			dfsan_chained_origin_depot.cpp
	dfsan_custom.cpp			dfsan_custom.cpp
	dfsan_interceptors.cpp			dfsan_interceptors.cpp
	dfsan_thread.cpp			dfsan_thread.cpp
	)			)

	set(DFSAN_RTL_HEADERS			set(DFSAN_RTL_HEADERS
	dfsan.h			dfsan.h
				dfsan_allocator.h
	dfsan_chained_origin_depot.h			dfsan_chained_origin_depot.h
	dfsan_flags.inc			dfsan_flags.inc
	dfsan_flags.h			dfsan_flags.h
	dfsan_platform.h			dfsan_platform.h
	dfsan_thread.h			dfsan_thread.h
	)			)

	set(DFSAN_COMMON_CFLAGS ${SANITIZER_COMMON_CFLAGS})			set(DFSAN_COMMON_CFLAGS ${SANITIZER_COMMON_CFLAGS})
	▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

compiler-rt/lib/dfsan/dfsan.h

Show All 13 Lines
#ifndef DFSAN_H		#ifndef DFSAN_H
#define DFSAN_H		#define DFSAN_H

#include "sanitizer_common/sanitizer_internal_defs.h"		#include "sanitizer_common/sanitizer_internal_defs.h"

#include "dfsan_flags.h"		#include "dfsan_flags.h"
#include "dfsan_platform.h"		#include "dfsan_platform.h"

		#ifndef DFSAN_REPLACE_OPERATORS_NEW_AND_DELETE
		#define DFSAN_REPLACE_OPERATORS_NEW_AND_DELETE 1
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#define DFSAN_REPLACE_OPERATORS_NEW_AND_DELETE 1 +# define DFSAN_REPLACE_OPERATORS_NEW_AND_DELETE 1 Lint: Pre-merge checks: clang-format: please reformat the code ``` -#define DFSAN_REPLACE_OPERATORS_NEW_AND_DELETE 1 +#…
		#endif
		morehouseUnsubmitted Not Done Reply Inline Actions This is unused. morehouse: This is unused.
		stephan.yichao.zhaoAuthorUnsubmitted Done Reply Inline Actions This would be used by those new/delete injections dfsan_new_delete.cpp in D101204. stephan.yichao.zhao: This would be used by those new/delete injections dfsan_new_delete.cpp in D101204.
		morehouseUnsubmitted Done Reply Inline Actions In that case, maybe we should define it in the file it will be used in, instead of here. morehouse: In that case, maybe we should define it in the file it will be used in, instead of here.
		stephan.yichao.zhaoAuthorUnsubmitted Done Reply Inline Actions moved to dfsan_new_delete.cpp stephan.yichao.zhao: moved to dfsan_new_delete.cpp

using __sanitizer::u16;		using __sanitizer::u16;
using __sanitizer::u32;		using __sanitizer::u32;
using __sanitizer::uptr;		using __sanitizer::uptr;

// Copy declarations from public sanitizer/dfsan_interface.h header here.		// Copy declarations from public sanitizer/dfsan_interface.h header here.
typedef u16 dfsan_label;		typedef u16 dfsan_label;
typedef u32 dfsan_origin;		typedef u32 dfsan_origin;

Show All 28 Lines

template <typename T>		template <typename T>
void dfsan_set_label(dfsan_label label, T &data) { // NOLINT		void dfsan_set_label(dfsan_label label, T &data) { // NOLINT
dfsan_set_label(label, (void *)&data, sizeof(T));		dfsan_set_label(label, (void *)&data, sizeof(T));
}		}

namespace __dfsan {		namespace __dfsan {

		extern bool dfsan_inited;
		morehouseUnsubmitted Done Reply Inline Actions Why int instead of bool? morehouse: Why int instead of bool?
		extern bool dfsan_init_is_running;
		morehouseUnsubmitted Not Done Reply Inline Actions Since we currently only call `dfsan_init` from preinit_array, which happens once in a single thread, do we actually need to guard against multiple initialization? morehouse: Since we currently only call `dfsan_init` from preinit_array, which happens once in a single…
		stephan.yichao.zhaoAuthorUnsubmitted Done Reply Inline Actions I found dfsan_init_is_running is mainly to ensure interceptors use real calls w/o wrappers before dfsan_init is done. See dfsan_interceptors.cpp from D101204. Otherwise mmap called by dfsan_init can be intercepted too. stephan.yichao.zhao: I found dfsan_init_is_running is mainly to ensure interceptors use real calls w/o wrappers…

void InitializeInterceptors();		void InitializeInterceptors();

inline dfsan_label shadow_for(void ptr) {		inline dfsan_label shadow_for(void ptr) {
return (dfsan_label *) ((((uptr) ptr) & ShadowMask()) << 1);		return (dfsan_label *) ((((uptr) ptr) & ShadowMask()) << 1);
}		}

inline const dfsan_label shadow_for(const void ptr) {		inline const dfsan_label shadow_for(const void ptr) {
return shadow_for(const_cast<void *>(ptr));		return shadow_for(const_cast<void *>(ptr));
Show All 17 Lines	inline bool is_shadow_addr_valid(uptr shadow_addr) {
return (uptr)shadow_addr >= ShadowAddr() && (uptr)shadow_addr < OriginAddr();		return (uptr)shadow_addr >= ShadowAddr() && (uptr)shadow_addr < OriginAddr();
}		}

inline bool has_valid_shadow_addr(const void *ptr) {		inline bool has_valid_shadow_addr(const void *ptr) {
const dfsan_label *ptr_s = shadow_for(ptr);		const dfsan_label *ptr_s = shadow_for(ptr);
return is_shadow_addr_valid((uptr)ptr_s);		return is_shadow_addr_valid((uptr)ptr_s);
}		}

		void dfsan_copy_memory(void dst, const void src, uptr size);
		void dfsan_release_meta_memory(const void *addr, uptr size);

		void dfsan_allocator_init();
		void dfsan_deallocate(void *ptr);

		void *dfsan_malloc(uptr size);
		void *dfsan_calloc(uptr nmemb, uptr size);
		void dfsan_realloc(void ptr, uptr size);
		void dfsan_reallocarray(void ptr, uptr nmemb, uptr size);
		void *dfsan_valloc(uptr size);
		void *dfsan_pvalloc(uptr size);
		void *dfsan_aligned_alloc(uptr alignment, uptr size);
		void *dfsan_memalign(uptr alignment, uptr size);
		int dfsan_posix_memalign(void **memptr, uptr alignment, uptr size);

		void dfsan_init();
		morehouseUnsubmitted Not Done Reply Inline Actions This looks unused. morehouse: This looks unused.
		stephan.yichao.zhaoAuthorUnsubmitted Done Reply Inline Actions This will be called by dfsan_interceptors.cpp in D101204. It makes sure that when interceptors are called, dfsan_init (shadow/origin allocation) must be done because wrappers code run. I also feel like dfsan_init/dfsan_is_running/dfsan_inited are not straightforward. But from MSan's code blame, they seem useful to some corner cases. For example, calloc, realloc and malloc can be called by DL before preinit_array runs, etc. stephan.yichao.zhao: This will be called by dfsan_interceptors.cpp in D101204. It makes sure that when interceptors…

} // namespace __dfsan		} // namespace __dfsan

#endif // DFSAN_H		#endif // DFSAN_H

compiler-rt/lib/dfsan/dfsan.cpp

Show All 15 Lines

// The public interface is defined in include/sanitizer/dfsan_interface.h whose // The public interface is defined in include/sanitizer/dfsan_interface.h whose

// functions are prefixed dfsan_ while the compiler interface functions are // functions are prefixed dfsan_ while the compiler interface functions are

// prefixed __dfsan_. // prefixed __dfsan_.

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#include "dfsan/dfsan.h" #include "dfsan/dfsan.h"

#include "dfsan/dfsan_chained_origin_depot.h" #include "dfsan/dfsan_chained_origin_depot.h"

#include "dfsan/dfsan_flags.h"

morehouseUnsubmitted

Not Done

We still need dfsan_flags.h.

morehouse: We still need dfsan_flags.h.

stephan.yichao.zhaoAuthorUnsubmitted

Done

dfsan_flags.h is included in dfsan.h

stephan.yichao.zhao: dfsan_flags.h is included in dfsan.h

morehouseUnsubmitted

Done

Yes. It's just a nit, but I usually try to explicitly include the things used in each file, so that we can change header files without touching all the places where they're included.

morehouse: Yes. It's just a nit, but I usually try to explicitly include the things used in each file, so…

stephan.yichao.zhaoAuthorUnsubmitted

Done

moved #include "dfsan/dfsan_flags.h" to each cc file.

stephan.yichao.zhao: moved #include "dfsan/dfsan_flags.h" to each cc file.

#include "dfsan/dfsan_origin.h" #include "dfsan/dfsan_origin.h"

#include "dfsan/dfsan_thread.h" #include "dfsan/dfsan_thread.h"

#include "sanitizer_common/sanitizer_atomic.h" #include "sanitizer_common/sanitizer_atomic.h"

#include "sanitizer_common/sanitizer_common.h" #include "sanitizer_common/sanitizer_common.h"

#include "sanitizer_common/sanitizer_file.h" #include "sanitizer_common/sanitizer_file.h"

#include "sanitizer_common/sanitizer_flag_parser.h" #include "sanitizer_common/sanitizer_flag_parser.h"

#include "sanitizer_common/sanitizer_flags.h" #include "sanitizer_common/sanitizer_flags.h"

#include "sanitizer_common/sanitizer_internal_defs.h" #include "sanitizer_common/sanitizer_internal_defs.h"

▲ Show 20 Lines • Show All 502 Lines • ▼ Show 20 Lines if (*(u64 *)addr == origin64)

continue; continue;

*(u64 *)addr = origin64; *(u64 *)addr = origin64;

} }

if (end & 7ULL) if (end & 7ULL)

if (*(u32 *)(end - kOriginAlign) != origin) if (*(u32 *)(end - kOriginAlign) != origin)

*(u32 *)(end - kOriginAlign) = origin; *(u32 *)(end - kOriginAlign) = origin;

} }

static void WriteShadowIfDifferent(dfsan_label label, uptr shadow_addr, static void WriteShadowInRange(dfsan_label label, uptr beg_shadow_addr,

uptr size) { uptr end_shadow_addr) {

morehouseUnsubmitted

Not Done

The start + size implementation seems cleaner to me. Can we use that for WriteShadowWithSize and make WriteShadowInRange call that instead?

morehouse: The start + size implementation seems cleaner to me. Can we use that for `WriteShadowWithSize`…

stephan.yichao.zhaoAuthorUnsubmitted

Done

If we use "WriteShadowWithSize(dfsan_label label, uptr beg_shadow_addr, uptr size)" with the base function,
here the size is preferred to be the number of labels. If not, we may have to assert size % sizeof(dfsan_label) == 0.

But if size is # fo labels, we would have

WriteShadowInRange(dfsan_label label, uptr beg_shadow_addr,
                               uptr end_shadow_addr)  {
  assert(beg_shadow_addr <= end_shadow_addr);
  assert((end_shadow_addr - beg_shadow_addr) % 2 == 0);
  WriteShadowWithSize(label, (end_shadow_addr - beg_shadow_addr) / 2);
}

I feel after the fast16label -> fast8label change, it is possible to do this because then #bytes == #labels.

stephan.yichao.zhao: If we use "WriteShadowWithSize(dfsan_label label, uptr beg_shadow_addr, uptr size)" with the…

dfsan_label *labelp = (dfsan_label *)shadow_addr; // TODO: After changing dfsan_label to 8bit, use internal_memset when label

for (; size != 0; --size, ++labelp) { // is not 0.

if (label) {

dfsan_label *labelp = (dfsan_label *)beg_shadow_addr;

for (; (uptr)labelp < end_shadow_addr; ++labelp) *labelp = label;

return;

}

dfsan_label *labelp = (dfsan_label *)beg_shadow_addr;

for (; (uptr)labelp < end_shadow_addr; ++labelp) {

morehouseUnsubmitted

Done

// is not 0.

- if (label) {

- dfsan_label *labelp = (dfsan_label *)beg_shadow_addr;

+ dfsan_label *labelp = (dfsan_label *)beg_shadow_addr;

+ if (label) {

for (; (uptr)labelp < end_shadow_addr; ++labelp) *labelp = label;

return;

}

- dfsan_label *labelp = (dfsan_label *)beg_shadow_addr;

for (; (uptr)labelp < end_shadow_addr; ++labelp) {

// Don't write the label if it is already the value we need it to be.

morehouse:

stephan.yichao.zhaoAuthorUnsubmitted

Done

Thank you.

stephan.yichao.zhao: Thank you.

// Don't write the label if it is already the value we need it to be. // Don't write the label if it is already the value we need it to be.

// In a program where most addresses are not labeled, it is common that // In a program where most addresses are not labeled, it is common that

// a page of shadow memory is entirely zeroed. The Linux copy-on-write // a page of shadow memory is entirely zeroed. The Linux copy-on-write

// implementation will share all of the zeroed pages, making a copy of a // implementation will share all of the zeroed pages, making a copy of a

// page when any value is written. The un-sharing will happen even if // page when any value is written. The un-sharing will happen even if

// the value written does not change the value in memory. Avoiding the // the value written does not change the value in memory. Avoiding the

// the amount of real memory used by large programs. // the amount of real memory used by large programs.

if (label == *labelp) if (!*labelp)

continue; continue;

*labelp = label; *labelp = 0;

} }

static void WriteShadowWithSize(dfsan_label label, uptr shadow_addr,

uptr size) {

WriteShadowInRange(label, shadow_addr, shadow_addr + size * sizeof(label));

}

#define RET_CHAIN_ORIGIN(id) \ #define RET_CHAIN_ORIGIN(id) \

GET_CALLER_PC_BP_SP; \ GET_CALLER_PC_BP_SP; \

(void)sp; \ (void)sp; \

GET_STORE_STACK_TRACE_PC_BP(pc, bp); \ GET_STORE_STACK_TRACE_PC_BP(pc, bp); \

return ChainOrigin(id, &stack); return ChainOrigin(id, &stack);

// Return a new origin chain with the previous ID id and the current stack // Return a new origin chain with the previous ID id and the current stack

// trace. // trace.

Show All 22 Lines

} }

SANITIZER_INTERFACE_ATTRIBUTE void dfsan_mem_origin_transfer(const void *dst, SANITIZER_INTERFACE_ATTRIBUTE void dfsan_mem_origin_transfer(const void *dst,

const void *src, const void *src,

uptr len) { uptr len) {

__dfsan_mem_origin_transfer(dst, src, len); __dfsan_mem_origin_transfer(dst, src, len);

} }

namespace __dfsan {

bool dfsan_inited = false;

bool dfsan_init_is_running = false;

void dfsan_copy_memory(void *dst, const void *src, uptr size) {

internal_memcpy(dst, src, size);

internal_memcpy((void *)shadow_for(dst), (const void *)shadow_for(src),

size * sizeof(dfsan_label));

if (__dfsan_get_track_origins())

dfsan_mem_origin_transfer(dst, src, size);

}

void dfsan_release_meta_memory(const void *addr, uptr size) {

dfsan_set_label(0, const_cast<void *>(addr), size);

morehouseUnsubmitted

Not Done

This might be redundant since on Linux releasing the memory zeroes it out on the next load.

morehouse: This might be redundant since on Linux releasing the memory zeroes it out on the next load.

stephan.yichao.zhaoAuthorUnsubmitted

Not Done

This mainly follows how MSan does when allocator does unmmap:

https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/msan/msan_allocator.cpp#L33

dfsan_set_label only does mmap, at unmmap event, we call ReleaseMemoryPagesToOS explicitly.

stephan.yichao.zhao: This mainly follows how MSan does when allocator does unmmap: https://github.com/llvm/llvm…

// We are about to unmap a chunk of user memory.

// Mark the corresponding shadow memory as not needed.

const uptr beg_shadow_addr = (uptr)__dfsan::shadow_for(addr);

const void *end_addr = (void *)((uptr)addr + size);

const uptr end_shadow_addr = (uptr)__dfsan::shadow_for(end_addr);

ReleaseMemoryPagesToOS(beg_shadow_addr, end_shadow_addr);

if (__dfsan_get_track_origins()) {

const uptr beg_origin_addr = (uptr)__dfsan::origin_for(addr);

const uptr end_origin_addr = (uptr)__dfsan::origin_for(end_addr);

ReleaseMemoryPagesToOS(beg_origin_addr, end_origin_addr);

morehouseUnsubmitted

Not Done

Should we just call ReleaseOrigins?

morehouse: Should we just call `ReleaseOrigins`?

stephan.yichao.zhaoAuthorUnsubmitted

Not Done

ReleaseOrigins does not call ReleaseMemoryPagesToOS.

stephan.yichao.zhao: ReleaseOrigins does not call ReleaseMemoryPagesToOS.

morehouseUnsubmitted

Done

Shouldn't mmap(MAP_NORESERVE) have a similar effect on resident memory to madvise(MADV_DONTNEED)?

Besides, we already have ReleaseOrClearShadow and ReleaseOrigins. Why do we need to reimplement that functionality in this function?

morehouse: Shouldn't mmap(MAP_NORESERVE) have a similar effect on resident memory to madvise…

stephan.yichao.zhaoAuthorUnsubmitted

Done

removed madvise(MADV_DONTNEED), and replaced dfsan_release_meta_memory by dfsan_set_label(0).

stephan.yichao.zhao: removed madvise(MADV_DONTNEED), and replaced dfsan_release_meta_memory by dfsan_set_label(0).

}

} // namespace __dfsan

// If the label s is tainted, set the size bytes from the address p to be a new // If the label s is tainted, set the size bytes from the address p to be a new

// origin chain with the previous ID o and the current stack trace. This is // origin chain with the previous ID o and the current stack trace. This is

// used by instrumentation to reduce code size when too much code is inserted. // used by instrumentation to reduce code size when too much code is inserted.

extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_maybe_store_origin( extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_maybe_store_origin(

u16 s, void *p, uptr size, dfsan_origin o) { u16 s, void *p, uptr size, dfsan_origin o) {

if (UNLIKELY(s)) { if (UNLIKELY(s)) {

GET_CALLER_PC_BP_SP; GET_CALLER_PC_BP_SP;

(void)sp; (void)sp;

GET_STORE_STACK_TRACE_PC_BP(pc, bp); GET_STORE_STACK_TRACE_PC_BP(pc, bp);

SetOrigin(p, size, ChainOrigin(o, &stack)); SetOrigin(p, size, ChainOrigin(o, &stack));

} }

// Releases the pages within the origin address range, and sets the origin // Releases the pages within the origin address range.

// addresses not on the pages to be 0. static void ReleaseOrigins(void *addr, uptr size) {

static void ReleaseOrClearOrigins(void *addr, uptr size) {

const uptr beg_origin_addr = (uptr)__dfsan::origin_for(addr); const uptr beg_origin_addr = (uptr)__dfsan::origin_for(addr);

const void *end_addr = (void *)((uptr)addr + size); const void *end_addr = (void *)((uptr)addr + size);

const uptr end_origin_addr = (uptr)__dfsan::origin_for(end_addr); const uptr end_origin_addr = (uptr)__dfsan::origin_for(end_addr);

if (end_origin_addr - beg_origin_addr <

common_flags()->clear_shadow_mmap_threshold)

return;

const uptr page_size = GetPageSizeCached(); const uptr page_size = GetPageSizeCached();

const uptr beg_aligned = RoundUpTo(beg_origin_addr, page_size); const uptr beg_aligned = RoundUpTo(beg_origin_addr, page_size);

const uptr end_aligned = RoundDownTo(end_origin_addr, page_size); const uptr end_aligned = RoundDownTo(end_origin_addr, page_size);

// dfsan_set_label can be called from the following cases if (!MmapFixedSuperNoReserve(beg_aligned, end_aligned - beg_aligned))

// 1) mapped ranges by new/delete and malloc/free. This case has origin memory Die();

morehouseUnsubmitted

Not Done

What's the reason to change this from ReleaseMemoryPagesToOS?

morehouse: What's the reason to change this from `ReleaseMemoryPagesToOS`?

stephan.yichao.zhaoAuthorUnsubmitted

Done

This mainly follows MSan's approach.

Although MSan does not have a similar function to release origin, it has SetShadow

https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/msan/msan_poisoning.cpp#L216

to zero out Shadow, it calls MmapFixedSuperNoReserve w/o ReleaseMemoryPagesToOS,

To be consistent with this, ReleaseOrigins replaces ReleaseMemoryPagesToOS by MmapFixedSuperNoReserve.

stephan.yichao.zhao: This mainly follows MSan's approach. Although MSan does not have a similar function to release…

morehouseUnsubmitted

Done

Don't they have a similar effect? It's confusing to use two different ways of releasing memory without at least a comment explaining why.

morehouse: Don't they have a similar effect? It's confusing to use two different ways of releasing memory…

stephan.yichao.zhaoAuthorUnsubmitted

Done

removed those madvise calls.

stephan.yichao.zhao: removed those madvise calls.

// size > 50k, and happens less frequently.

// 2) zero-filling internal data structures by utility libraries. This case

// has origin memory size < 16k, and happens more often.

// Set kNumPagesThreshold to be 4 to avoid releasing small pages.

const int kNumPagesThreshold = 4;

if (beg_aligned + kNumPagesThreshold * page_size >= end_aligned)

return;

ReleaseMemoryPagesToOS(beg_aligned, end_aligned);

} }

void SetShadow(dfsan_label label, void *addr, uptr size, dfsan_origin origin) { // Releases the pages within the shadow address range, and sets

// the shadow addresses not on the pages to be 0.

static void ReleaseOrClearShadows(void *addr, uptr size) {

const uptr beg_shadow_addr = (uptr)__dfsan::shadow_for(addr); const uptr beg_shadow_addr = (uptr)__dfsan::shadow_for(addr);

const void *end_addr = (void *)((uptr)addr + size);

const uptr end_shadow_addr = (uptr)__dfsan::shadow_for(end_addr);

if (end_shadow_addr - beg_shadow_addr <

common_flags()->clear_shadow_mmap_threshold)

return WriteShadowWithSize(0, beg_shadow_addr, size);

const uptr page_size = GetPageSizeCached();

const uptr beg_aligned = RoundUpTo(beg_shadow_addr, page_size);

const uptr end_aligned = RoundDownTo(end_shadow_addr, page_size);

if (beg_aligned >= end_aligned) {

WriteShadowWithSize(0, beg_shadow_addr, size);

} else {

if (beg_aligned != beg_shadow_addr)

WriteShadowInRange(0, beg_shadow_addr, beg_aligned);

if (end_aligned != end_shadow_addr)

WriteShadowInRange(0, end_aligned, end_shadow_addr);

if (!MmapFixedSuperNoReserve(beg_aligned, end_aligned - beg_aligned))

Die();

}

void SetShadow(dfsan_label label, void *addr, uptr size, dfsan_origin origin) {

if (0 != label) { if (0 != label) {

WriteShadowIfDifferent(label, beg_shadow_addr, size); const uptr beg_shadow_addr = (uptr)__dfsan::shadow_for(addr);

WriteShadowWithSize(label, beg_shadow_addr, size);

if (__dfsan_get_track_origins()) if (__dfsan_get_track_origins())

SetOrigin(addr, size, origin); SetOrigin(addr, size, origin);

return; return;

} }

if (__dfsan_get_track_origins()) if (__dfsan_get_track_origins())

ReleaseOrClearOrigins(addr, size); ReleaseOrigins(addr, size);

// If label is 0, releases the pages within the shadow address range, and sets

// the shadow addresses not on the pages to be 0.

const void *end_addr = (void *)((uptr)addr + size);

const uptr end_shadow_addr = (uptr)__dfsan::shadow_for(end_addr);

const uptr page_size = GetPageSizeCached();

const uptr beg_aligned = RoundUpTo(beg_shadow_addr, page_size);

const uptr end_aligned = RoundDownTo(end_shadow_addr, page_size);

// dfsan_set_label can be called from the following cases ReleaseOrClearShadows(addr, size);

morehouseUnsubmitted

Done

Does this do anything since we already called MmapFixedSuperNoReserve?

morehouse: Does this do anything since we already called `MmapFixedSuperNoReserve`?

stephan.yichao.zhaoAuthorUnsubmitted

Done

This should have been removed to be consistent with
https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/msan/msan_poisoning.cpp#L216

stephan.yichao.zhao: This should have been removed to be consistent with https://github.com/llvm/llvm…

// 1) mapped ranges by new/delete and malloc/free. This case has shadow memory

// size > 100k, and happens less frequently.

// 2) zero-filling internal data structures by utility libraries. This case

// has shadow memory size < 32k, and happens more often.

// Set kNumPagesThreshold to be 8 to avoid releasing small pages.

const int kNumPagesThreshold = 8;

if (beg_aligned + kNumPagesThreshold * page_size >= end_aligned)

return WriteShadowIfDifferent(label, beg_shadow_addr, size);

WriteShadowIfDifferent(label, beg_shadow_addr, beg_aligned - beg_shadow_addr);

ReleaseMemoryPagesToOS(beg_aligned, end_aligned);

WriteShadowIfDifferent(label, end_aligned, end_shadow_addr - end_aligned);

} }

extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_set_label( extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_set_label(

dfsan_label label, dfsan_origin origin, void *addr, uptr size) { dfsan_label label, dfsan_origin origin, void *addr, uptr size) {

SetShadow(label, addr, size, origin); SetShadow(label, addr, size, origin);

} }

SANITIZER_INTERFACE_ATTRIBUTE SANITIZER_INTERFACE_ATTRIBUTE

morehouseUnsubmitted

Not Done

Can we simplify by moving all this logic into WriteShadowWithSize? It's confusing to have multiple layers of functions to set the shadow, with some repeated checks in each layer.

morehouse: Can we simplify by moving all this logic into `WriteShadowWithSize`? It's confusing to have…

stephan.yichao.zhaoAuthorUnsubmitted

Done

WriteShadowInRange works for both 0 and non-zero labels.
Refactored this code into ReleaseOrClearShadows.

stephan.yichao.zhao: WriteShadowInRange works for both 0 and non-zero labels. Refactored this code into…

void dfsan_set_label(dfsan_label label, void *addr, uptr size) { void dfsan_set_label(dfsan_label label, void *addr, uptr size) {

dfsan_origin init_origin = 0; dfsan_origin init_origin = 0;

if (label && __dfsan_get_track_origins()) { if (label && __dfsan_get_track_origins()) {

GET_CALLER_PC_BP; GET_CALLER_PC_BP;

GET_STORE_STACK_TRACE_PC_BP(pc, bp); GET_STORE_STACK_TRACE_PC_BP(pc, bp);

init_origin = ChainOrigin(0, &stack, true); init_origin = ChainOrigin(0, &stack, true);

} }

SetShadow(label, addr, size, init_origin); SetShadow(label, addr, size, init_origin);

▲ Show 20 Lines • Show All 225 Lines • ▼ Show 20 Lines

#define DFSAN_FLAG(Type, Name, DefaultValue, Description) \ #define DFSAN_FLAG(Type, Name, DefaultValue, Description) \

RegisterFlag(parser, #Name, Description, &f->Name); RegisterFlag(parser, #Name, Description, &f->Name);

#include "dfsan_flags.inc" #include "dfsan_flags.inc"

#undef DFSAN_FLAG #undef DFSAN_FLAG

} }

static void InitializeFlags() { static void InitializeFlags() {

SetCommonFlagsDefaults(); SetCommonFlagsDefaults();

{

CommonFlags cf;

cf.CopyFrom(*common_flags());

cf.intercept_tls_get_addr = true;

OverrideCommonFlags(cf);

morehouseUnsubmitted

Not Done

What's going on here? intercept_tld_get_addr appears unused, so why do we need to override it?

morehouse: What's going on here? `intercept_tld_get_addr` appears unused, so why do we need to override…

stephan.yichao.zhaoAuthorUnsubmitted

Done

This will be used by D101204 when interceptors are defined.

stephan.yichao.zhao: This will be used by D101204 when interceptors are defined.

}

flags().SetDefaults(); flags().SetDefaults();

FlagParser parser; FlagParser parser;

RegisterCommonFlags(&parser); RegisterCommonFlags(&parser);

RegisterDfsanFlags(&parser, &flags()); RegisterDfsanFlags(&parser, &flags());

parser.ParseStringFromEnv("DFSAN_OPTIONS"); parser.ParseStringFromEnv("DFSAN_OPTIONS");

InitializeCommonFlags(); InitializeCommonFlags();

if (Verbosity()) ReportUnrecognizedFlags(); if (Verbosity()) ReportUnrecognizedFlags();

▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines static void dfsan_fini() {

} }

extern "C" void dfsan_flush() { extern "C" void dfsan_flush() {

if (!MmapFixedSuperNoReserve(ShadowAddr(), UnusedAddr() - ShadowAddr())) if (!MmapFixedSuperNoReserve(ShadowAddr(), UnusedAddr() - ShadowAddr()))

Die(); Die();

} }

static void dfsan_init(int argc, char **argv, char **envp) { static void DFsanInit(int argc, char **argv, char **envp) {

CHECK(!dfsan_init_is_running);

if (dfsan_inited)

return;

dfsan_init_is_running = true;

SanitizerToolName = "DataflowSanitizer";

InitializeFlags(); InitializeFlags();

::InitializePlatformEarly(); ::InitializePlatformEarly();

dfsan_flush(); dfsan_flush();

if (common_flags()->use_madv_dontdump) if (common_flags()->use_madv_dontdump)

DontDumpShadowMemory(ShadowAddr(), UnusedAddr() - ShadowAddr()); DontDumpShadowMemory(ShadowAddr(), UnusedAddr() - ShadowAddr());

// Protect the region of memory we don't use, to preserve the one-to-one // Protect the region of memory we don't use, to preserve the one-to-one

// mapping from application to shadow memory. But if ASLR is disabled, Linux // mapping from application to shadow memory. But if ASLR is disabled, Linux

// will load our executable in the middle of our unused region. This mostly // will load our executable in the middle of our unused region. This mostly

// works so long as the program doesn't use too much memory. We support this // works so long as the program doesn't use too much memory. We support this

// case by disabling memory protection when ASLR is disabled. // case by disabling memory protection when ASLR is disabled.

uptr init_addr = (uptr)&dfsan_init; uptr init_addr = (uptr)&DFsanInit;

if (!(init_addr >= UnusedAddr() && init_addr < AppAddr())) if (!(init_addr >= UnusedAddr() && init_addr < AppAddr()))

MmapFixedNoAccess(UnusedAddr(), AppAddr() - UnusedAddr()); MmapFixedNoAccess(UnusedAddr(), AppAddr() - UnusedAddr());

InitializeInterceptors(); InitializeInterceptors();

// Register the fini callback to run when the program terminates successfully // Register the fini callback to run when the program terminates successfully

// or it is killed by the runtime. // or it is killed by the runtime.

Atexit(dfsan_fini); Atexit(dfsan_fini);

AddDieCallback(dfsan_fini); AddDieCallback(dfsan_fini);

// Set up threads // Set up threads

DFsanTSDInit(DFsanTSDDtor); DFsanTSDInit(DFsanTSDDtor);

dfsan_allocator_init();

DFsanThread *main_thread = DFsanThread::Create(nullptr, nullptr, nullptr); DFsanThread *main_thread = DFsanThread::Create(nullptr, nullptr, nullptr);

SetCurrentThread(main_thread); SetCurrentThread(main_thread);

main_thread->ThreadStart(); main_thread->ThreadStart();

__dfsan_label_info[kInitializingLabel].desc = "<init label>"; __dfsan_label_info[kInitializingLabel].desc = "<init label>";

dfsan_init_is_running = false;

dfsan_inited = true;

} }

namespace __dfsan {

void dfsan_init() { DFsanInit(0, nullptr, nullptr); }

morehouseUnsubmitted

Not Done

dfsan_init is now unused. Why do we need this?

morehouse: `dfsan_init` is now unused. Why do we need this?

stephan.yichao.zhaoAuthorUnsubmitted

Done

This will be used by D101204.

stephan.yichao.zhao: This will be used by D101204.

} // namespace __dfsan

#if SANITIZER_CAN_USE_PREINIT_ARRAY #if SANITIZER_CAN_USE_PREINIT_ARRAY

__attribute__((section(".preinit_array"), used)) __attribute__((section(".preinit_array"),

static void (*dfsan_init_ptr)(int, char **, char **) = dfsan_init; used)) static void (*dfsan_init_ptr)(int, char **,

char **) = DFsanInit;

#endif #endif

compiler-rt/lib/dfsan/dfsan_allocator.h

This file was added.

				//===-- dfsan_allocator.h ---------------------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is a part of DataflowSanitizer.
				//
				//===----------------------------------------------------------------------===//

				#ifndef DFSAN_ALLOCATOR_H
				#define DFSAN_ALLOCATOR_H

				#include "sanitizer_common/sanitizer_common.h"

				namespace __dfsan {

				struct DFsanThreadLocalMallocStorage {
				ALIGNED(8) uptr allocator_cache[96 * (512 * 8 + 16)]; // Opaque.
				void CommitBack();

				private:
				// These objects are allocated via mmap() and are zero-initialized.
				DFsanThreadLocalMallocStorage() {}
				};

				} // namespace __dfsan
				#endif // DFSAN_ALLOCATOR_H

compiler-rt/lib/dfsan/dfsan_allocator.cpp

This file was added.

				//===-- dfsan_allocator.cpp -------------------------- --------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is a part of DataflowSanitizer.
				//
				// DataflowSanitizer allocator.
				//===----------------------------------------------------------------------===//

				#include "dfsan_allocator.h"

				#include "dfsan.h"
				#include "dfsan_thread.h"
				#include "sanitizer_common/sanitizer_allocator.h"
				#include "sanitizer_common/sanitizer_allocator_checks.h"
				#include "sanitizer_common/sanitizer_allocator_interface.h"
				#include "sanitizer_common/sanitizer_allocator_report.h"
				#include "sanitizer_common/sanitizer_errno.h"

				namespace __dfsan {

				struct Metadata {
				uptr requested_size;
				};

				struct DFsanMapUnmapCallback {
				void OnMap(uptr p, uptr size) const {}
				void OnUnmap(uptr p, uptr size) const { dfsan_set_label(0, (void *)p, size); }
				morehouseUnsubmitted Done Reply Inline Actions Maybe we don't need this. We release (zero) shadow on unmap, so it should still be zero if we mmap it again. morehouse: Maybe we don't need this. We release (zero) shadow on unmap, so it should still be zero if we…
				stephan.yichao.zhaoAuthorUnsubmitted Done Reply Inline Actions Thank you. Removed. stephan.yichao.zhao: Thank you. Removed.
				morehouseUnsubmitted Not Done Reply Inline Actions Looks like you added it back. I think we only need to set 0 label on unmap. morehouse: Looks like you added it back. I think we only need to set 0 label on unmap.
				stephan.yichao.zhaoAuthorUnsubmitted Done Reply Inline Actions I found those mmap interceptors in dfsan_intercerptors.cp also do dfsan_set_label(0). https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/dfsan/dfsan_interceptors.cpp#L41 For the same reason they are not needed either. In the next change, I will remove all mmap related dfsan_set_label(0) together. stephan.yichao.zhao: I found those mmap interceptors in dfsan_intercerptors.cp also do dfsan_set_label(0). https…
				stephan.yichao.zhaoAuthorUnsubmitted Done Reply Inline Actions Thought about the problem a bit more. I think our discussion is based on an assumption that all unmmap will be precisely intercepted. This may not be true since if some code calls unmmap by system call number directly like what this sanitizer allocator does, there is no way to intercept that kind of unmmap. So in general, we may not be able to remove intercepting those mmap and mmap64. But for this OnMap, it should be safe because its corresponding OnUmmap is injected by our code. So in the following change, I will only remove dfsan_set_label(0) for OnMap. stephan.yichao.zhao: Thought about the problem a bit more. I think our discussion is based on an assumption that…
				morehouseUnsubmitted Not Done Reply Inline Actions Thought about the problem a bit more. I think our discussion is based on an assumption that all unmmap will be precisely intercepted. This may not be true since if some code calls unmmap by system call number directly like what this sanitizer allocator does, there is no way to intercept that kind of unmmap. So in general, we may not be able to remove intercepting those mmap and mmap64. Ah yes, good point. But for this OnMap, it should be safe because its corresponding OnUmmap is injected by our code. Well, I'm not 100% sure it's safe. Something else could do `syscall(SYS_mmap)`, then taint the memory, then `syscall_(SYS_munmap)`. Then we could end up mapping that same region for our heap and calling this `OnMap` callback. So in the following change, I will only remove dfsan_set_label(0) for OnMap. morehouse: > Thought about the problem a bit more. > > I think our discussion is based on an assumption…
				};

				static const uptr kMaxAllowedMallocSize = 8UL << 30;

				struct AP64 { // Allocator64 parameters. Deliberately using a short name.
				// TODO: DFSan assumes application memory starts from 0x700000008000. For
				// unknown reason, the sanitizer allocator does not support any start address
				// between 0x701000000000 and 0x700000008000. After switching to fast8labels
				// mode, DFSan memory layout will be changed to the same to MSan's. Then we
				// set the start address to 0x700000000000 as MSan.
				static const uptr kSpaceBeg = 0x701000000000ULL;
				morehouseUnsubmitted Done Reply Inline Actions This appears unused except for `kSpaceBeg` below. The name sounds like the max size of the heap, rather than the start address of it. morehouse: 1) This appears unused except for `kSpaceBeg` below. 2) The name sounds like the max size of…
				stephan.yichao.zhaoAuthorUnsubmitted Done Reply Inline Actions inlined the constant. stephan.yichao.zhao: inlined the constant.
				static const uptr kSpaceSize = 0x40000000000; // 4T.
				static const uptr kMetadataSize = sizeof(Metadata);
				typedef DefaultSizeClassMap SizeClassMap;
				typedef DFsanMapUnmapCallback MapUnmapCallback;
				static const uptr kFlags = 0;
				using AddressSpaceView = LocalAddressSpaceView;
				};

				typedef SizeClassAllocator64<AP64> PrimaryAllocator;

				typedef CombinedAllocator<PrimaryAllocator> Allocator;
				typedef Allocator::AllocatorCache AllocatorCache;

				static Allocator allocator;
				static AllocatorCache fallback_allocator_cache;
				static StaticSpinMutex fallback_mutex;

				static uptr max_malloc_size;

				void dfsan_allocator_init() {
				SetAllocatorMayReturnNull(common_flags()->allocator_may_return_null);
				allocator.Init(common_flags()->allocator_release_to_os_interval_ms);
				if (common_flags()->max_allocation_size_mb)
				max_malloc_size = Min(common_flags()->max_allocation_size_mb << 20,
				kMaxAllowedMallocSize);
				else
				max_malloc_size = kMaxAllowedMallocSize;
				}

				AllocatorCache GetAllocatorCache(DFsanThreadLocalMallocStorage ms) {
				CHECK(ms);
				CHECK_LE(sizeof(AllocatorCache), sizeof(ms->allocator_cache));
				return reinterpret_cast<AllocatorCache *>(ms->allocator_cache);
				}

				void DFsanThreadLocalMallocStorage::CommitBack() {
				allocator.SwallowCache(GetAllocatorCache(this));
				}

				static void *DFsanAllocate(uptr size, uptr alignment, bool zeroise) {
				if (size > max_malloc_size) {
				if (AllocatorMayReturnNull()) {
				Report("WARNING: DataflowSanitizer failed to allocate 0x%zx bytes\n",
				size);
				return nullptr;
				}
				BufferedStackTrace stack;
				ReportAllocationSizeTooBig(size, max_malloc_size, &stack);
				}
				DFsanThread *t = GetCurrentThread();
				void *allocated;
				if (t) {
				AllocatorCache *cache = GetAllocatorCache(&t->malloc_storage());
				allocated = allocator.Allocate(cache, size, alignment);
				} else {
				SpinMutexLock l(&fallback_mutex);
				AllocatorCache *cache = &fallback_allocator_cache;
				allocated = allocator.Allocate(cache, size, alignment);
				}
				if (UNLIKELY(!allocated)) {
				SetAllocatorOutOfMemory();
				if (AllocatorMayReturnNull())
				return nullptr;
				BufferedStackTrace stack;
				ReportOutOfMemory(size, &stack);
				}
				Metadata *meta =
				reinterpret_cast<Metadata *>(allocator.GetMetaData(allocated));
				meta->requested_size = size;
				if (zeroise) {
				internal_memset(allocated, 0, size);
				dfsan_set_label(0, allocated, size);
				} else if (flags().zero_in_malloc) {
				dfsan_set_label(0, allocated, size);
				}
				return allocated;
				}

				void dfsan_deallocate(void *p) {
				CHECK(p);
				Metadata meta = reinterpret_cast<Metadata >(allocator.GetMetaData(p));
				uptr size = meta->requested_size;
				meta->requested_size = 0;
				if (flags().zero_in_free)
				dfsan_set_label(0, p, size);
				DFsanThread *t = GetCurrentThread();
				if (t) {
				AllocatorCache *cache = GetAllocatorCache(&t->malloc_storage());
				allocator.Deallocate(cache, p);
				} else {
				SpinMutexLock l(&fallback_mutex);
				AllocatorCache *cache = &fallback_allocator_cache;
				allocator.Deallocate(cache, p);
				}
				}

				void DFsanReallocate(void old_p, uptr new_size, uptr alignment) {
				Metadata meta = reinterpret_cast<Metadata >(allocator.GetMetaData(old_p));
				uptr old_size = meta->requested_size;
				uptr actually_allocated_size = allocator.GetActuallyAllocatedSize(old_p);
				if (new_size <= actually_allocated_size) {
				// We are not reallocating here.
				meta->requested_size = new_size;
				if (new_size > old_size && flags().zero_in_malloc)
				dfsan_set_label(0, (char *)old_p + old_size, new_size - old_size);
				return old_p;
				}
				uptr memcpy_size = Min(new_size, old_size);
				void new_p = DFsanAllocate(new_size, alignment, false /zeroise*/);
				if (new_p) {
				dfsan_copy_memory(new_p, old_p, memcpy_size);
				dfsan_deallocate(old_p);
				}
				return new_p;
				}

				void *DFsanCalloc(uptr nmemb, uptr size) {
				if (UNLIKELY(CheckForCallocOverflow(size, nmemb))) {
				if (AllocatorMayReturnNull())
				return nullptr;
				BufferedStackTrace stack;
				ReportCallocOverflow(nmemb, size, &stack);
				}
				return DFsanAllocate(nmemb * size, sizeof(u64), true /zeroise/);
				}

				static uptr AllocationSize(const void *p) {
				if (!p)
				return 0;
				const void *beg = allocator.GetBlockBegin(p);
				if (beg != p)
				return 0;
				Metadata b = (Metadata )allocator.GetMetaData(p);
				return b->requested_size;
				}

				void *dfsan_malloc(uptr size) {
				return SetErrnoOnNull(DFsanAllocate(size, sizeof(u64), false /zeroise/));
				}

				void *dfsan_calloc(uptr nmemb, uptr size) {
				return SetErrnoOnNull(DFsanCalloc(nmemb, size));
				}

				void dfsan_realloc(void ptr, uptr size) {
				if (!ptr)
				return SetErrnoOnNull(DFsanAllocate(size, sizeof(u64), false /zeroise/));
				if (size == 0) {
				dfsan_deallocate(ptr);
				return nullptr;
				}
				return SetErrnoOnNull(DFsanReallocate(ptr, size, sizeof(u64)));
				}

				void dfsan_reallocarray(void ptr, uptr nmemb, uptr size) {
				if (UNLIKELY(CheckForCallocOverflow(size, nmemb))) {
				errno = errno_ENOMEM;
				if (AllocatorMayReturnNull())
				return nullptr;
				BufferedStackTrace stack;
				ReportReallocArrayOverflow(nmemb, size, &stack);
				}
				return dfsan_realloc(ptr, nmemb * size);
				}

				void *dfsan_valloc(uptr size) {
				return SetErrnoOnNull(
				DFsanAllocate(size, GetPageSizeCached(), false /zeroise/));
				}

				void *dfsan_pvalloc(uptr size) {
				uptr PageSize = GetPageSizeCached();
				if (UNLIKELY(CheckForPvallocOverflow(size, PageSize))) {
				errno = errno_ENOMEM;
				if (AllocatorMayReturnNull())
				return nullptr;
				BufferedStackTrace stack;
				ReportPvallocOverflow(size, &stack);
				}
				// pvalloc(0) should allocate one page.
				size = size ? RoundUpTo(size, PageSize) : PageSize;
				return SetErrnoOnNull(DFsanAllocate(size, PageSize, false /zeroise/));
				}

				void *dfsan_aligned_alloc(uptr alignment, uptr size) {
				if (UNLIKELY(!CheckAlignedAllocAlignmentAndSize(alignment, size))) {
				errno = errno_EINVAL;
				if (AllocatorMayReturnNull())
				return nullptr;
				BufferedStackTrace stack;
				ReportInvalidAlignedAllocAlignment(size, alignment, &stack);
				}
				return SetErrnoOnNull(DFsanAllocate(size, alignment, false /zeroise/));
				}

				void *dfsan_memalign(uptr alignment, uptr size) {
				if (UNLIKELY(!IsPowerOfTwo(alignment))) {
				errno = errno_EINVAL;
				if (AllocatorMayReturnNull())
				return nullptr;
				BufferedStackTrace stack;
				ReportInvalidAllocationAlignment(alignment, &stack);
				}
				return SetErrnoOnNull(DFsanAllocate(size, alignment, false /zeroise/));
				}

				int dfsan_posix_memalign(void **memptr, uptr alignment, uptr size) {
				if (UNLIKELY(!CheckPosixMemalignAlignment(alignment))) {
				if (AllocatorMayReturnNull())
				return errno_EINVAL;
				BufferedStackTrace stack;
				ReportInvalidPosixMemalignAlignment(alignment, &stack);
				}
				void ptr = DFsanAllocate(size, alignment, false /zeroise*/);
				if (UNLIKELY(!ptr))
				// OOM error is already taken care of by DFsanAllocate.
				return errno_ENOMEM;
				CHECK(IsAligned((uptr)ptr, alignment));
				*memptr = ptr;
				return 0;
				}

				} // namespace __dfsan

				using namespace __dfsan;

				uptr __sanitizer_get_current_allocated_bytes() {
				uptr stats[AllocatorStatCount];
				allocator.GetStats(stats);
				return stats[AllocatorStatAllocated];
				}

				uptr __sanitizer_get_heap_size() {
				uptr stats[AllocatorStatCount];
				allocator.GetStats(stats);
				return stats[AllocatorStatMapped];
				}

				uptr __sanitizer_get_free_bytes() { return 1; }

				uptr __sanitizer_get_unmapped_bytes() { return 1; }

				uptr __sanitizer_get_estimated_allocated_size(uptr size) { return size; }

				int __sanitizer_get_ownership(const void *p) { return AllocationSize(p) != 0; }

				uptr __sanitizer_get_allocated_size(const void *p) { return AllocationSize(p); }

compiler-rt/lib/dfsan/dfsan_flags.inc

	//===-- dfsan_flags.inc ------------------------------------------ C++ --===//			//===-- dfsan_flags.inc ------------------------------------------ C++ --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// DFSan runtime flags.			// DFSan runtime flags.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	#ifndef DFSAN_FLAG			#ifndef DFSAN_FLAG
	# error "Define DFSAN_FLAG prior to including this file!"			# error "Define DFSAN_FLAG prior to including this file!"
				Lint: Pre-merge checks Inline Actions clang-tidy: error: "Define DFSAN_FLAG prior to including this file!" [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: "Define DFSAN_FLAG prior to including this file!" [clang-diagnostic-error]…
	#endif			#endif

	// DFSAN_FLAG(Type, Name, DefaultValue, Description)			// DFSAN_FLAG(Type, Name, DefaultValue, Description)
	// See COMMON_FLAG in sanitizer_flags.inc for more details.			// See COMMON_FLAG in sanitizer_flags.inc for more details.

	DFSAN_FLAG(bool, warn_unimplemented, true,			DFSAN_FLAG(bool, warn_unimplemented, true,
				Lint: Pre-merge checks Inline Actions clang-tidy: error: unknown type name 'warn_unimplemented' [clang-diagnostic-error] not useful clang-tidy: error: expected ')' [clang-diagnostic-error] not useful clang-tidy: error: expected parameter declarator [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: unknown type name 'warn_unimplemented' [clang-diagnostic-error] [[https…
	"Whether to warn on unimplemented functions.")			"Whether to warn on unimplemented functions.")
	DFSAN_FLAG(bool, warn_nonzero_labels, false,			DFSAN_FLAG(bool, warn_nonzero_labels, false,
				Lint: Pre-merge checks Inline Actions clang-tidy: error: expected function body after function declarator [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: expected function body after function declarator [clang-diagnostic-error]…
	"Whether to warn on unimplemented functions.")			"Whether to warn on unimplemented functions.")
	DFSAN_FLAG(			DFSAN_FLAG(
	bool, strict_data_dependencies, true,			bool, strict_data_dependencies, true,
	"Whether to propagate labels only when there is an obvious data dependency"			"Whether to propagate labels only when there is an obvious data dependency"
	"(e.g., when comparing strings, ignore the fact that the output of the"			"(e.g., when comparing strings, ignore the fact that the output of the"
	"comparison might be data-dependent on the content of the strings). This"			"comparison might be data-dependent on the content of the strings). This"
	"applies only to the custom functions defined in 'custom.c'.")			"applies only to the custom functions defined in 'custom.c'.")
	DFSAN_FLAG(const char *, dump_labels_at_exit, "", "The path of the file where "			DFSAN_FLAG(const char *, dump_labels_at_exit, "", "The path of the file where "
	"to dump the labels when the "			"to dump the labels when the "
	"program terminates.")			"program terminates.")
	DFSAN_FLAG(			DFSAN_FLAG(
	int, origin_history_size, Origin::kMaxDepth,			int, origin_history_size, Origin::kMaxDepth,
	"The limit of origin chain length. Non-positive values mean unlimited.")			"The limit of origin chain length. Non-positive values mean unlimited.")
	DFSAN_FLAG(			DFSAN_FLAG(
	int, origin_history_per_stack_limit, 20000,			int, origin_history_per_stack_limit, 20000,
	"The limit of origin node's references count. "			"The limit of origin node's references count. "
	"Non-positive values mean unlimited.")			"Non-positive values mean unlimited.")
	DFSAN_FLAG(int, store_context_size, 20,			DFSAN_FLAG(int, store_context_size, 20,
	"The depth limit of origin tracking stack traces.")			"The depth limit of origin tracking stack traces.")
	DFSAN_FLAG(bool, check_origin_invariant, false,			DFSAN_FLAG(bool, check_origin_invariant, false,
	"Whether to check if the origin invariant holds.")			"Whether to check if the origin invariant holds.")
				DFSAN_FLAG(bool, zero_in_malloc, true,
				"Whether to zero shadow space of new allocated memory.")
				DFSAN_FLAG(bool, zero_in_free, true,
				"Whether to zero shadow space of deallocated memory.")

compiler-rt/lib/dfsan/dfsan_thread.h

	//===-- dfsan_thread.h -------------------------------------------*- C++			//===-- dfsan_thread.h -------------------------------------------*- C++
	//-*-===//			//-*-===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// This file is a part of DataFlowSanitizer.			// This file is a part of DataFlowSanitizer.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef DFSAN_THREAD_H			#ifndef DFSAN_THREAD_H
	#define DFSAN_THREAD_H			#define DFSAN_THREAD_H

				#include "dfsan_allocator.h"
	#include "sanitizer_common/sanitizer_common.h"			#include "sanitizer_common/sanitizer_common.h"

	namespace __dfsan {			namespace __dfsan {

	class DFsanThread {			class DFsanThread {
	public:			public:
	// NOTE: There is no DFsanThread constructor. It is allocated			// NOTE: There is no DFsanThread constructor. It is allocated
	// via mmap() and must be valid in zero-initialized state.			// via mmap() and must be valid in zero-initialized state.

	static DFsanThread Create(void start_routine_trampoline,			static DFsanThread Create(void start_routine_trampoline,
	thread_callback_t start_routine, void *arg,			thread_callback_t start_routine, void *arg,
	bool track_origins = false);			bool track_origins = false);
	static void TSDDtor(void *tsd);			static void TSDDtor(void *tsd);
	void Destroy();			void Destroy();

	void Init(); // Should be called from the thread itself.			void Init(); // Should be called from the thread itself.
	thread_return_t ThreadStart();			thread_return_t ThreadStart();

	uptr stack_top();			uptr stack_top();
	uptr stack_bottom();			uptr stack_bottom();
				uptr tls_begin() { return tls_begin_; }
				uptr tls_end() { return tls_end_; }
	bool IsMainThread() { return start_routine_ == nullptr; }			bool IsMainThread() { return start_routine_ == nullptr; }

	bool InSignalHandler() { return in_signal_handler_; }			bool InSignalHandler() { return in_signal_handler_; }
	void EnterSignalHandler() { in_signal_handler_++; }			void EnterSignalHandler() { in_signal_handler_++; }
	void LeaveSignalHandler() { in_signal_handler_--; }			void LeaveSignalHandler() { in_signal_handler_--; }

				DFsanThreadLocalMallocStorage &malloc_storage() { return malloc_storage_; }

	int destructor_iterations_;			int destructor_iterations_;

	private:			private:
	void SetThreadStackAndTls();			void SetThreadStackAndTls();
				void ClearShadowForThreadStackAndTLS();
	struct StackBounds {			struct StackBounds {
	uptr bottom;			uptr bottom;
	uptr top;			uptr top;
	};			};
	StackBounds GetStackBounds() const;			StackBounds GetStackBounds() const;

	bool AddrIsInStack(uptr addr);			bool AddrIsInStack(uptr addr);

	void *start_routine_trampoline_;			void *start_routine_trampoline_;
	thread_callback_t start_routine_;			thread_callback_t start_routine_;
	void *arg_;			void *arg_;
	bool track_origins_;			bool track_origins_;

	StackBounds stack_;			StackBounds stack_;

				uptr tls_begin_;
				uptr tls_end_;

	unsigned in_signal_handler_;			unsigned in_signal_handler_;

				DFsanThreadLocalMallocStorage malloc_storage_;
	};			};

	DFsanThread *GetCurrentThread();			DFsanThread *GetCurrentThread();
	void SetCurrentThread(DFsanThread *t);			void SetCurrentThread(DFsanThread *t);
	void DFsanTSDInit(void (destructor)(void tsd));			void DFsanTSDInit(void (destructor)(void tsd));
	void DFsanTSDDtor(void *tsd);			void DFsanTSDDtor(void *tsd);

	} // namespace __dfsan			} // namespace __dfsan

	#endif // DFSAN_THREAD_H			#endif // DFSAN_THREAD_H

compiler-rt/lib/dfsan/dfsan_thread.cpp

	#include "dfsan_thread.h"			#include "dfsan_thread.h"

	#include <pthread.h>			#include <pthread.h>

	#include "dfsan.h"			#include "dfsan.h"
				#include "sanitizer_common/sanitizer_tls_get_addr.h"

	namespace __dfsan {			namespace __dfsan {

	DFsanThread DFsanThread::Create(void start_routine_trampoline,			DFsanThread DFsanThread::Create(void start_routine_trampoline,
	thread_callback_t start_routine, void *arg,			thread_callback_t start_routine, void *arg,
	bool track_origins) {			bool track_origins) {
	uptr PageSize = GetPageSizeCached();			uptr PageSize = GetPageSizeCached();
	uptr size = RoundUpTo(sizeof(DFsanThread), PageSize);			uptr size = RoundUpTo(sizeof(DFsanThread), PageSize);
	DFsanThread thread = (DFsanThread )MmapOrDie(size, __func__);			DFsanThread thread = (DFsanThread )MmapOrDie(size, __func__);
	thread->start_routine_trampoline_ = start_routine_trampoline;			thread->start_routine_trampoline_ = start_routine_trampoline;
	thread->start_routine_ = start_routine;			thread->start_routine_ = start_routine;
	thread->arg_ = arg;			thread->arg_ = arg;
	thread->track_origins_ = track_origins;			thread->track_origins_ = track_origins;
	thread->destructor_iterations_ = GetPthreadDestructorIterations();			thread->destructor_iterations_ = GetPthreadDestructorIterations();

	return thread;			return thread;
	}			}

	void DFsanThread::SetThreadStackAndTls() {			void DFsanThread::SetThreadStackAndTls() {
	uptr tls_size = 0;			uptr tls_size = 0;
	uptr stack_size = 0;			uptr stack_size = 0;
	uptr tls_begin;			GetThreadStackAndTls(IsMainThread(), &stack_.bottom, &stack_size, &tls_begin_,
	GetThreadStackAndTls(IsMainThread(), &stack_.bottom, &stack_size, &tls_begin,
	&tls_size);			&tls_size);
	stack_.top = stack_.bottom + stack_size;			stack_.top = stack_.bottom + stack_size;
				tls_end_ = tls_begin_ + tls_size;

	int local;			int local;
	CHECK(AddrIsInStack((uptr)&local));			CHECK(AddrIsInStack((uptr)&local));
	}			}

	void DFsanThread::Init() { SetThreadStackAndTls(); }			void DFsanThread::ClearShadowForThreadStackAndTLS() {
				dfsan_set_label(0, (void *)stack_.bottom, stack_.top - stack_.bottom);
				if (tls_begin_ != tls_end_)
				dfsan_set_label(0, (void *)tls_begin_, tls_end_ - tls_begin_);
				DTLS *dtls = DTLS_Get();
				CHECK_NE(dtls, 0);
				ForEachDVT(dtls, [](const DTLS::DTV &dtv, int id) {
				dfsan_set_label(0, (void *)(dtv.beg), dtv.size);
				});
				}

				void DFsanThread::Init() {
				SetThreadStackAndTls();
				ClearShadowForThreadStackAndTLS();
				}

	void DFsanThread::TSDDtor(void *tsd) {			void DFsanThread::TSDDtor(void *tsd) {
	DFsanThread t = (DFsanThread )tsd;			DFsanThread t = (DFsanThread )tsd;
	t->Destroy();			t->Destroy();
	}			}

	void DFsanThread::Destroy() {			void DFsanThread::Destroy() {
				malloc_storage().CommitBack();
				// We also clear the shadow on thread destruction because
				// some code may still be executing in later TSD destructors
				// and we don't want it to have any poisoned stack.
				ClearShadowForThreadStackAndTLS();
	uptr size = RoundUpTo(sizeof(DFsanThread), GetPageSizeCached());			uptr size = RoundUpTo(sizeof(DFsanThread), GetPageSizeCached());
	UnmapOrDie(this, size);			UnmapOrDie(this, size);
				DTLS_Destroy();
	}			}

	thread_return_t DFsanThread::ThreadStart() {			thread_return_t DFsanThread::ThreadStart() {
	Init();			Init();

	if (!start_routine_) {			if (!start_routine_) {
	// start_routine_ == 0 if we're on the main thread or on one of the			// start_routine_ == 0 if we're on the main thread or on one of the
	// OS X libdispatch worker threads. But nobody is supposed to call			// OS X libdispatch worker threads. But nobody is supposed to call
	▲ Show 20 Lines • Show All 72 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[dfsan] Add a DFSan allocatorClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 342585

compiler-rt/lib/dfsan/CMakeLists.txt

compiler-rt/lib/dfsan/dfsan.h

compiler-rt/lib/dfsan/dfsan.cpp

compiler-rt/lib/dfsan/dfsan_allocator.h

compiler-rt/lib/dfsan/dfsan_allocator.cpp

compiler-rt/lib/dfsan/dfsan_flags.inc

compiler-rt/lib/dfsan/dfsan_thread.h

compiler-rt/lib/dfsan/dfsan_thread.cpp

[dfsan] Add a DFSan allocator
ClosedPublic