This is an archive of the discontinued LLVM Phabricator instance.

Add KMSAN instrumentation to MSan pass
ClosedPublic

Authored by glider on Jun 21 2017, 7:39 AM.

Details

Summary

Introduce the -msan-kernel flag, which enables the kernel instrumentation.

The main differences between KMSAN and MSan instrumentations are:

  • KMSAN implies msan-track-origins=2, msan-keep-going=true;
  • there're no explicit accesses to shadow and origin memory. Shadow and origin values for a particular X-byte memory location are read and written via pointers returned by __msan_metadata_ptr_for_load_X(u8 *addr) and __msan_store_shadow_origin_X(u8 *addr, uptr shadow, uptr origin);
  • TLS variables are stored in a single struct in per-task storage. A call to a function returning that struct is inserted into every instrumented function before the entry block;
  • __msan_warning() takes a 32-bit origin parameter;
  • local variables are poisoned with __msan_poison_alloca() upon function entry and unpoisoned with __msan_unpoison_alloca() before leaving the function;
  • the pass doesn't declare any global variables or add global constructors to the translation unit.

Diff Detail

Event Timeline

glider created this revision.Jun 21 2017, 7:39 AM
dvyukov edited edge metadata.Jun 22 2017, 1:28 AM

I know, not the feedback you was looking for :)

lib/Transforms/Instrumentation/MemorySanitizer.cpp
455

Looks like a bug in clang-format. Why is it 3 spaces here? All other members start with 2 and 2 is the proper style.

dvyukov added inline comments.Jun 22 2017, 1:45 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
234

I see that for interface part (flags, pass ctor) you copied KASAN approach, so this looks good to me.

638

ShadowOriginStruct seems to be used only in the loop below. So drop it and use the actual types in the loop.

679

The interface looks bulky because of distinction between arg/param/retval/va_arg. It feels that we put too much details into the interface, such interfaces are usually unstable. Is it possible to combine all that data into a single struct including shadow and origins (for user-space as well) and then have just single function that requests data at offsets of that struct? Or maybe just returns a pointer to the thread-local struct. I should also reduce amount of differences between kernel and userspace.

We could go further and reuse __kmsan_load/store_shadow_origin functions by giving then some magic address that mean TLS. Not sure if it will be good or not.

dvyukov added inline comments.Jun 22 2017, 1:56 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
679

Note that we can't version compiler and kernel atomically. This interface will have to stay basically forever. If we will want to improve it later, we will only make situation worse because we will have to support 2 interfaces.

Evgenii, please take a look at actual pass changes.

glider added inline comments.Jun 22 2017, 2:19 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
689

@eugenis: Dima suggested we merge all msan_XXX_tls globals into a single struct, this way I can replace all the kmsan_get_XXX_tls() callbacks with a single one.
WDYT?

glider added inline comments.Jun 22 2017, 2:28 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
455

I think this is being caused by me formatting the diff, not the whole file.
clang-format handles this case correctly in the wild.

638

This is a legacy, will remove.

eugenis added inline comments.Jun 23 2017, 5:00 PM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
679

That would be an ABI break on Linux. And hiding the arg/param/retval/etc details in a struct does not make the interface any more stable.
On the other hand, it makes the interface simpler, so if you think you can do it just for the kernel w/o complicating the runtime library too much, then go for it.

679

Are you sure there is no way to version compiler and the runtime atomically? As I imagine, the part that must be in sync with the kernel is basically shadow mapping and the bool are_we_initialized_already() function. Is it possible to make that the forever-stable interface, and tie everything on top of it to the compiler?

glider added inline comments.Jun 26 2017, 5:15 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
679

Are you sure there is no way to version compiler and the runtime atomically?

If we're talking about the runtime residing in the upstream kernel, then yes.
Clang versions live long, so if we ever introduce a new API version, then the kernel will have to support all the existing versions.

As I imagine, the part that must be in sync with the kernel is basically shadow mapping and the bool are_we_initialized_already() function.

Neither is currently part of the compiler API. The kernel allocates the shadow on per-page basis (i.e. shadow is non-contiguous), and these details better be abstracted from the instrumentation.

Is it possible to make that the forever-stable interface, and tie everything on top of it to the compiler?

If we treat the userspace shadow mapping as an example of such a stable interface, we see that it's not enough (e.g. handling va_args require additional globals or functions).
The question is basically whether right now MSan interface is stable enough, or will there be changes to it in the future.

679

On the other hand, it makes the interface simpler, so if you think you can do it just for the kernel w/o complicating the runtime library too much, then go for it.

Ok, I'll make this change just for the kernel.

glider updated this revision to Diff 104658.Jun 29 2017, 8:38 AM
glider edited the summary of this revision. (Show Details)
glider updated this revision to Diff 105098.Jul 3 2017, 9:14 AM
glider edited the summary of this revision. (Show Details)

Updated the patch.
Now all KMSAN-specific functions bear the "__msan_" prefix so that they can be used in other contexts (e.g. MSan instrumentation for Android).
I've also dropped aliasing functions that were working with argument shadow, except for __msan_load_arg_shadow() and __msan_store_arg_shadow().

Please take another look.

eugenis edited edge metadata.Jul 5 2017, 6:02 PM

In general, way too many small pieces of kernel-specific code. Please try to come up with a set of common utility functions that would capture the difference between kernel and userspace.

lib/Transforms/Instrumentation/MemorySanitizer.cpp
1139

There is some code duplication between this function and materializeStoresUserspace. It could be better to implement a kernel-specific storeOrigin() and keep this one common.

1154

When can origin be less than 32 bit?

1702

I think this is the wrong level of abstraction. You don't want kernel-specific handling of LoadInst, you want kernel-specific primitives for writing and reading shadow. You can use them in handleVectorLoadIntrinsic as well as other similar places. Ideally, there should be no checks for MS.CompileKernel outside of such common utility functgions.

3104

How can it be? The call to this function is inserted in the Kmsan prologue, and we specifically step over the prologue when running the visitor in runOnFunction.

glider updated this revision to Diff 124592.Nov 28 2017, 9:30 AM
glider marked 3 inline comments as done.
glider edited the summary of this revision. (Show Details)

Updated the diff.
This is still work in progress, and I'm not sure what's the best way to proceed with it.
The diff of the instrumentation pass patch itself is around 1000 LoC, plus I've moved around some IR tests.
(In particular the tests that behave differently with and without -msan-kernel=1 were put into separate files, while those agnostic to the memory layout remained in msan_common_basic.ll)

We can try moving forward with this patch as a whole, or I can start landing small chunks of it just for the userspace (e.g. replace F.getEntryBlock() with MSV.ActualFnStart everywhere, factor some code into helper functions) so that it'll be easier to add a kernel-specific implementation to them.

Please disregard. Looks like I've missed some opportunities to move the CompileKernel() checks to the lower level.

glider updated this revision to Diff 140971.Apr 4 2018, 9:00 AM
glider edited the summary of this revision. (Show Details)

Updated the diff.
This is more or less the final version of the KMSAN API. I'm still working on the tests and fixing minor nits.
I'll be OOO till the end of April, so take your time.

eugenis added inline comments.Apr 6 2018, 11:05 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
444

Lets make -mllvm flags take precedence, as in KASan (see ClRecover.getNumOccurrences() > 0).

675

What are these, why do they have "_arg_" in the name but not in the description, and why are they not used?

glider marked an inline comment as done.Jul 20 2018, 9:07 AM
glider added inline comments.
lib/Transforms/Instrumentation/MemorySanitizer.cpp
675

Removed these.

glider updated this revision to Diff 156510.Jul 20 2018, 9:09 AM

Major update after landing the kernel-inspecific parts.

eugenis added inline comments.Jul 23 2018, 1:58 PM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
754–758

Please invert this check.

919–920

Please rebase on ToT.

1096–1102

Invert this check, and a few others below.

1337

I think you want DL.getTypeStoreSize() here.

1520

std::ignore instead of CpOriginPtr to avoid unused variable warning? Or is the compiler not smart enough to see that it is unused?

3294

Could you explain why this function is passed return address (i.e. the caller pc)?

Why is this api so different from userspace?

3497

I think you can detect this by looking at function attributes.

3580

This is not about kernel at all, is it?
Could you fix it for userspace as well, as a separate change?

test/Instrumentation/MemorySanitizer/msan_kernel_basic.ll
17

AFAIK, in no-asserts build IR temporary names are lost, so this "%param_shadow" is not going to match.

45

Please add checks for actual arguments etc. For example, make sure that the right member of context_state is called for param_shadow. No need to repeat this in every test case, just once would be enough.

glider updated this revision to Diff 159504.Aug 7 2018, 7:25 AM

Rebased to r339138.

glider marked 6 inline comments as done.Aug 8 2018, 4:13 AM
glider added inline comments.
lib/Transforms/Instrumentation/MemorySanitizer.cpp
1520

Looks like it isn't.
But I've replaced the variable with std::ignore anyway.

3294

It's too expensive to unwind the stack for each alloca, and __builtin_return_address(1) is unreliable, therefore we take the caller's return address to get two stack frames instead of one.

3497

I've only found mentions of SSE in the "target-features" string, where they look like: "-sse,-sse2,-sse3,...", not sure it's easier to parse those.

3580

This is gonna change the userspace ABI, as we'll need to introduce the origin tls var.
I can sure fix it for the userspace, we'll just need to be accurate deploying the new version.

test/Instrumentation/MemorySanitizer/msan_kernel_basic.ll
17

Hm, I don't see any difference between files built with -f[no-]discard-value-names using Clang built with/without assertions.
I've replaced the value names with seven getelementptr instructions.

glider updated this revision to Diff 159682.Aug 8 2018, 4:15 AM
glider marked an inline comment as done.
glider retitled this revision from [RFC] Add KMSAN instrumentation to MSan pass to Add KMSAN instrumentation to MSan pass.
glider edited the summary of this revision. (Show Details)
glider added a comment.Aug 8 2018, 5:36 AM

Re: value names, there's code in Clang that explicitly overrides discarding them for ASan/MSan builds.

eugenis added inline comments.Aug 8 2018, 2:10 PM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
1520

Hmm, now it is clear that this chunk does not do anything at all.
Revert it (but keep the TODO)?

3294

Is that because you can not use frame pointers in the kernel for some reason?

Looking at the code for __msan_poison_alloca in kmsan, with the loops and the hash tables and origin setting (with even more loops), I find it hard to believe that taking one or even several steps through FP chain would slow it down.

Note that when a function has more than one local variable, you are doing some redundant work.

Having said that, I don't really mind this change.

It would be nice to change the function name though, because it has very different semantics from the userspace function.

3445

Remove this.

3497

It's not easier, but it is correct.

3580

It won't break old instrumented code running with new runtime library, right?

AFAIK that's all that we care about.

test/Instrumentation/MemorySanitizer/msan_kernel_basic.ll
17

Right. As you mention in the other changelist, we preserve value names in asan/msan builds.

Anyway, I like this new test more.

glider updated this revision to Diff 159910.Aug 9 2018, 6:26 AM
glider marked 2 inline comments as done.
glider edited the summary of this revision. (Show Details)
glider added inline comments.
lib/Transforms/Instrumentation/MemorySanitizer.cpp
3294

Is that because you can not use frame pointers in the kernel for some reason?

There are configurations that disable frame pointers and use a different unwinding scheme.

Looking at the code for __msan_poison_alloca in kmsan, with the loops and the hash tables and origin setting (with even more loops), I find it hard to believe that taking one or even several steps through FP chain would slow it down.

Fair point.

Note that when a function has more than one local variable, you are doing some redundant work.

It might be a good idea to cache the unwound stack somehow, e.g. call the unwinder at the function prologue.

Having said that, I don't really mind this change.

Looks like it really doesn't make any difference whether we pass the pc or unwind it.
I'd better drop it then.

It would be nice to change the function name though, because it has very different semantics from the userspace function.

It's already different :)
There's no __msan_poison_alloca in the userspace, we're defining it in createKernelApi()

3497

Ok, I'll send a separate patch along the lines of:

AMD64FpEndOffset = AMD64FpEndOffsetSSE;
for (const auto &Attr : F.getAttributes().getFnAttributes()) {
  if (Attr.isStringAttribute() && (Attr.getKindAsString() == "target-features")) {
    if (Attr.getValueAsString().contains("-sse"))
      AMD64FpEndOffset = AMD64FpEndOffsetNoSSE;
    break;
  }
}

I'm not sure though that we're guaranteed to have those attributes (e.g. we currently don't have them in the tests, so we default to 176 bytes)

3580

Guess it shouldn't.
Ok, will do.

eugenis added inline comments.Aug 9 2018, 2:24 PM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
3294

It's already different :)
There's no __msan_poison_alloca in the userspace, we're defining it in createKernelApi()

Indeed.

3497

Then we should do whatever the target does when it does not find the attribute.

glider updated this revision to Diff 160070.Aug 10 2018, 1:28 AM

Picked r339414, updated the test, PTAL

Looks like all that's left is to split va_arg origin copy into a separate change and enable it in userspace.

glider updated this revision to Diff 163046.Aug 29 2018, 4:18 AM

Incorporate https://reviews.llvm.org/D51412 (va_arg origin support in userspace)

glider updated this revision to Diff 163052.Aug 29 2018, 4:42 AM

Now for realz (pulled the missing bit of va_arg support patch)

LGTM, but you probably want to rebase it on top of D51412, instead of including that change in this one.

Sure, will do.

glider added inline comments.Aug 31 2018, 2:27 AM
test/Instrumentation/MemorySanitizer/store-origin.ll
64

TODO: replace "01-9a-z" with "_0-9a-z"

glider updated this revision to Diff 164171.Sep 6 2018, 2:12 AM

Rebased the CL on top of recent vararg changes

eugenis accepted this revision.Sep 6 2018, 2:56 PM

LGTM

This revision is now accepted and ready to land.Sep 6 2018, 2:56 PM
glider closed this revision.Sep 7 2018, 2:11 AM

Landed r341637, hope it sticks!