This is an archive of the discontinued LLVM Phabricator instance.

[PATCH] [sanitizers] Add MSan support for AArch64
ClosedPublic

Authored by zatrazz on Sep 8 2015, 2:14 PM.

Details

Summary

This patch adds support for msan on aarch64-linux for both 39 and
42-bit VMA. The support is enabled by defining the
SANITIZER_AARCH64_VMA compiler flag to either 39 or 42 at build time
for both clang/llvm and compiler-rt. The default VMA is 39 bits.

Diff Detail

Event Timeline

zatrazz updated this revision to Diff 34260.Sep 8 2015, 2:14 PM
zatrazz retitled this revision from to [PATCH] [sanitizers] Add MSan support for AArch64.
zatrazz updated this object.
zatrazz added a subscriber: llvm-commits.
eugenis added a subscriber: eugenis.Sep 8 2015, 2:39 PM
eugenis added inline comments.
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

What's that?
It's not even attached to any cmake setting.
Does it mean that a typical clang binary can target all of x86/arm/ppc/etc, 32/64, but only one fixed aarch64 kernel configuration?

I really think this should be a compiler flag.

And it should be possible to remove the corresponding macro in compiler-rt as well by exporting this setting through a global variable in the compiled objects (see __msan_keep_going or the ASAN_FLEXIBLE_MAPPING thing).

samsonov added inline comments.
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

+1.

I'm sorry that we didn't figure it out promptly during reviews where this define was added to another sanitizers. We should rework this - building compiler and runtimes with a secret define is not the way to go.

rengolin added inline comments.Sep 9 2015, 4:10 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

What's that?

An unfortunate choice. We're in need of a better one.

It's not even attached to any cmake setting.

CFLAGS. I know...

Does it mean that a typical clang binary can target all of x86/arm/ppc/etc, 32/64, but only one fixed aarch64 kernel configuration?

I think so. If we want to target both 39 and 42, we'd have to create two separate libraries every time. With 48 coming to be the default in a year or so, we'd have to keep three. If that's acceptable, than I think we should create them all every time. If not, we need a way to select them at build time.

I really think this should be a compiler flag.

Can you expand on that? This is a temporary implementation detail that is very specific to AArch64, and one that will disappear soon.

Currently, it *is* a compiler flag. Not a pretty one, but one that we expect to be used *very* rarely. In the future, when it all becomes 48 bits, it'll be used even less, and possibly deprecated altogether when we give up supporting the "old" VMA configurations.

Creating an -mvma or anything like that would be kinda pointless, and harder to revert later.

zatrazz added inline comments.Sep 9 2015, 11:48 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

What's that?
It's not even attached to any cmake setting.
Does it mean that a typical clang binary can target all of x86/arm/ppc/etc, 32/64, but only one fixed aarch64 kernel configuration?

I really think this should be a compiler flag.

And it should be possible to remove the corresponding macro in compiler-rt as well by exporting this setting through a global variable in the compiled objects (see __msan_keep_going or the ASAN_FLEXIBLE_MAPPING thing).

I though about adding it as compiler flag and let the compiler/user decide to which VMA to use. However, as pointed out by rengolin it would require to build multiple sanitizer libraries for each VMA and select them at link time. If it is the desired target I can evaluate this option, however I would like to avoid such complication from the fact recent kernel change default VMA to 48 bits and the idea is to use only this config.

eugenis added inline comments.Sep 9 2015, 12:39 PM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

It should be possible to have a single runtime library that would detect VMA at runtime. Maybe it would cost us a percent or two of performance.

Having said that, I don't think this macro, as it is, would cause us any problems in practice, and I don't mind if you go ahead with this patch.

Two questions:

  1. Does lower VMA setting work on kernels configured for higher VMA, or does it have to be an exact match?
  2. is there a runtime check for an incompatible or suboptimal VMA setting?

It should be possible to have a single runtime library that would detect VMA at runtime. Maybe it would cost us a percent or two of performance.

The performance issue is not the runtime detection, but how to disentangle all the macros and one VMA assumption
in sanitizer code. The MSAN compiler-rt patch relies on define both the internal allocator (which can be SizeClassAllocator32
or SizeClassAllocator64 depending of the available VMA size) and transformation macros (MEM_TO_SHADOW) that are
used extensively in a lot of places.

And there is the question if this work will be really useful, since the idea is indeed to have only one VMA as default (48-bit
for aarch64).

Having said that, I don't think this macro, as it is, would cause us any problems in practice, and I don't mind if you go ahead with this patch.

Two questions:

Does lower VMA setting work on kernels configured for higher VMA, or does it have to be an exact match?

It must be a exact match because compiler-rt sanitizer libraries mmap specific areas for shadow and origin memory
and they are calculated based on where the user regions are placed. For instance, for 39-bit VMA some text segment
is expected to be between 0x7000000000-0x8000000000 where for 42-bit VMA 0x3f000000000-0x40000000000. This
generates complete different mappings.

is there a runtime check for an incompatible or suboptimal VMA setting?

Not right now, but it is something I will implement.

rengolin edited edge metadata.Sep 10 2015, 5:45 AM

is there a runtime check for an incompatible or suboptimal VMA setting?

Not right now, but it is something I will implement.

That's a good idea, especially now that we have two "defaults", depending on the distro.

kristof.beyls added inline comments.
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

I'm a bit confused.
It seems to me that this is adding functionality to the code-generation part of the sanitizer, i.e. embedded in the compiler right? And we're making a decision at compiler-construction time about only supporting one variant of AArch64 linux kernels? That doesn't seem right to me as at compiler-construction time you may want to create a compiler that can target all the different variants of linux kernels.

eugenis added inline comments.Sep 10 2015, 11:52 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

Yeah, it does sound quite bad.
I'd also like to argue that lower VMA settings are unlikely to go away in the near future. Even when the linux kernel defaults to 48, there are platforms using older kernels, already released devices with at least a few years of lifetime. For example, a toolchain in the Android NDK would need to support VMA variants for a wide set of devices, and bundling a whole extra toolchain or two just for aarch64 is not an option.

I take back my earlier comment. I don't see a way around -mvma.

I agree that a runtime VMA detection (i.e. a single runtime library for all VMA variants) is not realistic. Sounds like we'll need to build 3 libraries and pick the correct one in the driver based on -mvma value.

rengolin added inline comments.Sep 11 2015, 2:20 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

I didn't mean 39/42 are going away, just that we'll probably move the default some time in the future.

In the long run, I agree that -mvma will be required, if anything, to pick one of the versions. We could also have the detection in the compiler itself, to act as a default, but due to cross-compilation issues, we'll need the flag anyway.

Kristof also mentioned about name mangling as a way to do some run time selection, but that would also require some magic in the run time, either a dispatch table, or self-modifying code at start up (like IFUNC). I don't think we're at the stage that we can take any of these decisions lightly.

Given the status of the sanitizer code, where a lot of it is either hard-coded or based on pre-processor macros, trying anything dynamic will be a big refactoring change. One that we may need, granted, but not at the expense of having the sanitizers working on the two VMA configurations we have today in some way.

As I said earlier, we're discussing this, and maybe we should put forward a BOF session in the US LLVM to discuss the refactoring, but meanwhile, this allows us to test it on both VMAs as we're doing for the other sanitizers.

We haven't taken any decision, nor we're not changing anything here, just adding functionality.

kristof.beyls added inline comments.Sep 11 2015, 2:37 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

I don't think that name mangling requires dispatch tables or self-modifying code.

I agree that name mangling or another mechanism to fail at binary build/link time rather than at run time can be done as part of a separate patch.

However, I do think that hard-coding/restricting which linux distribution you can target when building the compiler is evil, and this patch shouldn't do that. I therefore think that some command line option indeed is needed with a reasonable default before this patch can land.

rengolin added inline comments.Sep 11 2015, 2:47 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

There is a compilation option to control that, -DSANITIZER_AARCH64_VMA=42. I agree it's not the best, but being temporary until we can refactor all the *other* hard-coded stuff to make this work at run time without performance impacts, I think it's acceptable.

zatrazz added inline comments.Sep 11 2015, 5:33 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

This patch itself is a *enablement* one meant to make it possible to run and check msan output in current AArch64 scenarios. The idea is to be able to continuous check for regression by buildbots and add an initial support.

We do agree that current approach (using a compiler flag) is not the best option and it should go away with a better strategy. However, as pointed out I also do agree it should be done in another patch and incrementally. It requires some refactor from compiler-rt to build/generate multiple versions of the required symbols used by the sanitizers and the implementation of the flag mechanism.

My idea is to after finish the msan sanitizer (aarch64-linux now supports asan, tsan, and dfsan) is to work on a mechanism to select the vma at build time and make the compiler-rt transparent to it (either by using name mangling, runtime detection, or multiples compiler-rt builds).

So should we still block this patch based on the assumption that this mechanism will be one used for general deployment?

kristof.beyls added inline comments.Sep 11 2015, 5:44 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

My main objection is that - unless I misunderstand - the current patch doesn't offer a command line flag to clang and/or llvm to specify the vma size of the system you're targeting.
clang/llvm offers cross-compilation support out-of-the-box and only supporting a single vma size in built libraries just seems such a big regression over by-default cross-compilation support that I think this patch shouldn't land without a way to support instrumentation for different vma sizes.
The way the patch currently is, it's basically impossible to create a single clang binary that can offer native support across all linux-aarch distributions. I think that's just too wrong.
Furthermore, it seems like a minor change to this patch to not hard-code which vma size is supported in the llvm/clang binary being built, but instead having an llvm flag for this.

Basically, I object to using predefines like SANITIZER_AARCH64_VMA in the compiler source code - i.e. this patch.

rengolin added inline comments.Sep 11 2015, 6:00 AM
lib/Transforms/Instrumentation/MemorySanitizer.cpp
125

My main objection is that - unless I misunderstand - the current patch doesn't offer a command line flag to clang and/or llvm to specify the vma size of the system you're targeting.

It does, -DSANITIZER_AARCH64_VMA=42.

We are testing both 39 and 42 bits VMA on our buildbots, so it *is* possible to change it.

The way the patch currently is, it's basically impossible to create a single clang binary that can offer native support across all linux-aarch distributions. I think that's just too wrong.

That's not this patch, that's how the whole sanitizer is hard-coded to be. That's a very important thing we need to implement, which will be our next step, but this is not related to this patch.

This specific patch is not using anything that the other sanitizers are not, since we're not inventing the macro here, now.

Because all sanitizers use the same macro, once we fix for one, we fix for all. But this is not the point of this patch.

rengolin accepted this revision.Sep 15 2015, 7:45 AM
rengolin edited edge metadata.
This revision is now accepted and ready to land.Sep 15 2015, 7:45 AM
zatrazz updated this revision to Diff 34889.Sep 16 2015, 7:33 AM
zatrazz edited edge metadata.

I will push this patch.

zatrazz updated this revision to Diff 34890.Sep 16 2015, 7:35 AM

Wrong patch (this was meant for compiler-rt).

zatrazz closed this revision.Sep 16 2015, 8:14 AM