This is an archive of the discontinued LLVM Phabricator instance.

[PATCH] [compiler-rt] [sanitizers] Add VMA size check at runtime
ClosedPublic

Authored by zatrazz on Sep 10 2015, 8:03 AM.

Details

Summary

This patch adds a runtime check for asan, dfsan, msan, and tsan for
architectures that support multiple VMA size (like aarch64). Currently
the check only prints a warning indicating which is the VMA built and
expected against the one detected at runtime.

Diff Detail

Event Timeline

zatrazz updated this revision to Diff 34447.Sep 10 2015, 8:03 AM
zatrazz retitled this revision from to [PATCH] [compiler-rt] [sanitizers] Add VMA size check at runtime.
zatrazz updated this object.
zatrazz added reviewers: kcc, dvyukov, eugenis, pcc, rengolin.
zatrazz added a subscriber: llvm-commits.
rengolin added inline comments.Sep 10 2015, 8:57 AM
lib/sanitizer_common/sanitizer_posix.cc
336

I was expecting this to print things like 39/42, not the max size.

zatrazz added inline comments.Sep 10 2015, 9:57 AM
lib/sanitizer_common/sanitizer_posix.cc
336

I tried to be less intrusive as possible, since only GetMaxVirtualAddress is really defined in a multiplatform way. However if the idea is to show 39/42 I think we can create a symbol to return it as well.

rengolin added inline comments.Sep 10 2015, 10:02 AM
lib/sanitizer_common/sanitizer_posix.cc
336

I think the max size will be very uninformative. Maybe it would be good to have that CLZ function in the sanitizers after all.

zatrazz updated this revision to Diff 34484.Sep 10 2015, 1:41 PM

Updated patch to prints bits instead of total VMA range.

I would have assumed it would be relatively straightforward to support multiple virtual-memory address space sizes in a single compiler-rt library, by adding name mangling to the public functions in the library that need to behave differently for different VMA sizes. E.g. couldn't a name mangling scheme adding "_vma39/_vma42/_vma48" be used to get all the functions depending on VMA size into a single compiler-rt library?
Building different compiler-rt libraries just for this difference seems inconvenient to users to me. Furthermore, if for some reason another difference requires slightly different implementations in the sanitizers or in other functionality in compiler-rt, are we going to build the cartesian cross-product of all variants?

I haven't looked into the details of how the sanitizers are implemented, but I'm assuming there's a very good reason why the shadow regions need to be placed in an area depending on the VMA size?

dvyukov edited edge metadata.Sep 11 2015, 12:54 AM

Otherwise LGTM

lib/sanitizer_common/sanitizer_posix.cc
335

Add tool name in the beginning. There is a function SanitizerToolName() or something like that.
We generally prefix all output with tool name, because sometimes people run third_party programs or very large programs, and they may not know that this output comes from the tool.

rengolin edited edge metadata.Sep 11 2015, 1:17 AM

I would have assumed it would be relatively straightforward to support multiple virtual-memory address space sizes in a single compiler-rt library, by adding name mangling to the public functions in the library that need to behave differently for different VMA sizes. E.g. couldn't a name mangling scheme adding "_vma39/_vma42/_vma48" be used to get all the functions depending on VMA size into a single compiler-rt library?

Hi Kristof,

The main issue here is run time impact of the decisions, and how it led the development of the sanitizers, and that's a topic not suitable to a small patch like this. :)

It should be possible to come up with a clean design to make run time decisions cheaper and more elegant, and we're discussing this on IRC, mailing list and privately. There will be a session at Connect to discuss this with the ARM GCC team, and if you guys are interested, I'll send you the invite.

But even so, none of that will give us any final direction, it'll be just an informal talk. The real discussion will happen in the list, hopefully also in the US LLVM meeting, so we can come up with a final design that doesn't regress performance and gets us our run time VMA choice.

The sanitizers were developed with the mindset that there is only one memory map per architecture and that's hard-coded into the pre-processor macros. We'll need a bunch of small refactorings before we can start thinking about name mangling. For now, we need some mechanism that works.

cheers,
--renato

rengolin accepted this revision.Sep 11 2015, 1:18 AM
rengolin edited edge metadata.

With Dmitry's comments, LGTM, too.

This revision is now accepted and ready to land.Sep 11 2015, 1:18 AM

I would have assumed it would be relatively straightforward to support multiple virtual-memory address space sizes in a single compiler-rt library, by adding name mangling to the public functions in the library that need to behave differently for different VMA sizes. E.g. couldn't a name mangling scheme adding "_vma39/_vma42/_vma48" be used to get all the functions depending on VMA size into a single compiler-rt library?

Hi Kristof,

The main issue here is run time impact of the decisions, and how it led the development of the sanitizers, and that's a topic not suitable to a small patch like this. :)

It should be possible to come up with a clean design to make run time decisions cheaper and more elegant, and we're discussing this on IRC, mailing list and privately. There will be a session at Connect to discuss this with the ARM GCC team, and if you guys are interested, I'll send you the invite.

But even so, none of that will give us any final direction, it'll be just an informal talk. The real discussion will happen in the list, hopefully also in the US LLVM meeting, so we can come up with a final design that doesn't regress performance and gets us our run time VMA choice.

The sanitizers were developed with the mindset that there is only one memory map per architecture and that's hard-coded into the pre-processor macros. We'll need a bunch of small refactorings before we can start thinking about name mangling. For now, we need some mechanism that works.

cheers,
--renato

I don't see why name mangling would add run-time overhead - but it would avoid having to build different variants of compiler-rt for AArch64.
If name mangling does add run-time overhead, could you explain why?

I don't see why name mangling would add run-time overhead - but it would avoid having to build different variants of compiler-rt for AArch64.
If name mangling does add run-time overhead, could you explain why?

I'd rather discuss this outside of this patch, as it is just a safety check, not a decision making change. Feel free to send an email to the list if you want to start the discussion now.

I don't see why name mangling would add run-time overhead - but it would avoid having to build different variants of compiler-rt for AArch64.
If name mangling does add run-time overhead, could you explain why?

I'd rather discuss this outside of this patch, as it is just a safety check, not a decision making change. Feel free to send an email to the list if you want to start the discussion now.

Fair enough - I hadn't noticed we're already requiring different compiler-rt builds for different variants of AArch64-linux.

zatrazz closed this revision.Sep 11 2015, 7:02 AM