This is an archive of the discontinued LLVM Phabricator instance.

[sanitizer] Use same shadow offset for aarch64
ClosedPublic

Authored by zatrazz on Oct 15 2015, 12:52 PM.

Details

Summary

This patch makes ASAN for aarch64 use the same shadow offset for all
currently supported VMAs (39 and 42 bits). The shadow offset is the
same for 39-bit (36). Similar to ppc64 port, aarch64 transformation
also requires to use an add instead of 'or' for 42-bit VMA.

No regressions found in 39 and 42-bit VMA. I have not checked on 48-bit
due lack of a working system.

Diff Detail

Event Timeline

zatrazz updated this revision to Diff 37507.Oct 15 2015, 12:52 PM
zatrazz retitled this revision from to [sanitizer] Use same shadow offset for aarch64.
zatrazz updated this object.
zatrazz added a subscriber: llvm-commits.
rengolin edited edge metadata.Oct 15 2015, 1:37 PM

Hi Adhemerval,

What is the impact of this change? I remember that using 39-bits config on 42-bits broke a lot of tests. Is the addition a cure for all those problems?

What about the other sanitizers? Some of them, like TSAN, run together with ASAN, shouldn't the settings be the same on both?

For the next steps, do you think this will work with all the other sanitizers?

cheers,
--renato

Hi Adhemerval,

What is the impact of this change? I remember that using 39-bits config on 42-bits broke a lot of tests. Is the addition a cure for all those problems?

I tested on 42-bit without no regressions. It will use different mappings, but I also sent a compiler-rt
to adjust this [1]. The only downside is for 39-bits it will need to cover a 42-bit VMA in TwoLevelByteMap
thus consume slight more memory in mapping (which is something I think we can live with).

What about the other sanitizers? Some of them, like TSAN, run together with ASAN, shouldn't the settings be the same on both?

That's not true, tsan can not be run with asan (trying to use -fsanitize=address,thread issues an error).
Also conceptually it is not really possible: memory operations are handled different regarding
these two sanitizers and memory mapping is defined independently for each one. You can have some
sanitizer working together, like asan and lsan, but it is due both use the same infrastructure.

For the next steps, do you think this will work with all the other sanitizers?

I am currently trying to make it happen for msan and it looks feasible: the strategy I am using is
to use the same instrumentation for both 39 and 42-bit VMA and use the same mapping for
both. The idea is define a transformation that will translate 39-bit segments to 39-bit shadow
addresses and for 39-bit vma only maps segments up to 39-bits. For instance, using the
39-bit msan instrumentation scheme for a 42-bit VMA:

{0x05500000000ULL, 0x055FFFFFFFFULL, MappingDesc::SHADOW,  "app-1"},
{0x04000000000ULL, 0x04100000000ULL, MappingDesc::SHADOW,  "shadow-1"},
{0x04300000000ULL, 0x04400000000ULL, MappingDesc::ORIGIN,  "origin-1"},
{0x07000000000ULL, 0x07FFFFFFFFFULL, MappingDesc::SHADOW,  "app-2"},
{0x04100000000ULL, 0x04300000000ULL, MappingDesc::SHADOW,  "shadow-2"},
{0x04400000000ULL, 0x04600000000ULL, MappingDesc::ORIGIN,  "origin-2"},
{0x2AA00000000ULL, 0x2AAFFFFFFFFULL, MappingDesc::SHADOW,  "app-3"},
{0x2C300000000ULL, 0x2C400000000ULL, MappingDesc::SHADOW,  "shadow-3"},
{0x2C600000000ULL, 0x2C700000000ULL, MappingDesc::ORIGIN,  "origin-3"},
{0x2AA00000000ULL, 0x2AAFFFFFFFFULL, MappingDesc::SHADOW,  "app-4"},
{0x2C300000000ULL, 0x2C400000000ULL, MappingDesc::SHADOW,  "shadow-4"},
{0x2C600000000ULL, 0x2C700000000ULL, MappingDesc::ORIGIN,  "origin-4"},
{0x3F000000000ULL, 0x3FFFFFFFFFFULL, MappingDesc::SHADOW,  "app-5"},
{0x3C100000000ULL, 0x3C300000000ULL, MappingDesc::SHADOW,  "shadow-5"},
{0x3C400000000ULL, 0x3C600000000ULL, MappingDesc::ORIGIN,  "origin-5"},

For 39-bit VMA libsanitizer will only map the segments until 0x8000000000 and for 42-bit
VMA it will maps all the segments.

cheers,
--renato

[1] http://reviews.llvm.org/D13782

thus consume slight more memory in mapping (which is something I think we can live with).

When we proposed using the same mapping for both in January we were told that the memory increase was a real problem and that we should find a way that it could work best on both. That's why we started all this. :)

I'd like to have everyone agreeing that we can, indeed, live with it and finish this now. Last thing I want is to start a third round...

That's not true, tsan can not be run with asan (trying to use -fsanitize=address,thread issues an error).

I stand corrected.

You can have some
sanitizer working together, like asan and lsan, but it is due both use the same infrastructure.

But my question still stands: Won't all the other sanitizers break when run together with ASAN with this change?

I thought we had cross tests like that already, so maybe my question is answered already...

cheers,
--renato

rengolin added a reviewer: samsonov.

thus consume slight more memory in mapping (which is something I think we can live with).

When we proposed using the same mapping for both in January we were told that the memory increase was a real problem and that we should find a way that it could work best on both. That's why we started all this. :)

I will check which is the memory consumption difference by internal allocators with
and without this patch on 39-bit VMA.

I'd like to have everyone agreeing that we can, indeed, live with it and finish this now. Last thing I want is to start a third round...

That's not true, tsan can not be run with asan (trying to use -fsanitize=address,thread issues an error).

I stand corrected.

You can have some
sanitizer working together, like asan and lsan, but it is due both use the same infrastructure.

But my question still stands: Won't all the other sanitizers break when run together with ASAN with this change?

I thought we had cross tests like that already, so maybe my question is answered already...

Afaik tests already cover the sanitizers that are meant to run concurrently, which for ASAN is LSAN and UBAN.
MSAN and TSAN requires different instrumentation and their mapping are defined independently of each other.

cheers,
--renato

Afaik tests already cover the sanitizers that are meant to run concurrently, which for ASAN is LSAN and UBAN.
MSAN and TSAN requires different instrumentation and their mapping are defined independently of each other.

Sounds good, thanks!

eugenis accepted this revision.Oct 19 2015, 11:10 AM
eugenis edited edge metadata.
This revision is now accepted and ready to land.Oct 19 2015, 11:10 AM

Hi Adhemerval,

What is the impact of this change? I remember that using 39-bits config on 42-bits broke a lot of tests. Is the addition a cure for all those problems?

I tested on 42-bit without no regressions. It will use different mappings, but I also sent a compiler-rt
to adjust this [1]. The only downside is for 39-bits it will need to cover a 42-bit VMA in TwoLevelByteMap
thus consume slight more memory in mapping (which is something I think we can live with).

What about the other sanitizers? Some of them, like TSAN, run together with ASAN, shouldn't the settings be the same on both?

That's not true, tsan can not be run with asan (trying to use -fsanitize=address,thread issues an error).
Also conceptually it is not really possible: memory operations are handled different regarding
these two sanitizers and memory mapping is defined independently for each one. You can have some
sanitizer working together, like asan and lsan, but it is due both use the same infrastructure.

For the next steps, do you think this will work with all the other sanitizers?

I am currently trying to make it happen for msan and it looks feasible: the strategy I am using is
to use the same instrumentation for both 39 and 42-bit VMA and use the same mapping for
both. The idea is define a transformation that will translate 39-bit segments to 39-bit shadow
addresses and for 39-bit vma only maps segments up to 39-bits. For instance, using the
39-bit msan instrumentation scheme for a 42-bit VMA:

{0x05500000000ULL, 0x055FFFFFFFFULL, MappingDesc::SHADOW,  "app-1"},
{0x04000000000ULL, 0x04100000000ULL, MappingDesc::SHADOW,  "shadow-1"},
{0x04300000000ULL, 0x04400000000ULL, MappingDesc::ORIGIN,  "origin-1"},
{0x07000000000ULL, 0x07FFFFFFFFFULL, MappingDesc::SHADOW,  "app-2"},
{0x04100000000ULL, 0x04300000000ULL, MappingDesc::SHADOW,  "shadow-2"},
{0x04400000000ULL, 0x04600000000ULL, MappingDesc::ORIGIN,  "origin-2"},
{0x2AA00000000ULL, 0x2AAFFFFFFFFULL, MappingDesc::SHADOW,  "app-3"},
{0x2C300000000ULL, 0x2C400000000ULL, MappingDesc::SHADOW,  "shadow-3"},
{0x2C600000000ULL, 0x2C700000000ULL, MappingDesc::ORIGIN,  "origin-3"},
{0x2AA00000000ULL, 0x2AAFFFFFFFFULL, MappingDesc::SHADOW,  "app-4"},
{0x2C300000000ULL, 0x2C400000000ULL, MappingDesc::SHADOW,  "shadow-4"},
{0x2C600000000ULL, 0x2C700000000ULL, MappingDesc::ORIGIN,  "origin-4"},
{0x3F000000000ULL, 0x3FFFFFFFFFFULL, MappingDesc::SHADOW,  "app-5"},
{0x3C100000000ULL, 0x3C300000000ULL, MappingDesc::SHADOW,  "shadow-5"},
{0x3C400000000ULL, 0x3C600000000ULL, MappingDesc::ORIGIN,  "origin-5"},

For 39-bit VMA libsanitizer will only map the segments until 0x8000000000 and for 42-bit
VMA it will maps all the segments.

Wow this is complicated :)
There's nothing bad about that, and the general approach sounds good.
Also, while you are at it, study all execution modes (PIE/non-PIE, ASLR enabled/disabled) and see if it is possible to devise a mapping that would support as many of those as possible. Also, consider MAP_32BIT - none of the regions on your list include the first 4GB of the address space.
See https://github.com/google/sanitizers/issues/579 for the recent linux/x86_64 mapping change.

zatrazz added a comment.EditedOct 19 2015, 1:00 PM

Hi Adhemerval,

What is the impact of this change? I remember that using 39-bits config on 42-bits broke a lot of tests. Is the addition a cure for all those problems?

I tested on 42-bit without no regressions. It will use different mappings, but I also sent a compiler-rt
to adjust this [1]. The only downside is for 39-bits it will need to cover a 42-bit VMA in TwoLevelByteMap
thus consume slight more memory in mapping (which is something I think we can live with).

What about the other sanitizers? Some of them, like TSAN, run together with ASAN, shouldn't the settings be the same on both?

That's not true, tsan can not be run with asan (trying to use -fsanitize=address,thread issues an error).
Also conceptually it is not really possible: memory operations are handled different regarding
these two sanitizers and memory mapping is defined independently for each one. You can have some
sanitizer working together, like asan and lsan, but it is due both use the same infrastructure.

For the next steps, do you think this will work with all the other sanitizers?

I am currently trying to make it happen for msan and it looks feasible: the strategy I am using is
to use the same instrumentation for both 39 and 42-bit VMA and use the same mapping for
both. The idea is define a transformation that will translate 39-bit segments to 39-bit shadow
addresses and for 39-bit vma only maps segments up to 39-bits. For instance, using the
39-bit msan instrumentation scheme for a 42-bit VMA:

{0x05500000000ULL, 0x055FFFFFFFFULL, MappingDesc::SHADOW,  "app-1"},
{0x04000000000ULL, 0x04100000000ULL, MappingDesc::SHADOW,  "shadow-1"},
{0x04300000000ULL, 0x04400000000ULL, MappingDesc::ORIGIN,  "origin-1"},
{0x07000000000ULL, 0x07FFFFFFFFFULL, MappingDesc::SHADOW,  "app-2"},
{0x04100000000ULL, 0x04300000000ULL, MappingDesc::SHADOW,  "shadow-2"},
{0x04400000000ULL, 0x04600000000ULL, MappingDesc::ORIGIN,  "origin-2"},
{0x2AA00000000ULL, 0x2AAFFFFFFFFULL, MappingDesc::SHADOW,  "app-3"},
{0x2C300000000ULL, 0x2C400000000ULL, MappingDesc::SHADOW,  "shadow-3"},
{0x2C600000000ULL, 0x2C700000000ULL, MappingDesc::ORIGIN,  "origin-3"},
{0x2AA00000000ULL, 0x2AAFFFFFFFFULL, MappingDesc::SHADOW,  "app-4"},
{0x2C300000000ULL, 0x2C400000000ULL, MappingDesc::SHADOW,  "shadow-4"},
{0x2C600000000ULL, 0x2C700000000ULL, MappingDesc::ORIGIN,  "origin-4"},
{0x3F000000000ULL, 0x3FFFFFFFFFFULL, MappingDesc::SHADOW,  "app-5"},
{0x3C100000000ULL, 0x3C300000000ULL, MappingDesc::SHADOW,  "shadow-5"},
{0x3C400000000ULL, 0x3C600000000ULL, MappingDesc::ORIGIN,  "origin-5"},

For 39-bit VMA libsanitizer will only map the segments until 0x8000000000 and for 42-bit
VMA it will maps all the segments.

Wow this is complicated :)
There's nothing bad about that, and the general approach sounds good.
Also, while you are at it, study all execution modes (PIE/non-PIE, ASLR enabled/disabled) and see if it is possible to devise a mapping that would support as many of those as possible. Also, consider MAP_32BIT - none of the regions on your list include the first 4GB of the address space.
See https://github.com/google/sanitizers/issues/579 for the recent linux/x86_64 mapping change.

I checked for ASLR enabled/disabled and the mappings seems fine:

  • For 39 bits text segments are places between 0x00400000-00XXXXXX for binary itself with [heap] randomized between 0x0000000-0xFFFFFFFF. High text address are either place at 0x7fb7XXXXXXX or randomized between 0x7fXXXXXXXXX. For 42-bits, lower address follow the same pattern with high addresses being randomized between 0x3f00000000-0x3fFFFFFFFFF.
  • I have not tested for PIE build yet, I will check that.
  • MAP_32BIT is valid only for x86-64 (64-bit programs).

I checked for ASLR enabled/disabled and the mappings seems fine:

  • For 39 bits text segments are places between 0x00400000-00XXXXXX for binary itself with [heap] randomized between 0x0000000-0xFFFFFFFF. High text address are either place at 0x7fb7XXXXXXX or randomized between 0x7fXXXXXXXXX. For 42-bits, lower address follow the same pattern with high addresses being randomized between 0x3f00000000-0x3fFFFFFFFFF.
  • I have not tested for PIE build yet, I will check that.
  • MAP_32BIT is valid only for x86-64 (64-bit programs).

For 39 bits, ASLR on/off without pie maps in low addresses from 0x00000000-0x10000000
(executable own segments) and from 0x7f80000000-0x7f8fffffff (libraries, stack, vdso). PIE
builds for 39 bits moves main executable to 0x5500000000-0x5600000000.

For 42-bits, ALSR on/off without pie also maps executable in same low addresses regions
as 39-bits: 0x00000000-0x10000000. The high addresses use are different:
0x3ff000000000-0x3fffffffffff. PIE moves main executable segments to
0x2aa00000000-0x2ab00000000.

I have tested with ASLR off/on and with/without pie using MSAN own tests and it shows
no regressions in 39 and 42-bits. I will push this when the compiler-rt counterpart patch
has been accepted.

zatrazz closed this revision.Nov 9 2015, 11:30 AM