Page MenuHomePhabricator

[LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA
Needs ReviewPublic

Authored by sebpop on Apr 3 2019, 7:58 PM.

Details

Summary

This patch fixes https://github.com/google/sanitizers/issues/703
On a 48-bit VMA aarch64 machine the time spent in LSan and ASan reduced from 2.5s to 0.01s when running

clang -fsanitize=leak compiler-rt/test/lsan/TestCases/sanity_check_pure_c.c && time ./a.out
clang -fsanitize=address compiler-rt/test/lsan/TestCases/sanity_check_pure_c.c && time ./a.out

With this patch, LSan and ASan create both the 32 and 64 allocators and select at run time between the two allocators following a global variable that is initialized at init time to whether the allocator64 can be used in the virtual address space.

Diff Detail

Event Timeline

sebpop created this revision.Apr 3 2019, 7:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2019, 7:58 PM
sebpop updated this revision to Diff 193646.Apr 3 2019, 8:08 PM

Rebased patch on today's trunk.

junbuml added a subscriber: junbuml.Apr 4 2019, 9:01 AM

I think the changes are minimal and make sense given the define() blocks for other arches. I'll defer to @kcc on approval though.

compiler-rt/lib/lsan/lsan_allocator.cc
117 ↗(On Diff #193646)

Nit: indent level of p = ... is different than line 133 below.

163 ↗(On Diff #193646)

spacing again.

kcc added a reviewer: eugenis.Apr 5 2019, 4:51 PM

Please don't use this many #ifdefs.
If should not need more than one ifdef for this patch,
Split the logic into separate files, when needed.

sebpop added a comment.Apr 5 2019, 5:10 PM

Ok, I will prepare an updated patch.
Thanks Brian and Kostya for your reviews.

sebpop updated this revision to Diff 194200.Apr 8 2019, 1:14 PM

Address review comments from Kostya: move AArch64 lsan allocator to a separate file to avoid #ifdefs.

kcc added a comment.Apr 8 2019, 1:19 PM

Hm... But this is so much code duplication... Can we have few #ifdefs but also not too much duplication?

Also, this is changing only the standalone lsan, not lsan used as part of asan. Right?
standalone lsan is not widely used, AFAICT.

sebpop added a comment.Apr 8 2019, 1:58 PM

Also, this is changing only the standalone lsan, not lsan used as part of asan. Right?
standalone lsan is not widely used, AFAICT.

Correct, asan is still very slow with this patch on.
I was trying to reach a patch that is in good shape to be accepted for lsan
before moving on to apply the same solution to asan.

Can we have few #ifdefs but also not too much duplication?

Avoiding code dup was the intent of the first version of the patch,
and I agree with you that the ifdefs with else clauses are ugly...
Do you have a suggestion?

I also tried to create a pointer that we would switch over between the two allocators 32/64.
The problem is to name a type that could point to both allocators:

using Allocator = CombinedAllocator<                                                                                                                                                          
  PrimeAlloc, AllocatorCache, SecondAlloc, LocalAddressSpaceView>;                                                                                                                            

using Allocator64 = CombinedAllocator<                                                                                                                                                        
  PrimeAlloc64, AllocatorCache64, SecondAlloc, LocalAddressSpaceView>;                                                                                                                        

Allocator a32;
Allocator64 a64;
Allocator32or64 *a;

if (47_bit_VMA)
  a = &a64;
else
  a = &a32;

In that case the rest of the code remains the same, only replacing a.op() with a->op().

sebpop updated this revision to Diff 194937.Apr 12 2019, 12:54 PM
sebpop retitled this revision from [LSan][AArch64] Speed-up leak-sanitizer on AArch64 for 47-bit VMA to [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 47-bit VMA .

This patch reduces the number of #ifdefs as suggested by Kostya, and speeds up both the leak and address sanitizers on aarch64.
Passes check-all on x86_64-linux and aarch64-linux is still under test.
Worked with Brian Rzycki @brzycki.

Hi @kcc. @sebpop and I decided on this approach because we were unable to create a base allocator class. We ran into pointer coercion issues for 32-bit or 64-bit pointers returned from some of the methods. We either had to mask this with (void *) casts at all the callsites. Even if we did this every routine would require a runtime check for 32 vs 64 on aarch64. This latest patch reduces the number of ifdefs and is still a compile-time check for every non-aarch64 platform.

sorry for delay.
Vitaly, can you give a recommendation on how to avoid #ifdefs and avoid too much code duplication in this case.

I propose to finalize one patch for either lsan or asan and then replicate it for another.

compiler-rt/lib/asan/asan_allocator.cc
456

allocated =

compiler-rt/lib/lsan/lsan_allocator.cc
40 ↗(On Diff #194937)

maybe

#if defined(SANITIZER_SELECT_ALLOCATOR_AT_RUNTIME)
bool useAllocator1 = false;

struct AllocatorCache {
  Type1 cache1;
  Type2 cache2;

  void Fn1() {
    if (useAllocator1)
      cache1.Fn1();
    else
      cache2.Fn1();
  }
}

struct Allocator {
  Type1 allocator1;
  Type2 allocator2;
  
  void Fn1() {
    if (useAllocator1)
      allocator1.Fn1();
    else
      allocator2.Fn1();
  }
};

#else

using Allocator = CombinedAllocator<
   PrimeAlloc, AllocatorCache, SecondAlloc, LocalAddressSpaceView>;
using AllocatorCache = AllocatorASVT<LocalAddressSpaceView>;

#endif

Would be nice to be able to enable SANITIZER_SELECT_ALLOCATOR_AT_RUNTIME
even on platforms where it's not needed, just to check the build

I've started some refactoring there. So maybe it would be easier after that.

I've started some refactoring there. So maybe it would be easier after that.

Hello @vitalybuka I plan on looking at your suggestions this week. Please let me know when you think you have the refactored code in a state you perfer and I will start working from that as the new base.

I've started some refactoring there. So maybe it would be easier after that.

Hello @vitalybuka I plan on looking at your suggestions this week. Please let me know when you think you have the refactored code in a state you perfer and I will start working from that as the new base.

Done.
maybe we can use something like: D61401

Hello @vitalybuka , thank you for the example in D61401. Do you have a suggestion what file and namespace the DoubleAllocator template class definition should reside? The most recent diff for D60243 was authored April 3 and no longer cleanly applies (the file lsan_allocator.h has changed considerably since then). This is complicated by @sebpop changing jobs and requiring legal approval before he can help work on this again. In order to make progress I'm essentially re-writing this patch based on the new Allocator layout in lsan_allocator.h.

Hello @vitalybuka , thank you for the example in D61401. Do you have a suggestion what file and namespace the DoubleAllocator template class definition should reside? The most recent diff for D60243 was authored April 3 and no longer cleanly applies (the file lsan_allocator.h has changed considerably since then). This is complicated by @sebpop changing jobs and requiring legal approval before he can help work on this again. In order to make progress I'm essentially re-writing this patch based on the new Allocator layout in lsan_allocator.h.

Probably namespace __sanitizer as you will need it for asan and lsan
have no idea about name. DoubleAllocator was for example but does not sound as helpful.
maybe DoubleAllocator -> SizeClassAllocatorPair in sanitizer_allocator_primary_pair.h.

Also instead of bool use1 maybe better to use virtual methods, but I suspect it will be more complicated. I am fine either way.
From D60243 probably you need only the part which decides which allocator to use and maybe make sure that we have appropriate AP32/AP64 for the new allocator

brzycki added a comment.EditedMay 9 2019, 8:55 AM

Hi @vitalybuka , I am attempting to use D61401 as you suggested but have run into an issue regarding use1 used inside static member functions. This error persists regardless if I attempt to make it a member variable or a global variable:

/work/b.rzycki/upstream/llvm-project/llvm/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_allocator_primary_pair.h:76:9: error: invalid use of member 'use1' in static member function
    if (use1)
        ^~~~
/work/b.rzycki/upstream/llvm-project/llvm/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_allocator_primary_pair.h:82:9: error: invalid use of member 'use1' in static member function
    if (use1)
        ^~~~

I don't quite understand why use1 is failing for me as a header, but not for your test case embedded in cc code. Here's the code that LLVM is unhappy about:

static bool use1 = false;

...

  static bool CanAllocate(uptr size, uptr alignment) {
    if (use1)
      return A1::CanAllocate(size, alignment);
    return A2::CanAllocate(size, alignment);
  }

  static uptr ClassID(uptr size) {
    if (use1)
      return A1::ClassID(size);
    return A2::ClassID(size);
  }

EDIT: Please disregard. For some reason I am no longer seeing the error. I consider it user error. :)

Hello @vitalybuka , I have uploaded a WIP diff in D61766 and I appreciate any insight you can give.

First, this approach still requires ifdefs for __aarch64__. The problem is I cannot use SizeClassAllocatorPair for arches where there is only one Allocator. This unfortunately prevents the request of using this approach to build this code even on platforms that do not need to select an allocator at runtime. This goes against the original intent of what you and @kcc asked for after the initial patch review.

Second, there are compile-time errors due to issues with replacing SizeClassAllocatorXX with the new SizeClassAllocatorPair class on Aaarch64. It's not a 1:1 drop-in replacement and in some cases I don't know what the correct answer is. For example, here's an LLVM error when building lsan_allocator.cc:

In file included from /home/cc/bmr/llvm-project/llvm/projects/compiler-rt/lib/lsan/lsan_allocator.cc:14:
In file included from /home/cc/bmr/llvm-project/llvm/projects/compiler-rt/lib/lsan/lsan_allocator.h:17:
In file included from /home/cc/bmr/llvm-project/llvm/projects/compiler-rt/lib/lsan/../sanitizer_common/sanitzer_allocator.h:77:
/home/cc/bmr/llvm-project/llvm/projects/compiler-rt/lib/lsan/../sanitizer_common/sanitizer_allocator_combine.h:28:53: error: no type named 'MapUnmapCallback' in '__sanitizer::SizeClassAllocatorPair<__lsan::AP64<__santizer::LocalAddressSpaceView>, __lsan::AP32<__sanitizer::LocalAddressSpaceView> >'
      LargeMmapAllocator<typename PrimaryAllocator::MapUnmapCallback,
                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
/home/cc/bmr/llvm-project/llvm/projects/compiler-rt/lib/lsan/lsan_allocator.h:122:24: note: in instantiationof template class '__sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocatorPair<__lsan::AP64<__sanitizr::LocalAddressSpaceView>, __lsan::AP32<__sanitizer::LocalAddressSpaceView> >, __sanitizer::LargeMmapAllocatrPtrArrayDynamic>' requested here
using AllocatorCache = Allocator::AllocatorCache;
                       ^

In SizeClassAllocatorXX the MapUnmapCallback is passed in as a typedef inside one of the two APXX structs. The SizeClassAllocatorPair class cannot select one of the two MapUnmapCallback A1 or A2 template classes until the runtime memory check determines the size of the VMA. But by then it's too late: we need to instantiate a CombinedAllocator class in order to create AllocatorCache in lsan_allocator.h. We are back to this needing to be a compile-time property for another dependent class. I could always select A1's typedef of MapUnmapCallback, similar to what you do with PrimaryAllocator::MapUnmapCallback but I am uncertain if this is correct.

Please let me know if I misunderstood your testcase in D61401 or if I'm inserting the templated class in the wrong location.

sebpop updated this revision to Diff 198980.May 9 2019, 10:39 PM
sebpop edited the summary of this revision. (Show Details)

I have verified that the updated patch compiles and it reduces the execution time of leak sanitized trivial example.

Thanks @vitalybuka for your guidance on how to avoid the #ifdefs.
Thanks @brzycki for your help on this patch.
I got the ok to update the patch.

If we are happy with this fix for LSan, I will send another patch to fix ASan.

@sebpop welcome back! I'm glad you received clearance to work on this. :) My comments are inline.

compiler-rt/lib/lsan/lsan_allocator.cc
38 ↗(On Diff #198980)

Needs to be under #if defined(__aarch64__) or UseAllocator32 needs to be moved outside of the Aarch64 define in lsan_allocator.h.

46 ↗(On Diff #198980)

Needs to be under #if defined(__aarch64__) or UseAllocator32 needs to be moved outside of the Aarch64 define in lsan_allocator.h.

compiler-rt/lib/lsan/lsan_allocator.h
137

Is this correct? This is part of what I was commenting on in my previous patch. We're always using the 32-bit versions of MapUnmapCallback and AddressSpaceView even if we switch at run-time to the 64-bit allocator. @vitalybuka would know better than I if this is guaranteed to be the same across 32/64 on the same arch or not.

brzycki added inline comments.May 10 2019, 8:09 AM
compiler-rt/lib/lsan/lsan_allocator.cc
38 ↗(On Diff #198980)

Nevermind, I now see what you're doing. It's fine as it is.

46 ↗(On Diff #198980)

Nevermind, I now see what you're doing. It's fine as it is.

sebpop marked 3 inline comments as done.May 10 2019, 8:24 AM
sebpop added inline comments.
compiler-rt/lib/lsan/lsan_allocator.h
75

Both AP64 and AP32 typedef MapUnmapCallback to be the same type: NoOpMapUnmapCallback.

137

See my comment above for MapUnmapCallback and the comment below for AddressSpaceView.

215

Both Allocator32ASVT and Allocator64ASVT are instantiated with the same AddressSpaceView, so
Allocator32::AddressSpaceView == Allocator64::AddressSpaceView.

brzycki added inline comments.May 10 2019, 8:42 AM
compiler-rt/lib/lsan/lsan_allocator.cc
43 ↗(On Diff #198980)

Compiling on x86_64 causes a shift-count-overflow warning to be emitted:

/tmp/tmp.rRhin33V2X/src/llvm/projects/compiler-rt/lib/lsan/lsan_allocator.cc:43:42: warning: shift count >= width of type [-Wshift-count-overflow]
  if (GetMaxVirtualAddress() < (((uptr)1 << 48) - 1))
                                         ^  ~~
1 warning generated.

Changing the comparison to the following line removes the warning:

if (GetMaxVirtualAddress() < (uptr)0xffffffffffffUL)

It's a bit less readable but it doesn't have the warning or the potential to overflow.

sebpop updated this revision to Diff 199085.May 10 2019, 2:26 PM
sebpop retitled this revision from [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 47-bit VMA to [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .
sebpop edited the summary of this revision. (Show Details)

Updated patch fixes ASan.

sebpop updated this revision to Diff 199111.May 10 2019, 5:16 PM

Fix the x86_64 overflow warning with 1ULL << 48.

LGTM. The number of if defined() and duplicated code regions is minimal. I'm curious to know what @vitalybuka and @kcc think of this iteration of the patch.

LGTM. The number of if defined() and duplicated code regions is minimal. I'm curious to know what @vitalybuka and @kcc think of this iteration of the patch.

Oh, I didn't realize that this is ready for review. I'll take a look.

vitalybuka added inline comments.May 13 2019, 11:00 AM
compiler-rt/lib/asan/asan_allocator.h
189

I see 3 copies of this template.
We need to extract that into separate header in sanitizer_common/

In shared extracted version I'd like to avoid using 32/64 in naming, just something generic first/second 1/2 ... etc.

200

we need to make UseAllocator32 as a static member of DoubleAllocator as we want to make difference instantiations if the template have own copy of this var.

compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cc
299

There are other tests which use directly or indirectly SizeClassAllocator32/SizeClassAllocator64.
they all should work with DoubleAllocator.

Could you please define Allocator32or64Compact in the top of the file and extend other tests?

using Allocator32or64Compact = DoubleAllocator<Allocator32Compact, Allocator64Compact>;

...

TEST(SanitizerCommon, SizeClassAllocator32or64Compact) {
  Allocator32or64Compact::use1 = false;
  TestSizeClassAllocator<Allocator32or64Compact>();
  Allocator32or64Compact::use1 = true;
  TestSizeClassAllocator<Allocator32or64Compact>();
}
824

we want to test shared template not the one we define here.

sebpop updated this revision to Diff 200171.May 19 2019, 1:17 AM

Addressed comments from @vitalybuka: factored up the 3 versions and added more tests.
Passes with no new fails ninja check-all on an AArch64 Graviton A1 instance.

vitalybuka added inline comments.May 21 2019, 1:28 PM
compiler-rt/lib/asan/asan_allocator.cc
36

same as lsan and preinit_arrays

116

get_allocator().getKMaxSize() ?
same for the rest

compiler-rt/lib/asan/asan_allocator.h
163

Historically it's Google style and we start function names with capitals.

169

can you make it get_allocator().getKMaxSize() ?
it would be nice to move them into a separate patch.

compiler-rt/lib/lsan/lsan_allocator.cc
32 ↗(On Diff #200171)

it's too late if we use SANITIZER_CAN_USE_PREINIT_ARRAY
Please move into DoubleAllocator::Init

compiler-rt/lib/sanitizer_common/sanitizer_doubleallocator.h
3 ↗(On Diff #200171)

Please update the first line and rename the file to sanitizer_double_allocator.h
we usually split words with _

18 ↗(On Diff #200171)

sorry for these naming in my sample patch
historically we use Google style here
a1, a2, UseAllocator1 -> first_, second_, use_first_

I am open for better recommendations

22 ↗(On Diff #200171)

is possible to make it non static?

sebpop marked 9 inline comments as done.Jun 11 2019, 1:13 PM

The updated patch I will post is addressing all the review comments.

sebpop updated this revision to Diff 204145.Jun 11 2019, 1:15 PM

The updated patch passes make check-lsan check-asan and is still under test for check-all on aarch64-linux.

sebpop updated this revision to Diff 204177.Jun 11 2019, 3:08 PM

For some reason asan/tests/asan_noinst_test.cc is not compiled on make check-asan and that has exposed a compile error that was not triggered by the other tests:

sanitizer_double_allocator.h:30:11: error: use of non-static data member 'use_first_' of 'DoubleAllocator' from nested type 'DoubleAllocatorCache'
      if (use_first_)
          ^~~~~~~~~~

The updated patch fixes this by accessing the non-static field of the enclosing class through a this pointer to one of the instances:

-      if (use_first_)
+      if (this->use_first_)

The updated patch passes check-all with no new fails on aarch64-linux.