Page MenuHomePhabricator

NFC: Reduce presence of "template <typename AddressSpaceView>"
Needs ReviewPublic

Authored by vitalybuka on Jan 25 2019, 7:50 PM.

Details

Summary

Goal is to make it more clear which parts of code rely on AddressSpaceView.

Also we this way we can remove some static_asserts which check that
CombinedAllocator instantiated with compatible types. Now
CombinedAllocator controls type by itself.

Event Timeline

vitalybuka created this revision.Jan 25 2019, 7:50 PM

removed type duplicate

If we're going to go down this road we really need to land https://reviews.llvm.org/D56964 first. That fixes additional behaviour of the allocators.

delcypher added inline comments.Jan 26 2019, 6:57 AM
compiler-rt/lib/sanitizer_common/sanitizer_allocator_bytemap.h
80

This looks wrong. In the out-of-process case where we've copied over a FlatByteMap from the target process map2 is pointing to memory that isn't valid in our local address space. Here it looks like you're actually dereferencing invalid memory (map2[idx % kSize2]) and then taking a reference to it.

Perhaps this is okay. I think it might be the implicit conversion of map2[idx %kSize2] to an rvalue that actually triggers the dereference normally. This might not be happening here (I hope the code isn't taking a reference to a temporary r-value), but I am incredibly suspicious of this code.

vitalybuka marked 2 inline comments as done.Jan 26 2019, 1:34 PM

If we're going to go down this road we really need to land https://reviews.llvm.org/D56964 first. That fixes additional behaviour of the allocators.

Sure, I'd like to land this patch as NFC. I don't mind if D56964 goes either before or after.
rebased D56964 is https://github.com/vitalybuka/llvm-project/commit/6eda6de4a4802bf3a830df270fe6dab279baf278
it required some manual interaction but very trivial.

compiler-rt/lib/sanitizer_common/sanitizer_allocator_bytemap.h
80

No, I just moved AddressSpaceView::Load out of the map, and return references here.
For random user this is still regular map. For allocator if we know that we are remote "copy" and we may need to fix-up reference with AddressSpaceView::Load
FlatByteMap does not need AddressSpaceView::Load but to make it general, I apply AddressSpaceView::Load even there.

Another option to add: const "u8* find(uptr) const {...}" where deference will be clearly out of the map, but I don't rely think references here are much worse here.

compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary32.h
235

Here is a map access.
Maybe AddressSpaceView::LoadByRef could be useful, but I afraid that it will accept references to temps. So pointer only interface will not let to removes refs from map::operator[]

vitalybuka retitled this revision from Reduce presence of "template <typename AddressSpaceView>" to NFC: Reduce presence of "template <typename AddressSpaceView>".Jan 26 2019, 1:40 PM
vitalybuka edited the summary of this revision. (Show Details)
vitalybuka edited the summary of this revision. (Show Details)

rebase

@vitalybuka Thanks for working on this.

Context for those new to the patch: We are looking at ways of converting a Allocator<LocalAddressSpaceView> into a Allocator<RemoteAddressSpaceView>. With the current design on master it is not safe to do a reinterpret_cast<>() because C++ doesn't guarantee that the two different types have the same data-layout or fields (AFAIK).

Over the weekend I investigated your first suggestion for fixing the reinterpret_cast problem for the allocators. I've updated my patch stack to use this new approach. Even though you're trying a different approach in this patch I thought it was important that I investigate other options. The key patches are...

  • https://reviews.llvm.org/D57306 - This removes AddressSpaceView from primary allocators params data structures. The patch explains more but basically this is needed for the next patch. It fixes what I consider to be a design mistake on my part and cleans up the code considerably.
  • https://reviews.llvm.org/D57307 - This add CopyFromTarget(...) static methods to all data types that take AddressSpaceView as a type parameter. It also adds a public templated type alias (ThisTASVT) that allows you to construct the RemoteAddressSpaceView counter-part from an existing type.
  • https://reviews.llvm.org/D56207 - Updates the unit tests to use the new CopyFromTarget(...) and use ThisTASVT for type construction. Note that the test is significantly simpler now.

I'd encourage you take a look because I think what we might end up doing may be a hybrid between this patch and what I've done in the patches I listed above.

I've made a short pro/con list (from my perspective) for the two approaches. Hopefully this will encourage some discussion that will allow us to converge on a design.

This patch

This patch effectively tries to separate data and code. Data in a parent allocator class that is not templated over AddressSpaceView, and then the methods that depend on AddressSpaceView are in the child class. I'm guessing the hope is that recasting a AllocatorView<LocalAddressSpaceView> as a AllocatorView<RemoteAddressSpaceView> should work and be safe because all the data is in the parent class (the data layout doesn't depend on AddressSpaceView).

Pros

  • Creating a AllocatorView<RemoteAddressSpaceView> from a AllocatorView<LocalAddressSpaceView> is a cheap operation because it's just a re-cast.
  • There is a clear separation between what methods depend on AddressSpaceView in an allocator, and what does not depend on it. In fact this separation will also create compilation errors if you get it wrong (method is parent class tries to call a method that only exists in the child class).

Cons

  • Restricts the design/behaviour of data structures contained within the allocator significantly. This design disallows fields of the allocator depending on AddressSpaceView (would require parent class taking AddressSpaceView as a template parameter which would break doing safe cheap casting). This is a very strong restriction because it means data structures inside the allocator cannot derefence pointers that point outside itself because the data structure has no access to AddressSpaceView.
  • The previous point means we can (and do have) horrible layering violations. Now that data structures in the allocator cannot derefence pointers, it means the data structure's interface must return pointers (or references that we can take the address of) and that the caller must decide whether or not to use AddressSpaceView::Load(). This is a layering violation because it requires that the allocator knows how the data structure is implemented (does the pointer/reference returned point to the local address space or the remote address space?). A concrete example of this in this patch is ByteMap.
  • The above problem is even worse if the used data-structure contains nested data-structures.

My CopyFromTarget(...) patch series.

Pros

  • There is no restriction on what data structures that an allocator can use because they are allowed to take AddressSpaceView as a type parameter.
  • There is no restriction on the design of nested data structures that use AddressSpaceView, other than that they implement CopyFromTarget(...) and use AddressSpaceView::Load(...).
  • Because of the above point, data structures recursively using AddressSpaceView as a template parameter are supported (used in my patches).

Cons

  • Creating a AllocatorView<RemoteAddressSpaceView> from a AllocatorView<LocalAddressSpaceView> is not cheap because we have to copy over all the fields (we might be able to improve this by using a constexpr() what checks if the data layout is the same).
  • The above is a maintenance burden because we have to manually maintain code that enumerates the fields of types that depend on AddressSpaceView. Note, if we forget to copy a field that actually matters for allocator enumeration our tests should catch this.

Thoughts

I personally prefer my CopyFromTarget(...) patch series. Although I think what @vitalybuka has done in this patch is quite clever (the separation of code and data is quite aesthetically pleasing), I think this patch paints us into a design corner. The restrictions it places on what data structures can be used in allocators and how they can be implemented is a very serious concern of mine. I wanted the AddressSpaceView interface to be as unintrusive as possible but his patch feels like it places very intrusive design restrictions on the allocators. I would rather take the performance hit that CopyFromTarget(...) adds and the minor maintenance burden it adds so that we can continue to allow the Allocators to use whatever data structure they wish.

@kcc @eugenis @cryptoad I'd appreciate your input here because this is quite an important design decision.

@vitalybuka Thanks for working on this.

Context for those new to the patch: We are looking at ways of converting a Allocator<LocalAddressSpaceView> into a Allocator<RemoteAddressSpaceView>. With the current design on master it is not safe to do a reinterpret_cast<>() because C++ doesn't guarantee that the two different types have the same data-layout or fields (AFAIK).

Now you can reinterpret/memcpy at base class level.

Over the weekend I investigated your first suggestion for fixing the reinterpret_cast problem for the allocators. I've updated my patch stack to use this new approach. Even though you're trying a different approach in this patch I thought it was important that I investigate other options. The key patches are...

  • https://reviews.llvm.org/D57306 - This removes AddressSpaceView from primary allocators params data structures. The patch explains more but basically this is needed for the next patch. It fixes what I consider to be a design mistake on my part and cleans up the code considerably.
  • https://reviews.llvm.org/D57307 - This add CopyFromTarget(...) static methods to all data types that take AddressSpaceView as a type parameter. It also adds a public templated type alias (ThisTASVT) that allows you to construct the RemoteAddressSpaceView counter-part from an existing type.
  • https://reviews.llvm.org/D56207 - Updates the unit tests to use the new CopyFromTarget(...) and use ThisTASVT for type construction. Note that the test is significantly simpler now.

    I'd encourage you take a look because I think what we might end up doing may be a hybrid between this patch and what I've done in the patches I listed above.

    I've made a short pro/con list (from my perspective) for the two approaches. Hopefully this will encourage some discussion that will allow us to converge on a design.
  • This patch

    This patch effectively tries to separate data and code. Data in a parent allocator class that is not templated over AddressSpaceView, and then the methods that depend on AddressSpaceView are in the child class. I'm guessing the hope is that recasting a AllocatorView<LocalAddressSpaceView> as a AllocatorView<RemoteAddressSpaceView> should work and be safe because all the data is in the parent class (the data layout doesn't depend on AddressSpaceView).
    1. Pros
  • Creating a AllocatorView<RemoteAddressSpaceView> from a AllocatorView<LocalAddressSpaceView> is a cheap operation because it's just a re-cast.
  • There is a clear separation between what methods depend on AddressSpaceView in an allocator, and what does not depend on it. In fact this separation will also create compilation errors if you get it wrong (method is parent class tries to call a method that only exists in the child class).
    1. Cons
  • Restricts the design/behaviour of data structures contained within the allocator significantly. This design disallows fields of the allocator depending on AddressSpaceView (would require parent class taking AddressSpaceView as a template parameter which would break doing safe cheap casting). This is a very strong restriction because it means data structures inside the allocator cannot derefence pointers that point outside itself because the data structure has no access to AddressSpaceView.

I rely hope we will try to avoid such fields with either design. So I don't think we need to optimize for this scenario.

  • The previous point means we can (and do have) horrible layering violations. Now that data structures in the allocator cannot derefence pointers, it means the data structure's interface must return pointers (or references that we can take the address of) and that the caller must decide whether or not to use AddressSpaceView::Load(). This is a layering violation because it requires that the allocator knows how the data structure is implemented (does the pointer/reference returned point to the local address space or the remote address space?). A concrete example of this in this patch is ByteMap.

This one can be easily solved making SizeClassAllocator32->SizeClassAllocator32State a pure data structure without methods and
SizeClassAllocator32::WithView ->

class SizeClassAllocator32<AddressSpaceView> {
  // logic
  SizeClassAllocator32State state_;
}

or even better:

class SizeClassAllocator32<AddressSpaceView> {
  // logic
  SizeClassAllocator32State* state_;
}

This way remote part will be able to swap state pointers without need to copy/reinterprete anything.

Anyway all this hidden in implementation and simplify interface. These consts are "possible" future issues which will likely never bother us. AddressSpaceView noise is current issue.
I am against more complicate design for sake of avoiding possible but unlikely future issues.

  • The above problem is even worse if the used data-structure contains nested data-structures.
  • My CopyFromTarget(...) patch series.
    1. Pros
  • There is no restriction on what data structures that an allocator can use because they are allowed to take AddressSpaceView as a type parameter.
  • There is no restriction on the design of nested data structures that use AddressSpaceView, other than that they implement CopyFromTarget(...) and use AddressSpaceView::Load(...).
  • Because of the above point, data structures recursively using AddressSpaceView as a template parameter are supported (used in my patches).
    1. Cons
  • Creating a AllocatorView<RemoteAddressSpaceView> from a AllocatorView<LocalAddressSpaceView> is not cheap because we have to copy over all the fields (we might be able to improve this by using a constexpr() what checks if the data layout is the same).
  • The above is a maintenance burden because we have to manually maintain code that enumerates the fields of types that depend on AddressSpaceView. Note, if we forget to copy a field that actually matters for allocator enumeration our tests should catch this.

It adds even more complexity for readers.

  1. Thoughts

    I personally prefer my CopyFromTarget(...) patch series. Although I think what @vitalybuka has done in this patch is quite clever (the separation of code and data is quite aesthetically pleasing), I think this patch paints us into a design corner. The restrictions it places on what data structures can be used in allocators and how they can be implemented is a very serious concern of mine. I wanted the AddressSpaceView interface to be as unintrusive as possible but his patch feels like it places very intrusive design restrictions on the allocators. I would rather take the performance hit that CopyFromTarget(...) adds and the minor maintenance burden it adds so that we can continue to allow the Allocators to use whatever data structure they wish.

    @kcc @eugenis @cryptoad I'd appreciate your input here because this is quite an important design decision.

@vitalybuka Thanks for working on this.

Context for those new to the patch: We are looking at ways of converting a Allocator<LocalAddressSpaceView> into a Allocator<RemoteAddressSpaceView>. With the current design on master it is not safe to do a reinterpret_cast<>() because C++ doesn't guarantee that the two different types have the same data-layout or fields (AFAIK).

Now you can reinterpret/memcpy at base class level.

Okay. As long as we encapsulate that in a method similar to what I have in https://reviews.llvm.org/D57307

template <typename DstTy, typename SrcTy>
const SrcTy *CopyFromTarget(DstTy *dst, const SrcTy *src)

then I'm fine with that.

Over the weekend I investigated your first suggestion for fixing the reinterpret_cast problem for the allocators. I've updated my patch stack to use this new approach. Even though you're trying a different approach in this patch I thought it was important that I investigate other options. The key patches are...

  • https://reviews.llvm.org/D57306 - This removes AddressSpaceView from primary allocators params data structures. The patch explains more but basically this is needed for the next patch. It fixes what I consider to be a design mistake on my part and cleans up the code considerably.
  • https://reviews.llvm.org/D57307 - This add CopyFromTarget(...) static methods to all data types that take AddressSpaceView as a type parameter. It also adds a public templated type alias (ThisTASVT) that allows you to construct the RemoteAddressSpaceView counter-part from an existing type.
  • https://reviews.llvm.org/D56207 - Updates the unit tests to use the new CopyFromTarget(...) and use ThisTASVT for type construction. Note that the test is significantly simpler now.

    I'd encourage you take a look because I think what we might end up doing may be a hybrid between this patch and what I've done in the patches I listed above.

    I've made a short pro/con list (from my perspective) for the two approaches. Hopefully this will encourage some discussion that will allow us to converge on a design.
  • This patch

    This patch effectively tries to separate data and code. Data in a parent allocator class that is not templated over AddressSpaceView, and then the methods that depend on AddressSpaceView are in the child class. I'm guessing the hope is that recasting a AllocatorView<LocalAddressSpaceView> as a AllocatorView<RemoteAddressSpaceView> should work and be safe because all the data is in the parent class (the data layout doesn't depend on AddressSpaceView).
    1. Pros
  • Creating a AllocatorView<RemoteAddressSpaceView> from a AllocatorView<LocalAddressSpaceView> is a cheap operation because it's just a re-cast.
  • There is a clear separation between what methods depend on AddressSpaceView in an allocator, and what does not depend on it. In fact this separation will also create compilation errors if you get it wrong (method is parent class tries to call a method that only exists in the child class).
    1. Cons
  • Restricts the design/behaviour of data structures contained within the allocator significantly. This design disallows fields of the allocator depending on AddressSpaceView (would require parent class taking AddressSpaceView as a template parameter which would break doing safe cheap casting). This is a very strong restriction because it means data structures inside the allocator cannot derefence pointers that point outside itself because the data structure has no access to AddressSpaceView.

I rely hope we will try to avoid such fields with either design. So I don't think we need to optimize for this scenario.

  • The previous point means we can (and do have) horrible layering violations. Now that data structures in the allocator cannot derefence pointers, it means the data structure's interface must return pointers (or references that we can take the address of) and that the caller must decide whether or not to use AddressSpaceView::Load(). This is a layering violation because it requires that the allocator knows how the data structure is implemented (does the pointer/reference returned point to the local address space or the remote address space?). A concrete example of this in this patch is ByteMap.

This one can be easily solved making SizeClassAllocator32->SizeClassAllocator32State a pure data structure without methods and
SizeClassAllocator32::WithView ->

class SizeClassAllocator32<AddressSpaceView> {
  // logic
  SizeClassAllocator32State state_;
}

or even better:

class SizeClassAllocator32<AddressSpaceView> {
  // logic
  SizeClassAllocator32State* state_;
}

This way remote part will be able to swap state pointers without need to copy/reinterprete anything.

I was going to ask if the performance impact of this indirection matters. However given that Scudo now has its own independent code base I guess this matters a lot less.

Anyway all this hidden in implementation and simplify interface. These consts are "possible" future issues which will likely never bother us. AddressSpaceView noise is current issue.
I am against more complicate design for sake of avoiding possible but unlikely future issues.

Okay. I left this open for a week to see if anyone would respond and it seems that no one else has a differing opinion on this other than me. Given that I am not the code owner the only real way forward is for me to follow the approach in this patch.
Please allow me some time to try rebasing all my patches on top of this one so I can investigate how well it works. I will report back with my findings.