Page MenuHomePhabricator

Introduce `RemoteAddressSpaceView` and `VMReadContext`.
Needs RevisionPublic

Authored by delcypher on Jan 2 2019, 10:35 AM.

Details

Summary

RemoteAddressSpaceView is the remote memory counter part to
LocalAddressSpaceView that implements the AddressSpaceView
interface.

The RemoteAddressSpaceView implementation is a simple wrapper around a
VMReadContext object. A VMReadContext object implements the platform
specific code to

  • Read memory from another process.
  • Cache those reads and manage the cache lifetime.
  • Manage the writability of the cached memory.

Two implementations are provided:

  • A Darwin only implementation that allows in-process and out-of-process

reads. The reads are cached by a Darwin specific framework
(CoreSymbolication) and thus the VMReadContext implementation does not
implement a cache.

  • A non-Darwin implementation that only allows in-process reads

(basically a no-op). This exists only for testing purposes.

Tests for this functionality will be added in later patches.

rdar://problem/45284065

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
vitalybuka added inline comments.Jan 16 2019, 1:37 PM
lib/sanitizer_common/sanitizer_remote_address_space_view.h
68

same for class

lib/sanitizer_common/sanitizer_vm_read_context.h
18

empty live between define and include

25

then just define process_vm_read_handle_t as task_t in sanitizer_internal_defs for Darwin?

31

please init target_process_ to something target_process_ as well

43

why copying is not acceptable?

74

We cannot calculate it. It needs to be part of the sate.

I read this as:

bool IsLocal() const {
  return config_ == nullptr || mach_task_self() == target_process_;
}
lib/sanitizer_common/sanitizer_vm_read_context_mac.cc
29

can you remove "config" argument and just initialize config_ to darwin specific stuff here?

30

if we keep it please remove {} for consistency

43

It still looks like better to split Read and ReadWriteble when we have different implementation for them.
So If you have/expect such patch quite soon, then fine, let's keep it. If this is just expectation that some-when in undefined future we will likely need to spit them, then I'd prefer we split when it's actually necessary.

51

why do we need config_?
from here looks like you expect a particular function
could you call it directly here?

delcypher marked 6 inline comments as done.Jan 18 2019, 5:26 AM
delcypher added inline comments.
lib/sanitizer_common/sanitizer_vm_read_context.h
31

Not doing inline initialisation here is deliberate. The type is platform specific so we might not be able to initialise it using an int. This is why the platform specific constructors do the initialisation.

Technically we could initialise to 0 now because task_t happens to be a typedef for unsigned on Darwin but this might not work for other platforms in the future.

43

Because a VMReadContext object is supposed to manage the lifetime of memory it copies from another. The implementation might hold data structures to manage this memory and so copying a VMReadContext object is almost certainly a mistake.

For the Darwin implementation VMReadContext delegates this task to a Darwin specific framework that guarantees the lifetime of the copied memory will be longer than its client (VMReadContext). So technically nothing would go wrong right now if the VMReadContext copy constructor was called. However calling the VMReadContext is almost certainly a mistake and so it should be disallowed.

74
bool IsLocal() const {
 return config_ == nullptr || mach_task_self() == target_process_;
}

Oh I see. I thought you where talking about GetTargetProcess(). However in this latest revision IsLocal() has actually been removed because my internal implementation never called it.

I can probably bring it back though.

lib/sanitizer_common/sanitizer_vm_read_context_mac.cc
29

No. On Darwin config is a function pointer but on other platforms it can be anything (that's why it has type void*) that's needed to get access to the target process's memory. The config argument must be there to provide a consistent and flexible interface on all platforms.

There is no special function we can call from inside VMReadContext to get the function pointer. The client creating the VMReadContext on Darwin knows the value of the pointer and must pass it in.

30

Ok. I'll fix this.

51

No we cannot. It must be passed in via the VMReadContext constructor.

delcypher updated this revision to Diff 182510.Jan 18 2019, 7:21 AM

Minor tweaks.

delcypher marked 6 inline comments as done.Jan 18 2019, 7:24 AM
delcypher added inline comments.
lib/sanitizer_common/sanitizer_vm_read_context.h
18

Fixed.

32

Fixed.

74

I've brought the implementation back and removed the is_local_ field.

lib/sanitizer_common/sanitizer_vm_read_context_mac.cc
30

Done.

delcypher marked 2 inline comments as done.Jan 18 2019, 7:30 AM
delcypher added inline comments.
lib/sanitizer_common/sanitizer_vm_read_context.h
25

I tried doing this in my internal implementation. It breaks compilation because sanitizer_internal_defs gets included very early and task_t at that point is not defined. Let's just keep it.

delcypher marked an inline comment as done.Jan 18 2019, 7:49 AM
delcypher added inline comments.
lib/sanitizer_common/sanitizer_vm_read_context_mac.cc
43

Using the read-only/writable information will land much later so I'm going to remove passing this information to VMReadContext.

delcypher updated this revision to Diff 182527.Jan 18 2019, 8:37 AM
  • Check thread id
  • Use atomic variables
delcypher marked an inline comment as done.Jan 18 2019, 8:38 AM
delcypher added inline comments.
lib/sanitizer_common/sanitizer_remote_address_space_view.h
28

Okay I've added the thread check. I don't know if this code will be called on the main thread or not. Presumably that's irrelevant here?

@vitalybuka I've done another pass over the patch and fixed everything I reasonably could. Please make another pass when you have time.

vitalybuka added inline comments.Jan 18 2019, 1:18 PM
lib/sanitizer_common/sanitizer_remote_address_space_view.h
25

includes are not ordered

33

Why did you obfuscate context type?
There is not point in atomic if you are going to use single thread
maybe OK for owning_tid_ but we will crash immediately if this does not match

51

probably you should check if owning_tid_ is already set that they match

58

Please avoid reinterpret_cast, just return VMReadContext* as member

lib/sanitizer_common/sanitizer_vm_read_context.h
31

Maybe platform specific invalid_vm_read_handle near process_vm_read_handle_t?
also process_vm_read_handle_t is too specific, why not just some task_id?

43

"operator=" ?

48

(void*) -> reinterpret_cast<>

lib/sanitizer_common/sanitizer_vm_read_context_mac.cc
51

Instead of void* can you make it function pointer?
With type similar to memory_reader_t
If you make memory_reader_t return bool and handle error there, the only platform specific part of VMReadContext will be the callback.

also config_ -> native_reader_;

Rebase on master and adopt new license headers.

delcypher updated this revision to Diff 182930.Jan 22 2019, 8:39 AM
delcypher marked 4 inline comments as done.
  • Revert back to plain VMReadContext*
  • Check owning_tid_ in set_context()
  • Delete more methods from VMReadContext.
delcypher marked 2 inline comments as done.Jan 22 2019, 8:41 AM
delcypher added inline comments.
lib/sanitizer_common/sanitizer_remote_address_space_view.h
25

Oops. Good catch. Fixed.

33

Originally I did that because I was going to make this a DCHECK and I didn't want it to be possible for the pointer to change non-atomically. You're right though. Now that the CHECK happens in all builds I should just use VMContext*.

I've fixed this now.

51

That would require that I guarantee that owning_tid is 0 initially (i.e. before RemoteAddressSpaceView is ever used). I think this might already be the case because atomic_uint64_t RemoteAddressSpaceView::owning_tid is a global but I wasn't 100% sure. I wasn't sure how I was supposed to write an initializer for the global in sanitizer_remote_address_space_view.cc so I just left it out.

If we can confidently assume that owning_tid_ is zero initialized then I'll add the check.

I've gone ahead and made the fix. Seems to work.

lib/sanitizer_common/sanitizer_vm_read_context.h
43

That would also be bad. I've deleted that and also the move variants.

delcypher updated this revision to Diff 182932.Jan 22 2019, 8:44 AM
  • Remove C-style cast.
delcypher marked 2 inline comments as done.Jan 22 2019, 8:45 AM
delcypher added inline comments.
lib/sanitizer_common/sanitizer_vm_read_context.h
48

Fixed.

delcypher marked 2 inline comments as done.Jan 22 2019, 9:22 AM
delcypher added inline comments.
lib/sanitizer_common/sanitizer_vm_read_context_mac.cc
51

Instead of void* can you make it function pointer?
With type similar to memory_reader_t
If you make memory_reader_t return bool and handle error there, the only platform specific part of VMReadContext will be the callback.

I could but I'm strongly against that because.

  • A function pointer isn't great. It can't hold any state. Even in Darwin's case I'd have to write a wrapper function that implements the callback interface. That function needs to know the address of the actual function to call. So the callback would need to be able to receive arbitrary data (i.e. passing a void*) so I could pass that address.
  • Having a function pointer encourages implementations to actually live outside of VMReadContext. This is the opposite of what VMReadContext() tries to do -- Hide as much platform specific behaviour as possible inside VMReadContext. It is a Darwin specific implementation detail that we have to pass a function pointer to VMReadContext, other platforms should not be forced to do that in-order to write their implementations. They should be able to pass whatever makes sense for their platform.
  • On other platforms the interface may need to work differently. More parameters might need to be passed to the reader. In this case the void* would be a pointer to a struct containing the extra data needed by the platform.
  • It ties the design of VMReadContext very closely to Darwin's memory_reader_t.

Besides, I could probably achieve the non-platform specific error handling simplify by refactoring Read(...) into the platform agnostic part Read(...) and the platform specific part ReadImpl(...).

I think we could clean up the interface in a better way. Instead of a void* argument to the constructor it could be

VMReadContextConfig* which would be

class VMReadContextConfig {
   public:
    __sanitizer::process_vm_read_handle_t target_process;
};

Then other platforms could sub-class this and add their own platform specific fields. For Darwin we'd add a function pointer field so that the Darwin specific function pointer can be passed.

That way

  • At most one argument needs to be passed VMReadContext
  • It allows for platform specific implementations in a way that's slightly more type safe than void*.
  • We can pass arbitrary data to be passed to VMReadContext.

    We don't have RTTI or LLVM's dyn_cast so we would have to statically cast inside each implementation to the expected VMReadContextConfig sub-class.
vitalybuka added inline comments.Jan 22 2019, 12:41 PM
lib/sanitizer_common/CMakeLists.txt
48

Please return )

lib/sanitizer_common/sanitizer_remote_address_space_view.h
57

So data race is possible when one thread is releasing context and another is setting.
For simplicity I think you should never reset owning_tid_
As soon as one thread used it, no other allowed to interact with that.

Otherwise we need to design proper synchronization which is inconvenient with all these statics. E.g. as soon someone called Load and saved returned pointer
we will need to make sure that caller release the pointers before letting other threads to interact with context.

And maybe it's not bad idea. But you have option to go with "never reset owning_tid_" now and switch to Load/Unload with mutexes later, when needed.

58

Nothing prevents one thread to create ScopedContext and another thread to do LoadWritable. We need to check owning_tid_ in Load and LoadWritable

103

: vmrc() is not needed

lib/sanitizer_common/sanitizer_vm_read_context.h
31

Maybe platform specific invalid_vm_read_handle near process_vm_read_handle_t?
also process_vm_read_handle_t is too specific, why not just some task_id?

I still think you should remove process_vm_read_handle_t from sanitizer_internal_defs.h and just define it in the Context. u64 everywhere, like for tid_t, is good enough for me.

lib/sanitizer_common/sanitizer_vm_read_context_mac.cc
57
bool VMReadContext::IsLocal() const {
  return config_ == nullptr || mach_task_self() == target_process_;
}
delcypher marked 2 inline comments as done.Jan 23 2019, 10:04 AM
delcypher added inline comments.
lib/sanitizer_common/sanitizer_remote_address_space_view.h
57

So data race is possible when one thread is releasing context and another is setting.
For simplicity I think you should never reset owning_tid_
As soon as one thread used it, no other allowed to interact with that.

I'm fine with that. Currently RemoteAddressSpaceView should only be used within a single-thread.

I'll update the patch as you suggest.

58

Did I miss something? Won't the CHECK_EQ(atomic_load_relaxed(&owning_tid_), current_tid) fire inside get_context() fire due to LoadWritable() or Load() eventually triggering a call to get_context()?

  • Add extra return in CMakeLists.txt.
  • Only allow for owning_tid_ to be set once.
  • Clean up IsLocal() darwin implementation.
delcypher marked 2 inline comments as done and an inline comment as not done.Jan 23 2019, 10:38 AM

Add VMReadContextConfig to simplify usage of VMReadContext.

delcypher marked an inline comment as done.Jan 23 2019, 12:25 PM
delcypher added inline comments.
lib/sanitizer_common/sanitizer_vm_read_context_config.h
2

@vitalybuka I had a go at adding VMReadContextConfig (see the header file below). The advantage of this is I'm able to pull the definition of task_t out of sanitizer_internal_defs.h and make the constructor of VMReadContextConfig a little nicer by not having it take a void*.

Fix lint issues.

Fix comments so the VMReadContext::Read() actually complies
in the case we are doing local reads.

vitalybuka requested changes to this revision.Apr 24 2019, 11:14 AM

Removing stale patches from the "Ready to Review" list.

This revision now requires changes to proceed.Apr 24 2019, 11:14 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 24 2019, 11:14 AM