This is an archive of the discontinued LLVM Phabricator instance.

[TypeSystem] Guard the global `ASTSourceMap` with a mutex
ClosedPublic

Authored by spyffe on Jul 6 2017, 1:26 PM.

Details

Summary

s_source_map in ClangExternalASTSourceCommon.cpp is unguarded and therefore can break in multithreaded conditions. This can cause crashes in particular if multiple targets are being set up at once.

This patch wraps s_source_map in a function that ensures exclusivity, and makes every user of it use that function instead.

Diff Detail

Repository
rL LLVM

Event Timeline

spyffe created this revision.Jul 6 2017, 1:26 PM
lhames added inline comments.Jul 6 2017, 6:33 PM
source/Symbol/ClangExternalASTSourceCommon.cpp
24 ↗(On Diff #105516)

Does std::result_of<FnType>::type work as the return type?

http://en.cppreference.com/w/cpp/types/result_of

28 ↗(On Diff #105516)

Should this be on a context object of some kind (ASTContext?).

jingham added a subscriber: jingham.Jul 6 2017, 7:10 PM

This is an awfully complex solution which in the end doesn't actually enforce that you take the lock to get the SourceMap. You have to know to wrap the access in this WithExclusiveSourceMap. Wouldn't it be simpler to make GetSourceMap take a reference to a std::unique_lock which it fills in as, for instance:

ExecutionContext::ExecutionContext(const ExecutionContextRef &exe_ctx_ref,
                                   std::unique_lock<std::recursive_mutex> &lock)

does, constructing the unique_lock with its mutex:

lock = std::unique_lock<std::recursive_mutex>(m_target_sp->GetAPIMutex());

Then the caller will have to make this lock, pass it in, and the lock hold for the scope of the caller, e.g.:

SBCompileUnit SBFrame::GetCompileUnit() const {
  Log *log(GetLogIfAllCategoriesSet(LIBLLDB_LOG_API));
  SBCompileUnit sb_comp_unit;
  std::unique_lock<std::recursive_mutex> lock;
  ExecutionContext exe_ctx(m_opaque_sp.get(), lock);

...
}

This is much more readable AND enforces that you have to provide a lock to call the function.

labath added a subscriber: labath.Jul 7 2017, 1:49 AM

I would prefer the unique_lock idea also. Lamdas make the code look ugly.

Alternatively, we could start using thread safety annotations https://clang.llvm.org/docs/ThreadSafetyAnalysis.html, and let clang do the checking for us. I think a number of places could benefit from those. Not saying this is the best example, but since we're already discussing it...

jingham added a comment.EditedJul 7 2017, 10:07 AM

The ThreadSafety analysis stuff looks interesting, but just to be sure I understand: You said "alternatively we could..." but from what I could tell the thread safety defines are markup on already extant mechanisms. It didn't look like it provided any locking on it's own, it only enforces the discipline for your current locking strategy. Did I miss something?

Ah, maybe you meant applying the thread safety annotation to Sean's solution to enforce the contract. That's an interesting solution, but makes a non-straightforward solution even more non-straightforward, so I agree this isn't the best example...

Responded to Lang's comments inline.

Jim: you say this patch "doesn't actually enforce that you take the lock to get the SourceMap." How do you get the source map otherwise? It's static inside the function so nothing can see it outside. Do you mean that it's still possible to have the lambda make an alias to the reference and return?

Pavel: if I have GetSourceMap() take a unique_lock& and leave the return value as it is, then clients get an unguarded alias to s_source_map to play with regardless of whether they're holding the lock or not. We have to trust that they hold on to the lock as long as they need to – or use GUARDED_BY, adding more complexity for something that WithExclusiveSourceMap() enforced by default. (Creating an escaping alias is still an issue.)

I'm going to hold off for Lang's opinion on this. I feel like we have a few workable solutions, and since this isn't API for anything even outside the .cpp file picking one and going with it is fine.

source/Symbol/ClangExternalASTSourceCommon.cpp
24 ↗(On Diff #105516)

No, it doesn't. If I change the declaration to

template <typename FnType>
static typename std::result_of<FnType>::type
WithExclusiveSourceMap(FnType fn) {

then the compiler can no longer resolve calls:

…/lldb/source/Symbol/ClangExternalASTSourceCommon.cpp:25:1: Candidate template ignored: substitution failure [with FnType = (lambda at …/lldb/source/Symbol/ClangExternalASTSourceCommon.cpp:39:7)]: implicit instantiation of undefined template 'std::__1::result_of<(lambda at …/lldb/source/Symbol/ClangExternalASTSourceCommon.cpp:39:7)>'
28 ↗(On Diff #105516)

No, it shouldn't. This is intended to be global.

labath added a comment.Jul 8 2017, 2:21 AM

Ah, maybe you meant applying the thread safety annotation to Sean's solution to enforce the contract. That's an interesting solution, but makes a non-straightforward solution even more non-straightforward, so I agree this isn't the best example...

Actually, I meant applying the annotations to *your* solution :). The "alternatively" meant that instead of passing the by-reference locker argument to enforce the contract, you would annotate the function like:

static GetASTMap() REQUIRES_CAPABILITY(my_mutex)

Then the caller could just lock the mutex any way he likes and the compiler would smack him on the head if he forgets that.

However, after playing around with this more, I have become less enthusiastic about it -- it basically requires you to wrap all locking primitives with your own (annotated) classes (recent libc++'s have them out of the box, but only on std::mutex). So, maybe you should just ignore my comments :)

Responded to Lang's comments inline.

Jim: you say this patch "doesn't actually enforce that you take the lock to get the SourceMap." How do you get the source map otherwise? It's static inside the function so nothing can see it outside. Do you mean that it's still possible to have the lambda make an alias to the reference and return?

Nah, just missed you were hiding the map inside the wrapper amidst all the goo.

Pavel: if I have GetSourceMap() take a unique_lock& and leave the return value as it is, then clients get an unguarded alias to s_source_map to play with regardless of whether they're holding the lock or not. We have to trust that they hold on to the lock as long as they need to – or use GUARDED_BY, adding more complexity for something that WithExclusiveSourceMap() enforced by default. (Creating an escaping alias is still an issue.)

We do this all over the place in lldb. The fact that you have to provide a mutex to access a data structure is pretty clear documentation that you should only access it under the aegis of that mutex. And the scoped lockers make this extremely easy to do right. Do you really think this is an issue - and do you think lldb would be made more readable by replacing all functions that access mutex-protected variables with this sort of construct?

lhames edited edge metadata.Jul 10 2017, 6:25 PM

I prefer the lambda syntax: Locking with an argument requires you to know the idiom (which wasn't immediately obvious to me), whereas the lambda enforces it implicitly.

We could also consider writing an atomic access adapter for this (and other types). That would be generally useful, but would also require more consideration (i.e. we shouldn't block this patch on that discussion).

source/Symbol/ClangExternalASTSourceCommon.cpp
24 ↗(On Diff #105516)

Oh right, result_of only works for function types at the moment.

One minor tidy-up would be to use the 'auto' function declaration syntax so that you can use the name of the functor in the decltype expression, rather than having to declval one up. i.e.:

static decltype(std::declval<FnType>()(std::declval<ASTSourceMap&>()))
WithExclusiveSourceMap(FnType fn) {

becomes

static auto WithExclusiveSourceMap(FnType fn) -> declval(fn(std::declval<ASTSourceMap&>())) {
  ...

This is a fairly common idiom in lldb and seems to me quite obvious. If the API to get an object requires a lock guard of some sort, then you have to hold the lock while using the object.

As a general practice requiring a wrapper like this for every use of a should be locked object would make the code noisy and hard to read. The only error you would be protecting against is that somebody used the entity after the lock went out of scope. You could use some kind of markup to enforce this requirement, but you also have to be pretty sloppy to make this kind of error, so I'm not sure this it worth going to great lengths to protect against.

In this limited use, I guess I don't object seriously, but don't like this as a general pattern.

As a general practice requiring a wrapper like this for every use of a should be locked object would make the code noisy and hard to read. The only error you would be protecting against is that somebody used the entity after the lock went out of scope. You could use some kind of markup to enforce this requirement, but you also have to be pretty sloppy to make this kind of error, so I'm not sure this it worth going to great lengths to protect against.

In this limited use, I guess I don't object seriously, but don't like this as a general pattern.

+1. For limited use, I am fine with leaving this up to the owners discretion, but I would object to a more widespread usage.

source/Symbol/ClangExternalASTSourceCommon.cpp
24 ↗(On Diff #105516)

that should probably be something like std::result_of<FnType(ASTSourceMap&)>

55 ↗(On Diff #105516)

Since you're worrying about this being run concurrently, it looks like you should also protect the access to this global variable (possibly by making it atomic).

spyffe updated this revision to Diff 106087.Jul 11 2017, 1:30 PM

Upon reflection, it's not worth coming up with a new pattern if

  • that pattern does not considerably reduce the incidence of bugs in the source file I'm applying it in (and indeed that seems quite unlikely), and
  • that pattern is unlikely to be adopted elsewhere.

I'm going to simply pass in a std::unique_lock and swap into that. This has the pleasant side effect of guarding the g_totalSizeOfMetadata variable against at least some overrides, although to be honest I'd accept a separate patch that removes this variable.

jingham accepted this revision.Jul 24 2017, 5:11 PM

Ack! I thought when you said you were "going to simply pass..." you hadn't done it yet, so I didn't look at the source changes.

Anyway, this looks fine to me.

This revision is now accepted and ready to land.Jul 24 2017, 5:11 PM
This revision was automatically updated to reflect the committed changes.