This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Fix data race when interacting with python scripts
ClosedPublic

Authored by mib on Jun 30 2023, 3:57 PM.

Details

Summary

This patch should fix some data races when a python script (i.e. a
Scripted Process) has a nested call to another python script (i.e. a
OperatingSystem Plugin), which can cause concurrent writes to the python
lock count.

This patch also fixes a data race happening when resetting the operating
system unique pointer.

To address these issues, both accesses is guarded by a mutex.

rdar://109413039

Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>

Diff Detail

Event Timeline

mib created this revision.Jun 30 2023, 3:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2023, 3:57 PM
mib requested review of this revision.Jun 30 2023, 3:57 PM

Does this have to be a recursive mutex? Can we simplify this by using a std::atomic<unsigned> instead?

mib updated this revision to Diff 536514.Jun 30 2023, 10:32 PM
mib edited the summary of this revision. (Show Details)

Address @JDevlieghere comment.

Making m_lock_count's type into std::atomic<uint32_t> makes sense to me, but I'm a little confused about why Process::LoadOperatingSystemPlugin is guarded by acquiring m_thread_mutex. My (admittedly limited) understanding of that is that it's a mutex that the Process holds for the ThreadList to manage concurrent modifications to the thread list. Is loading an Operating System plugin related to modifying the ThreadList? If not, perhaps it would be better served by having its own mutex?

Making m_lock_count's type into std::atomic<uint32_t> makes sense to me, but I'm a little confused about why Process::LoadOperatingSystemPlugin is guarded by acquiring m_thread_mutex. My (admittedly limited) understanding of that is that it's a mutex that the Process holds for the ThreadList to manage concurrent modifications to the thread list. Is loading an Operating System plugin related to modifying the ThreadList? If not, perhaps it would be better served by having its own mutex?

Yes, the OS plugins are named somewhat confusingly, but their purpose is to allow an operating systems (e.g. XNU) to specify threads when doing system level debugging, where otherwise we'd have just one thread per core.

lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPythonImpl.h
431–432

I was looking at how m_lock_count is used. IsExecutingPython` is loading the value once so that's fine. IncrementLockCount is using operator++ so that's fine too. The only problem is DecrementLockCount, which is loading the value multiple times:

uint32_t DecrementLockCount() {
  if (m_lock_count > 0)
    --m_lock_count;
  return m_lock_count;
}

We can either drop the "safe" behavior and implement DecrementLockCount as --m_lock_count or we'll have to use a (hopefully non-recursive) mutex after al.

mib added a comment.Jul 1 2023, 10:59 AM

Making m_lock_count's type into std::atomic<uint32_t> makes sense to me, but I'm a little confused about why Process::LoadOperatingSystemPlugin is guarded by acquiring m_thread_mutex. My (admittedly limited) understanding of that is that it's a mutex that the Process holds for the ThreadList to manage concurrent modifications to the thread list. Is loading an Operating System plugin related to modifying the ThreadList? If not, perhaps it would be better served by having its own mutex?

Yes, the OS plugins are named somewhat confusingly, but their purpose is to allow an operating systems (e.g. XNU) to specify threads when doing system level debugging, where otherwise we'd have just one thread per core.

@bulbazord I understand the confusion regarding OperatingSystem but it's used for "OS Thread Plugins", a way to mock a thread using python script, so it does change the ThreadList.

mib updated this revision to Diff 536540.Jul 1 2023, 11:03 AM
mib edited the summary of this revision. (Show Details)

Use std::atomic<T>::fetch_{add, sub} to address @JDevlieghere feedback.

mib updated this revision to Diff 536541.Jul 1 2023, 11:04 AM

Fix typo

mib updated this revision to Diff 536542.Jul 1 2023, 11:09 AM
mib added a subscriber: jasonmolenda.

Fix major typo, thanks @jasonmolenda ;)

JDevlieghere requested changes to this revision.Jul 1 2023, 4:32 PM
JDevlieghere added inline comments.
lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPythonImpl.h
389

Sorry if my previous comment wasn't clear. std::atomic provides these operators so the old code was fine: https://en.cppreference.com/w/cpp/atomic/atomic/operator_arith

395–396

What you do here isn't safe. Nothing is preventing the atomic value from being modified between the fetch_sub and the store and there is no atomic operation that does a less-than-compare-and-store. I don't think you can keep the current behavior with just an atomic.

This revision now requires changes to proceed.Jul 1 2023, 4:32 PM
mib added inline comments.Jul 2 2023, 12:50 AM
lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPythonImpl.h
389

Makes sense, we shouldn't touch the implementation of IncrementLockCount.

391–396

You're right, but I think we should at least keep the guardrails to prevent returning an underflow value and return 0 instead.

mib updated this revision to Diff 536573.Jul 2 2023, 1:10 AM

Make DecrementLockCount actually thread-safe and revert IncrementLockCount to its original implementation.

mib updated this revision to Diff 536850.Jul 3 2023, 11:08 AM

After chatting with Jonas, it turns out that my previous implementation didn't actually address the underflow issue using atomics, so we have to revert to using a std::mutex when incrementing/decrementing the python lock counter.

mib edited the summary of this revision. (Show Details)Jul 3 2023, 11:09 AM
JDevlieghere accepted this revision.Jul 3 2023, 11:18 AM

LGTM modulo one more lock.

lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPythonImpl.h
381–384

We should probably take the lock here as well. It won't be super meaningful (the value can change after the function returns) but I think TSan will complain about the potential race.

This revision is now accepted and ready to land.Jul 3 2023, 11:18 AM
mib updated this revision to Diff 536859.Jul 3 2023, 11:48 AM

Address @JDevlieghere last comment.

This revision was landed with ongoing or failed builds.Jul 3 2023, 11:49 AM
This revision was automatically updated to reflect the committed changes.