This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Fix data race in PipePosix
ClosedPublic

Authored by augusto2112 on Aug 10 2023, 2:27 PM.

Details

Summary

Thread sanitizer reports the following data race:

WARNING: ThreadSanitizer: data race (pid=43201)
  Write of size 4 at 0x00010520c474 by thread T1 (mutexes: write M0, write M1):
    #0 lldb_private::PipePosix::CloseWriteFileDescriptor() PipePosix.cpp:242 (liblldb.18.0.0git.dylib:arm64+0x414700) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #1 lldb_private::PipePosix::Close() PipePosix.cpp:217 (liblldb.18.0.0git.dylib:arm64+0x4144e8) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #2 lldb_private::ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:239 (liblldb.18.0.0git.dylib:arm64+0x40a620) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #3 lldb_private::Communication::Disconnect(lldb_private::Status*) Communication.cpp:61 (liblldb.18.0.0git.dylib:arm64+0x2a9318) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #4 lldb_private::process_gdb_remote::ProcessGDBRemote::DidExit() ProcessGDBRemote.cpp:1167 (liblldb.18.0.0git.dylib:arm64+0x8ed984) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)

  Previous read of size 4 at 0x00010520c474 by main thread (mutexes: write M2, write M3):
    #0 lldb_private::PipePosix::CanWrite() const PipePosix.cpp:229 (liblldb.18.0.0git.dylib:arm64+0x4145e4) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #1 lldb_private::ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:212 (liblldb.18.0.0git.dylib:arm64+0x40a4a8) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #2 lldb_private::Communication::Disconnect(lldb_private::Status*) Communication.cpp:61 (liblldb.18.0.0git.dylib:arm64+0x2a9318) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #3 lldb_private::process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(StringExtractorGDBRemote&, lldb_private::Timeout<std::__1::ratio<1l, 1000000l>>, bool) GDBRemoteCommunication.cpp:373 (liblldb.18.0.0git.dylib:arm64+0x8b9c48) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #4 lldb_private::process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(StringExtractorGDBRemote&, lldb_private::Timeout<std::__1::ratio<1l, 1000000l>>, bool) GDBRemoteCommunication.cpp:243 (liblldb.18.0.0git.dylib:arm64+0x8b9904) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)

Fix this by adding a mutex to PipePosix.

Diff Detail

Event Timeline

augusto2112 created this revision.Aug 10 2023, 2:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2023, 2:27 PM
augusto2112 requested review of this revision.Aug 10 2023, 2:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2023, 2:27 PM
bulbazord added inline comments.Aug 10 2023, 3:11 PM
lldb/source/Host/posix/PipePosix.cpp
72–73

I think this lock needs to be at the beginning. Scenario:

Thread 1 calls the move assign operator and performs the base move assign. Gets unscheduled before acquiring lock.
Thread 2 calls the move assign operator and performs the base move assign. Acquires the lock, fills in m_fds, and returns.
Thread 1 is scheduled again, grabs the lock, and fills in m_fds.
this will have its PipeBase members filled in by the Thread 2 call while m_fds will be filled in by the Thread 1 call. Granted, one thread may be surprised that suddenly the thing didn't get moved in as expected, but at least this will be in a consistent state.

I think you also need to lock the mutex of pipe_posix at the same time. Imagine Thread 1 is trying to access something from pipe_posix while Thread 2 is trying to move it into another PipePosix object. (Not sure how likely this scenario is but I'd rather be safe than sorry).

augusto2112 added inline comments.Aug 11 2023, 10:34 AM
lldb/source/Host/posix/PipePosix.cpp
72–73

I think you also need to lock the mutex of pipe_posix at the same time. Imagine Thread 1 is trying to access something from pipe_posix while Thread 2 is trying to move it into another PipePosix object. (Not sure how likely this scenario is but I'd rather be safe than sorry).

Yeah I thought about this too, since pipe_posix is an rvalue reference I was thinking this shouldn't be necessary, but you're right, I'll lock it anyway.

Lock earlier in operator=

JDevlieghere requested changes to this revision.Aug 11 2023, 10:43 AM

Shouldn't we have two murexes, one for the read FD and one for the write FD? The current approach is overly strict: it preventing concurrent reads and writes because ReadWithTimeout and Write are locking the same mutex.

This revision now requires changes to proceed.Aug 11 2023, 10:43 AM

Update to two shared mutexes

@JDevlieghere I updated this to two shared mutexes because I'm assuming you can have more than one concurrent read and more than one concurrent write. If this is wrong I can go back to two regular mutexes.

@JDevlieghere I updated this to two shared mutexes because I'm assuming you can have more than one concurrent read and more than one concurrent write. If this is wrong I can go back to two regular mutexes.

According to POSIX [1] that would be undefined:

The behavior of multiple concurrent reads on the same pipe, FIFO, or terminal device is unspecified.

A regular mutex should be fine.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html

Changed to regular mutex

This revision is now accepted and ready to land.Aug 12 2023, 8:17 AM
This revision was automatically updated to reflect the committed changes.