This is an archive of the discontinued LLVM Phabricator instance.

Protect DWARFCompileUnit::m_die_array by a new mutex
ClosedPublic

Authored by jankratochvil on Nov 26 2017, 5:12 AM.

Details

Summary

I find it is unrelated to DW_TAG_partial_unit. For example LLDB Python Reference says:

consider that lldb can run in a multithreaded environment, and another thread might call the "script" command, changing the value out from under you.

Therefore if BuildAddressRangeTable called ExtractDIEsIfNeeded(false), then another thread started processing data from m_die_array and then the first thread called final ClearDIEs() the second thread would crash, wouldn't it?

DW_TAG_partial_unit for DWZ would get solved even by this patch.

This patch requires D46810.

Some notes of mine:

ExtractDIEsIfNeeded ro/rw, 0->1, 1->n, 0->n
AddUnitDIE rw, called with lock by: DWARFUnit::ExtractDIEsIfNeeded
AppendDIEsWithTag ro, called by ParseCompileUnitFunctions,
                      called by Module::ParseAllDebugSymbols after ParseVariablesForContext using ExtractDIEsIfNeeded(false)
ClearDIEs ro/rw, called by BuildAddressRangeTable, SymbolFileDWARF::Index
GetDIE expand+ro, called by MANY, only using m_die_array after: ExtractDIEsIfNeeded(false)
IndexPrivate ro, called by Index, Index called only after: ExtractDIEsIfNeeded(false)
GetUnitDIEPtrOnly expand+ro, only using m_die_array after: ExtractDIEsIfNeeded(true)
                             called by BuildAddressRangeTable, ParseProducerInfo, GetLanguageType, GetIsOptimized
DIEPtr            expand+ro, only using m_die_array after: ExtractDIEsIfNeeded(false)
                             called by BuildAddressRangeTable, GetFunctionAranges

Diff Detail

Event Timeline

jankratochvil created this revision.Nov 26 2017, 5:12 AM
clayborg requested changes to this revision.Nov 27 2017, 10:39 AM
clayborg added inline comments.
source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.cpp
123 ↗(On Diff #124296)

This isn't initialized and it is captured and used in already_parsed below?

123–137 ↗(On Diff #124296)

I really can't tell what any of the above changed code is doing besides adding the mutex. Seems like you are trying to emulate a condition variable to halt other threads trying to get access to the DIEs. But m_die_array_finished isn't initialized in the constructor or inlined in the header file and initial_die_array_size isn't initialized so I can't really tell what this is supposed to actually do.

128 ↗(On Diff #124296)

initial_die_array_size is used uninitialized when calling already_parsed()

131 ↗(On Diff #124296)

initial_die_array_size is used uninitialized when calling already_parsed()

source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.h
74 ↗(On Diff #124296)

This isn't initialized anywhere.

This revision now requires changes to proceed.Nov 27 2017, 10:39 AM
jankratochvil added inline comments.Feb 20 2018, 7:12 AM
source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.cpp
123 ↗(On Diff #124296)

I am sorry but the missing initialization was just due to a wrong patch splitting. Next patch in the series did initialize it. Anyway it has been reworked now (after dropping whole ExtractDIEsIfNeededKind for its AllDiesButCuDieOnlyForPartialUnits).

123–137 ↗(On Diff #124296)

Condition variable is not required here I think. m_die_array_finished is initialized before any time it gets read.

jankratochvil retitled this revision from DWZ 07/12: Protect DWARFCompileUnit::m_die_array by a new mutex to DWZ 06/11: Protect DWARFCompileUnit::m_die_array by a new mutex.
jankratochvil edited the summary of this revision. (Show Details)
jankratochvil planned changes to this revision.Apr 14 2018, 4:21 AM
jankratochvil retitled this revision from DWZ 06/11: Protect DWARFCompileUnit::m_die_array by a new mutex to DWZ 03/07: Protect DWARFCompileUnit::m_die_array by a new mutex.

It is now reworked without FileOffset and the remapping to unique DIE offsets for each DW_TAG_partial_unit inclusion by DW_TAG_imported_unit, as discussed in: https://lists.llvm.org/pipermail/lldb-commits/Week-of-Mon-20180409/040324.html

clayborg requested changes to this revision.Apr 30 2018, 10:19 AM
clayborg added inline comments.
source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
56

Remove the "m_" prefix on this local variable as that indicates it is a member variable.

Might be better to use CleanUp utility from "lldb/Utility/CleanUp.h" instead of a shared pointer to void? Took me a bit to figure out what this was doing. If this is a very accepted C++ way to run a function that is scoped that is fine, but it seems a little unclear what this was doing initially.

This revision now requires changes to proceed.Apr 30 2018, 10:19 AM

Removed the m_die_array_size_atomic_set m_ prefix, used CleanUp.

jankratochvil marked an inline comment as done.Apr 30 2018, 10:46 AM
jankratochvil added inline comments.
source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
56

Done, done. I have googled the shared_ptr cleanup for this purpose at StackOverflow. Updated, thanks.

clayborg requested changes to this revision.Apr 30 2018, 10:52 AM
clayborg added inline comments.
source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
333–334

You can't really play with the m_die_array_size_atomic without locking the mutex, which really means m_die_array_size_atomic is not very useful. Seems like there is still a race condition:

  • thread 1 calls this function and executes "m_die_array.swap(tmp_array);" but not " m_die_array_size_atomic = 0;"
  • thread 2 calls DWARFUnit::ExtractDIEsIfNeeded() and calls "already_parsed()" which returns true because m_die_array_size_atomic is still some valid value
  • thread 1 clears the DIE array and sets m_die_array_size_atomic to zero
  • thread 2 has no DIEs when it tries to use them
This revision now requires changes to proceed.Apr 30 2018, 10:52 AM
jankratochvil marked an inline comment as done.
clayborg accepted this revision.Apr 30 2018, 2:31 PM

Much better.

This revision is now accepted and ready to land.Apr 30 2018, 2:31 PM
jankratochvil marked an inline comment as done.Apr 30 2018, 2:31 PM
jankratochvil added inline comments.
source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
333–334

I see I did screw up even that, sorry for that. I believe there still could be an atomic_t speedup but given you are fine with the performance of a mutex I have implemented it that way.

This revision was automatically updated to reflect the committed changes.
jankratochvil marked an inline comment as done.
labath added inline comments.May 1 2018, 1:19 AM
lldb/trunk/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
317 ↗(On Diff #144641)

You're accessing m_die_array without holding a lock.

Revert "Protect DWARFCompileUnit::m_die_array by a new mutex"
Pavel Labath found this patch is incomplete and racy. I think there needs to be some more mutexes even before considering DW_TAG_partial_unit.
This reverts commit 331229 which was: https://reviews.llvm.org/D40470
GIT commit cc242f2a5e10a8e6f8e11a268d376a76324c1d23

jankratochvil reopened this revision.May 13 2018, 12:39 PM
jankratochvil updated this revision to Diff 146523.
jankratochvil retitled this revision from DWZ 03/07: Protect DWARFCompileUnit::m_die_array by a new mutex to Protect DWARFCompileUnit::m_die_array by a new mutex.
jankratochvil edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.May 13 2018, 12:40 PM
jankratochvil edited the summary of this revision. (Show Details)May 13 2018, 12:44 PM
jankratochvil edited the summary of this revision. (Show Details)

I think we should expose the llvm::sys::RWMutex and lets clients like BuildAddressRangeTable and SymbolFileDWARF::Index use it and get rid of m_die_array_usecount all together.

source/Plugins/SymbolFile/DWARF/DWARFUnit.h
149

Why don't we expose the llvm::sys::RWMutex for BuildAddressRangeTable and SymbolFileDWARF::Index? Then they can just grab the read lock?

jankratochvil marked an inline comment as done.May 31 2018, 3:01 AM
jankratochvil added inline comments.
source/Plugins/SymbolFile/DWARF/DWARFUnit.h
149

Wouldn't you prefer an RAII lock (DWARFUnit::ScopedExtractDIEs) instead of the mutex exposure?

There are two mutexes (m_die_array_mutex and m_die_array_scoped_mutex), I tried it using a single mutex (as you IMO suggest) but then one needs to downgrade exclusive->shared lock there which requires a retry label there so I did not like the single lock (mutex2.patch).

jankratochvil marked an inline comment as done.

RAII lock DWARFUnit::ScopedExtractDIEs with mutexes m_die_array_mutex, m_die_array_scoped_mutex and m_first_die_mutex.

Provided variants with a new RAII lock guard vs. unchanged caller bool clear_dies variable. And also 1 vs. 2 mutexes for m_die_array. Please choose one, I do not mind any.

clayborg accepted this revision.Jun 4 2018, 9:16 AM
This revision is now accepted and ready to land.Jun 4 2018, 9:16 AM
This revision was automatically updated to reflect the committed changes.

FYI I have reordered m_cu->m_die_array_scoped_mutex.unlock_shared(); in DWARFUnit::ScopedExtractDIEs::~ScopedExtractDIEs() during commit (it was written this new way in some other of the 4 offered variants).