The following problem was occuring:
- broadcaster B had two listeners: L1 and L2 (thread T1)
- (T1) B has started to broadcast an event, it has locked a shared_ptr to L1 (in ListenerIterator())
- on another thread T2 the penultimate reference to L1 was destroyed (the transient object in B is now the last reference)
- (T2) the last reference to L2 was destroyed as well
- (T1) B has finished broadcasting the event to L1 and destroyed the last shared_ptr
- (T1) this triggered the destructor, which called into B->RemoveListener()
- (T1) all pointers in the m_listeners list were now stale, so RemoveListener emptied the list
- (T1) Eventually control returned to the ListenerIterator() for doing broadcasting, which was still in the middle of iterating through the list
- (T1) Only now, it was holding onto a dangling iterator. BOOM.
I fix this issue by making sure nothing can interfere with the
iterate-and-remove-expired-pointers loop, by moving this logic into a single function, which
first locks (or clears) the whole list and then returns the list of valid and locked Listeners
for further processing. Instead of std::list I use an llvm::SmallVector which should hopefully
offset the fact that we create a copy of the list for the common case where we have only a few
listeners (no heap allocations).
A slight difference in behaviour is that now RemoveListener does not remove an element from the
list -- it only sets it's mask to 0, which means it will be removed during the next iteration of
GetListeners(). This is purely an implementation detail and it should not be externally
I was not able to reproduce this bug reliably without inserting sleep statements into the code,
so I do not add a test for it. Instead, I add some unit tests for the functions that I do modify.