Bug fix for PR37808. The regression test is a reduced version of the original reproducer attached to the bug report. As stated in the report, the problem was that InsertedPHIs was keeping dangling pointers to deleted Memory-Phis. MemoryPhis are created eagerly and sometimes get zapped shortly afterwards. I've used WeakVH instead of an expensive removal operation from the active workset.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Analysis/MemorySSAUpdater.cpp | ||
---|---|---|
208 ↗ | (On Diff #152090) | remove on a SmallSetVector is linear time, which might be a problem here. |
lib/Analysis/MemorySSAUpdater.cpp | ||
---|---|---|
208 ↗ | (On Diff #152090) | On a SmallVector we'd have to do a linear lookup and then move the elements around, so it should be the same in terms of complexity. Moreover, I thought it's good to have a guarantee that InsertedPHIs does not contain duplicates. What is the suggestion here? |
lib/Analysis/MemorySSAUpdater.cpp | ||
---|---|---|
208 ↗ | (On Diff #152090) | It's possible to construct a set with both constant-time removal and iteration in insertion order: essentially, make a DenseMap<AssertingVH<MemoryPhi>, size_t> where the value is an index into a SmallVector<Optional<AssertingVH<MemoryPhi>>>), or something like that. Or maybe there's some simpler way to handle this I'm not thinking of. |
lib/Analysis/MemorySSAUpdater.cpp | ||
---|---|---|
208 ↗ | (On Diff #152090) | I am not convinced it's worth maintaining two data structures to avoid a linear lookup, which costs nothing on a SmallVector. We do linear lookups for NonOptPhis.erase() anyway. I have a patch in case you are still interested but I am not in favour of it. In order for the DenseMap to be up to date we can't actually erase elements from InsertedPHIs but we just mark them as stale. This complicates other parts of the algorithm like insertDef() and fixupDefs(). |
Thanks for this!
lib/Analysis/MemorySSAUpdater.cpp | ||
---|---|---|
208 ↗ | (On Diff #152090) | I'm surprised that we're removing anything here at all, given that we rely on stable indices in InsertedPhis on lines 299-305. Would keeping a SmallVector<WeakVH, N> (with appropriate null checks/conversions where necessary) work just as well? (I don't see how InsertedPHIs can get a duplicate, anyway. AFAICT, when we're inserting, either:
Am I missing something?) |
We do linear lookups for NonOptPhis.erase() anyway
SmallSet insertion and removal are log(N) for large N.
https://reviews.llvm.org/D48504 is a (almost) drop-in replacement for SetVector, with a constant time remove().
lib/Analysis/MemorySSAUpdater.cpp | ||
---|---|---|
304 ↗ | (On Diff #152090) |
@efriedma thanks a lot for your effort on this :) It seems we need a random access iterator here: FixupList.append(InsertedPHIs.at(StartingPHISize), InsertedPHIs.end()); otherwise the following assertion fails static_assert(IsRandomAccess, "The '-' operator is only defined for random access iterators."); |
lib/Analysis/MemorySSAUpdater.cpp | ||
---|---|---|
208 ↗ | (On Diff #152090) | Just bumping this comment, since I don't think that the 'stable indices' bit is addressed by the OrderedSet ADT, and don't see a clear path for addressing that (nor prose about why it's not a problem) |
lib/Analysis/MemorySSAUpdater.cpp | ||
---|---|---|
208 ↗ | (On Diff #152090) | fixupDefs() can potentially insert/remove MemoryPhis. As long as it only removes newly created Phis (not existing ones in the ADT), then the order of elements in the ADT shouldn't matter. Now looking at the line 304, it would make more sense if it was FixupList.append(InsertedPHIs.begin() + StartingPHISize, InsertedPHIs.end()); |
Changes to the prior revision:
- I've used std::set for InsertedPHIs to avoid linear cost lookups upon deletion.
- I've refactored insertDef() so that it no longer relies on insertion order iteration over InsertedPHIs.
I'm still in favor of just replacing SmallVector<MemoryPhi *, 8> InsertedPHIs with SmallVector<WeakVH, 8> InsertedPHIs, since:
- I think it meets all of our needs here,
- it's a very straightforward change (no need to deal with removal, new containers, etc.), and
- I don't see how we can ever get a duplicate in InsertedPHIs to start with.
In any case, a few questions/nits about the current patch.
include/llvm/Analysis/MemorySSAUpdater.h | ||
---|---|---|
64 ↗ | (On Diff #154838) | Please add a comment saying that we're using a std::set because we need insertion to not invalidate iterators, since we mutate this set while iterating over it. (I assume that's why we're using a set here, anyway) |
64 ↗ | (On Diff #154838) | Also, is it possible that this would cause us to visit Phis in a nondeterministic order (thus creating new Phis in a nondeterministic order, thus having unstable numbering for MemorySSA's print output)? If so, a straightforward fix would be to make a comparator that uses the Phi's ID. |
lib/Analysis/MemorySSAUpdater.cpp | ||
297 ↗ | (On Diff #154838) | Nit: If we only put Phis in here, please make it a set of MemoryPhis, not MemoryAccesses. |
298 ↗ | (On Diff #154838) | Nit: Cnt |
299 ↗ | (On Diff #154838) | In general, is there a "we'll never remove a MP from InsertedPhis if we visit it here" guarantee that I'm missing? If so, please add a comment about why we have that guarantee, since this loop appears to depend on it in a few ways. |
305 ↗ | (On Diff #154838) | This is quadratic WRT the size of InsertedPHIs, no? |
208 ↗ | (On Diff #152090) |
Sure, but are you 100% sure that it can't remove a phi that's not newly-created? :) I'd be willing to buy it, but haven't reasoned about this case enough to know for certain yet. |
Replacing MemoryPhi * with WeakVH won't work out of the box; it requires typecasts and null checks upon iteration. It also adds and additional overhead (similar to AssertingVH though). In general, the idea of keeping null references in the ADT instead of actually deleting them seems like a workaround to me. I don't see why we should be afraid of making non trivial changes in the code if there's a fundamental bug. On that note I still believe the line 303 of the original revision is wrong since we push_back() when adding new elements:
FixupList.append(InsertedPHIs.end() - StartingPHISize, InsertedPHIs.end());
Shouldn't it be like this?
FixupList.append(InsertedPHIs.begin() + StartingPHISize, InsertedPHIs.end());
Answering some inline comments:
include/llvm/Analysis/MemorySSAUpdater.h | ||
---|---|---|
64 ↗ | (On Diff #154838) |
We use std::set is to avoid linear time removal from SmallVector, a concern raised by @efriedma. This was reason I created https://reviews.llvm.org/D49030. However, people from the community were tentative about adding a new ADT that is very similar to existing ones (MapVector,SetVector), which are not a good fit for InsertedPHIs though.
Sounds like a good idea. |
lib/Analysis/MemorySSAUpdater.cpp | ||
297 ↗ | (On Diff #154838) | Good catch. |
299 ↗ | (On Diff #154838) | No we don't have such a guarantee that's why we have to break the innermost loop and re-iterate. |
305 ↗ | (On Diff #154838) | Possibly. |
208 ↗ | (On Diff #152090) | No, I am not 100%sure we won't remove a Phi which is not newly-created. But this is no longer an issue with the latest revision of the patch, since we don't rely on insertion order iteration. However, I believe we need <AssertingVH<MemoryPhi> to avoid considering a newly created Phi, that happens to get allocated to the same address as a deleted one, as Seen. I will experiment with a new patch and update my revision shortly. |
It also adds and additional overhead
Any targeted fix that makes us track the lifetime of these more closely is going to add additional overhead. If anything, I imagine that adding a DenseMap insertion/deletion per Phi would be cheaper than taking n^2 time to walk said Phis.
In general, the idea of keeping null references in the ADT instead of actually deleting them seems like a workaround to me. I don't see why we should be afraid of making non trivial changes in the code if there's a fundamental bug.
I think the fundamental issue is that we create these non-minimal phi structures just to zap them shortly afterward. :) From the looks of this, though, that's a limitation imposed by the algorithm we're using. So, unless we want to try and improve that algorithm, or unless our goal is to swap to another algorithm without this limitation, anything we do here is going to boil down to a "workaround" of that quirk.
A Phi is being deleted and we need later iterations of this algorithm to take that into account. I don't understand why eagerly removing a pointer from our worklist is strictly less hacky than using WeakVH for this purpose.
If you still feel strongly that WeakVH isn't the right way, I'm happy to accept an eager removal approach if it doesn't substantially lose out in overall clarity, simplicity, and (algorithmic, within reason) performance over just using WeakVH. I don't think this patch is at that point yet, and I think we'd need a custom SetVector-but-with-constant-time-bounds, as was proposed, to get it there. (I also suspect that such a patch would end up losing out in perf/memory over WeakVH in the constant factor, but I'm sure we have lower hanging fruits than that in this code, if the speed/memory consumption here are becoming problematic.)
I don't see why we should be afraid of making non trivial changes in the code
I'm fine with making substantial changes to code if doing so is called for. My concern is more that it's not called for here. :)
On that note I still believe the line 303 of the original revision is wrong
Agreed
include/llvm/Analysis/MemorySSAUpdater.h | ||
---|---|---|
64 ↗ | (On Diff #154838) |
Please prefer DenseSet (or one of its variants) if all that's needed is fast set-like operations. std::set provides a number of guarantees above and beyond DenseSet (iterator validity, pointer stability, ...), and so generally ends up being slower. That said, if we go with using IDs as keys, we'll need the ordering guarantee that std::set provides, so please add a comment noting that we're using a set for consistent Phi ordering. |
lib/Analysis/MemorySSAUpdater.cpp | ||
103 ↗ | (On Diff #154838) | Please also assert that this insertion happened, so it's clear that deduplication is never a goal here. |
299 ↗ | (On Diff #154838) | I don't think that works in the general case, since our Seen set will retain dangling pointers. If we delete a Phi and new up one at the same pointer, we'll fail to visit it here. It's admittedly a rare case, but certainly a possibility. |
305 ↗ | (On Diff #154838) | With the std::set comparator change, I'd say "definitely". :) |
Ok, I've convinced myself that using WeakVH is the most straightforward fix for now. We shouldn't be spending more time on this.